All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
@ 2021-10-27 10:44 ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: alex.williamson, jacob.jun.pan, yi.l.liu, kevin.tian, ashok.raj,
	maz, peter.maydell, vivek.gautam, shameerali.kolothum.thodi,
	wangxingang5, jiangkunkun, yuzenghui, nicoleotsuka, chenxiang66,
	sumitg, nicolinc, vdumpa, zhangfei.gao, zhangfei.gao, lushenming,
	vsethi

This series brings the IOMMU part of HW nested paging support
in the SMMUv3.

The SMMUv3 driver is adapted to support 2 nested stages.

The IOMMU API is extended to convey the guest stage 1
configuration and the hook is implemented in the SMMUv3 driver.

This allows the guest to own the stage 1 tables and context
descriptors (so-called PASID table) while the host owns the
stage 2 tables and main configuration structures (STE).

This work mainly is provided for test purpose as the upper
layer integration is under rework and bound to be based on
/dev/iommu instead of VFIO tunneling. In this version we also get
rid of the MSI BINDING ioctl, assuming the guest enforces
flat mapping of host IOVAs used to bind physical MSI doorbells.
In the current QEMU integration this is achieved by exposing
RMRs to the guest, using Shameer's series [1]. This approach
is RFC as the IORT spec is not really meant to do that
(single mapping flag limitation).

Best Regards

Eric

This series (Host) can be found at:
https://github.com/eauger/linux/tree/v5.15-rc7-nested-v16
This includes a rebased VFIO integration (although not meant
to be upstreamed)

Guest kernel branch can be found at:
https://github.com/eauger/linux/tree/shameer_rmrr_v7
featuring [1]

QEMU integration (still based on VFIO and exposing RMRs)
can be found at:
https://github.com/eauger/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
(use iommu=nested-smmuv3 ARM virt option)

Guest dependency:
[1] [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node

History:

v15 -> v16:
- guest RIL must support RIL
- additional checks in the cache invalidation hook
- removal of the MSI BINDING ioctl (tentative replacement
  by RMRs)


Eric Auger (9):
  iommu: Introduce attach/detach_pasid_table API
  iommu: Introduce iommu_get_nesting
  iommu/smmuv3: Allow s1 and s2 configs to coexist
  iommu/smmuv3: Get prepared for nested stage support
  iommu/smmuv3: Implement attach/detach_pasid_table
  iommu/smmuv3: Allow stage 1 invalidation with unmanaged ASIDs
  iommu/smmuv3: Implement cache_invalidate
  iommu/smmuv3: report additional recoverable faults
  iommu/smmuv3: Disallow nested mode in presence of HW MSI regions

 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 383 ++++++++++++++++++--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  14 +-
 drivers/iommu/arm/arm-smmu/arm-smmu.c       |   8 +
 drivers/iommu/intel/iommu.c                 |  13 +
 drivers/iommu/iommu.c                       |  79 ++++
 include/linux/iommu.h                       |  35 ++
 include/uapi/linux/iommu.h                  |  54 +++
 7 files changed, 548 insertions(+), 38 deletions(-)

-- 
2.26.3


^ permalink raw reply	[flat|nested] 116+ messages in thread

* [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
@ 2021-10-27 10:44 ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: peter.maydell, kevin.tian, ashok.raj, maz, vivek.gautam,
	alex.williamson, vsethi, zhangfei.gao, lushenming, wangxingang5

This series brings the IOMMU part of HW nested paging support
in the SMMUv3.

The SMMUv3 driver is adapted to support 2 nested stages.

The IOMMU API is extended to convey the guest stage 1
configuration and the hook is implemented in the SMMUv3 driver.

This allows the guest to own the stage 1 tables and context
descriptors (so-called PASID table) while the host owns the
stage 2 tables and main configuration structures (STE).

This work mainly is provided for test purpose as the upper
layer integration is under rework and bound to be based on
/dev/iommu instead of VFIO tunneling. In this version we also get
rid of the MSI BINDING ioctl, assuming the guest enforces
flat mapping of host IOVAs used to bind physical MSI doorbells.
In the current QEMU integration this is achieved by exposing
RMRs to the guest, using Shameer's series [1]. This approach
is RFC as the IORT spec is not really meant to do that
(single mapping flag limitation).

Best Regards

Eric

This series (Host) can be found at:
https://github.com/eauger/linux/tree/v5.15-rc7-nested-v16
This includes a rebased VFIO integration (although not meant
to be upstreamed)

Guest kernel branch can be found at:
https://github.com/eauger/linux/tree/shameer_rmrr_v7
featuring [1]

QEMU integration (still based on VFIO and exposing RMRs)
can be found at:
https://github.com/eauger/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
(use iommu=nested-smmuv3 ARM virt option)

Guest dependency:
[1] [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node

History:

v15 -> v16:
- guest RIL must support RIL
- additional checks in the cache invalidation hook
- removal of the MSI BINDING ioctl (tentative replacement
  by RMRs)


Eric Auger (9):
  iommu: Introduce attach/detach_pasid_table API
  iommu: Introduce iommu_get_nesting
  iommu/smmuv3: Allow s1 and s2 configs to coexist
  iommu/smmuv3: Get prepared for nested stage support
  iommu/smmuv3: Implement attach/detach_pasid_table
  iommu/smmuv3: Allow stage 1 invalidation with unmanaged ASIDs
  iommu/smmuv3: Implement cache_invalidate
  iommu/smmuv3: report additional recoverable faults
  iommu/smmuv3: Disallow nested mode in presence of HW MSI regions

 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 383 ++++++++++++++++++--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  14 +-
 drivers/iommu/arm/arm-smmu/arm-smmu.c       |   8 +
 drivers/iommu/intel/iommu.c                 |  13 +
 drivers/iommu/iommu.c                       |  79 ++++
 include/linux/iommu.h                       |  35 ++
 include/uapi/linux/iommu.h                  |  54 +++
 7 files changed, 548 insertions(+), 38 deletions(-)

-- 
2.26.3

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 116+ messages in thread

* [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
@ 2021-10-27 10:44 ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: kevin.tian, jacob.jun.pan, ashok.raj, chenxiang66, maz, vdumpa,
	nicoleotsuka, vivek.gautam, alex.williamson, yi.l.liu, nicolinc,
	vsethi, zhangfei.gao, sumitg, lushenming, wangxingang5

This series brings the IOMMU part of HW nested paging support
in the SMMUv3.

The SMMUv3 driver is adapted to support 2 nested stages.

The IOMMU API is extended to convey the guest stage 1
configuration and the hook is implemented in the SMMUv3 driver.

This allows the guest to own the stage 1 tables and context
descriptors (so-called PASID table) while the host owns the
stage 2 tables and main configuration structures (STE).

This work mainly is provided for test purpose as the upper
layer integration is under rework and bound to be based on
/dev/iommu instead of VFIO tunneling. In this version we also get
rid of the MSI BINDING ioctl, assuming the guest enforces
flat mapping of host IOVAs used to bind physical MSI doorbells.
In the current QEMU integration this is achieved by exposing
RMRs to the guest, using Shameer's series [1]. This approach
is RFC as the IORT spec is not really meant to do that
(single mapping flag limitation).

Best Regards

Eric

This series (Host) can be found at:
https://github.com/eauger/linux/tree/v5.15-rc7-nested-v16
This includes a rebased VFIO integration (although not meant
to be upstreamed)

Guest kernel branch can be found at:
https://github.com/eauger/linux/tree/shameer_rmrr_v7
featuring [1]

QEMU integration (still based on VFIO and exposing RMRs)
can be found at:
https://github.com/eauger/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
(use iommu=nested-smmuv3 ARM virt option)

Guest dependency:
[1] [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node

History:

v15 -> v16:
- guest RIL must support RIL
- additional checks in the cache invalidation hook
- removal of the MSI BINDING ioctl (tentative replacement
  by RMRs)


Eric Auger (9):
  iommu: Introduce attach/detach_pasid_table API
  iommu: Introduce iommu_get_nesting
  iommu/smmuv3: Allow s1 and s2 configs to coexist
  iommu/smmuv3: Get prepared for nested stage support
  iommu/smmuv3: Implement attach/detach_pasid_table
  iommu/smmuv3: Allow stage 1 invalidation with unmanaged ASIDs
  iommu/smmuv3: Implement cache_invalidate
  iommu/smmuv3: report additional recoverable faults
  iommu/smmuv3: Disallow nested mode in presence of HW MSI regions

 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 383 ++++++++++++++++++--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  14 +-
 drivers/iommu/arm/arm-smmu/arm-smmu.c       |   8 +
 drivers/iommu/intel/iommu.c                 |  13 +
 drivers/iommu/iommu.c                       |  79 ++++
 include/linux/iommu.h                       |  35 ++
 include/uapi/linux/iommu.h                  |  54 +++
 7 files changed, 548 insertions(+), 38 deletions(-)

-- 
2.26.3

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
  2021-10-27 10:44 ` Eric Auger
  (?)
@ 2021-10-27 10:44   ` Eric Auger
  -1 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: alex.williamson, jacob.jun.pan, yi.l.liu, kevin.tian, ashok.raj,
	maz, peter.maydell, vivek.gautam, shameerali.kolothum.thodi,
	wangxingang5, jiangkunkun, yuzenghui, nicoleotsuka, chenxiang66,
	sumitg, nicolinc, vdumpa, zhangfei.gao, zhangfei.gao, lushenming,
	vsethi

In virtualization use case, when a guest is assigned
a PCI host device, protected by a virtual IOMMU on the guest,
the physical IOMMU must be programmed to be consistent with
the guest mappings. If the physical IOMMU supports two
translation stages it makes sense to program guest mappings
onto the first stage/level (ARM/Intel terminology) while the host
owns the stage/level 2.

In that case, it is mandated to trap on guest configuration
settings and pass those to the physical iommu driver.

This patch adds a new API to the iommu subsystem that allows
to set/unset the pasid table information.

A generic iommu_pasid_table_config struct is introduced in
a new iommu.h uapi header. This is going to be used by the VFIO
user API.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Liu, Yi L <yi.l.liu@linux.intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v13 -> v14:
- export iommu_attach_pasid_table
- add dummy iommu_uapi_attach_pasid_table
- swap base_ptr and format in iommu_pasid_table_config

v12 -> v13:
- Fix config check

v11 -> v12:
- add argsz, name the union
---
 drivers/iommu/iommu.c      | 69 ++++++++++++++++++++++++++++++++++++++
 include/linux/iommu.h      | 27 +++++++++++++++
 include/uapi/linux/iommu.h | 54 +++++++++++++++++++++++++++++
 3 files changed, 150 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 3303d707bab4..6033c263c6e6 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2236,6 +2236,75 @@ int iommu_uapi_sva_unbind_gpasid(struct iommu_domain *domain, struct device *dev
 }
 EXPORT_SYMBOL_GPL(iommu_uapi_sva_unbind_gpasid);
 
+int iommu_attach_pasid_table(struct iommu_domain *domain,
+			     struct iommu_pasid_table_config *cfg)
+{
+	if (unlikely(!domain->ops->attach_pasid_table))
+		return -ENODEV;
+
+	return domain->ops->attach_pasid_table(domain, cfg);
+}
+EXPORT_SYMBOL_GPL(iommu_attach_pasid_table);
+
+int iommu_uapi_attach_pasid_table(struct iommu_domain *domain,
+				  void __user *uinfo)
+{
+	struct iommu_pasid_table_config pasid_table_data = { 0 };
+	u32 minsz;
+
+	if (unlikely(!domain->ops->attach_pasid_table))
+		return -ENODEV;
+
+	/*
+	 * No new spaces can be added before the variable sized union, the
+	 * minimum size is the offset to the union.
+	 */
+	minsz = offsetof(struct iommu_pasid_table_config, vendor_data);
+
+	/* Copy minsz from user to get flags and argsz */
+	if (copy_from_user(&pasid_table_data, uinfo, minsz))
+		return -EFAULT;
+
+	/* Fields before the variable size union are mandatory */
+	if (pasid_table_data.argsz < minsz)
+		return -EINVAL;
+
+	/* PASID and address granu require additional info beyond minsz */
+	if (pasid_table_data.version != PASID_TABLE_CFG_VERSION_1)
+		return -EINVAL;
+	if (pasid_table_data.format == IOMMU_PASID_FORMAT_SMMUV3 &&
+	    pasid_table_data.argsz <
+		offsetofend(struct iommu_pasid_table_config, vendor_data.smmuv3))
+		return -EINVAL;
+
+	/*
+	 * User might be using a newer UAPI header which has a larger data
+	 * size, we shall support the existing flags within the current
+	 * size. Copy the remaining user data _after_ minsz but not more
+	 * than the current kernel supported size.
+	 */
+	if (copy_from_user((void *)&pasid_table_data + minsz, uinfo + minsz,
+			   min_t(u32, pasid_table_data.argsz, sizeof(pasid_table_data)) - minsz))
+		return -EFAULT;
+
+	/* Now the argsz is validated, check the content */
+	if (pasid_table_data.config < IOMMU_PASID_CONFIG_TRANSLATE ||
+	    pasid_table_data.config > IOMMU_PASID_CONFIG_ABORT)
+		return -EINVAL;
+
+	return domain->ops->attach_pasid_table(domain, &pasid_table_data);
+}
+EXPORT_SYMBOL_GPL(iommu_uapi_attach_pasid_table);
+
+void iommu_detach_pasid_table(struct iommu_domain *domain)
+{
+	if (unlikely(!domain->ops->detach_pasid_table))
+		return;
+
+	domain->ops->detach_pasid_table(domain);
+}
+EXPORT_SYMBOL_GPL(iommu_detach_pasid_table);
+
 static void __iommu_detach_device(struct iommu_domain *domain,
 				  struct device *dev)
 {
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index d2f3435e7d17..e34a1b1c805b 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -232,6 +232,8 @@ struct iommu_iotlb_gather {
  * @cache_invalidate: invalidate translation caches
  * @sva_bind_gpasid: bind guest pasid and mm
  * @sva_unbind_gpasid: unbind guest pasid and mm
+ * @attach_pasid_table: attach a pasid table
+ * @detach_pasid_table: detach the pasid table
  * @def_domain_type: device default domain type, return value:
  *		- IOMMU_DOMAIN_IDENTITY: must use an identity domain
  *		- IOMMU_DOMAIN_DMA: must use a dma domain
@@ -297,6 +299,9 @@ struct iommu_ops {
 				      void *drvdata);
 	void (*sva_unbind)(struct iommu_sva *handle);
 	u32 (*sva_get_pasid)(struct iommu_sva *handle);
+	int (*attach_pasid_table)(struct iommu_domain *domain,
+				  struct iommu_pasid_table_config *cfg);
+	void (*detach_pasid_table)(struct iommu_domain *domain);
 
 	int (*page_response)(struct device *dev,
 			     struct iommu_fault_event *evt,
@@ -430,6 +435,11 @@ extern int iommu_uapi_sva_unbind_gpasid(struct iommu_domain *domain,
 					struct device *dev, void __user *udata);
 extern int iommu_sva_unbind_gpasid(struct iommu_domain *domain,
 				   struct device *dev, ioasid_t pasid);
+extern int iommu_attach_pasid_table(struct iommu_domain *domain,
+				    struct iommu_pasid_table_config *cfg);
+extern int iommu_uapi_attach_pasid_table(struct iommu_domain *domain,
+					 void __user *udata);
+extern void iommu_detach_pasid_table(struct iommu_domain *domain);
 extern struct iommu_domain *iommu_get_domain_for_dev(struct device *dev);
 extern struct iommu_domain *iommu_get_dma_domain(struct device *dev);
 extern int iommu_map(struct iommu_domain *domain, unsigned long iova,
@@ -1035,6 +1045,23 @@ iommu_aux_get_pasid(struct iommu_domain *domain, struct device *dev)
 	return -ENODEV;
 }
 
+static inline
+int iommu_attach_pasid_table(struct iommu_domain *domain,
+			     struct iommu_pasid_table_config *cfg)
+{
+	return -ENODEV;
+}
+
+static inline
+int iommu_uapi_attach_pasid_table(struct iommu_domain *domain,
+				  void __user *uinfo)
+{
+	return -ENODEV;
+}
+
+static inline
+void iommu_detach_pasid_table(struct iommu_domain *domain) {}
+
 static inline struct iommu_sva *
 iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void *drvdata)
 {
diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
index 59178fc229ca..8c079a78dfec 100644
--- a/include/uapi/linux/iommu.h
+++ b/include/uapi/linux/iommu.h
@@ -339,4 +339,58 @@ struct iommu_gpasid_bind_data {
 	} vendor;
 };
 
+/**
+ * struct iommu_pasid_smmuv3 - ARM SMMUv3 Stream Table Entry stage 1 related
+ *     information
+ * @version: API version of this structure
+ * @s1fmt: STE s1fmt (format of the CD table: single CD, linear table
+ *         or 2-level table)
+ * @s1dss: STE s1dss (specifies the behavior when @pasid_bits != 0
+ *         and no PASID is passed along with the incoming transaction)
+ * @padding: reserved for future use (should be zero)
+ *
+ * The PASID table is referred to as the Context Descriptor (CD) table on ARM
+ * SMMUv3. Please refer to the ARM SMMU 3.x spec (ARM IHI 0070A) for full
+ * details.
+ */
+struct iommu_pasid_smmuv3 {
+#define PASID_TABLE_SMMUV3_CFG_VERSION_1 1
+	__u32	version;
+	__u8	s1fmt;
+	__u8	s1dss;
+	__u8	padding[2];
+};
+
+/**
+ * struct iommu_pasid_table_config - PASID table data used to bind guest PASID
+ *     table to the host IOMMU
+ * @argsz: User filled size of this data
+ * @version: API version to prepare for future extensions
+ * @base_ptr: guest physical address of the PASID table
+ * @format: format of the PASID table
+ * @pasid_bits: number of PASID bits used in the PASID table
+ * @config: indicates whether the guest translation stage must
+ *          be translated, bypassed or aborted.
+ * @padding: reserved for future use (should be zero)
+ * @vendor_data.smmuv3: table information when @format is
+ * %IOMMU_PASID_FORMAT_SMMUV3
+ */
+struct iommu_pasid_table_config {
+	__u32	argsz;
+#define PASID_TABLE_CFG_VERSION_1 1
+	__u32	version;
+	__u64	base_ptr;
+#define IOMMU_PASID_FORMAT_SMMUV3	1
+	__u32	format;
+	__u8	pasid_bits;
+#define IOMMU_PASID_CONFIG_TRANSLATE	1
+#define IOMMU_PASID_CONFIG_BYPASS	2
+#define IOMMU_PASID_CONFIG_ABORT	3
+	__u8	config;
+	__u8    padding[2];
+	union {
+		struct iommu_pasid_smmuv3 smmuv3;
+	} vendor_data;
+};
+
 #endif /* _UAPI_IOMMU_H */
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-10-27 10:44   ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: peter.maydell, kevin.tian, ashok.raj, maz, vivek.gautam,
	alex.williamson, vsethi, zhangfei.gao, lushenming, wangxingang5

In virtualization use case, when a guest is assigned
a PCI host device, protected by a virtual IOMMU on the guest,
the physical IOMMU must be programmed to be consistent with
the guest mappings. If the physical IOMMU supports two
translation stages it makes sense to program guest mappings
onto the first stage/level (ARM/Intel terminology) while the host
owns the stage/level 2.

In that case, it is mandated to trap on guest configuration
settings and pass those to the physical iommu driver.

This patch adds a new API to the iommu subsystem that allows
to set/unset the pasid table information.

A generic iommu_pasid_table_config struct is introduced in
a new iommu.h uapi header. This is going to be used by the VFIO
user API.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Liu, Yi L <yi.l.liu@linux.intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v13 -> v14:
- export iommu_attach_pasid_table
- add dummy iommu_uapi_attach_pasid_table
- swap base_ptr and format in iommu_pasid_table_config

v12 -> v13:
- Fix config check

v11 -> v12:
- add argsz, name the union
---
 drivers/iommu/iommu.c      | 69 ++++++++++++++++++++++++++++++++++++++
 include/linux/iommu.h      | 27 +++++++++++++++
 include/uapi/linux/iommu.h | 54 +++++++++++++++++++++++++++++
 3 files changed, 150 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 3303d707bab4..6033c263c6e6 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2236,6 +2236,75 @@ int iommu_uapi_sva_unbind_gpasid(struct iommu_domain *domain, struct device *dev
 }
 EXPORT_SYMBOL_GPL(iommu_uapi_sva_unbind_gpasid);
 
+int iommu_attach_pasid_table(struct iommu_domain *domain,
+			     struct iommu_pasid_table_config *cfg)
+{
+	if (unlikely(!domain->ops->attach_pasid_table))
+		return -ENODEV;
+
+	return domain->ops->attach_pasid_table(domain, cfg);
+}
+EXPORT_SYMBOL_GPL(iommu_attach_pasid_table);
+
+int iommu_uapi_attach_pasid_table(struct iommu_domain *domain,
+				  void __user *uinfo)
+{
+	struct iommu_pasid_table_config pasid_table_data = { 0 };
+	u32 minsz;
+
+	if (unlikely(!domain->ops->attach_pasid_table))
+		return -ENODEV;
+
+	/*
+	 * No new spaces can be added before the variable sized union, the
+	 * minimum size is the offset to the union.
+	 */
+	minsz = offsetof(struct iommu_pasid_table_config, vendor_data);
+
+	/* Copy minsz from user to get flags and argsz */
+	if (copy_from_user(&pasid_table_data, uinfo, minsz))
+		return -EFAULT;
+
+	/* Fields before the variable size union are mandatory */
+	if (pasid_table_data.argsz < minsz)
+		return -EINVAL;
+
+	/* PASID and address granu require additional info beyond minsz */
+	if (pasid_table_data.version != PASID_TABLE_CFG_VERSION_1)
+		return -EINVAL;
+	if (pasid_table_data.format == IOMMU_PASID_FORMAT_SMMUV3 &&
+	    pasid_table_data.argsz <
+		offsetofend(struct iommu_pasid_table_config, vendor_data.smmuv3))
+		return -EINVAL;
+
+	/*
+	 * User might be using a newer UAPI header which has a larger data
+	 * size, we shall support the existing flags within the current
+	 * size. Copy the remaining user data _after_ minsz but not more
+	 * than the current kernel supported size.
+	 */
+	if (copy_from_user((void *)&pasid_table_data + minsz, uinfo + minsz,
+			   min_t(u32, pasid_table_data.argsz, sizeof(pasid_table_data)) - minsz))
+		return -EFAULT;
+
+	/* Now the argsz is validated, check the content */
+	if (pasid_table_data.config < IOMMU_PASID_CONFIG_TRANSLATE ||
+	    pasid_table_data.config > IOMMU_PASID_CONFIG_ABORT)
+		return -EINVAL;
+
+	return domain->ops->attach_pasid_table(domain, &pasid_table_data);
+}
+EXPORT_SYMBOL_GPL(iommu_uapi_attach_pasid_table);
+
+void iommu_detach_pasid_table(struct iommu_domain *domain)
+{
+	if (unlikely(!domain->ops->detach_pasid_table))
+		return;
+
+	domain->ops->detach_pasid_table(domain);
+}
+EXPORT_SYMBOL_GPL(iommu_detach_pasid_table);
+
 static void __iommu_detach_device(struct iommu_domain *domain,
 				  struct device *dev)
 {
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index d2f3435e7d17..e34a1b1c805b 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -232,6 +232,8 @@ struct iommu_iotlb_gather {
  * @cache_invalidate: invalidate translation caches
  * @sva_bind_gpasid: bind guest pasid and mm
  * @sva_unbind_gpasid: unbind guest pasid and mm
+ * @attach_pasid_table: attach a pasid table
+ * @detach_pasid_table: detach the pasid table
  * @def_domain_type: device default domain type, return value:
  *		- IOMMU_DOMAIN_IDENTITY: must use an identity domain
  *		- IOMMU_DOMAIN_DMA: must use a dma domain
@@ -297,6 +299,9 @@ struct iommu_ops {
 				      void *drvdata);
 	void (*sva_unbind)(struct iommu_sva *handle);
 	u32 (*sva_get_pasid)(struct iommu_sva *handle);
+	int (*attach_pasid_table)(struct iommu_domain *domain,
+				  struct iommu_pasid_table_config *cfg);
+	void (*detach_pasid_table)(struct iommu_domain *domain);
 
 	int (*page_response)(struct device *dev,
 			     struct iommu_fault_event *evt,
@@ -430,6 +435,11 @@ extern int iommu_uapi_sva_unbind_gpasid(struct iommu_domain *domain,
 					struct device *dev, void __user *udata);
 extern int iommu_sva_unbind_gpasid(struct iommu_domain *domain,
 				   struct device *dev, ioasid_t pasid);
+extern int iommu_attach_pasid_table(struct iommu_domain *domain,
+				    struct iommu_pasid_table_config *cfg);
+extern int iommu_uapi_attach_pasid_table(struct iommu_domain *domain,
+					 void __user *udata);
+extern void iommu_detach_pasid_table(struct iommu_domain *domain);
 extern struct iommu_domain *iommu_get_domain_for_dev(struct device *dev);
 extern struct iommu_domain *iommu_get_dma_domain(struct device *dev);
 extern int iommu_map(struct iommu_domain *domain, unsigned long iova,
@@ -1035,6 +1045,23 @@ iommu_aux_get_pasid(struct iommu_domain *domain, struct device *dev)
 	return -ENODEV;
 }
 
+static inline
+int iommu_attach_pasid_table(struct iommu_domain *domain,
+			     struct iommu_pasid_table_config *cfg)
+{
+	return -ENODEV;
+}
+
+static inline
+int iommu_uapi_attach_pasid_table(struct iommu_domain *domain,
+				  void __user *uinfo)
+{
+	return -ENODEV;
+}
+
+static inline
+void iommu_detach_pasid_table(struct iommu_domain *domain) {}
+
 static inline struct iommu_sva *
 iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void *drvdata)
 {
diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
index 59178fc229ca..8c079a78dfec 100644
--- a/include/uapi/linux/iommu.h
+++ b/include/uapi/linux/iommu.h
@@ -339,4 +339,58 @@ struct iommu_gpasid_bind_data {
 	} vendor;
 };
 
+/**
+ * struct iommu_pasid_smmuv3 - ARM SMMUv3 Stream Table Entry stage 1 related
+ *     information
+ * @version: API version of this structure
+ * @s1fmt: STE s1fmt (format of the CD table: single CD, linear table
+ *         or 2-level table)
+ * @s1dss: STE s1dss (specifies the behavior when @pasid_bits != 0
+ *         and no PASID is passed along with the incoming transaction)
+ * @padding: reserved for future use (should be zero)
+ *
+ * The PASID table is referred to as the Context Descriptor (CD) table on ARM
+ * SMMUv3. Please refer to the ARM SMMU 3.x spec (ARM IHI 0070A) for full
+ * details.
+ */
+struct iommu_pasid_smmuv3 {
+#define PASID_TABLE_SMMUV3_CFG_VERSION_1 1
+	__u32	version;
+	__u8	s1fmt;
+	__u8	s1dss;
+	__u8	padding[2];
+};
+
+/**
+ * struct iommu_pasid_table_config - PASID table data used to bind guest PASID
+ *     table to the host IOMMU
+ * @argsz: User filled size of this data
+ * @version: API version to prepare for future extensions
+ * @base_ptr: guest physical address of the PASID table
+ * @format: format of the PASID table
+ * @pasid_bits: number of PASID bits used in the PASID table
+ * @config: indicates whether the guest translation stage must
+ *          be translated, bypassed or aborted.
+ * @padding: reserved for future use (should be zero)
+ * @vendor_data.smmuv3: table information when @format is
+ * %IOMMU_PASID_FORMAT_SMMUV3
+ */
+struct iommu_pasid_table_config {
+	__u32	argsz;
+#define PASID_TABLE_CFG_VERSION_1 1
+	__u32	version;
+	__u64	base_ptr;
+#define IOMMU_PASID_FORMAT_SMMUV3	1
+	__u32	format;
+	__u8	pasid_bits;
+#define IOMMU_PASID_CONFIG_TRANSLATE	1
+#define IOMMU_PASID_CONFIG_BYPASS	2
+#define IOMMU_PASID_CONFIG_ABORT	3
+	__u8	config;
+	__u8    padding[2];
+	union {
+		struct iommu_pasid_smmuv3 smmuv3;
+	} vendor_data;
+};
+
 #endif /* _UAPI_IOMMU_H */
-- 
2.26.3

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-10-27 10:44   ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: kevin.tian, jacob.jun.pan, ashok.raj, chenxiang66, maz, vdumpa,
	nicoleotsuka, vivek.gautam, alex.williamson, yi.l.liu, nicolinc,
	vsethi, zhangfei.gao, sumitg, lushenming, wangxingang5

In virtualization use case, when a guest is assigned
a PCI host device, protected by a virtual IOMMU on the guest,
the physical IOMMU must be programmed to be consistent with
the guest mappings. If the physical IOMMU supports two
translation stages it makes sense to program guest mappings
onto the first stage/level (ARM/Intel terminology) while the host
owns the stage/level 2.

In that case, it is mandated to trap on guest configuration
settings and pass those to the physical iommu driver.

This patch adds a new API to the iommu subsystem that allows
to set/unset the pasid table information.

A generic iommu_pasid_table_config struct is introduced in
a new iommu.h uapi header. This is going to be used by the VFIO
user API.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Liu, Yi L <yi.l.liu@linux.intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v13 -> v14:
- export iommu_attach_pasid_table
- add dummy iommu_uapi_attach_pasid_table
- swap base_ptr and format in iommu_pasid_table_config

v12 -> v13:
- Fix config check

v11 -> v12:
- add argsz, name the union
---
 drivers/iommu/iommu.c      | 69 ++++++++++++++++++++++++++++++++++++++
 include/linux/iommu.h      | 27 +++++++++++++++
 include/uapi/linux/iommu.h | 54 +++++++++++++++++++++++++++++
 3 files changed, 150 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 3303d707bab4..6033c263c6e6 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2236,6 +2236,75 @@ int iommu_uapi_sva_unbind_gpasid(struct iommu_domain *domain, struct device *dev
 }
 EXPORT_SYMBOL_GPL(iommu_uapi_sva_unbind_gpasid);
 
+int iommu_attach_pasid_table(struct iommu_domain *domain,
+			     struct iommu_pasid_table_config *cfg)
+{
+	if (unlikely(!domain->ops->attach_pasid_table))
+		return -ENODEV;
+
+	return domain->ops->attach_pasid_table(domain, cfg);
+}
+EXPORT_SYMBOL_GPL(iommu_attach_pasid_table);
+
+int iommu_uapi_attach_pasid_table(struct iommu_domain *domain,
+				  void __user *uinfo)
+{
+	struct iommu_pasid_table_config pasid_table_data = { 0 };
+	u32 minsz;
+
+	if (unlikely(!domain->ops->attach_pasid_table))
+		return -ENODEV;
+
+	/*
+	 * No new spaces can be added before the variable sized union, the
+	 * minimum size is the offset to the union.
+	 */
+	minsz = offsetof(struct iommu_pasid_table_config, vendor_data);
+
+	/* Copy minsz from user to get flags and argsz */
+	if (copy_from_user(&pasid_table_data, uinfo, minsz))
+		return -EFAULT;
+
+	/* Fields before the variable size union are mandatory */
+	if (pasid_table_data.argsz < minsz)
+		return -EINVAL;
+
+	/* PASID and address granu require additional info beyond minsz */
+	if (pasid_table_data.version != PASID_TABLE_CFG_VERSION_1)
+		return -EINVAL;
+	if (pasid_table_data.format == IOMMU_PASID_FORMAT_SMMUV3 &&
+	    pasid_table_data.argsz <
+		offsetofend(struct iommu_pasid_table_config, vendor_data.smmuv3))
+		return -EINVAL;
+
+	/*
+	 * User might be using a newer UAPI header which has a larger data
+	 * size, we shall support the existing flags within the current
+	 * size. Copy the remaining user data _after_ minsz but not more
+	 * than the current kernel supported size.
+	 */
+	if (copy_from_user((void *)&pasid_table_data + minsz, uinfo + minsz,
+			   min_t(u32, pasid_table_data.argsz, sizeof(pasid_table_data)) - minsz))
+		return -EFAULT;
+
+	/* Now the argsz is validated, check the content */
+	if (pasid_table_data.config < IOMMU_PASID_CONFIG_TRANSLATE ||
+	    pasid_table_data.config > IOMMU_PASID_CONFIG_ABORT)
+		return -EINVAL;
+
+	return domain->ops->attach_pasid_table(domain, &pasid_table_data);
+}
+EXPORT_SYMBOL_GPL(iommu_uapi_attach_pasid_table);
+
+void iommu_detach_pasid_table(struct iommu_domain *domain)
+{
+	if (unlikely(!domain->ops->detach_pasid_table))
+		return;
+
+	domain->ops->detach_pasid_table(domain);
+}
+EXPORT_SYMBOL_GPL(iommu_detach_pasid_table);
+
 static void __iommu_detach_device(struct iommu_domain *domain,
 				  struct device *dev)
 {
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index d2f3435e7d17..e34a1b1c805b 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -232,6 +232,8 @@ struct iommu_iotlb_gather {
  * @cache_invalidate: invalidate translation caches
  * @sva_bind_gpasid: bind guest pasid and mm
  * @sva_unbind_gpasid: unbind guest pasid and mm
+ * @attach_pasid_table: attach a pasid table
+ * @detach_pasid_table: detach the pasid table
  * @def_domain_type: device default domain type, return value:
  *		- IOMMU_DOMAIN_IDENTITY: must use an identity domain
  *		- IOMMU_DOMAIN_DMA: must use a dma domain
@@ -297,6 +299,9 @@ struct iommu_ops {
 				      void *drvdata);
 	void (*sva_unbind)(struct iommu_sva *handle);
 	u32 (*sva_get_pasid)(struct iommu_sva *handle);
+	int (*attach_pasid_table)(struct iommu_domain *domain,
+				  struct iommu_pasid_table_config *cfg);
+	void (*detach_pasid_table)(struct iommu_domain *domain);
 
 	int (*page_response)(struct device *dev,
 			     struct iommu_fault_event *evt,
@@ -430,6 +435,11 @@ extern int iommu_uapi_sva_unbind_gpasid(struct iommu_domain *domain,
 					struct device *dev, void __user *udata);
 extern int iommu_sva_unbind_gpasid(struct iommu_domain *domain,
 				   struct device *dev, ioasid_t pasid);
+extern int iommu_attach_pasid_table(struct iommu_domain *domain,
+				    struct iommu_pasid_table_config *cfg);
+extern int iommu_uapi_attach_pasid_table(struct iommu_domain *domain,
+					 void __user *udata);
+extern void iommu_detach_pasid_table(struct iommu_domain *domain);
 extern struct iommu_domain *iommu_get_domain_for_dev(struct device *dev);
 extern struct iommu_domain *iommu_get_dma_domain(struct device *dev);
 extern int iommu_map(struct iommu_domain *domain, unsigned long iova,
@@ -1035,6 +1045,23 @@ iommu_aux_get_pasid(struct iommu_domain *domain, struct device *dev)
 	return -ENODEV;
 }
 
+static inline
+int iommu_attach_pasid_table(struct iommu_domain *domain,
+			     struct iommu_pasid_table_config *cfg)
+{
+	return -ENODEV;
+}
+
+static inline
+int iommu_uapi_attach_pasid_table(struct iommu_domain *domain,
+				  void __user *uinfo)
+{
+	return -ENODEV;
+}
+
+static inline
+void iommu_detach_pasid_table(struct iommu_domain *domain) {}
+
 static inline struct iommu_sva *
 iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void *drvdata)
 {
diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
index 59178fc229ca..8c079a78dfec 100644
--- a/include/uapi/linux/iommu.h
+++ b/include/uapi/linux/iommu.h
@@ -339,4 +339,58 @@ struct iommu_gpasid_bind_data {
 	} vendor;
 };
 
+/**
+ * struct iommu_pasid_smmuv3 - ARM SMMUv3 Stream Table Entry stage 1 related
+ *     information
+ * @version: API version of this structure
+ * @s1fmt: STE s1fmt (format of the CD table: single CD, linear table
+ *         or 2-level table)
+ * @s1dss: STE s1dss (specifies the behavior when @pasid_bits != 0
+ *         and no PASID is passed along with the incoming transaction)
+ * @padding: reserved for future use (should be zero)
+ *
+ * The PASID table is referred to as the Context Descriptor (CD) table on ARM
+ * SMMUv3. Please refer to the ARM SMMU 3.x spec (ARM IHI 0070A) for full
+ * details.
+ */
+struct iommu_pasid_smmuv3 {
+#define PASID_TABLE_SMMUV3_CFG_VERSION_1 1
+	__u32	version;
+	__u8	s1fmt;
+	__u8	s1dss;
+	__u8	padding[2];
+};
+
+/**
+ * struct iommu_pasid_table_config - PASID table data used to bind guest PASID
+ *     table to the host IOMMU
+ * @argsz: User filled size of this data
+ * @version: API version to prepare for future extensions
+ * @base_ptr: guest physical address of the PASID table
+ * @format: format of the PASID table
+ * @pasid_bits: number of PASID bits used in the PASID table
+ * @config: indicates whether the guest translation stage must
+ *          be translated, bypassed or aborted.
+ * @padding: reserved for future use (should be zero)
+ * @vendor_data.smmuv3: table information when @format is
+ * %IOMMU_PASID_FORMAT_SMMUV3
+ */
+struct iommu_pasid_table_config {
+	__u32	argsz;
+#define PASID_TABLE_CFG_VERSION_1 1
+	__u32	version;
+	__u64	base_ptr;
+#define IOMMU_PASID_FORMAT_SMMUV3	1
+	__u32	format;
+	__u8	pasid_bits;
+#define IOMMU_PASID_CONFIG_TRANSLATE	1
+#define IOMMU_PASID_CONFIG_BYPASS	2
+#define IOMMU_PASID_CONFIG_ABORT	3
+	__u8	config;
+	__u8    padding[2];
+	union {
+		struct iommu_pasid_smmuv3 smmuv3;
+	} vendor_data;
+};
+
 #endif /* _UAPI_IOMMU_H */
-- 
2.26.3

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 2/9] iommu: Introduce iommu_get_nesting
  2021-10-27 10:44 ` Eric Auger
  (?)
@ 2021-10-27 10:44   ` Eric Auger
  -1 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: alex.williamson, jacob.jun.pan, yi.l.liu, kevin.tian, ashok.raj,
	maz, peter.maydell, vivek.gautam, shameerali.kolothum.thodi,
	wangxingang5, jiangkunkun, yuzenghui, nicoleotsuka, chenxiang66,
	sumitg, nicolinc, vdumpa, zhangfei.gao, zhangfei.gao, lushenming,
	vsethi

Add iommu_get_nesting() which allow to retrieve whether a domain
uses nested stages.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  8 ++++++++
 drivers/iommu/arm/arm-smmu/arm-smmu.c       |  8 ++++++++
 drivers/iommu/intel/iommu.c                 | 13 +++++++++++++
 drivers/iommu/iommu.c                       | 10 ++++++++++
 include/linux/iommu.h                       |  8 ++++++++
 5 files changed, 47 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index a388e318f86e..61477853a536 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2731,6 +2731,13 @@ static int arm_smmu_enable_nesting(struct iommu_domain *domain)
 	return ret;
 }
 
+static bool arm_smmu_get_nesting(struct iommu_domain *domain)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+	return smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED;
+}
+
 static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
 {
 	return iommu_fwspec_add_ids(dev, args->args, 1);
@@ -2845,6 +2852,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.release_device		= arm_smmu_release_device,
 	.device_group		= arm_smmu_device_group,
 	.enable_nesting		= arm_smmu_enable_nesting,
+	.get_nesting		= arm_smmu_get_nesting,
 	.of_xlate		= arm_smmu_of_xlate,
 	.get_resv_regions	= arm_smmu_get_resv_regions,
 	.put_resv_regions	= generic_iommu_put_resv_regions,
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 4bc75c4ce402..167cf1d51279 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1522,6 +1522,13 @@ static int arm_smmu_enable_nesting(struct iommu_domain *domain)
 	return ret;
 }
 
+static bool arm_smmu_get_nesting(struct iommu_domain *domain)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+	return smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED;
+}
+
 static int arm_smmu_set_pgtable_quirks(struct iommu_domain *domain,
 		unsigned long quirks)
 {
@@ -1595,6 +1602,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.probe_finalize		= arm_smmu_probe_finalize,
 	.device_group		= arm_smmu_device_group,
 	.enable_nesting		= arm_smmu_enable_nesting,
+	.get_nesting		= arm_smmu_get_nesting,
 	.set_pgtable_quirks	= arm_smmu_set_pgtable_quirks,
 	.of_xlate		= arm_smmu_of_xlate,
 	.get_resv_regions	= arm_smmu_get_resv_regions,
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index d75f59ae28e6..e42767bd47f9 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5524,6 +5524,18 @@ intel_iommu_enable_nesting(struct iommu_domain *domain)
 	return ret;
 }
 
+static bool intel_iommu_get_nesting(struct iommu_domain *domain)
+{
+	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+	bool nesting;
+
+	spin_lock_irqsave(&device_domain_lock, flags);
+	nesting =  dmar_domain->flags & DOMAIN_FLAG_NESTING_MODE &&
+		   !(dmar_domain->flags & DOMAIN_FLAG_USE_FIRST_LEVEL);
+	spin_unlock_irqrestore(&device_domain_lock, flags);
+	return nesting;
+}
+
 /*
  * Check that the device does not live on an external facing PCI port that is
  * marked as untrusted. Such devices should not be able to apply quirks and
@@ -5561,6 +5573,7 @@ const struct iommu_ops intel_iommu_ops = {
 	.domain_alloc		= intel_iommu_domain_alloc,
 	.domain_free		= intel_iommu_domain_free,
 	.enable_nesting		= intel_iommu_enable_nesting,
+	.get_nesting		= intel_iommu_get_nesting,
 	.attach_dev		= intel_iommu_attach_device,
 	.detach_dev		= intel_iommu_detach_device,
 	.aux_attach_dev		= intel_iommu_aux_attach_device,
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 6033c263c6e6..3e639c4e8015 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2843,6 +2843,16 @@ int iommu_enable_nesting(struct iommu_domain *domain)
 }
 EXPORT_SYMBOL_GPL(iommu_enable_nesting);
 
+bool iommu_get_nesting(struct iommu_domain *domain)
+{
+	if (domain->type != IOMMU_DOMAIN_UNMANAGED)
+		return false;
+	if (!domain->ops->enable_nesting)
+		return false;
+	return domain->ops->get_nesting(domain);
+}
+EXPORT_SYMBOL_GPL(iommu_get_nesting);
+
 int iommu_set_pgtable_quirks(struct iommu_domain *domain,
 		unsigned long quirk)
 {
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index e34a1b1c805b..846e19151f40 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -213,6 +213,7 @@ struct iommu_iotlb_gather {
  *                  group and attached to the groups domain
  * @device_group: find iommu group for a particular device
  * @enable_nesting: Enable nesting
+ * @get_nesting: get whether the domain uses nested stages
  * @set_pgtable_quirks: Set io page table quirks (IO_PGTABLE_QUIRK_*)
  * @get_resv_regions: Request list of reserved regions for a device
  * @put_resv_regions: Free list of reserved regions for a device
@@ -271,6 +272,7 @@ struct iommu_ops {
 	void (*probe_finalize)(struct device *dev);
 	struct iommu_group *(*device_group)(struct device *dev);
 	int (*enable_nesting)(struct iommu_domain *domain);
+	bool (*get_nesting)(struct iommu_domain *domain);
 	int (*set_pgtable_quirks)(struct iommu_domain *domain,
 				  unsigned long quirks);
 
@@ -690,6 +692,7 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev,
 					void *drvdata);
 void iommu_sva_unbind_device(struct iommu_sva *handle);
 u32 iommu_sva_get_pasid(struct iommu_sva *handle);
+bool iommu_get_nesting(struct iommu_domain *domain);
 
 #else /* CONFIG_IOMMU_API */
 
@@ -1108,6 +1111,11 @@ static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device *dev)
 {
 	return NULL;
 }
+
+static inline bool iommu_get_nesting(struct iommu_domain *domain)
+{
+	return false;
+}
 #endif /* CONFIG_IOMMU_API */
 
 /**
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 2/9] iommu: Introduce iommu_get_nesting
@ 2021-10-27 10:44   ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: peter.maydell, kevin.tian, ashok.raj, maz, vivek.gautam,
	alex.williamson, vsethi, zhangfei.gao, lushenming, wangxingang5

Add iommu_get_nesting() which allow to retrieve whether a domain
uses nested stages.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  8 ++++++++
 drivers/iommu/arm/arm-smmu/arm-smmu.c       |  8 ++++++++
 drivers/iommu/intel/iommu.c                 | 13 +++++++++++++
 drivers/iommu/iommu.c                       | 10 ++++++++++
 include/linux/iommu.h                       |  8 ++++++++
 5 files changed, 47 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index a388e318f86e..61477853a536 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2731,6 +2731,13 @@ static int arm_smmu_enable_nesting(struct iommu_domain *domain)
 	return ret;
 }
 
+static bool arm_smmu_get_nesting(struct iommu_domain *domain)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+	return smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED;
+}
+
 static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
 {
 	return iommu_fwspec_add_ids(dev, args->args, 1);
@@ -2845,6 +2852,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.release_device		= arm_smmu_release_device,
 	.device_group		= arm_smmu_device_group,
 	.enable_nesting		= arm_smmu_enable_nesting,
+	.get_nesting		= arm_smmu_get_nesting,
 	.of_xlate		= arm_smmu_of_xlate,
 	.get_resv_regions	= arm_smmu_get_resv_regions,
 	.put_resv_regions	= generic_iommu_put_resv_regions,
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 4bc75c4ce402..167cf1d51279 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1522,6 +1522,13 @@ static int arm_smmu_enable_nesting(struct iommu_domain *domain)
 	return ret;
 }
 
+static bool arm_smmu_get_nesting(struct iommu_domain *domain)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+	return smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED;
+}
+
 static int arm_smmu_set_pgtable_quirks(struct iommu_domain *domain,
 		unsigned long quirks)
 {
@@ -1595,6 +1602,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.probe_finalize		= arm_smmu_probe_finalize,
 	.device_group		= arm_smmu_device_group,
 	.enable_nesting		= arm_smmu_enable_nesting,
+	.get_nesting		= arm_smmu_get_nesting,
 	.set_pgtable_quirks	= arm_smmu_set_pgtable_quirks,
 	.of_xlate		= arm_smmu_of_xlate,
 	.get_resv_regions	= arm_smmu_get_resv_regions,
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index d75f59ae28e6..e42767bd47f9 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5524,6 +5524,18 @@ intel_iommu_enable_nesting(struct iommu_domain *domain)
 	return ret;
 }
 
+static bool intel_iommu_get_nesting(struct iommu_domain *domain)
+{
+	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+	bool nesting;
+
+	spin_lock_irqsave(&device_domain_lock, flags);
+	nesting =  dmar_domain->flags & DOMAIN_FLAG_NESTING_MODE &&
+		   !(dmar_domain->flags & DOMAIN_FLAG_USE_FIRST_LEVEL);
+	spin_unlock_irqrestore(&device_domain_lock, flags);
+	return nesting;
+}
+
 /*
  * Check that the device does not live on an external facing PCI port that is
  * marked as untrusted. Such devices should not be able to apply quirks and
@@ -5561,6 +5573,7 @@ const struct iommu_ops intel_iommu_ops = {
 	.domain_alloc		= intel_iommu_domain_alloc,
 	.domain_free		= intel_iommu_domain_free,
 	.enable_nesting		= intel_iommu_enable_nesting,
+	.get_nesting		= intel_iommu_get_nesting,
 	.attach_dev		= intel_iommu_attach_device,
 	.detach_dev		= intel_iommu_detach_device,
 	.aux_attach_dev		= intel_iommu_aux_attach_device,
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 6033c263c6e6..3e639c4e8015 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2843,6 +2843,16 @@ int iommu_enable_nesting(struct iommu_domain *domain)
 }
 EXPORT_SYMBOL_GPL(iommu_enable_nesting);
 
+bool iommu_get_nesting(struct iommu_domain *domain)
+{
+	if (domain->type != IOMMU_DOMAIN_UNMANAGED)
+		return false;
+	if (!domain->ops->enable_nesting)
+		return false;
+	return domain->ops->get_nesting(domain);
+}
+EXPORT_SYMBOL_GPL(iommu_get_nesting);
+
 int iommu_set_pgtable_quirks(struct iommu_domain *domain,
 		unsigned long quirk)
 {
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index e34a1b1c805b..846e19151f40 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -213,6 +213,7 @@ struct iommu_iotlb_gather {
  *                  group and attached to the groups domain
  * @device_group: find iommu group for a particular device
  * @enable_nesting: Enable nesting
+ * @get_nesting: get whether the domain uses nested stages
  * @set_pgtable_quirks: Set io page table quirks (IO_PGTABLE_QUIRK_*)
  * @get_resv_regions: Request list of reserved regions for a device
  * @put_resv_regions: Free list of reserved regions for a device
@@ -271,6 +272,7 @@ struct iommu_ops {
 	void (*probe_finalize)(struct device *dev);
 	struct iommu_group *(*device_group)(struct device *dev);
 	int (*enable_nesting)(struct iommu_domain *domain);
+	bool (*get_nesting)(struct iommu_domain *domain);
 	int (*set_pgtable_quirks)(struct iommu_domain *domain,
 				  unsigned long quirks);
 
@@ -690,6 +692,7 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev,
 					void *drvdata);
 void iommu_sva_unbind_device(struct iommu_sva *handle);
 u32 iommu_sva_get_pasid(struct iommu_sva *handle);
+bool iommu_get_nesting(struct iommu_domain *domain);
 
 #else /* CONFIG_IOMMU_API */
 
@@ -1108,6 +1111,11 @@ static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device *dev)
 {
 	return NULL;
 }
+
+static inline bool iommu_get_nesting(struct iommu_domain *domain)
+{
+	return false;
+}
 #endif /* CONFIG_IOMMU_API */
 
 /**
-- 
2.26.3

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 2/9] iommu: Introduce iommu_get_nesting
@ 2021-10-27 10:44   ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: kevin.tian, jacob.jun.pan, ashok.raj, chenxiang66, maz, vdumpa,
	nicoleotsuka, vivek.gautam, alex.williamson, yi.l.liu, nicolinc,
	vsethi, zhangfei.gao, sumitg, lushenming, wangxingang5

Add iommu_get_nesting() which allow to retrieve whether a domain
uses nested stages.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  8 ++++++++
 drivers/iommu/arm/arm-smmu/arm-smmu.c       |  8 ++++++++
 drivers/iommu/intel/iommu.c                 | 13 +++++++++++++
 drivers/iommu/iommu.c                       | 10 ++++++++++
 include/linux/iommu.h                       |  8 ++++++++
 5 files changed, 47 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index a388e318f86e..61477853a536 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2731,6 +2731,13 @@ static int arm_smmu_enable_nesting(struct iommu_domain *domain)
 	return ret;
 }
 
+static bool arm_smmu_get_nesting(struct iommu_domain *domain)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+	return smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED;
+}
+
 static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
 {
 	return iommu_fwspec_add_ids(dev, args->args, 1);
@@ -2845,6 +2852,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.release_device		= arm_smmu_release_device,
 	.device_group		= arm_smmu_device_group,
 	.enable_nesting		= arm_smmu_enable_nesting,
+	.get_nesting		= arm_smmu_get_nesting,
 	.of_xlate		= arm_smmu_of_xlate,
 	.get_resv_regions	= arm_smmu_get_resv_regions,
 	.put_resv_regions	= generic_iommu_put_resv_regions,
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 4bc75c4ce402..167cf1d51279 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1522,6 +1522,13 @@ static int arm_smmu_enable_nesting(struct iommu_domain *domain)
 	return ret;
 }
 
+static bool arm_smmu_get_nesting(struct iommu_domain *domain)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+	return smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED;
+}
+
 static int arm_smmu_set_pgtable_quirks(struct iommu_domain *domain,
 		unsigned long quirks)
 {
@@ -1595,6 +1602,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.probe_finalize		= arm_smmu_probe_finalize,
 	.device_group		= arm_smmu_device_group,
 	.enable_nesting		= arm_smmu_enable_nesting,
+	.get_nesting		= arm_smmu_get_nesting,
 	.set_pgtable_quirks	= arm_smmu_set_pgtable_quirks,
 	.of_xlate		= arm_smmu_of_xlate,
 	.get_resv_regions	= arm_smmu_get_resv_regions,
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index d75f59ae28e6..e42767bd47f9 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5524,6 +5524,18 @@ intel_iommu_enable_nesting(struct iommu_domain *domain)
 	return ret;
 }
 
+static bool intel_iommu_get_nesting(struct iommu_domain *domain)
+{
+	struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+	bool nesting;
+
+	spin_lock_irqsave(&device_domain_lock, flags);
+	nesting =  dmar_domain->flags & DOMAIN_FLAG_NESTING_MODE &&
+		   !(dmar_domain->flags & DOMAIN_FLAG_USE_FIRST_LEVEL);
+	spin_unlock_irqrestore(&device_domain_lock, flags);
+	return nesting;
+}
+
 /*
  * Check that the device does not live on an external facing PCI port that is
  * marked as untrusted. Such devices should not be able to apply quirks and
@@ -5561,6 +5573,7 @@ const struct iommu_ops intel_iommu_ops = {
 	.domain_alloc		= intel_iommu_domain_alloc,
 	.domain_free		= intel_iommu_domain_free,
 	.enable_nesting		= intel_iommu_enable_nesting,
+	.get_nesting		= intel_iommu_get_nesting,
 	.attach_dev		= intel_iommu_attach_device,
 	.detach_dev		= intel_iommu_detach_device,
 	.aux_attach_dev		= intel_iommu_aux_attach_device,
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 6033c263c6e6..3e639c4e8015 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2843,6 +2843,16 @@ int iommu_enable_nesting(struct iommu_domain *domain)
 }
 EXPORT_SYMBOL_GPL(iommu_enable_nesting);
 
+bool iommu_get_nesting(struct iommu_domain *domain)
+{
+	if (domain->type != IOMMU_DOMAIN_UNMANAGED)
+		return false;
+	if (!domain->ops->enable_nesting)
+		return false;
+	return domain->ops->get_nesting(domain);
+}
+EXPORT_SYMBOL_GPL(iommu_get_nesting);
+
 int iommu_set_pgtable_quirks(struct iommu_domain *domain,
 		unsigned long quirk)
 {
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index e34a1b1c805b..846e19151f40 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -213,6 +213,7 @@ struct iommu_iotlb_gather {
  *                  group and attached to the groups domain
  * @device_group: find iommu group for a particular device
  * @enable_nesting: Enable nesting
+ * @get_nesting: get whether the domain uses nested stages
  * @set_pgtable_quirks: Set io page table quirks (IO_PGTABLE_QUIRK_*)
  * @get_resv_regions: Request list of reserved regions for a device
  * @put_resv_regions: Free list of reserved regions for a device
@@ -271,6 +272,7 @@ struct iommu_ops {
 	void (*probe_finalize)(struct device *dev);
 	struct iommu_group *(*device_group)(struct device *dev);
 	int (*enable_nesting)(struct iommu_domain *domain);
+	bool (*get_nesting)(struct iommu_domain *domain);
 	int (*set_pgtable_quirks)(struct iommu_domain *domain,
 				  unsigned long quirks);
 
@@ -690,6 +692,7 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev,
 					void *drvdata);
 void iommu_sva_unbind_device(struct iommu_sva *handle);
 u32 iommu_sva_get_pasid(struct iommu_sva *handle);
+bool iommu_get_nesting(struct iommu_domain *domain);
 
 #else /* CONFIG_IOMMU_API */
 
@@ -1108,6 +1111,11 @@ static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device *dev)
 {
 	return NULL;
 }
+
+static inline bool iommu_get_nesting(struct iommu_domain *domain)
+{
+	return false;
+}
 #endif /* CONFIG_IOMMU_API */
 
 /**
-- 
2.26.3

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 3/9] iommu/smmuv3: Allow s1 and s2 configs to coexist
  2021-10-27 10:44 ` Eric Auger
  (?)
@ 2021-10-27 10:44   ` Eric Auger
  -1 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: alex.williamson, jacob.jun.pan, yi.l.liu, kevin.tian, ashok.raj,
	maz, peter.maydell, vivek.gautam, shameerali.kolothum.thodi,
	wangxingang5, jiangkunkun, yuzenghui, nicoleotsuka, chenxiang66,
	sumitg, nicolinc, vdumpa, zhangfei.gao, zhangfei.gao, lushenming,
	vsethi

In true nested mode, both s1_cfg and s2_cfg will coexist.
Let's remove the union and add a "set" field in each
config structure telling whether the config is set and needs
to be applied when writing the STE. In legacy nested mode,
only the second stage is used. In true nested mode, both stages
are used and the S1 config is "set" when the guest passes
its pasid table.

No functional change intended.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v13 -> v14:
- slight reword of the commit message

v12 -> v13:
- does not dynamically allocate s1-cfg and s2_cfg anymore. Add
  the set field
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 43 +++++++++++++--------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  8 ++--
 2 files changed, 31 insertions(+), 20 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 61477853a536..b8384a834552 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1254,8 +1254,8 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 	u64 val = le64_to_cpu(dst[0]);
 	bool ste_live = false;
 	struct arm_smmu_device *smmu = NULL;
-	struct arm_smmu_s1_cfg *s1_cfg = NULL;
-	struct arm_smmu_s2_cfg *s2_cfg = NULL;
+	struct arm_smmu_s1_cfg *s1_cfg;
+	struct arm_smmu_s2_cfg *s2_cfg;
 	struct arm_smmu_domain *smmu_domain = NULL;
 	struct arm_smmu_cmdq_ent prefetch_cmd = {
 		.opcode		= CMDQ_OP_PREFETCH_CFG,
@@ -1270,13 +1270,24 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 	}
 
 	if (smmu_domain) {
+		s1_cfg = &smmu_domain->s1_cfg;
+		s2_cfg = &smmu_domain->s2_cfg;
+
 		switch (smmu_domain->stage) {
 		case ARM_SMMU_DOMAIN_S1:
-			s1_cfg = &smmu_domain->s1_cfg;
+			s1_cfg->set = true;
+			s2_cfg->set = false;
 			break;
 		case ARM_SMMU_DOMAIN_S2:
+			s1_cfg->set = false;
+			s2_cfg->set = true;
+			break;
 		case ARM_SMMU_DOMAIN_NESTED:
-			s2_cfg = &smmu_domain->s2_cfg;
+			/*
+			 * Actual usage of stage 1 depends on nested mode:
+			 * legacy (2d stage only) or true nested mode
+			 */
+			s2_cfg->set = true;
 			break;
 		default:
 			break;
@@ -1303,7 +1314,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 	val = STRTAB_STE_0_V;
 
 	/* Bypass/fault */
-	if (!smmu_domain || !(s1_cfg || s2_cfg)) {
+	if (!smmu_domain || !(s1_cfg->set || s2_cfg->set)) {
 		if (!smmu_domain && disable_bypass)
 			val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT);
 		else
@@ -1322,7 +1333,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 		return;
 	}
 
-	if (s1_cfg) {
+	if (s1_cfg->set) {
 		u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ?
 			STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
 
@@ -1344,7 +1355,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 			FIELD_PREP(STRTAB_STE_0_S1FMT, s1_cfg->s1fmt);
 	}
 
-	if (s2_cfg) {
+	if (s2_cfg->set) {
 		BUG_ON(ste_live);
 		dst[2] = cpu_to_le64(
 			 FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
@@ -2036,23 +2047,23 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
 {
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
+	struct arm_smmu_s1_cfg *s1_cfg = &smmu_domain->s1_cfg;
+	struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg;
 
 	free_io_pgtable_ops(smmu_domain->pgtbl_ops);
 
 	/* Free the CD and ASID, if we allocated them */
-	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
-		struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg;
-
+	if (s1_cfg->set) {
 		/* Prevent SVA from touching the CD while we're freeing it */
 		mutex_lock(&arm_smmu_asid_lock);
-		if (cfg->cdcfg.cdtab)
+		if (s1_cfg->cdcfg.cdtab)
 			arm_smmu_free_cd_tables(smmu_domain);
-		arm_smmu_free_asid(&cfg->cd);
+		arm_smmu_free_asid(&s1_cfg->cd);
 		mutex_unlock(&arm_smmu_asid_lock);
-	} else {
-		struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
-		if (cfg->vmid)
-			arm_smmu_bitmap_free(smmu->vmid_map, cfg->vmid);
+	}
+	if (s2_cfg->set) {
+		if (s2_cfg->vmid)
+			arm_smmu_bitmap_free(smmu->vmid_map, s2_cfg->vmid);
 	}
 
 	kfree(smmu_domain);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 4cb136f07914..db1a84d24e30 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -597,12 +597,14 @@ struct arm_smmu_s1_cfg {
 	struct arm_smmu_ctx_desc	cd;
 	u8				s1fmt;
 	u8				s1cdmax;
+	bool				set;
 };
 
 struct arm_smmu_s2_cfg {
 	u16				vmid;
 	u64				vttbr;
 	u64				vtcr;
+	bool				set;
 };
 
 struct arm_smmu_strtab_cfg {
@@ -716,10 +718,8 @@ struct arm_smmu_domain {
 	atomic_t			nr_ats_masters;
 
 	enum arm_smmu_domain_stage	stage;
-	union {
-		struct arm_smmu_s1_cfg	s1_cfg;
-		struct arm_smmu_s2_cfg	s2_cfg;
-	};
+	struct arm_smmu_s1_cfg	s1_cfg;
+	struct arm_smmu_s2_cfg	s2_cfg;
 
 	struct iommu_domain		domain;
 
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 3/9] iommu/smmuv3: Allow s1 and s2 configs to coexist
@ 2021-10-27 10:44   ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: peter.maydell, kevin.tian, ashok.raj, maz, vivek.gautam,
	alex.williamson, vsethi, zhangfei.gao, lushenming, wangxingang5

In true nested mode, both s1_cfg and s2_cfg will coexist.
Let's remove the union and add a "set" field in each
config structure telling whether the config is set and needs
to be applied when writing the STE. In legacy nested mode,
only the second stage is used. In true nested mode, both stages
are used and the S1 config is "set" when the guest passes
its pasid table.

No functional change intended.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v13 -> v14:
- slight reword of the commit message

v12 -> v13:
- does not dynamically allocate s1-cfg and s2_cfg anymore. Add
  the set field
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 43 +++++++++++++--------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  8 ++--
 2 files changed, 31 insertions(+), 20 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 61477853a536..b8384a834552 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1254,8 +1254,8 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 	u64 val = le64_to_cpu(dst[0]);
 	bool ste_live = false;
 	struct arm_smmu_device *smmu = NULL;
-	struct arm_smmu_s1_cfg *s1_cfg = NULL;
-	struct arm_smmu_s2_cfg *s2_cfg = NULL;
+	struct arm_smmu_s1_cfg *s1_cfg;
+	struct arm_smmu_s2_cfg *s2_cfg;
 	struct arm_smmu_domain *smmu_domain = NULL;
 	struct arm_smmu_cmdq_ent prefetch_cmd = {
 		.opcode		= CMDQ_OP_PREFETCH_CFG,
@@ -1270,13 +1270,24 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 	}
 
 	if (smmu_domain) {
+		s1_cfg = &smmu_domain->s1_cfg;
+		s2_cfg = &smmu_domain->s2_cfg;
+
 		switch (smmu_domain->stage) {
 		case ARM_SMMU_DOMAIN_S1:
-			s1_cfg = &smmu_domain->s1_cfg;
+			s1_cfg->set = true;
+			s2_cfg->set = false;
 			break;
 		case ARM_SMMU_DOMAIN_S2:
+			s1_cfg->set = false;
+			s2_cfg->set = true;
+			break;
 		case ARM_SMMU_DOMAIN_NESTED:
-			s2_cfg = &smmu_domain->s2_cfg;
+			/*
+			 * Actual usage of stage 1 depends on nested mode:
+			 * legacy (2d stage only) or true nested mode
+			 */
+			s2_cfg->set = true;
 			break;
 		default:
 			break;
@@ -1303,7 +1314,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 	val = STRTAB_STE_0_V;
 
 	/* Bypass/fault */
-	if (!smmu_domain || !(s1_cfg || s2_cfg)) {
+	if (!smmu_domain || !(s1_cfg->set || s2_cfg->set)) {
 		if (!smmu_domain && disable_bypass)
 			val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT);
 		else
@@ -1322,7 +1333,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 		return;
 	}
 
-	if (s1_cfg) {
+	if (s1_cfg->set) {
 		u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ?
 			STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
 
@@ -1344,7 +1355,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 			FIELD_PREP(STRTAB_STE_0_S1FMT, s1_cfg->s1fmt);
 	}
 
-	if (s2_cfg) {
+	if (s2_cfg->set) {
 		BUG_ON(ste_live);
 		dst[2] = cpu_to_le64(
 			 FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
@@ -2036,23 +2047,23 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
 {
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
+	struct arm_smmu_s1_cfg *s1_cfg = &smmu_domain->s1_cfg;
+	struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg;
 
 	free_io_pgtable_ops(smmu_domain->pgtbl_ops);
 
 	/* Free the CD and ASID, if we allocated them */
-	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
-		struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg;
-
+	if (s1_cfg->set) {
 		/* Prevent SVA from touching the CD while we're freeing it */
 		mutex_lock(&arm_smmu_asid_lock);
-		if (cfg->cdcfg.cdtab)
+		if (s1_cfg->cdcfg.cdtab)
 			arm_smmu_free_cd_tables(smmu_domain);
-		arm_smmu_free_asid(&cfg->cd);
+		arm_smmu_free_asid(&s1_cfg->cd);
 		mutex_unlock(&arm_smmu_asid_lock);
-	} else {
-		struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
-		if (cfg->vmid)
-			arm_smmu_bitmap_free(smmu->vmid_map, cfg->vmid);
+	}
+	if (s2_cfg->set) {
+		if (s2_cfg->vmid)
+			arm_smmu_bitmap_free(smmu->vmid_map, s2_cfg->vmid);
 	}
 
 	kfree(smmu_domain);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 4cb136f07914..db1a84d24e30 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -597,12 +597,14 @@ struct arm_smmu_s1_cfg {
 	struct arm_smmu_ctx_desc	cd;
 	u8				s1fmt;
 	u8				s1cdmax;
+	bool				set;
 };
 
 struct arm_smmu_s2_cfg {
 	u16				vmid;
 	u64				vttbr;
 	u64				vtcr;
+	bool				set;
 };
 
 struct arm_smmu_strtab_cfg {
@@ -716,10 +718,8 @@ struct arm_smmu_domain {
 	atomic_t			nr_ats_masters;
 
 	enum arm_smmu_domain_stage	stage;
-	union {
-		struct arm_smmu_s1_cfg	s1_cfg;
-		struct arm_smmu_s2_cfg	s2_cfg;
-	};
+	struct arm_smmu_s1_cfg	s1_cfg;
+	struct arm_smmu_s2_cfg	s2_cfg;
 
 	struct iommu_domain		domain;
 
-- 
2.26.3

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 3/9] iommu/smmuv3: Allow s1 and s2 configs to coexist
@ 2021-10-27 10:44   ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: kevin.tian, jacob.jun.pan, ashok.raj, chenxiang66, maz, vdumpa,
	nicoleotsuka, vivek.gautam, alex.williamson, yi.l.liu, nicolinc,
	vsethi, zhangfei.gao, sumitg, lushenming, wangxingang5

In true nested mode, both s1_cfg and s2_cfg will coexist.
Let's remove the union and add a "set" field in each
config structure telling whether the config is set and needs
to be applied when writing the STE. In legacy nested mode,
only the second stage is used. In true nested mode, both stages
are used and the S1 config is "set" when the guest passes
its pasid table.

No functional change intended.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v13 -> v14:
- slight reword of the commit message

v12 -> v13:
- does not dynamically allocate s1-cfg and s2_cfg anymore. Add
  the set field
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 43 +++++++++++++--------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  8 ++--
 2 files changed, 31 insertions(+), 20 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 61477853a536..b8384a834552 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1254,8 +1254,8 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 	u64 val = le64_to_cpu(dst[0]);
 	bool ste_live = false;
 	struct arm_smmu_device *smmu = NULL;
-	struct arm_smmu_s1_cfg *s1_cfg = NULL;
-	struct arm_smmu_s2_cfg *s2_cfg = NULL;
+	struct arm_smmu_s1_cfg *s1_cfg;
+	struct arm_smmu_s2_cfg *s2_cfg;
 	struct arm_smmu_domain *smmu_domain = NULL;
 	struct arm_smmu_cmdq_ent prefetch_cmd = {
 		.opcode		= CMDQ_OP_PREFETCH_CFG,
@@ -1270,13 +1270,24 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 	}
 
 	if (smmu_domain) {
+		s1_cfg = &smmu_domain->s1_cfg;
+		s2_cfg = &smmu_domain->s2_cfg;
+
 		switch (smmu_domain->stage) {
 		case ARM_SMMU_DOMAIN_S1:
-			s1_cfg = &smmu_domain->s1_cfg;
+			s1_cfg->set = true;
+			s2_cfg->set = false;
 			break;
 		case ARM_SMMU_DOMAIN_S2:
+			s1_cfg->set = false;
+			s2_cfg->set = true;
+			break;
 		case ARM_SMMU_DOMAIN_NESTED:
-			s2_cfg = &smmu_domain->s2_cfg;
+			/*
+			 * Actual usage of stage 1 depends on nested mode:
+			 * legacy (2d stage only) or true nested mode
+			 */
+			s2_cfg->set = true;
 			break;
 		default:
 			break;
@@ -1303,7 +1314,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 	val = STRTAB_STE_0_V;
 
 	/* Bypass/fault */
-	if (!smmu_domain || !(s1_cfg || s2_cfg)) {
+	if (!smmu_domain || !(s1_cfg->set || s2_cfg->set)) {
 		if (!smmu_domain && disable_bypass)
 			val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT);
 		else
@@ -1322,7 +1333,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 		return;
 	}
 
-	if (s1_cfg) {
+	if (s1_cfg->set) {
 		u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ?
 			STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
 
@@ -1344,7 +1355,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 			FIELD_PREP(STRTAB_STE_0_S1FMT, s1_cfg->s1fmt);
 	}
 
-	if (s2_cfg) {
+	if (s2_cfg->set) {
 		BUG_ON(ste_live);
 		dst[2] = cpu_to_le64(
 			 FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
@@ -2036,23 +2047,23 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
 {
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
+	struct arm_smmu_s1_cfg *s1_cfg = &smmu_domain->s1_cfg;
+	struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg;
 
 	free_io_pgtable_ops(smmu_domain->pgtbl_ops);
 
 	/* Free the CD and ASID, if we allocated them */
-	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
-		struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg;
-
+	if (s1_cfg->set) {
 		/* Prevent SVA from touching the CD while we're freeing it */
 		mutex_lock(&arm_smmu_asid_lock);
-		if (cfg->cdcfg.cdtab)
+		if (s1_cfg->cdcfg.cdtab)
 			arm_smmu_free_cd_tables(smmu_domain);
-		arm_smmu_free_asid(&cfg->cd);
+		arm_smmu_free_asid(&s1_cfg->cd);
 		mutex_unlock(&arm_smmu_asid_lock);
-	} else {
-		struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
-		if (cfg->vmid)
-			arm_smmu_bitmap_free(smmu->vmid_map, cfg->vmid);
+	}
+	if (s2_cfg->set) {
+		if (s2_cfg->vmid)
+			arm_smmu_bitmap_free(smmu->vmid_map, s2_cfg->vmid);
 	}
 
 	kfree(smmu_domain);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 4cb136f07914..db1a84d24e30 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -597,12 +597,14 @@ struct arm_smmu_s1_cfg {
 	struct arm_smmu_ctx_desc	cd;
 	u8				s1fmt;
 	u8				s1cdmax;
+	bool				set;
 };
 
 struct arm_smmu_s2_cfg {
 	u16				vmid;
 	u64				vttbr;
 	u64				vtcr;
+	bool				set;
 };
 
 struct arm_smmu_strtab_cfg {
@@ -716,10 +718,8 @@ struct arm_smmu_domain {
 	atomic_t			nr_ats_masters;
 
 	enum arm_smmu_domain_stage	stage;
-	union {
-		struct arm_smmu_s1_cfg	s1_cfg;
-		struct arm_smmu_s2_cfg	s2_cfg;
-	};
+	struct arm_smmu_s1_cfg	s1_cfg;
+	struct arm_smmu_s2_cfg	s2_cfg;
 
 	struct iommu_domain		domain;
 
-- 
2.26.3

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 4/9] iommu/smmuv3: Get prepared for nested stage support
  2021-10-27 10:44 ` Eric Auger
  (?)
@ 2021-10-27 10:44   ` Eric Auger
  -1 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: alex.williamson, jacob.jun.pan, yi.l.liu, kevin.tian, ashok.raj,
	maz, peter.maydell, vivek.gautam, shameerali.kolothum.thodi,
	wangxingang5, jiangkunkun, yuzenghui, nicoleotsuka, chenxiang66,
	sumitg, nicolinc, vdumpa, zhangfei.gao, zhangfei.gao, lushenming,
	vsethi

When nested stage translation is setup, both s1_cfg and
s2_cfg are set.

We introduce a new smmu_domain abort field that will be set
upon guest stage1 configuration passing. If no guest stage1
config has been attached, it is ignored when writing the STE.

arm_smmu_write_strtab_ent() is modified to write both stage
fields in the STE and deal with the abort field.

In nested mode, only stage 2 is "finalized" as the host does
not own/configure the stage 1 context descriptor; guest does.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v13 -> v14:
- removed BUG_ON(ste_live && !nested) as this should never happen
- restored the old comment as there is always an abort in between
  S2 -> S1 + S2 and S1 + S2 -> S2
- remove sparse warning

v10 -> v11:
- Fix an issue reported by Shameer when switching from with vSMMU
  to without vSMMU. Despite the spec does not seem to mention it
  seems to be needed to reset the 2 high 64b when switching from
  S1+S2 cfg to S1 only. Especially dst[3] needs to be reset (S2TTB).
  On some implementations, if the S2TTB is not reset, this causes
  a C_BAD_STE error
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 55 ++++++++++++++++++---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 +
 2 files changed, 49 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index b8384a834552..5e0917e1226b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1252,7 +1252,8 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 	 * 3. Update Config, sync
 	 */
 	u64 val = le64_to_cpu(dst[0]);
-	bool ste_live = false;
+	bool s1_live = false, s2_live = false, ste_live;
+	bool abort, translate = false;
 	struct arm_smmu_device *smmu = NULL;
 	struct arm_smmu_s1_cfg *s1_cfg;
 	struct arm_smmu_s2_cfg *s2_cfg;
@@ -1292,6 +1293,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 		default:
 			break;
 		}
+		translate = s1_cfg->set || s2_cfg->set;
 	}
 
 	if (val & STRTAB_STE_0_V) {
@@ -1299,23 +1301,36 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 		case STRTAB_STE_0_CFG_BYPASS:
 			break;
 		case STRTAB_STE_0_CFG_S1_TRANS:
+			s1_live = true;
+			break;
 		case STRTAB_STE_0_CFG_S2_TRANS:
-			ste_live = true;
+			s2_live = true;
+			break;
+		case STRTAB_STE_0_CFG_NESTED:
+			s1_live = true;
+			s2_live = true;
 			break;
 		case STRTAB_STE_0_CFG_ABORT:
-			BUG_ON(!disable_bypass);
 			break;
 		default:
 			BUG(); /* STE corruption */
 		}
 	}
 
+	ste_live = s1_live || s2_live;
+
 	/* Nuke the existing STE_0 value, as we're going to rewrite it */
 	val = STRTAB_STE_0_V;
 
 	/* Bypass/fault */
-	if (!smmu_domain || !(s1_cfg->set || s2_cfg->set)) {
-		if (!smmu_domain && disable_bypass)
+
+	if (!smmu_domain)
+		abort = disable_bypass;
+	else
+		abort = smmu_domain->abort;
+
+	if (abort || !translate) {
+		if (abort)
 			val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT);
 		else
 			val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
@@ -1333,11 +1348,17 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 		return;
 	}
 
+	if (ste_live) {
+		/* First invalidate the live STE */
+		dst[0] = cpu_to_le64(STRTAB_STE_0_CFG_ABORT);
+		arm_smmu_sync_ste_for_sid(smmu, sid);
+	}
+
 	if (s1_cfg->set) {
 		u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ?
 			STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
 
-		BUG_ON(ste_live);
+		BUG_ON(s1_live);
 		dst[1] = cpu_to_le64(
 			 FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
 			 FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
@@ -1356,7 +1377,14 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 	}
 
 	if (s2_cfg->set) {
-		BUG_ON(ste_live);
+		u64 vttbr = s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK;
+
+		if (s2_live) {
+			u64 s2ttb = le64_to_cpu(dst[3]) & STRTAB_STE_3_S2TTB_MASK;
+
+			BUG_ON(s2ttb != vttbr);
+		}
+
 		dst[2] = cpu_to_le64(
 			 FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
 			 FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
@@ -1366,9 +1394,12 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 			 STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
 			 STRTAB_STE_2_S2R);
 
-		dst[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
+		dst[3] = cpu_to_le64(vttbr);
 
 		val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
+	} else {
+		dst[2] = 0;
+		dst[3] = 0;
 	}
 
 	if (master->ats_enabled)
@@ -2173,6 +2204,14 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain,
 		return 0;
 	}
 
+	if (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED &&
+	    (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1) ||
+	     !(smmu->features & ARM_SMMU_FEAT_TRANS_S2))) {
+		dev_info(smmu_domain->smmu->dev,
+			 "does not implement two stages\n");
+		return -EINVAL;
+	}
+
 	/* Restrict the stage to what we can actually support */
 	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
 		smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index db1a84d24e30..05959df01618 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -207,6 +207,7 @@
 #define STRTAB_STE_0_CFG_BYPASS		4
 #define STRTAB_STE_0_CFG_S1_TRANS	5
 #define STRTAB_STE_0_CFG_S2_TRANS	6
+#define STRTAB_STE_0_CFG_NESTED		7
 
 #define STRTAB_STE_0_S1FMT		GENMASK_ULL(5, 4)
 #define STRTAB_STE_0_S1FMT_LINEAR	0
@@ -720,6 +721,7 @@ struct arm_smmu_domain {
 	enum arm_smmu_domain_stage	stage;
 	struct arm_smmu_s1_cfg	s1_cfg;
 	struct arm_smmu_s2_cfg	s2_cfg;
+	bool				abort;
 
 	struct iommu_domain		domain;
 
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 4/9] iommu/smmuv3: Get prepared for nested stage support
@ 2021-10-27 10:44   ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: peter.maydell, kevin.tian, ashok.raj, maz, vivek.gautam,
	alex.williamson, vsethi, zhangfei.gao, lushenming, wangxingang5

When nested stage translation is setup, both s1_cfg and
s2_cfg are set.

We introduce a new smmu_domain abort field that will be set
upon guest stage1 configuration passing. If no guest stage1
config has been attached, it is ignored when writing the STE.

arm_smmu_write_strtab_ent() is modified to write both stage
fields in the STE and deal with the abort field.

In nested mode, only stage 2 is "finalized" as the host does
not own/configure the stage 1 context descriptor; guest does.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v13 -> v14:
- removed BUG_ON(ste_live && !nested) as this should never happen
- restored the old comment as there is always an abort in between
  S2 -> S1 + S2 and S1 + S2 -> S2
- remove sparse warning

v10 -> v11:
- Fix an issue reported by Shameer when switching from with vSMMU
  to without vSMMU. Despite the spec does not seem to mention it
  seems to be needed to reset the 2 high 64b when switching from
  S1+S2 cfg to S1 only. Especially dst[3] needs to be reset (S2TTB).
  On some implementations, if the S2TTB is not reset, this causes
  a C_BAD_STE error
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 55 ++++++++++++++++++---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 +
 2 files changed, 49 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index b8384a834552..5e0917e1226b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1252,7 +1252,8 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 	 * 3. Update Config, sync
 	 */
 	u64 val = le64_to_cpu(dst[0]);
-	bool ste_live = false;
+	bool s1_live = false, s2_live = false, ste_live;
+	bool abort, translate = false;
 	struct arm_smmu_device *smmu = NULL;
 	struct arm_smmu_s1_cfg *s1_cfg;
 	struct arm_smmu_s2_cfg *s2_cfg;
@@ -1292,6 +1293,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 		default:
 			break;
 		}
+		translate = s1_cfg->set || s2_cfg->set;
 	}
 
 	if (val & STRTAB_STE_0_V) {
@@ -1299,23 +1301,36 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 		case STRTAB_STE_0_CFG_BYPASS:
 			break;
 		case STRTAB_STE_0_CFG_S1_TRANS:
+			s1_live = true;
+			break;
 		case STRTAB_STE_0_CFG_S2_TRANS:
-			ste_live = true;
+			s2_live = true;
+			break;
+		case STRTAB_STE_0_CFG_NESTED:
+			s1_live = true;
+			s2_live = true;
 			break;
 		case STRTAB_STE_0_CFG_ABORT:
-			BUG_ON(!disable_bypass);
 			break;
 		default:
 			BUG(); /* STE corruption */
 		}
 	}
 
+	ste_live = s1_live || s2_live;
+
 	/* Nuke the existing STE_0 value, as we're going to rewrite it */
 	val = STRTAB_STE_0_V;
 
 	/* Bypass/fault */
-	if (!smmu_domain || !(s1_cfg->set || s2_cfg->set)) {
-		if (!smmu_domain && disable_bypass)
+
+	if (!smmu_domain)
+		abort = disable_bypass;
+	else
+		abort = smmu_domain->abort;
+
+	if (abort || !translate) {
+		if (abort)
 			val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT);
 		else
 			val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
@@ -1333,11 +1348,17 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 		return;
 	}
 
+	if (ste_live) {
+		/* First invalidate the live STE */
+		dst[0] = cpu_to_le64(STRTAB_STE_0_CFG_ABORT);
+		arm_smmu_sync_ste_for_sid(smmu, sid);
+	}
+
 	if (s1_cfg->set) {
 		u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ?
 			STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
 
-		BUG_ON(ste_live);
+		BUG_ON(s1_live);
 		dst[1] = cpu_to_le64(
 			 FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
 			 FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
@@ -1356,7 +1377,14 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 	}
 
 	if (s2_cfg->set) {
-		BUG_ON(ste_live);
+		u64 vttbr = s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK;
+
+		if (s2_live) {
+			u64 s2ttb = le64_to_cpu(dst[3]) & STRTAB_STE_3_S2TTB_MASK;
+
+			BUG_ON(s2ttb != vttbr);
+		}
+
 		dst[2] = cpu_to_le64(
 			 FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
 			 FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
@@ -1366,9 +1394,12 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 			 STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
 			 STRTAB_STE_2_S2R);
 
-		dst[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
+		dst[3] = cpu_to_le64(vttbr);
 
 		val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
+	} else {
+		dst[2] = 0;
+		dst[3] = 0;
 	}
 
 	if (master->ats_enabled)
@@ -2173,6 +2204,14 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain,
 		return 0;
 	}
 
+	if (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED &&
+	    (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1) ||
+	     !(smmu->features & ARM_SMMU_FEAT_TRANS_S2))) {
+		dev_info(smmu_domain->smmu->dev,
+			 "does not implement two stages\n");
+		return -EINVAL;
+	}
+
 	/* Restrict the stage to what we can actually support */
 	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
 		smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index db1a84d24e30..05959df01618 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -207,6 +207,7 @@
 #define STRTAB_STE_0_CFG_BYPASS		4
 #define STRTAB_STE_0_CFG_S1_TRANS	5
 #define STRTAB_STE_0_CFG_S2_TRANS	6
+#define STRTAB_STE_0_CFG_NESTED		7
 
 #define STRTAB_STE_0_S1FMT		GENMASK_ULL(5, 4)
 #define STRTAB_STE_0_S1FMT_LINEAR	0
@@ -720,6 +721,7 @@ struct arm_smmu_domain {
 	enum arm_smmu_domain_stage	stage;
 	struct arm_smmu_s1_cfg	s1_cfg;
 	struct arm_smmu_s2_cfg	s2_cfg;
+	bool				abort;
 
 	struct iommu_domain		domain;
 
-- 
2.26.3

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 4/9] iommu/smmuv3: Get prepared for nested stage support
@ 2021-10-27 10:44   ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: kevin.tian, jacob.jun.pan, ashok.raj, chenxiang66, maz, vdumpa,
	nicoleotsuka, vivek.gautam, alex.williamson, yi.l.liu, nicolinc,
	vsethi, zhangfei.gao, sumitg, lushenming, wangxingang5

When nested stage translation is setup, both s1_cfg and
s2_cfg are set.

We introduce a new smmu_domain abort field that will be set
upon guest stage1 configuration passing. If no guest stage1
config has been attached, it is ignored when writing the STE.

arm_smmu_write_strtab_ent() is modified to write both stage
fields in the STE and deal with the abort field.

In nested mode, only stage 2 is "finalized" as the host does
not own/configure the stage 1 context descriptor; guest does.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v13 -> v14:
- removed BUG_ON(ste_live && !nested) as this should never happen
- restored the old comment as there is always an abort in between
  S2 -> S1 + S2 and S1 + S2 -> S2
- remove sparse warning

v10 -> v11:
- Fix an issue reported by Shameer when switching from with vSMMU
  to without vSMMU. Despite the spec does not seem to mention it
  seems to be needed to reset the 2 high 64b when switching from
  S1+S2 cfg to S1 only. Especially dst[3] needs to be reset (S2TTB).
  On some implementations, if the S2TTB is not reset, this causes
  a C_BAD_STE error
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 55 ++++++++++++++++++---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 +
 2 files changed, 49 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index b8384a834552..5e0917e1226b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1252,7 +1252,8 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 	 * 3. Update Config, sync
 	 */
 	u64 val = le64_to_cpu(dst[0]);
-	bool ste_live = false;
+	bool s1_live = false, s2_live = false, ste_live;
+	bool abort, translate = false;
 	struct arm_smmu_device *smmu = NULL;
 	struct arm_smmu_s1_cfg *s1_cfg;
 	struct arm_smmu_s2_cfg *s2_cfg;
@@ -1292,6 +1293,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 		default:
 			break;
 		}
+		translate = s1_cfg->set || s2_cfg->set;
 	}
 
 	if (val & STRTAB_STE_0_V) {
@@ -1299,23 +1301,36 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 		case STRTAB_STE_0_CFG_BYPASS:
 			break;
 		case STRTAB_STE_0_CFG_S1_TRANS:
+			s1_live = true;
+			break;
 		case STRTAB_STE_0_CFG_S2_TRANS:
-			ste_live = true;
+			s2_live = true;
+			break;
+		case STRTAB_STE_0_CFG_NESTED:
+			s1_live = true;
+			s2_live = true;
 			break;
 		case STRTAB_STE_0_CFG_ABORT:
-			BUG_ON(!disable_bypass);
 			break;
 		default:
 			BUG(); /* STE corruption */
 		}
 	}
 
+	ste_live = s1_live || s2_live;
+
 	/* Nuke the existing STE_0 value, as we're going to rewrite it */
 	val = STRTAB_STE_0_V;
 
 	/* Bypass/fault */
-	if (!smmu_domain || !(s1_cfg->set || s2_cfg->set)) {
-		if (!smmu_domain && disable_bypass)
+
+	if (!smmu_domain)
+		abort = disable_bypass;
+	else
+		abort = smmu_domain->abort;
+
+	if (abort || !translate) {
+		if (abort)
 			val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT);
 		else
 			val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
@@ -1333,11 +1348,17 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 		return;
 	}
 
+	if (ste_live) {
+		/* First invalidate the live STE */
+		dst[0] = cpu_to_le64(STRTAB_STE_0_CFG_ABORT);
+		arm_smmu_sync_ste_for_sid(smmu, sid);
+	}
+
 	if (s1_cfg->set) {
 		u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ?
 			STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
 
-		BUG_ON(ste_live);
+		BUG_ON(s1_live);
 		dst[1] = cpu_to_le64(
 			 FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
 			 FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
@@ -1356,7 +1377,14 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 	}
 
 	if (s2_cfg->set) {
-		BUG_ON(ste_live);
+		u64 vttbr = s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK;
+
+		if (s2_live) {
+			u64 s2ttb = le64_to_cpu(dst[3]) & STRTAB_STE_3_S2TTB_MASK;
+
+			BUG_ON(s2ttb != vttbr);
+		}
+
 		dst[2] = cpu_to_le64(
 			 FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
 			 FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
@@ -1366,9 +1394,12 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 			 STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
 			 STRTAB_STE_2_S2R);
 
-		dst[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
+		dst[3] = cpu_to_le64(vttbr);
 
 		val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
+	} else {
+		dst[2] = 0;
+		dst[3] = 0;
 	}
 
 	if (master->ats_enabled)
@@ -2173,6 +2204,14 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain,
 		return 0;
 	}
 
+	if (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED &&
+	    (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1) ||
+	     !(smmu->features & ARM_SMMU_FEAT_TRANS_S2))) {
+		dev_info(smmu_domain->smmu->dev,
+			 "does not implement two stages\n");
+		return -EINVAL;
+	}
+
 	/* Restrict the stage to what we can actually support */
 	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
 		smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index db1a84d24e30..05959df01618 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -207,6 +207,7 @@
 #define STRTAB_STE_0_CFG_BYPASS		4
 #define STRTAB_STE_0_CFG_S1_TRANS	5
 #define STRTAB_STE_0_CFG_S2_TRANS	6
+#define STRTAB_STE_0_CFG_NESTED		7
 
 #define STRTAB_STE_0_S1FMT		GENMASK_ULL(5, 4)
 #define STRTAB_STE_0_S1FMT_LINEAR	0
@@ -720,6 +721,7 @@ struct arm_smmu_domain {
 	enum arm_smmu_domain_stage	stage;
 	struct arm_smmu_s1_cfg	s1_cfg;
 	struct arm_smmu_s2_cfg	s2_cfg;
+	bool				abort;
 
 	struct iommu_domain		domain;
 
-- 
2.26.3

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 5/9] iommu/smmuv3: Implement attach/detach_pasid_table
  2021-10-27 10:44 ` Eric Auger
  (?)
@ 2021-10-27 10:44   ` Eric Auger
  -1 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: alex.williamson, jacob.jun.pan, yi.l.liu, kevin.tian, ashok.raj,
	maz, peter.maydell, vivek.gautam, shameerali.kolothum.thodi,
	wangxingang5, jiangkunkun, yuzenghui, nicoleotsuka, chenxiang66,
	sumitg, nicolinc, vdumpa, zhangfei.gao, zhangfei.gao, lushenming,
	vsethi

On attach_pasid_table() we program STE S1 related info set
by the guest into the actual physical STEs. At minimum
we need to program the context descriptor GPA and compute
whether the stage1 is translated/bypassed or aborted.

On detach, the stage 1 config is unset and the abort flag is
unset.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v14 -> v15:
- add a comment before arm_smmu_get_cd_ptr to warn the
  developper this function must not be used in case of nested
  (Keqian)

v13 -> v14:
- on PASID table detach, reset the abort flag (Keqian)

v7 -> v8:
- remove smmu->features check, now done on domain finalize

v6 -> v7:
- check versions and comment the fact we don't need to take
  into account s1dss and s1fmt
v3 -> v4:
- adapt to changes in iommu_pasid_table_config
- different programming convention at s1_cfg/s2_cfg/ste.abort

v2 -> v3:
- callback now is named set_pasid_table and struct fields
  are laid out differently.

v1 -> v2:
- invalidate the STE before changing them
- hold init_mutex
- handle new fields
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 93 +++++++++++++++++++++
 1 file changed, 93 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 5e0917e1226b..bb2681581283 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1004,6 +1004,10 @@ static void arm_smmu_write_cd_l1_desc(__le64 *dst,
 	WRITE_ONCE(*dst, cpu_to_le64(val));
 }
 
+/*
+ * Must not be used in case of nested mode where the CD table is owned
+ * by the guest
+ */
 static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_domain *smmu_domain,
 				   u32 ssid)
 {
@@ -2809,6 +2813,93 @@ static void arm_smmu_get_resv_regions(struct device *dev,
 	iommu_dma_get_resv_regions(dev, head);
 }
 
+static int arm_smmu_attach_pasid_table(struct iommu_domain *domain,
+				       struct iommu_pasid_table_config *cfg)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_master *master;
+	struct arm_smmu_device *smmu;
+	unsigned long flags;
+	int ret = -EINVAL;
+
+	if (cfg->format != IOMMU_PASID_FORMAT_SMMUV3)
+		return -EINVAL;
+
+	if (cfg->version != PASID_TABLE_CFG_VERSION_1 ||
+	    cfg->vendor_data.smmuv3.version != PASID_TABLE_SMMUV3_CFG_VERSION_1)
+		return -EINVAL;
+
+	mutex_lock(&smmu_domain->init_mutex);
+
+	smmu = smmu_domain->smmu;
+
+	if (!smmu)
+		goto out;
+
+	if (smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
+		goto out;
+
+	switch (cfg->config) {
+	case IOMMU_PASID_CONFIG_ABORT:
+		smmu_domain->s1_cfg.set = false;
+		smmu_domain->abort = true;
+		break;
+	case IOMMU_PASID_CONFIG_BYPASS:
+		smmu_domain->s1_cfg.set = false;
+		smmu_domain->abort = false;
+		break;
+	case IOMMU_PASID_CONFIG_TRANSLATE:
+		/* we do not support S1 <-> S1 transitions */
+		if (smmu_domain->s1_cfg.set)
+			goto out;
+
+		/*
+		 * we currently support a single CD so s1fmt and s1dss
+		 * fields are also ignored
+		 */
+		if (cfg->pasid_bits)
+			goto out;
+
+		smmu_domain->s1_cfg.cdcfg.cdtab_dma = cfg->base_ptr;
+		smmu_domain->s1_cfg.set = true;
+		smmu_domain->abort = false;
+		break;
+	default:
+		goto out;
+	}
+	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
+	list_for_each_entry(master, &smmu_domain->devices, domain_head)
+		arm_smmu_install_ste_for_dev(master);
+	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
+	ret = 0;
+out:
+	mutex_unlock(&smmu_domain->init_mutex);
+	return ret;
+}
+
+static void arm_smmu_detach_pasid_table(struct iommu_domain *domain)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_master *master;
+	unsigned long flags;
+
+	mutex_lock(&smmu_domain->init_mutex);
+
+	if (smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
+		goto unlock;
+
+	smmu_domain->s1_cfg.set = false;
+	smmu_domain->abort = false;
+
+	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
+	list_for_each_entry(master, &smmu_domain->devices, domain_head)
+		arm_smmu_install_ste_for_dev(master);
+	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
+
+unlock:
+	mutex_unlock(&smmu_domain->init_mutex);
+}
+
 static bool arm_smmu_dev_has_feature(struct device *dev,
 				     enum iommu_dev_features feat)
 {
@@ -2906,6 +2997,8 @@ static struct iommu_ops arm_smmu_ops = {
 	.of_xlate		= arm_smmu_of_xlate,
 	.get_resv_regions	= arm_smmu_get_resv_regions,
 	.put_resv_regions	= generic_iommu_put_resv_regions,
+	.attach_pasid_table	= arm_smmu_attach_pasid_table,
+	.detach_pasid_table	= arm_smmu_detach_pasid_table,
 	.dev_has_feat		= arm_smmu_dev_has_feature,
 	.dev_feat_enabled	= arm_smmu_dev_feature_enabled,
 	.dev_enable_feat	= arm_smmu_dev_enable_feature,
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 5/9] iommu/smmuv3: Implement attach/detach_pasid_table
@ 2021-10-27 10:44   ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: peter.maydell, kevin.tian, ashok.raj, maz, vivek.gautam,
	alex.williamson, vsethi, zhangfei.gao, lushenming, wangxingang5

On attach_pasid_table() we program STE S1 related info set
by the guest into the actual physical STEs. At minimum
we need to program the context descriptor GPA and compute
whether the stage1 is translated/bypassed or aborted.

On detach, the stage 1 config is unset and the abort flag is
unset.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v14 -> v15:
- add a comment before arm_smmu_get_cd_ptr to warn the
  developper this function must not be used in case of nested
  (Keqian)

v13 -> v14:
- on PASID table detach, reset the abort flag (Keqian)

v7 -> v8:
- remove smmu->features check, now done on domain finalize

v6 -> v7:
- check versions and comment the fact we don't need to take
  into account s1dss and s1fmt
v3 -> v4:
- adapt to changes in iommu_pasid_table_config
- different programming convention at s1_cfg/s2_cfg/ste.abort

v2 -> v3:
- callback now is named set_pasid_table and struct fields
  are laid out differently.

v1 -> v2:
- invalidate the STE before changing them
- hold init_mutex
- handle new fields
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 93 +++++++++++++++++++++
 1 file changed, 93 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 5e0917e1226b..bb2681581283 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1004,6 +1004,10 @@ static void arm_smmu_write_cd_l1_desc(__le64 *dst,
 	WRITE_ONCE(*dst, cpu_to_le64(val));
 }
 
+/*
+ * Must not be used in case of nested mode where the CD table is owned
+ * by the guest
+ */
 static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_domain *smmu_domain,
 				   u32 ssid)
 {
@@ -2809,6 +2813,93 @@ static void arm_smmu_get_resv_regions(struct device *dev,
 	iommu_dma_get_resv_regions(dev, head);
 }
 
+static int arm_smmu_attach_pasid_table(struct iommu_domain *domain,
+				       struct iommu_pasid_table_config *cfg)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_master *master;
+	struct arm_smmu_device *smmu;
+	unsigned long flags;
+	int ret = -EINVAL;
+
+	if (cfg->format != IOMMU_PASID_FORMAT_SMMUV3)
+		return -EINVAL;
+
+	if (cfg->version != PASID_TABLE_CFG_VERSION_1 ||
+	    cfg->vendor_data.smmuv3.version != PASID_TABLE_SMMUV3_CFG_VERSION_1)
+		return -EINVAL;
+
+	mutex_lock(&smmu_domain->init_mutex);
+
+	smmu = smmu_domain->smmu;
+
+	if (!smmu)
+		goto out;
+
+	if (smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
+		goto out;
+
+	switch (cfg->config) {
+	case IOMMU_PASID_CONFIG_ABORT:
+		smmu_domain->s1_cfg.set = false;
+		smmu_domain->abort = true;
+		break;
+	case IOMMU_PASID_CONFIG_BYPASS:
+		smmu_domain->s1_cfg.set = false;
+		smmu_domain->abort = false;
+		break;
+	case IOMMU_PASID_CONFIG_TRANSLATE:
+		/* we do not support S1 <-> S1 transitions */
+		if (smmu_domain->s1_cfg.set)
+			goto out;
+
+		/*
+		 * we currently support a single CD so s1fmt and s1dss
+		 * fields are also ignored
+		 */
+		if (cfg->pasid_bits)
+			goto out;
+
+		smmu_domain->s1_cfg.cdcfg.cdtab_dma = cfg->base_ptr;
+		smmu_domain->s1_cfg.set = true;
+		smmu_domain->abort = false;
+		break;
+	default:
+		goto out;
+	}
+	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
+	list_for_each_entry(master, &smmu_domain->devices, domain_head)
+		arm_smmu_install_ste_for_dev(master);
+	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
+	ret = 0;
+out:
+	mutex_unlock(&smmu_domain->init_mutex);
+	return ret;
+}
+
+static void arm_smmu_detach_pasid_table(struct iommu_domain *domain)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_master *master;
+	unsigned long flags;
+
+	mutex_lock(&smmu_domain->init_mutex);
+
+	if (smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
+		goto unlock;
+
+	smmu_domain->s1_cfg.set = false;
+	smmu_domain->abort = false;
+
+	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
+	list_for_each_entry(master, &smmu_domain->devices, domain_head)
+		arm_smmu_install_ste_for_dev(master);
+	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
+
+unlock:
+	mutex_unlock(&smmu_domain->init_mutex);
+}
+
 static bool arm_smmu_dev_has_feature(struct device *dev,
 				     enum iommu_dev_features feat)
 {
@@ -2906,6 +2997,8 @@ static struct iommu_ops arm_smmu_ops = {
 	.of_xlate		= arm_smmu_of_xlate,
 	.get_resv_regions	= arm_smmu_get_resv_regions,
 	.put_resv_regions	= generic_iommu_put_resv_regions,
+	.attach_pasid_table	= arm_smmu_attach_pasid_table,
+	.detach_pasid_table	= arm_smmu_detach_pasid_table,
 	.dev_has_feat		= arm_smmu_dev_has_feature,
 	.dev_feat_enabled	= arm_smmu_dev_feature_enabled,
 	.dev_enable_feat	= arm_smmu_dev_enable_feature,
-- 
2.26.3

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 5/9] iommu/smmuv3: Implement attach/detach_pasid_table
@ 2021-10-27 10:44   ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: kevin.tian, jacob.jun.pan, ashok.raj, chenxiang66, maz, vdumpa,
	nicoleotsuka, vivek.gautam, alex.williamson, yi.l.liu, nicolinc,
	vsethi, zhangfei.gao, sumitg, lushenming, wangxingang5

On attach_pasid_table() we program STE S1 related info set
by the guest into the actual physical STEs. At minimum
we need to program the context descriptor GPA and compute
whether the stage1 is translated/bypassed or aborted.

On detach, the stage 1 config is unset and the abort flag is
unset.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v14 -> v15:
- add a comment before arm_smmu_get_cd_ptr to warn the
  developper this function must not be used in case of nested
  (Keqian)

v13 -> v14:
- on PASID table detach, reset the abort flag (Keqian)

v7 -> v8:
- remove smmu->features check, now done on domain finalize

v6 -> v7:
- check versions and comment the fact we don't need to take
  into account s1dss and s1fmt
v3 -> v4:
- adapt to changes in iommu_pasid_table_config
- different programming convention at s1_cfg/s2_cfg/ste.abort

v2 -> v3:
- callback now is named set_pasid_table and struct fields
  are laid out differently.

v1 -> v2:
- invalidate the STE before changing them
- hold init_mutex
- handle new fields
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 93 +++++++++++++++++++++
 1 file changed, 93 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 5e0917e1226b..bb2681581283 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1004,6 +1004,10 @@ static void arm_smmu_write_cd_l1_desc(__le64 *dst,
 	WRITE_ONCE(*dst, cpu_to_le64(val));
 }
 
+/*
+ * Must not be used in case of nested mode where the CD table is owned
+ * by the guest
+ */
 static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_domain *smmu_domain,
 				   u32 ssid)
 {
@@ -2809,6 +2813,93 @@ static void arm_smmu_get_resv_regions(struct device *dev,
 	iommu_dma_get_resv_regions(dev, head);
 }
 
+static int arm_smmu_attach_pasid_table(struct iommu_domain *domain,
+				       struct iommu_pasid_table_config *cfg)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_master *master;
+	struct arm_smmu_device *smmu;
+	unsigned long flags;
+	int ret = -EINVAL;
+
+	if (cfg->format != IOMMU_PASID_FORMAT_SMMUV3)
+		return -EINVAL;
+
+	if (cfg->version != PASID_TABLE_CFG_VERSION_1 ||
+	    cfg->vendor_data.smmuv3.version != PASID_TABLE_SMMUV3_CFG_VERSION_1)
+		return -EINVAL;
+
+	mutex_lock(&smmu_domain->init_mutex);
+
+	smmu = smmu_domain->smmu;
+
+	if (!smmu)
+		goto out;
+
+	if (smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
+		goto out;
+
+	switch (cfg->config) {
+	case IOMMU_PASID_CONFIG_ABORT:
+		smmu_domain->s1_cfg.set = false;
+		smmu_domain->abort = true;
+		break;
+	case IOMMU_PASID_CONFIG_BYPASS:
+		smmu_domain->s1_cfg.set = false;
+		smmu_domain->abort = false;
+		break;
+	case IOMMU_PASID_CONFIG_TRANSLATE:
+		/* we do not support S1 <-> S1 transitions */
+		if (smmu_domain->s1_cfg.set)
+			goto out;
+
+		/*
+		 * we currently support a single CD so s1fmt and s1dss
+		 * fields are also ignored
+		 */
+		if (cfg->pasid_bits)
+			goto out;
+
+		smmu_domain->s1_cfg.cdcfg.cdtab_dma = cfg->base_ptr;
+		smmu_domain->s1_cfg.set = true;
+		smmu_domain->abort = false;
+		break;
+	default:
+		goto out;
+	}
+	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
+	list_for_each_entry(master, &smmu_domain->devices, domain_head)
+		arm_smmu_install_ste_for_dev(master);
+	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
+	ret = 0;
+out:
+	mutex_unlock(&smmu_domain->init_mutex);
+	return ret;
+}
+
+static void arm_smmu_detach_pasid_table(struct iommu_domain *domain)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_master *master;
+	unsigned long flags;
+
+	mutex_lock(&smmu_domain->init_mutex);
+
+	if (smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
+		goto unlock;
+
+	smmu_domain->s1_cfg.set = false;
+	smmu_domain->abort = false;
+
+	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
+	list_for_each_entry(master, &smmu_domain->devices, domain_head)
+		arm_smmu_install_ste_for_dev(master);
+	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
+
+unlock:
+	mutex_unlock(&smmu_domain->init_mutex);
+}
+
 static bool arm_smmu_dev_has_feature(struct device *dev,
 				     enum iommu_dev_features feat)
 {
@@ -2906,6 +2997,8 @@ static struct iommu_ops arm_smmu_ops = {
 	.of_xlate		= arm_smmu_of_xlate,
 	.get_resv_regions	= arm_smmu_get_resv_regions,
 	.put_resv_regions	= generic_iommu_put_resv_regions,
+	.attach_pasid_table	= arm_smmu_attach_pasid_table,
+	.detach_pasid_table	= arm_smmu_detach_pasid_table,
 	.dev_has_feat		= arm_smmu_dev_has_feature,
 	.dev_feat_enabled	= arm_smmu_dev_feature_enabled,
 	.dev_enable_feat	= arm_smmu_dev_enable_feature,
-- 
2.26.3

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 6/9] iommu/smmuv3: Allow stage 1 invalidation with unmanaged ASIDs
  2021-10-27 10:44 ` Eric Auger
  (?)
@ 2021-10-27 10:44   ` Eric Auger
  -1 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: alex.williamson, jacob.jun.pan, yi.l.liu, kevin.tian, ashok.raj,
	maz, peter.maydell, vivek.gautam, shameerali.kolothum.thodi,
	wangxingang5, jiangkunkun, yuzenghui, nicoleotsuka, chenxiang66,
	sumitg, nicolinc, vdumpa, zhangfei.gao, zhangfei.gao, lushenming,
	vsethi

With nested stage support, soon we will need to invalidate
S1 contexts and ranges tagged with an unmanaged asid, this
latter being managed by the guest. So let's introduce 2 helpers
that allow to invalidate with externally managed ASIDs

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v15 -> v16:
- Use arm_smmu_cmdq_issue_cmd_with_sync()

v14 -> v15:
- Always send CMDQ_OP_TLBI_NH_VA and do not test
  smmu_domain->smmu->features & ARM_SMMU_FEAT_E2H as the guest does
  not run in hyp mode atm (Zenghui).

v13 -> v14
- Actually send the NH_ASID command (reported by Xingang Wang)
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 41 ++++++++++++++++-----
 1 file changed, 32 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index bb2681581283..d5e722105624 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1871,9 +1871,9 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
 }
 
 /* IO_PGTABLE API */
-static void arm_smmu_tlb_inv_context(void *cookie)
+static void __arm_smmu_tlb_inv_context(struct arm_smmu_domain *smmu_domain,
+				       int ext_asid)
 {
-	struct arm_smmu_domain *smmu_domain = cookie;
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	struct arm_smmu_cmdq_ent cmd;
 
@@ -1884,7 +1884,12 @@ static void arm_smmu_tlb_inv_context(void *cookie)
 	 * insertion to guarantee those are observed before the TLBI. Do be
 	 * careful, 007.
 	 */
-	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
+	if (ext_asid >= 0) { /* guest stage 1 invalidation */
+		cmd.opcode	= CMDQ_OP_TLBI_NH_ASID;
+		cmd.tlbi.asid	= ext_asid;
+		cmd.tlbi.vmid	= smmu_domain->s2_cfg.vmid;
+		arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
+	} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
 		arm_smmu_tlb_inv_asid(smmu, smmu_domain->s1_cfg.cd.asid);
 	} else {
 		cmd.opcode	= CMDQ_OP_TLBI_S12_VMALL;
@@ -1894,6 +1899,13 @@ static void arm_smmu_tlb_inv_context(void *cookie)
 	arm_smmu_atc_inv_domain(smmu_domain, 0, 0, 0);
 }
 
+static void arm_smmu_tlb_inv_context(void *cookie)
+{
+	struct arm_smmu_domain *smmu_domain = cookie;
+
+	__arm_smmu_tlb_inv_context(smmu_domain, -1);
+}
+
 static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd,
 				     unsigned long iova, size_t size,
 				     size_t granule,
@@ -1955,9 +1967,10 @@ static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd,
 	arm_smmu_cmdq_batch_submit(smmu, &cmds);
 }
 
-static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
-					  size_t granule, bool leaf,
-					  struct arm_smmu_domain *smmu_domain)
+static void
+arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
+			      size_t granule, bool leaf, int ext_asid,
+			      struct arm_smmu_domain *smmu_domain)
 {
 	struct arm_smmu_cmdq_ent cmd = {
 		.tlbi = {
@@ -1965,7 +1978,16 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
 		},
 	};
 
-	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
+	if (ext_asid >= 0) {  /* guest stage 1 invalidation */
+		/*
+		 * At the moment the guest only uses NS-EL1, to be
+		 * revisited when nested virt gets supported with E2H
+		 * exposed.
+		 */
+		cmd.opcode	= CMDQ_OP_TLBI_NH_VA;
+		cmd.tlbi.asid	= ext_asid;
+		cmd.tlbi.vmid	= smmu_domain->s2_cfg.vmid;
+	} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
 		cmd.opcode	= smmu_domain->smmu->features & ARM_SMMU_FEAT_E2H ?
 				  CMDQ_OP_TLBI_EL2_VA : CMDQ_OP_TLBI_NH_VA;
 		cmd.tlbi.asid	= smmu_domain->s1_cfg.cd.asid;
@@ -1973,6 +1995,7 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
 		cmd.opcode	= CMDQ_OP_TLBI_S2_IPA;
 		cmd.tlbi.vmid	= smmu_domain->s2_cfg.vmid;
 	}
+
 	__arm_smmu_tlb_inv_range(&cmd, iova, size, granule, smmu_domain);
 
 	/*
@@ -2011,7 +2034,7 @@ static void arm_smmu_tlb_inv_page_nosync(struct iommu_iotlb_gather *gather,
 static void arm_smmu_tlb_inv_walk(unsigned long iova, size_t size,
 				  size_t granule, void *cookie)
 {
-	arm_smmu_tlb_inv_range_domain(iova, size, granule, false, cookie);
+	arm_smmu_tlb_inv_range_domain(iova, size, granule, false, -1, cookie);
 }
 
 static const struct iommu_flush_ops arm_smmu_flush_ops = {
@@ -2548,7 +2571,7 @@ static void arm_smmu_iotlb_sync(struct iommu_domain *domain,
 
 	arm_smmu_tlb_inv_range_domain(gather->start,
 				      gather->end - gather->start + 1,
-				      gather->pgsize, true, smmu_domain);
+				      gather->pgsize, true, -1, smmu_domain);
 }
 
 static phys_addr_t
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 6/9] iommu/smmuv3: Allow stage 1 invalidation with unmanaged ASIDs
@ 2021-10-27 10:44   ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: peter.maydell, kevin.tian, ashok.raj, maz, vivek.gautam,
	alex.williamson, vsethi, zhangfei.gao, lushenming, wangxingang5

With nested stage support, soon we will need to invalidate
S1 contexts and ranges tagged with an unmanaged asid, this
latter being managed by the guest. So let's introduce 2 helpers
that allow to invalidate with externally managed ASIDs

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v15 -> v16:
- Use arm_smmu_cmdq_issue_cmd_with_sync()

v14 -> v15:
- Always send CMDQ_OP_TLBI_NH_VA and do not test
  smmu_domain->smmu->features & ARM_SMMU_FEAT_E2H as the guest does
  not run in hyp mode atm (Zenghui).

v13 -> v14
- Actually send the NH_ASID command (reported by Xingang Wang)
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 41 ++++++++++++++++-----
 1 file changed, 32 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index bb2681581283..d5e722105624 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1871,9 +1871,9 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
 }
 
 /* IO_PGTABLE API */
-static void arm_smmu_tlb_inv_context(void *cookie)
+static void __arm_smmu_tlb_inv_context(struct arm_smmu_domain *smmu_domain,
+				       int ext_asid)
 {
-	struct arm_smmu_domain *smmu_domain = cookie;
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	struct arm_smmu_cmdq_ent cmd;
 
@@ -1884,7 +1884,12 @@ static void arm_smmu_tlb_inv_context(void *cookie)
 	 * insertion to guarantee those are observed before the TLBI. Do be
 	 * careful, 007.
 	 */
-	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
+	if (ext_asid >= 0) { /* guest stage 1 invalidation */
+		cmd.opcode	= CMDQ_OP_TLBI_NH_ASID;
+		cmd.tlbi.asid	= ext_asid;
+		cmd.tlbi.vmid	= smmu_domain->s2_cfg.vmid;
+		arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
+	} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
 		arm_smmu_tlb_inv_asid(smmu, smmu_domain->s1_cfg.cd.asid);
 	} else {
 		cmd.opcode	= CMDQ_OP_TLBI_S12_VMALL;
@@ -1894,6 +1899,13 @@ static void arm_smmu_tlb_inv_context(void *cookie)
 	arm_smmu_atc_inv_domain(smmu_domain, 0, 0, 0);
 }
 
+static void arm_smmu_tlb_inv_context(void *cookie)
+{
+	struct arm_smmu_domain *smmu_domain = cookie;
+
+	__arm_smmu_tlb_inv_context(smmu_domain, -1);
+}
+
 static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd,
 				     unsigned long iova, size_t size,
 				     size_t granule,
@@ -1955,9 +1967,10 @@ static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd,
 	arm_smmu_cmdq_batch_submit(smmu, &cmds);
 }
 
-static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
-					  size_t granule, bool leaf,
-					  struct arm_smmu_domain *smmu_domain)
+static void
+arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
+			      size_t granule, bool leaf, int ext_asid,
+			      struct arm_smmu_domain *smmu_domain)
 {
 	struct arm_smmu_cmdq_ent cmd = {
 		.tlbi = {
@@ -1965,7 +1978,16 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
 		},
 	};
 
-	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
+	if (ext_asid >= 0) {  /* guest stage 1 invalidation */
+		/*
+		 * At the moment the guest only uses NS-EL1, to be
+		 * revisited when nested virt gets supported with E2H
+		 * exposed.
+		 */
+		cmd.opcode	= CMDQ_OP_TLBI_NH_VA;
+		cmd.tlbi.asid	= ext_asid;
+		cmd.tlbi.vmid	= smmu_domain->s2_cfg.vmid;
+	} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
 		cmd.opcode	= smmu_domain->smmu->features & ARM_SMMU_FEAT_E2H ?
 				  CMDQ_OP_TLBI_EL2_VA : CMDQ_OP_TLBI_NH_VA;
 		cmd.tlbi.asid	= smmu_domain->s1_cfg.cd.asid;
@@ -1973,6 +1995,7 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
 		cmd.opcode	= CMDQ_OP_TLBI_S2_IPA;
 		cmd.tlbi.vmid	= smmu_domain->s2_cfg.vmid;
 	}
+
 	__arm_smmu_tlb_inv_range(&cmd, iova, size, granule, smmu_domain);
 
 	/*
@@ -2011,7 +2034,7 @@ static void arm_smmu_tlb_inv_page_nosync(struct iommu_iotlb_gather *gather,
 static void arm_smmu_tlb_inv_walk(unsigned long iova, size_t size,
 				  size_t granule, void *cookie)
 {
-	arm_smmu_tlb_inv_range_domain(iova, size, granule, false, cookie);
+	arm_smmu_tlb_inv_range_domain(iova, size, granule, false, -1, cookie);
 }
 
 static const struct iommu_flush_ops arm_smmu_flush_ops = {
@@ -2548,7 +2571,7 @@ static void arm_smmu_iotlb_sync(struct iommu_domain *domain,
 
 	arm_smmu_tlb_inv_range_domain(gather->start,
 				      gather->end - gather->start + 1,
-				      gather->pgsize, true, smmu_domain);
+				      gather->pgsize, true, -1, smmu_domain);
 }
 
 static phys_addr_t
-- 
2.26.3

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 6/9] iommu/smmuv3: Allow stage 1 invalidation with unmanaged ASIDs
@ 2021-10-27 10:44   ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: kevin.tian, jacob.jun.pan, ashok.raj, chenxiang66, maz, vdumpa,
	nicoleotsuka, vivek.gautam, alex.williamson, yi.l.liu, nicolinc,
	vsethi, zhangfei.gao, sumitg, lushenming, wangxingang5

With nested stage support, soon we will need to invalidate
S1 contexts and ranges tagged with an unmanaged asid, this
latter being managed by the guest. So let's introduce 2 helpers
that allow to invalidate with externally managed ASIDs

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v15 -> v16:
- Use arm_smmu_cmdq_issue_cmd_with_sync()

v14 -> v15:
- Always send CMDQ_OP_TLBI_NH_VA and do not test
  smmu_domain->smmu->features & ARM_SMMU_FEAT_E2H as the guest does
  not run in hyp mode atm (Zenghui).

v13 -> v14
- Actually send the NH_ASID command (reported by Xingang Wang)
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 41 ++++++++++++++++-----
 1 file changed, 32 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index bb2681581283..d5e722105624 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1871,9 +1871,9 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
 }
 
 /* IO_PGTABLE API */
-static void arm_smmu_tlb_inv_context(void *cookie)
+static void __arm_smmu_tlb_inv_context(struct arm_smmu_domain *smmu_domain,
+				       int ext_asid)
 {
-	struct arm_smmu_domain *smmu_domain = cookie;
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	struct arm_smmu_cmdq_ent cmd;
 
@@ -1884,7 +1884,12 @@ static void arm_smmu_tlb_inv_context(void *cookie)
 	 * insertion to guarantee those are observed before the TLBI. Do be
 	 * careful, 007.
 	 */
-	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
+	if (ext_asid >= 0) { /* guest stage 1 invalidation */
+		cmd.opcode	= CMDQ_OP_TLBI_NH_ASID;
+		cmd.tlbi.asid	= ext_asid;
+		cmd.tlbi.vmid	= smmu_domain->s2_cfg.vmid;
+		arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
+	} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
 		arm_smmu_tlb_inv_asid(smmu, smmu_domain->s1_cfg.cd.asid);
 	} else {
 		cmd.opcode	= CMDQ_OP_TLBI_S12_VMALL;
@@ -1894,6 +1899,13 @@ static void arm_smmu_tlb_inv_context(void *cookie)
 	arm_smmu_atc_inv_domain(smmu_domain, 0, 0, 0);
 }
 
+static void arm_smmu_tlb_inv_context(void *cookie)
+{
+	struct arm_smmu_domain *smmu_domain = cookie;
+
+	__arm_smmu_tlb_inv_context(smmu_domain, -1);
+}
+
 static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd,
 				     unsigned long iova, size_t size,
 				     size_t granule,
@@ -1955,9 +1967,10 @@ static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd,
 	arm_smmu_cmdq_batch_submit(smmu, &cmds);
 }
 
-static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
-					  size_t granule, bool leaf,
-					  struct arm_smmu_domain *smmu_domain)
+static void
+arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
+			      size_t granule, bool leaf, int ext_asid,
+			      struct arm_smmu_domain *smmu_domain)
 {
 	struct arm_smmu_cmdq_ent cmd = {
 		.tlbi = {
@@ -1965,7 +1978,16 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
 		},
 	};
 
-	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
+	if (ext_asid >= 0) {  /* guest stage 1 invalidation */
+		/*
+		 * At the moment the guest only uses NS-EL1, to be
+		 * revisited when nested virt gets supported with E2H
+		 * exposed.
+		 */
+		cmd.opcode	= CMDQ_OP_TLBI_NH_VA;
+		cmd.tlbi.asid	= ext_asid;
+		cmd.tlbi.vmid	= smmu_domain->s2_cfg.vmid;
+	} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
 		cmd.opcode	= smmu_domain->smmu->features & ARM_SMMU_FEAT_E2H ?
 				  CMDQ_OP_TLBI_EL2_VA : CMDQ_OP_TLBI_NH_VA;
 		cmd.tlbi.asid	= smmu_domain->s1_cfg.cd.asid;
@@ -1973,6 +1995,7 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
 		cmd.opcode	= CMDQ_OP_TLBI_S2_IPA;
 		cmd.tlbi.vmid	= smmu_domain->s2_cfg.vmid;
 	}
+
 	__arm_smmu_tlb_inv_range(&cmd, iova, size, granule, smmu_domain);
 
 	/*
@@ -2011,7 +2034,7 @@ static void arm_smmu_tlb_inv_page_nosync(struct iommu_iotlb_gather *gather,
 static void arm_smmu_tlb_inv_walk(unsigned long iova, size_t size,
 				  size_t granule, void *cookie)
 {
-	arm_smmu_tlb_inv_range_domain(iova, size, granule, false, cookie);
+	arm_smmu_tlb_inv_range_domain(iova, size, granule, false, -1, cookie);
 }
 
 static const struct iommu_flush_ops arm_smmu_flush_ops = {
@@ -2548,7 +2571,7 @@ static void arm_smmu_iotlb_sync(struct iommu_domain *domain,
 
 	arm_smmu_tlb_inv_range_domain(gather->start,
 				      gather->end - gather->start + 1,
-				      gather->pgsize, true, smmu_domain);
+				      gather->pgsize, true, -1, smmu_domain);
 }
 
 static phys_addr_t
-- 
2.26.3

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 7/9] iommu/smmuv3: Implement cache_invalidate
  2021-10-27 10:44 ` Eric Auger
  (?)
@ 2021-10-27 10:44   ` Eric Auger
  -1 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: alex.williamson, jacob.jun.pan, yi.l.liu, kevin.tian, ashok.raj,
	maz, peter.maydell, vivek.gautam, shameerali.kolothum.thodi,
	wangxingang5, jiangkunkun, yuzenghui, nicoleotsuka, chenxiang66,
	sumitg, nicolinc, vdumpa, zhangfei.gao, zhangfei.gao, lushenming,
	vsethi

Implement domain-selective, pasid selective and page-selective
IOTLB invalidations.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v15 -> v16:
- make sure the range is set (RIL guest) and check the granule
  size is supported by the physical IOMMU
- use cmd_with_sync

v14 -> v15:
- remove the redundant arm_smmu_cmdq_issue_sync(smmu)
  in IOMMU_INV_GRANU_ADDR case (Zenghui)
- if RIL is not supported by the host, make sure the granule_size
  that is passed by the userspace is supported or fix it
  (Chenxiang)

v13 -> v14:
- Add domain invalidation
- do global inval when asid is not provided with addr
  granularity

v7 -> v8:
- ASID based invalidation using iommu_inv_pasid_info
- check ARCHID/PASID flags in addr based invalidation
- use __arm_smmu_tlb_inv_context and __arm_smmu_tlb_inv_range_nosync

v6 -> v7
- check the uapi version

v3 -> v4:
- adapt to changes in the uapi
- add support for leaf parameter
- do not use arm_smmu_tlb_inv_range_nosync or arm_smmu_tlb_inv_context
  anymore

v2 -> v3:
- replace __arm_smmu_tlb_sync by arm_smmu_cmdq_issue_sync

v1 -> v2:
- properly pass the asid
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 82 +++++++++++++++++++++
 1 file changed, 82 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index d5e722105624..e84a7c3e8730 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2923,6 +2923,87 @@ static void arm_smmu_detach_pasid_table(struct iommu_domain *domain)
 	mutex_unlock(&smmu_domain->init_mutex);
 }
 
+static int
+arm_smmu_cache_invalidate(struct iommu_domain *domain, struct device *dev,
+			  struct iommu_cache_invalidate_info *inv_info)
+{
+	struct arm_smmu_cmdq_ent cmd = {.opcode = CMDQ_OP_TLBI_NSNH_ALL};
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+
+	if (smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
+		return -EINVAL;
+
+	if (!smmu)
+		return -EINVAL;
+
+	if (inv_info->version != IOMMU_CACHE_INVALIDATE_INFO_VERSION_1)
+		return -EINVAL;
+
+	if (inv_info->cache & IOMMU_CACHE_INV_TYPE_PASID ||
+	    inv_info->cache & IOMMU_CACHE_INV_TYPE_DEV_IOTLB) {
+		return -ENOENT;
+	}
+
+	if (!(inv_info->cache & IOMMU_CACHE_INV_TYPE_IOTLB))
+		return -EINVAL;
+
+	/* IOTLB invalidation */
+
+	switch (inv_info->granularity) {
+	case IOMMU_INV_GRANU_PASID:
+	{
+		struct iommu_inv_pasid_info *info =
+			&inv_info->granu.pasid_info;
+
+		if (info->flags & IOMMU_INV_ADDR_FLAGS_PASID)
+			return -ENOENT;
+		if (!(info->flags & IOMMU_INV_PASID_FLAGS_ARCHID))
+			return -EINVAL;
+
+		__arm_smmu_tlb_inv_context(smmu_domain, info->archid);
+		return 0;
+	}
+	case IOMMU_INV_GRANU_ADDR:
+	{
+		struct iommu_inv_addr_info *info = &inv_info->granu.addr_info;
+		uint64_t granule_size  = info->granule_size;
+		uint64_t size = info->nb_granules * info->granule_size;
+		bool leaf = info->flags & IOMMU_INV_ADDR_FLAGS_LEAF;
+		int tg;
+
+		if (info->flags & IOMMU_INV_ADDR_FLAGS_PASID)
+			return -ENOENT;
+
+		if (!(info->flags & IOMMU_INV_ADDR_FLAGS_ARCHID))
+			break;
+
+		tg = __ffs(granule_size);
+		if (!granule_size || granule_size & ~(1ULL << tg) ||
+		    !(granule_size & smmu->pgsize_bitmap))
+			return -EINVAL;
+
+		/* range invalidation must be used */
+		if (!size)
+			return -EINVAL;
+
+		arm_smmu_tlb_inv_range_domain(info->addr, size,
+					      granule_size, leaf,
+					      info->archid, smmu_domain);
+		return 0;
+	}
+	case IOMMU_INV_GRANU_DOMAIN:
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	/* Global S1 invalidation */
+	cmd.tlbi.vmid   = smmu_domain->s2_cfg.vmid;
+	arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
+	return 0;
+}
+
 static bool arm_smmu_dev_has_feature(struct device *dev,
 				     enum iommu_dev_features feat)
 {
@@ -3022,6 +3103,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.put_resv_regions	= generic_iommu_put_resv_regions,
 	.attach_pasid_table	= arm_smmu_attach_pasid_table,
 	.detach_pasid_table	= arm_smmu_detach_pasid_table,
+	.cache_invalidate	= arm_smmu_cache_invalidate,
 	.dev_has_feat		= arm_smmu_dev_has_feature,
 	.dev_feat_enabled	= arm_smmu_dev_feature_enabled,
 	.dev_enable_feat	= arm_smmu_dev_enable_feature,
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 7/9] iommu/smmuv3: Implement cache_invalidate
@ 2021-10-27 10:44   ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: peter.maydell, kevin.tian, ashok.raj, maz, vivek.gautam,
	alex.williamson, vsethi, zhangfei.gao, lushenming, wangxingang5

Implement domain-selective, pasid selective and page-selective
IOTLB invalidations.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v15 -> v16:
- make sure the range is set (RIL guest) and check the granule
  size is supported by the physical IOMMU
- use cmd_with_sync

v14 -> v15:
- remove the redundant arm_smmu_cmdq_issue_sync(smmu)
  in IOMMU_INV_GRANU_ADDR case (Zenghui)
- if RIL is not supported by the host, make sure the granule_size
  that is passed by the userspace is supported or fix it
  (Chenxiang)

v13 -> v14:
- Add domain invalidation
- do global inval when asid is not provided with addr
  granularity

v7 -> v8:
- ASID based invalidation using iommu_inv_pasid_info
- check ARCHID/PASID flags in addr based invalidation
- use __arm_smmu_tlb_inv_context and __arm_smmu_tlb_inv_range_nosync

v6 -> v7
- check the uapi version

v3 -> v4:
- adapt to changes in the uapi
- add support for leaf parameter
- do not use arm_smmu_tlb_inv_range_nosync or arm_smmu_tlb_inv_context
  anymore

v2 -> v3:
- replace __arm_smmu_tlb_sync by arm_smmu_cmdq_issue_sync

v1 -> v2:
- properly pass the asid
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 82 +++++++++++++++++++++
 1 file changed, 82 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index d5e722105624..e84a7c3e8730 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2923,6 +2923,87 @@ static void arm_smmu_detach_pasid_table(struct iommu_domain *domain)
 	mutex_unlock(&smmu_domain->init_mutex);
 }
 
+static int
+arm_smmu_cache_invalidate(struct iommu_domain *domain, struct device *dev,
+			  struct iommu_cache_invalidate_info *inv_info)
+{
+	struct arm_smmu_cmdq_ent cmd = {.opcode = CMDQ_OP_TLBI_NSNH_ALL};
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+
+	if (smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
+		return -EINVAL;
+
+	if (!smmu)
+		return -EINVAL;
+
+	if (inv_info->version != IOMMU_CACHE_INVALIDATE_INFO_VERSION_1)
+		return -EINVAL;
+
+	if (inv_info->cache & IOMMU_CACHE_INV_TYPE_PASID ||
+	    inv_info->cache & IOMMU_CACHE_INV_TYPE_DEV_IOTLB) {
+		return -ENOENT;
+	}
+
+	if (!(inv_info->cache & IOMMU_CACHE_INV_TYPE_IOTLB))
+		return -EINVAL;
+
+	/* IOTLB invalidation */
+
+	switch (inv_info->granularity) {
+	case IOMMU_INV_GRANU_PASID:
+	{
+		struct iommu_inv_pasid_info *info =
+			&inv_info->granu.pasid_info;
+
+		if (info->flags & IOMMU_INV_ADDR_FLAGS_PASID)
+			return -ENOENT;
+		if (!(info->flags & IOMMU_INV_PASID_FLAGS_ARCHID))
+			return -EINVAL;
+
+		__arm_smmu_tlb_inv_context(smmu_domain, info->archid);
+		return 0;
+	}
+	case IOMMU_INV_GRANU_ADDR:
+	{
+		struct iommu_inv_addr_info *info = &inv_info->granu.addr_info;
+		uint64_t granule_size  = info->granule_size;
+		uint64_t size = info->nb_granules * info->granule_size;
+		bool leaf = info->flags & IOMMU_INV_ADDR_FLAGS_LEAF;
+		int tg;
+
+		if (info->flags & IOMMU_INV_ADDR_FLAGS_PASID)
+			return -ENOENT;
+
+		if (!(info->flags & IOMMU_INV_ADDR_FLAGS_ARCHID))
+			break;
+
+		tg = __ffs(granule_size);
+		if (!granule_size || granule_size & ~(1ULL << tg) ||
+		    !(granule_size & smmu->pgsize_bitmap))
+			return -EINVAL;
+
+		/* range invalidation must be used */
+		if (!size)
+			return -EINVAL;
+
+		arm_smmu_tlb_inv_range_domain(info->addr, size,
+					      granule_size, leaf,
+					      info->archid, smmu_domain);
+		return 0;
+	}
+	case IOMMU_INV_GRANU_DOMAIN:
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	/* Global S1 invalidation */
+	cmd.tlbi.vmid   = smmu_domain->s2_cfg.vmid;
+	arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
+	return 0;
+}
+
 static bool arm_smmu_dev_has_feature(struct device *dev,
 				     enum iommu_dev_features feat)
 {
@@ -3022,6 +3103,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.put_resv_regions	= generic_iommu_put_resv_regions,
 	.attach_pasid_table	= arm_smmu_attach_pasid_table,
 	.detach_pasid_table	= arm_smmu_detach_pasid_table,
+	.cache_invalidate	= arm_smmu_cache_invalidate,
 	.dev_has_feat		= arm_smmu_dev_has_feature,
 	.dev_feat_enabled	= arm_smmu_dev_feature_enabled,
 	.dev_enable_feat	= arm_smmu_dev_enable_feature,
-- 
2.26.3

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 7/9] iommu/smmuv3: Implement cache_invalidate
@ 2021-10-27 10:44   ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: kevin.tian, jacob.jun.pan, ashok.raj, chenxiang66, maz, vdumpa,
	nicoleotsuka, vivek.gautam, alex.williamson, yi.l.liu, nicolinc,
	vsethi, zhangfei.gao, sumitg, lushenming, wangxingang5

Implement domain-selective, pasid selective and page-selective
IOTLB invalidations.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v15 -> v16:
- make sure the range is set (RIL guest) and check the granule
  size is supported by the physical IOMMU
- use cmd_with_sync

v14 -> v15:
- remove the redundant arm_smmu_cmdq_issue_sync(smmu)
  in IOMMU_INV_GRANU_ADDR case (Zenghui)
- if RIL is not supported by the host, make sure the granule_size
  that is passed by the userspace is supported or fix it
  (Chenxiang)

v13 -> v14:
- Add domain invalidation
- do global inval when asid is not provided with addr
  granularity

v7 -> v8:
- ASID based invalidation using iommu_inv_pasid_info
- check ARCHID/PASID flags in addr based invalidation
- use __arm_smmu_tlb_inv_context and __arm_smmu_tlb_inv_range_nosync

v6 -> v7
- check the uapi version

v3 -> v4:
- adapt to changes in the uapi
- add support for leaf parameter
- do not use arm_smmu_tlb_inv_range_nosync or arm_smmu_tlb_inv_context
  anymore

v2 -> v3:
- replace __arm_smmu_tlb_sync by arm_smmu_cmdq_issue_sync

v1 -> v2:
- properly pass the asid
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 82 +++++++++++++++++++++
 1 file changed, 82 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index d5e722105624..e84a7c3e8730 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2923,6 +2923,87 @@ static void arm_smmu_detach_pasid_table(struct iommu_domain *domain)
 	mutex_unlock(&smmu_domain->init_mutex);
 }
 
+static int
+arm_smmu_cache_invalidate(struct iommu_domain *domain, struct device *dev,
+			  struct iommu_cache_invalidate_info *inv_info)
+{
+	struct arm_smmu_cmdq_ent cmd = {.opcode = CMDQ_OP_TLBI_NSNH_ALL};
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+
+	if (smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
+		return -EINVAL;
+
+	if (!smmu)
+		return -EINVAL;
+
+	if (inv_info->version != IOMMU_CACHE_INVALIDATE_INFO_VERSION_1)
+		return -EINVAL;
+
+	if (inv_info->cache & IOMMU_CACHE_INV_TYPE_PASID ||
+	    inv_info->cache & IOMMU_CACHE_INV_TYPE_DEV_IOTLB) {
+		return -ENOENT;
+	}
+
+	if (!(inv_info->cache & IOMMU_CACHE_INV_TYPE_IOTLB))
+		return -EINVAL;
+
+	/* IOTLB invalidation */
+
+	switch (inv_info->granularity) {
+	case IOMMU_INV_GRANU_PASID:
+	{
+		struct iommu_inv_pasid_info *info =
+			&inv_info->granu.pasid_info;
+
+		if (info->flags & IOMMU_INV_ADDR_FLAGS_PASID)
+			return -ENOENT;
+		if (!(info->flags & IOMMU_INV_PASID_FLAGS_ARCHID))
+			return -EINVAL;
+
+		__arm_smmu_tlb_inv_context(smmu_domain, info->archid);
+		return 0;
+	}
+	case IOMMU_INV_GRANU_ADDR:
+	{
+		struct iommu_inv_addr_info *info = &inv_info->granu.addr_info;
+		uint64_t granule_size  = info->granule_size;
+		uint64_t size = info->nb_granules * info->granule_size;
+		bool leaf = info->flags & IOMMU_INV_ADDR_FLAGS_LEAF;
+		int tg;
+
+		if (info->flags & IOMMU_INV_ADDR_FLAGS_PASID)
+			return -ENOENT;
+
+		if (!(info->flags & IOMMU_INV_ADDR_FLAGS_ARCHID))
+			break;
+
+		tg = __ffs(granule_size);
+		if (!granule_size || granule_size & ~(1ULL << tg) ||
+		    !(granule_size & smmu->pgsize_bitmap))
+			return -EINVAL;
+
+		/* range invalidation must be used */
+		if (!size)
+			return -EINVAL;
+
+		arm_smmu_tlb_inv_range_domain(info->addr, size,
+					      granule_size, leaf,
+					      info->archid, smmu_domain);
+		return 0;
+	}
+	case IOMMU_INV_GRANU_DOMAIN:
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	/* Global S1 invalidation */
+	cmd.tlbi.vmid   = smmu_domain->s2_cfg.vmid;
+	arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
+	return 0;
+}
+
 static bool arm_smmu_dev_has_feature(struct device *dev,
 				     enum iommu_dev_features feat)
 {
@@ -3022,6 +3103,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.put_resv_regions	= generic_iommu_put_resv_regions,
 	.attach_pasid_table	= arm_smmu_attach_pasid_table,
 	.detach_pasid_table	= arm_smmu_detach_pasid_table,
+	.cache_invalidate	= arm_smmu_cache_invalidate,
 	.dev_has_feat		= arm_smmu_dev_has_feature,
 	.dev_feat_enabled	= arm_smmu_dev_feature_enabled,
 	.dev_enable_feat	= arm_smmu_dev_enable_feature,
-- 
2.26.3

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 8/9] iommu/smmuv3: report additional recoverable faults
  2021-10-27 10:44 ` Eric Auger
  (?)
@ 2021-10-27 10:44   ` Eric Auger
  -1 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: alex.williamson, jacob.jun.pan, yi.l.liu, kevin.tian, ashok.raj,
	maz, peter.maydell, vivek.gautam, shameerali.kolothum.thodi,
	wangxingang5, jiangkunkun, yuzenghui, nicoleotsuka, chenxiang66,
	sumitg, nicolinc, vdumpa, zhangfei.gao, zhangfei.gao, lushenming,
	vsethi

Up to now we have only reported translation faults. Now that
the guest can induce some configuration faults, let's report them
too. Add propagation for BAD_SUBSTREAMID, CD_FETCH, BAD_CD, WALK_EABT.
We also fix the transcoding for some existing translation faults.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v14 -> v15:
- adapt to removal of IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID
  in [PATCH v13 10/10] iommu/arm-smmu-v3: Add stall support for
  platform devices
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 40 +++++++++++++++++++--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  4 +++
 2 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index e84a7c3e8730..ddfc069c10ae 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1488,6 +1488,7 @@ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
 	u32 perm = 0;
 	struct arm_smmu_master *master;
 	bool ssid_valid = evt[0] & EVTQ_0_SSV;
+	u8 type = FIELD_GET(EVTQ_0_ID, evt[0]);
 	u32 sid = FIELD_GET(EVTQ_0_SID, evt[0]);
 	struct iommu_fault_event fault_evt = { };
 	struct iommu_fault *flt = &fault_evt.fault;
@@ -1540,8 +1541,6 @@ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
 	} else {
 		flt->type = IOMMU_FAULT_DMA_UNRECOV;
 		flt->event = (struct iommu_fault_unrecoverable) {
-			.reason = reason,
-			.flags = IOMMU_FAULT_UNRECOV_ADDR_VALID,
 			.perm = perm,
 			.addr = FIELD_GET(EVTQ_2_ADDR, evt[2]),
 		};
@@ -1550,6 +1549,43 @@ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
 			flt->event.flags |= IOMMU_FAULT_UNRECOV_PASID_VALID;
 			flt->event.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]);
 		}
+
+		switch (type) {
+		case EVT_ID_TRANSLATION_FAULT:
+			flt->event.reason = IOMMU_FAULT_REASON_PTE_FETCH;
+			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID;
+			break;
+		case EVT_ID_ADDR_SIZE_FAULT:
+			flt->event.reason = IOMMU_FAULT_REASON_OOR_ADDRESS;
+			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID;
+			break;
+		case EVT_ID_ACCESS_FAULT:
+			flt->event.reason = IOMMU_FAULT_REASON_ACCESS;
+			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID;
+			break;
+		case EVT_ID_PERMISSION_FAULT:
+			flt->event.reason = IOMMU_FAULT_REASON_PERMISSION;
+			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID;
+			break;
+		case EVT_ID_BAD_SUBSTREAMID:
+			flt->event.reason = IOMMU_FAULT_REASON_PASID_INVALID;
+			break;
+		case EVT_ID_CD_FETCH:
+			flt->event.reason = IOMMU_FAULT_REASON_PASID_FETCH;
+			flt->event.flags |= IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID;
+			break;
+		case EVT_ID_BAD_CD:
+			flt->event.reason = IOMMU_FAULT_REASON_BAD_PASID_ENTRY;
+			break;
+		case EVT_ID_WALK_EABT:
+			flt->event.reason = IOMMU_FAULT_REASON_WALK_EABT;
+			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID |
+					    IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID;
+			break;
+		default:
+			/* TODO: report other unrecoverable faults. */
+			return -EFAULT;
+		}
 	}
 
 	mutex_lock(&smmu->streams_mutex);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 05959df01618..b914570ee5ba 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -379,6 +379,10 @@
 
 #define EVTQ_0_ID			GENMASK_ULL(7, 0)
 
+#define EVT_ID_BAD_SUBSTREAMID		0x08
+#define EVT_ID_CD_FETCH			0x09
+#define EVT_ID_BAD_CD			0x0a
+#define EVT_ID_WALK_EABT		0x0b
 #define EVT_ID_TRANSLATION_FAULT	0x10
 #define EVT_ID_ADDR_SIZE_FAULT		0x11
 #define EVT_ID_ACCESS_FAULT		0x12
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 8/9] iommu/smmuv3: report additional recoverable faults
@ 2021-10-27 10:44   ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: peter.maydell, kevin.tian, ashok.raj, maz, vivek.gautam,
	alex.williamson, vsethi, zhangfei.gao, lushenming, wangxingang5

Up to now we have only reported translation faults. Now that
the guest can induce some configuration faults, let's report them
too. Add propagation for BAD_SUBSTREAMID, CD_FETCH, BAD_CD, WALK_EABT.
We also fix the transcoding for some existing translation faults.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v14 -> v15:
- adapt to removal of IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID
  in [PATCH v13 10/10] iommu/arm-smmu-v3: Add stall support for
  platform devices
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 40 +++++++++++++++++++--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  4 +++
 2 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index e84a7c3e8730..ddfc069c10ae 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1488,6 +1488,7 @@ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
 	u32 perm = 0;
 	struct arm_smmu_master *master;
 	bool ssid_valid = evt[0] & EVTQ_0_SSV;
+	u8 type = FIELD_GET(EVTQ_0_ID, evt[0]);
 	u32 sid = FIELD_GET(EVTQ_0_SID, evt[0]);
 	struct iommu_fault_event fault_evt = { };
 	struct iommu_fault *flt = &fault_evt.fault;
@@ -1540,8 +1541,6 @@ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
 	} else {
 		flt->type = IOMMU_FAULT_DMA_UNRECOV;
 		flt->event = (struct iommu_fault_unrecoverable) {
-			.reason = reason,
-			.flags = IOMMU_FAULT_UNRECOV_ADDR_VALID,
 			.perm = perm,
 			.addr = FIELD_GET(EVTQ_2_ADDR, evt[2]),
 		};
@@ -1550,6 +1549,43 @@ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
 			flt->event.flags |= IOMMU_FAULT_UNRECOV_PASID_VALID;
 			flt->event.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]);
 		}
+
+		switch (type) {
+		case EVT_ID_TRANSLATION_FAULT:
+			flt->event.reason = IOMMU_FAULT_REASON_PTE_FETCH;
+			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID;
+			break;
+		case EVT_ID_ADDR_SIZE_FAULT:
+			flt->event.reason = IOMMU_FAULT_REASON_OOR_ADDRESS;
+			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID;
+			break;
+		case EVT_ID_ACCESS_FAULT:
+			flt->event.reason = IOMMU_FAULT_REASON_ACCESS;
+			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID;
+			break;
+		case EVT_ID_PERMISSION_FAULT:
+			flt->event.reason = IOMMU_FAULT_REASON_PERMISSION;
+			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID;
+			break;
+		case EVT_ID_BAD_SUBSTREAMID:
+			flt->event.reason = IOMMU_FAULT_REASON_PASID_INVALID;
+			break;
+		case EVT_ID_CD_FETCH:
+			flt->event.reason = IOMMU_FAULT_REASON_PASID_FETCH;
+			flt->event.flags |= IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID;
+			break;
+		case EVT_ID_BAD_CD:
+			flt->event.reason = IOMMU_FAULT_REASON_BAD_PASID_ENTRY;
+			break;
+		case EVT_ID_WALK_EABT:
+			flt->event.reason = IOMMU_FAULT_REASON_WALK_EABT;
+			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID |
+					    IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID;
+			break;
+		default:
+			/* TODO: report other unrecoverable faults. */
+			return -EFAULT;
+		}
 	}
 
 	mutex_lock(&smmu->streams_mutex);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 05959df01618..b914570ee5ba 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -379,6 +379,10 @@
 
 #define EVTQ_0_ID			GENMASK_ULL(7, 0)
 
+#define EVT_ID_BAD_SUBSTREAMID		0x08
+#define EVT_ID_CD_FETCH			0x09
+#define EVT_ID_BAD_CD			0x0a
+#define EVT_ID_WALK_EABT		0x0b
 #define EVT_ID_TRANSLATION_FAULT	0x10
 #define EVT_ID_ADDR_SIZE_FAULT		0x11
 #define EVT_ID_ACCESS_FAULT		0x12
-- 
2.26.3

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 8/9] iommu/smmuv3: report additional recoverable faults
@ 2021-10-27 10:44   ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: kevin.tian, jacob.jun.pan, ashok.raj, chenxiang66, maz, vdumpa,
	nicoleotsuka, vivek.gautam, alex.williamson, yi.l.liu, nicolinc,
	vsethi, zhangfei.gao, sumitg, lushenming, wangxingang5

Up to now we have only reported translation faults. Now that
the guest can induce some configuration faults, let's report them
too. Add propagation for BAD_SUBSTREAMID, CD_FETCH, BAD_CD, WALK_EABT.
We also fix the transcoding for some existing translation faults.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v14 -> v15:
- adapt to removal of IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID
  in [PATCH v13 10/10] iommu/arm-smmu-v3: Add stall support for
  platform devices
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 40 +++++++++++++++++++--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  4 +++
 2 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index e84a7c3e8730..ddfc069c10ae 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1488,6 +1488,7 @@ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
 	u32 perm = 0;
 	struct arm_smmu_master *master;
 	bool ssid_valid = evt[0] & EVTQ_0_SSV;
+	u8 type = FIELD_GET(EVTQ_0_ID, evt[0]);
 	u32 sid = FIELD_GET(EVTQ_0_SID, evt[0]);
 	struct iommu_fault_event fault_evt = { };
 	struct iommu_fault *flt = &fault_evt.fault;
@@ -1540,8 +1541,6 @@ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
 	} else {
 		flt->type = IOMMU_FAULT_DMA_UNRECOV;
 		flt->event = (struct iommu_fault_unrecoverable) {
-			.reason = reason,
-			.flags = IOMMU_FAULT_UNRECOV_ADDR_VALID,
 			.perm = perm,
 			.addr = FIELD_GET(EVTQ_2_ADDR, evt[2]),
 		};
@@ -1550,6 +1549,43 @@ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
 			flt->event.flags |= IOMMU_FAULT_UNRECOV_PASID_VALID;
 			flt->event.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]);
 		}
+
+		switch (type) {
+		case EVT_ID_TRANSLATION_FAULT:
+			flt->event.reason = IOMMU_FAULT_REASON_PTE_FETCH;
+			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID;
+			break;
+		case EVT_ID_ADDR_SIZE_FAULT:
+			flt->event.reason = IOMMU_FAULT_REASON_OOR_ADDRESS;
+			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID;
+			break;
+		case EVT_ID_ACCESS_FAULT:
+			flt->event.reason = IOMMU_FAULT_REASON_ACCESS;
+			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID;
+			break;
+		case EVT_ID_PERMISSION_FAULT:
+			flt->event.reason = IOMMU_FAULT_REASON_PERMISSION;
+			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID;
+			break;
+		case EVT_ID_BAD_SUBSTREAMID:
+			flt->event.reason = IOMMU_FAULT_REASON_PASID_INVALID;
+			break;
+		case EVT_ID_CD_FETCH:
+			flt->event.reason = IOMMU_FAULT_REASON_PASID_FETCH;
+			flt->event.flags |= IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID;
+			break;
+		case EVT_ID_BAD_CD:
+			flt->event.reason = IOMMU_FAULT_REASON_BAD_PASID_ENTRY;
+			break;
+		case EVT_ID_WALK_EABT:
+			flt->event.reason = IOMMU_FAULT_REASON_WALK_EABT;
+			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID |
+					    IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID;
+			break;
+		default:
+			/* TODO: report other unrecoverable faults. */
+			return -EFAULT;
+		}
 	}
 
 	mutex_lock(&smmu->streams_mutex);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 05959df01618..b914570ee5ba 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -379,6 +379,10 @@
 
 #define EVTQ_0_ID			GENMASK_ULL(7, 0)
 
+#define EVT_ID_BAD_SUBSTREAMID		0x08
+#define EVT_ID_CD_FETCH			0x09
+#define EVT_ID_BAD_CD			0x0a
+#define EVT_ID_WALK_EABT		0x0b
 #define EVT_ID_TRANSLATION_FAULT	0x10
 #define EVT_ID_ADDR_SIZE_FAULT		0x11
 #define EVT_ID_ACCESS_FAULT		0x12
-- 
2.26.3

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 9/9] iommu/smmuv3: Disallow nested mode in presence of HW MSI regions
  2021-10-27 10:44 ` Eric Auger
  (?)
@ 2021-10-27 10:44   ` Eric Auger
  -1 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: alex.williamson, jacob.jun.pan, yi.l.liu, kevin.tian, ashok.raj,
	maz, peter.maydell, vivek.gautam, shameerali.kolothum.thodi,
	wangxingang5, jiangkunkun, yuzenghui, nicoleotsuka, chenxiang66,
	sumitg, nicolinc, vdumpa, zhangfei.gao, zhangfei.gao, lushenming,
	vsethi

Nested mode currently is not compatible with HW MSI reserved regions.
Indeed MSI transactions targeting those MSI doorbells bypass the SMMU.
This would require the guest to also bypass those ranges but the guest
has no information about them.

Let's check nested mode is not attempted in such configuration.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 23 +++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index ddfc069c10ae..12e7d7920f27 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2488,6 +2488,23 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
 	arm_smmu_install_ste_for_dev(master);
 }
 
+static bool arm_smmu_has_hw_msi_resv_region(struct device *dev)
+{
+	struct iommu_resv_region *region;
+	bool has_msi_resv_region = false;
+	LIST_HEAD(resv_regions);
+
+	iommu_get_resv_regions(dev, &resv_regions);
+	list_for_each_entry(region, &resv_regions, list) {
+		if (region->type == IOMMU_RESV_MSI) {
+			has_msi_resv_region = true;
+			break;
+		}
+	}
+	iommu_put_resv_regions(dev, &resv_regions);
+	return has_msi_resv_region;
+}
+
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	int ret = 0;
@@ -2545,6 +2562,12 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		ret = -EINVAL;
 		goto out_unlock;
 	}
+	/* Nested mode is not compatible with MSI HW reserved regions */
+	if (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED &&
+	    arm_smmu_has_hw_msi_resv_region(dev)) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
 
 	master->domain = smmu_domain;
 
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 9/9] iommu/smmuv3: Disallow nested mode in presence of HW MSI regions
@ 2021-10-27 10:44   ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: peter.maydell, kevin.tian, ashok.raj, maz, vivek.gautam,
	alex.williamson, vsethi, zhangfei.gao, lushenming, wangxingang5

Nested mode currently is not compatible with HW MSI reserved regions.
Indeed MSI transactions targeting those MSI doorbells bypass the SMMU.
This would require the guest to also bypass those ranges but the guest
has no information about them.

Let's check nested mode is not attempted in such configuration.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 23 +++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index ddfc069c10ae..12e7d7920f27 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2488,6 +2488,23 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
 	arm_smmu_install_ste_for_dev(master);
 }
 
+static bool arm_smmu_has_hw_msi_resv_region(struct device *dev)
+{
+	struct iommu_resv_region *region;
+	bool has_msi_resv_region = false;
+	LIST_HEAD(resv_regions);
+
+	iommu_get_resv_regions(dev, &resv_regions);
+	list_for_each_entry(region, &resv_regions, list) {
+		if (region->type == IOMMU_RESV_MSI) {
+			has_msi_resv_region = true;
+			break;
+		}
+	}
+	iommu_put_resv_regions(dev, &resv_regions);
+	return has_msi_resv_region;
+}
+
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	int ret = 0;
@@ -2545,6 +2562,12 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		ret = -EINVAL;
 		goto out_unlock;
 	}
+	/* Nested mode is not compatible with MSI HW reserved regions */
+	if (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED &&
+	    arm_smmu_has_hw_msi_resv_region(dev)) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
 
 	master->domain = smmu_domain;
 
-- 
2.26.3

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 116+ messages in thread

* [RFC v16 9/9] iommu/smmuv3: Disallow nested mode in presence of HW MSI regions
@ 2021-10-27 10:44   ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-10-27 10:44 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: kevin.tian, jacob.jun.pan, ashok.raj, chenxiang66, maz, vdumpa,
	nicoleotsuka, vivek.gautam, alex.williamson, yi.l.liu, nicolinc,
	vsethi, zhangfei.gao, sumitg, lushenming, wangxingang5

Nested mode currently is not compatible with HW MSI reserved regions.
Indeed MSI transactions targeting those MSI doorbells bypass the SMMU.
This would require the guest to also bypass those ranges but the guest
has no information about them.

Let's check nested mode is not attempted in such configuration.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 23 +++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index ddfc069c10ae..12e7d7920f27 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2488,6 +2488,23 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
 	arm_smmu_install_ste_for_dev(master);
 }
 
+static bool arm_smmu_has_hw_msi_resv_region(struct device *dev)
+{
+	struct iommu_resv_region *region;
+	bool has_msi_resv_region = false;
+	LIST_HEAD(resv_regions);
+
+	iommu_get_resv_regions(dev, &resv_regions);
+	list_for_each_entry(region, &resv_regions, list) {
+		if (region->type == IOMMU_RESV_MSI) {
+			has_msi_resv_region = true;
+			break;
+		}
+	}
+	iommu_put_resv_regions(dev, &resv_regions);
+	return has_msi_resv_region;
+}
+
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	int ret = 0;
@@ -2545,6 +2562,12 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		ret = -EINVAL;
 		goto out_unlock;
 	}
+	/* Nested mode is not compatible with MSI HW reserved regions */
+	if (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED &&
+	    arm_smmu_has_hw_msi_resv_region(dev)) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
 
 	master->domain = smmu_domain;
 
-- 
2.26.3

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 116+ messages in thread

* Re: [RFC v16 8/9] iommu/smmuv3: report additional recoverable faults
  2021-10-27 10:44   ` Eric Auger
  (?)
  (?)
@ 2021-10-27 21:05   ` kernel test robot
  -1 siblings, 0 replies; 116+ messages in thread
From: kernel test robot @ 2021-10-27 21:05 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 18692 bytes --]

Hi Eric,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on joro-iommu/next]
[also build test WARNING on v5.15-rc7 next-20211027]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Eric-Auger/SMMUv3-Nested-Stage-Setup-IOMMU-part/20211027-184900
base:   https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git next
config: arm64-defconfig (attached as .config)
compiler: aarch64-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/7ab2730940794371e7d3fdd2b60b5b2639265ce6
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Eric-Auger/SMMUv3-Nested-Stage-Setup-IOMMU-part/20211027-184900
        git checkout 7ab2730940794371e7d3fdd2b60b5b2639265ce6
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross ARCH=arm64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c: In function 'arm_smmu_handle_evt':
>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:1487:13: warning: variable 'reason' set but not used [-Wunused-but-set-variable]
    1487 |         u32 reason;
         |             ^~~~~~


vim +/reason +1487 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c

cdf315f907d46a drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-04-01  1482  
48ec83bcbcf509 drivers/iommu/arm-smmu-v3.c                 Will Deacon           2015-05-27  1483  /* IRQ and event handlers */
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1484  static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1485  {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1486  	int ret;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26 @1487  	u32 reason;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1488  	u32 perm = 0;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1489  	struct arm_smmu_master *master;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1490  	bool ssid_valid = evt[0] & EVTQ_0_SSV;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1491  	u8 type = FIELD_GET(EVTQ_0_ID, evt[0]);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1492  	u32 sid = FIELD_GET(EVTQ_0_SID, evt[0]);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1493  	struct iommu_fault_event fault_evt = { };
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1494  	struct iommu_fault *flt = &fault_evt.fault;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1495  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1496  	switch (FIELD_GET(EVTQ_0_ID, evt[0])) {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1497  	case EVT_ID_TRANSLATION_FAULT:
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1498  		reason = IOMMU_FAULT_REASON_PTE_FETCH;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1499  		break;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1500  	case EVT_ID_ADDR_SIZE_FAULT:
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1501  		reason = IOMMU_FAULT_REASON_OOR_ADDRESS;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1502  		break;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1503  	case EVT_ID_ACCESS_FAULT:
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1504  		reason = IOMMU_FAULT_REASON_ACCESS;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1505  		break;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1506  	case EVT_ID_PERMISSION_FAULT:
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1507  		reason = IOMMU_FAULT_REASON_PERMISSION;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1508  		break;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1509  	default:
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1510  		return -EOPNOTSUPP;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1511  	}
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1512  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1513  	/* Stage-2 is always pinned at the moment */
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1514  	if (evt[1] & EVTQ_1_S2)
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1515  		return -EFAULT;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1516  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1517  	if (evt[1] & EVTQ_1_RnW)
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1518  		perm |= IOMMU_FAULT_PERM_READ;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1519  	else
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1520  		perm |= IOMMU_FAULT_PERM_WRITE;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1521  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1522  	if (evt[1] & EVTQ_1_InD)
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1523  		perm |= IOMMU_FAULT_PERM_EXEC;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1524  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1525  	if (evt[1] & EVTQ_1_PnU)
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1526  		perm |= IOMMU_FAULT_PERM_PRIV;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1527  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1528  	if (evt[1] & EVTQ_1_STALL) {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1529  		flt->type = IOMMU_FAULT_PAGE_REQ;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1530  		flt->prm = (struct iommu_fault_page_request) {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1531  			.flags = IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE,
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1532  			.grpid = FIELD_GET(EVTQ_1_STAG, evt[1]),
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1533  			.perm = perm,
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1534  			.addr = FIELD_GET(EVTQ_2_ADDR, evt[2]),
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1535  		};
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1536  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1537  		if (ssid_valid) {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1538  			flt->prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1539  			flt->prm.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1540  		}
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1541  	} else {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1542  		flt->type = IOMMU_FAULT_DMA_UNRECOV;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1543  		flt->event = (struct iommu_fault_unrecoverable) {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1544  			.perm = perm,
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1545  			.addr = FIELD_GET(EVTQ_2_ADDR, evt[2]),
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1546  		};
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1547  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1548  		if (ssid_valid) {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1549  			flt->event.flags |= IOMMU_FAULT_UNRECOV_PASID_VALID;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1550  			flt->event.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1551  		}
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1552  
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1553  		switch (type) {
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1554  		case EVT_ID_TRANSLATION_FAULT:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1555  			flt->event.reason = IOMMU_FAULT_REASON_PTE_FETCH;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1556  			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1557  			break;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1558  		case EVT_ID_ADDR_SIZE_FAULT:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1559  			flt->event.reason = IOMMU_FAULT_REASON_OOR_ADDRESS;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1560  			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1561  			break;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1562  		case EVT_ID_ACCESS_FAULT:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1563  			flt->event.reason = IOMMU_FAULT_REASON_ACCESS;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1564  			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1565  			break;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1566  		case EVT_ID_PERMISSION_FAULT:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1567  			flt->event.reason = IOMMU_FAULT_REASON_PERMISSION;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1568  			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1569  			break;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1570  		case EVT_ID_BAD_SUBSTREAMID:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1571  			flt->event.reason = IOMMU_FAULT_REASON_PASID_INVALID;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1572  			break;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1573  		case EVT_ID_CD_FETCH:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1574  			flt->event.reason = IOMMU_FAULT_REASON_PASID_FETCH;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1575  			flt->event.flags |= IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1576  			break;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1577  		case EVT_ID_BAD_CD:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1578  			flt->event.reason = IOMMU_FAULT_REASON_BAD_PASID_ENTRY;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1579  			break;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1580  		case EVT_ID_WALK_EABT:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1581  			flt->event.reason = IOMMU_FAULT_REASON_WALK_EABT;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1582  			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID |
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1583  					    IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1584  			break;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1585  		default:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1586  			/* TODO: report other unrecoverable faults. */
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1587  			return -EFAULT;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1588  		}
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1589  	}
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1590  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1591  	mutex_lock(&smmu->streams_mutex);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1592  	master = arm_smmu_find_master(smmu, sid);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1593  	if (!master) {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1594  		ret = -EINVAL;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1595  		goto out_unlock;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1596  	}
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1597  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1598  	ret = iommu_report_device_fault(master->dev, &fault_evt);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1599  	if (ret && flt->type == IOMMU_FAULT_PAGE_REQ) {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1600  		/* Nobody cared, abort the access */
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1601  		struct iommu_page_response resp = {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1602  			.pasid		= flt->prm.pasid,
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1603  			.grpid		= flt->prm.grpid,
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1604  			.code		= IOMMU_PAGE_RESP_FAILURE,
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1605  		};
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1606  		arm_smmu_page_response(master->dev, &fault_evt, &resp);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1607  	}
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1608  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1609  out_unlock:
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1610  	mutex_unlock(&smmu->streams_mutex);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1611  	return ret;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1612  }
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1613  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 56389 bytes --]

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 2/9] iommu: Introduce iommu_get_nesting
  2021-10-27 10:44   ` Eric Auger
  (?)
  (?)
@ 2021-10-27 22:15   ` kernel test robot
  -1 siblings, 0 replies; 116+ messages in thread
From: kernel test robot @ 2021-10-27 22:15 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 4742 bytes --]

Hi Eric,

[FYI, it's a private test report for your RFC patch.]
[auto build test ERROR on joro-iommu/next]
[also build test ERROR on v5.15-rc7 next-20211027]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Eric-Auger/SMMUv3-Nested-Stage-Setup-IOMMU-part/20211027-184900
base:   https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git next
config: ia64-defconfig (attached as .config)
compiler: ia64-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/bd101c49e9c42dc3299df01f6aabc15433f147a8
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Eric-Auger/SMMUv3-Nested-Stage-Setup-IOMMU-part/20211027-184900
        git checkout bd101c49e9c42dc3299df01f6aabc15433f147a8
        # save the attached .config to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=ia64 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from include/linux/bitops.h:7,
                    from include/linux/bitmap.h:8,
                    from drivers/iommu/intel/iommu.c:17:
   drivers/iommu/intel/iommu.c: In function 'intel_iommu_get_nesting':
>> drivers/iommu/intel/iommu.c:5610:48: error: 'flags' undeclared (first use in this function)
    5610 |         spin_lock_irqsave(&device_domain_lock, flags);
         |                                                ^~~~~
   include/linux/typecheck.h:11:16: note: in definition of macro 'typecheck'
      11 |         typeof(x) __dummy2; \
         |                ^
   include/linux/spinlock.h:393:9: note: in expansion of macro 'raw_spin_lock_irqsave'
     393 |         raw_spin_lock_irqsave(spinlock_check(lock), flags);     \
         |         ^~~~~~~~~~~~~~~~~~~~~
   drivers/iommu/intel/iommu.c:5610:9: note: in expansion of macro 'spin_lock_irqsave'
    5610 |         spin_lock_irqsave(&device_domain_lock, flags);
         |         ^~~~~~~~~~~~~~~~~
   drivers/iommu/intel/iommu.c:5610:48: note: each undeclared identifier is reported only once for each function it appears in
    5610 |         spin_lock_irqsave(&device_domain_lock, flags);
         |                                                ^~~~~
   include/linux/typecheck.h:11:16: note: in definition of macro 'typecheck'
      11 |         typeof(x) __dummy2; \
         |                ^
   include/linux/spinlock.h:393:9: note: in expansion of macro 'raw_spin_lock_irqsave'
     393 |         raw_spin_lock_irqsave(spinlock_check(lock), flags);     \
         |         ^~~~~~~~~~~~~~~~~~~~~
   drivers/iommu/intel/iommu.c:5610:9: note: in expansion of macro 'spin_lock_irqsave'
    5610 |         spin_lock_irqsave(&device_domain_lock, flags);
         |         ^~~~~~~~~~~~~~~~~
   include/linux/typecheck.h:12:25: warning: comparison of distinct pointer types lacks a cast
      12 |         (void)(&__dummy == &__dummy2); \
         |                         ^~
   include/linux/spinlock.h:255:17: note: in expansion of macro 'typecheck'
     255 |                 typecheck(unsigned long, flags);        \
         |                 ^~~~~~~~~
   include/linux/spinlock.h:393:9: note: in expansion of macro 'raw_spin_lock_irqsave'
     393 |         raw_spin_lock_irqsave(spinlock_check(lock), flags);     \
         |         ^~~~~~~~~~~~~~~~~~~~~
   drivers/iommu/intel/iommu.c:5610:9: note: in expansion of macro 'spin_lock_irqsave'
    5610 |         spin_lock_irqsave(&device_domain_lock, flags);
         |         ^~~~~~~~~~~~~~~~~


vim +/flags +5610 drivers/iommu/intel/iommu.c

  5604	
  5605	static bool intel_iommu_get_nesting(struct iommu_domain *domain)
  5606	{
  5607		struct dmar_domain *dmar_domain = to_dmar_domain(domain);
  5608		bool nesting;
  5609	
> 5610		spin_lock_irqsave(&device_domain_lock, flags);
  5611		nesting =  dmar_domain->flags & DOMAIN_FLAG_NESTING_MODE &&
  5612			   !(dmar_domain->flags & DOMAIN_FLAG_USE_FIRST_LEVEL);
  5613		spin_unlock_irqrestore(&device_domain_lock, flags);
  5614		return nesting;
  5615	}
  5616	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 19957 bytes --]

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 8/9] iommu/smmuv3: report additional recoverable faults
  2021-10-27 10:44   ` Eric Auger
@ 2021-10-27 22:41     ` kernel test robot
  -1 siblings, 0 replies; 116+ messages in thread
From: kernel test robot @ 2021-10-27 22:41 UTC (permalink / raw)
  To: Eric Auger; +Cc: llvm, kbuild-all

[-- Attachment #1: Type: text/plain, Size: 18635 bytes --]

Hi Eric,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on joro-iommu/next]
[also build test WARNING on v5.15-rc7 next-20211027]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Eric-Auger/SMMUv3-Nested-Stage-Setup-IOMMU-part/20211027-184900
base:   https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git next
config: arm64-randconfig-r033-20211027 (attached as .config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 5db7568a6a1fcb408eb8988abdaff2a225a8eb72)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install arm64 cross compiling tool for clang build
        # apt-get install binutils-aarch64-linux-gnu
        # https://github.com/0day-ci/linux/commit/7ab2730940794371e7d3fdd2b60b5b2639265ce6
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Eric-Auger/SMMUv3-Nested-Stage-Setup-IOMMU-part/20211027-184900
        git checkout 7ab2730940794371e7d3fdd2b60b5b2639265ce6
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 ARCH=arm64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:1487:6: warning: variable 'reason' set but not used [-Wunused-but-set-variable]
           u32 reason;
               ^
   1 warning generated.


vim +/reason +1487 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c

cdf315f907d46a drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-04-01  1482  
48ec83bcbcf509 drivers/iommu/arm-smmu-v3.c                 Will Deacon           2015-05-27  1483  /* IRQ and event handlers */
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1484  static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1485  {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1486  	int ret;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26 @1487  	u32 reason;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1488  	u32 perm = 0;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1489  	struct arm_smmu_master *master;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1490  	bool ssid_valid = evt[0] & EVTQ_0_SSV;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1491  	u8 type = FIELD_GET(EVTQ_0_ID, evt[0]);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1492  	u32 sid = FIELD_GET(EVTQ_0_SID, evt[0]);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1493  	struct iommu_fault_event fault_evt = { };
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1494  	struct iommu_fault *flt = &fault_evt.fault;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1495  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1496  	switch (FIELD_GET(EVTQ_0_ID, evt[0])) {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1497  	case EVT_ID_TRANSLATION_FAULT:
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1498  		reason = IOMMU_FAULT_REASON_PTE_FETCH;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1499  		break;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1500  	case EVT_ID_ADDR_SIZE_FAULT:
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1501  		reason = IOMMU_FAULT_REASON_OOR_ADDRESS;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1502  		break;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1503  	case EVT_ID_ACCESS_FAULT:
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1504  		reason = IOMMU_FAULT_REASON_ACCESS;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1505  		break;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1506  	case EVT_ID_PERMISSION_FAULT:
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1507  		reason = IOMMU_FAULT_REASON_PERMISSION;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1508  		break;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1509  	default:
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1510  		return -EOPNOTSUPP;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1511  	}
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1512  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1513  	/* Stage-2 is always pinned at the moment */
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1514  	if (evt[1] & EVTQ_1_S2)
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1515  		return -EFAULT;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1516  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1517  	if (evt[1] & EVTQ_1_RnW)
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1518  		perm |= IOMMU_FAULT_PERM_READ;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1519  	else
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1520  		perm |= IOMMU_FAULT_PERM_WRITE;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1521  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1522  	if (evt[1] & EVTQ_1_InD)
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1523  		perm |= IOMMU_FAULT_PERM_EXEC;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1524  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1525  	if (evt[1] & EVTQ_1_PnU)
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1526  		perm |= IOMMU_FAULT_PERM_PRIV;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1527  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1528  	if (evt[1] & EVTQ_1_STALL) {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1529  		flt->type = IOMMU_FAULT_PAGE_REQ;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1530  		flt->prm = (struct iommu_fault_page_request) {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1531  			.flags = IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE,
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1532  			.grpid = FIELD_GET(EVTQ_1_STAG, evt[1]),
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1533  			.perm = perm,
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1534  			.addr = FIELD_GET(EVTQ_2_ADDR, evt[2]),
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1535  		};
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1536  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1537  		if (ssid_valid) {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1538  			flt->prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1539  			flt->prm.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1540  		}
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1541  	} else {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1542  		flt->type = IOMMU_FAULT_DMA_UNRECOV;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1543  		flt->event = (struct iommu_fault_unrecoverable) {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1544  			.perm = perm,
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1545  			.addr = FIELD_GET(EVTQ_2_ADDR, evt[2]),
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1546  		};
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1547  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1548  		if (ssid_valid) {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1549  			flt->event.flags |= IOMMU_FAULT_UNRECOV_PASID_VALID;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1550  			flt->event.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1551  		}
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1552  
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1553  		switch (type) {
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1554  		case EVT_ID_TRANSLATION_FAULT:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1555  			flt->event.reason = IOMMU_FAULT_REASON_PTE_FETCH;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1556  			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1557  			break;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1558  		case EVT_ID_ADDR_SIZE_FAULT:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1559  			flt->event.reason = IOMMU_FAULT_REASON_OOR_ADDRESS;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1560  			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1561  			break;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1562  		case EVT_ID_ACCESS_FAULT:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1563  			flt->event.reason = IOMMU_FAULT_REASON_ACCESS;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1564  			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1565  			break;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1566  		case EVT_ID_PERMISSION_FAULT:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1567  			flt->event.reason = IOMMU_FAULT_REASON_PERMISSION;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1568  			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1569  			break;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1570  		case EVT_ID_BAD_SUBSTREAMID:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1571  			flt->event.reason = IOMMU_FAULT_REASON_PASID_INVALID;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1572  			break;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1573  		case EVT_ID_CD_FETCH:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1574  			flt->event.reason = IOMMU_FAULT_REASON_PASID_FETCH;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1575  			flt->event.flags |= IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1576  			break;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1577  		case EVT_ID_BAD_CD:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1578  			flt->event.reason = IOMMU_FAULT_REASON_BAD_PASID_ENTRY;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1579  			break;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1580  		case EVT_ID_WALK_EABT:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1581  			flt->event.reason = IOMMU_FAULT_REASON_WALK_EABT;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1582  			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID |
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1583  					    IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1584  			break;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1585  		default:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1586  			/* TODO: report other unrecoverable faults. */
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1587  			return -EFAULT;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1588  		}
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1589  	}
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1590  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1591  	mutex_lock(&smmu->streams_mutex);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1592  	master = arm_smmu_find_master(smmu, sid);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1593  	if (!master) {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1594  		ret = -EINVAL;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1595  		goto out_unlock;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1596  	}
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1597  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1598  	ret = iommu_report_device_fault(master->dev, &fault_evt);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1599  	if (ret && flt->type == IOMMU_FAULT_PAGE_REQ) {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1600  		/* Nobody cared, abort the access */
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1601  		struct iommu_page_response resp = {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1602  			.pasid		= flt->prm.pasid,
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1603  			.grpid		= flt->prm.grpid,
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1604  			.code		= IOMMU_PAGE_RESP_FAILURE,
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1605  		};
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1606  		arm_smmu_page_response(master->dev, &fault_evt, &resp);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1607  	}
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1608  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1609  out_unlock:
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1610  	mutex_unlock(&smmu->streams_mutex);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1611  	return ret;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1612  }
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1613  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 45135 bytes --]

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 8/9] iommu/smmuv3: report additional recoverable faults
@ 2021-10-27 22:41     ` kernel test robot
  0 siblings, 0 replies; 116+ messages in thread
From: kernel test robot @ 2021-10-27 22:41 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 18811 bytes --]

Hi Eric,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on joro-iommu/next]
[also build test WARNING on v5.15-rc7 next-20211027]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Eric-Auger/SMMUv3-Nested-Stage-Setup-IOMMU-part/20211027-184900
base:   https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git next
config: arm64-randconfig-r033-20211027 (attached as .config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 5db7568a6a1fcb408eb8988abdaff2a225a8eb72)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install arm64 cross compiling tool for clang build
        # apt-get install binutils-aarch64-linux-gnu
        # https://github.com/0day-ci/linux/commit/7ab2730940794371e7d3fdd2b60b5b2639265ce6
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Eric-Auger/SMMUv3-Nested-Stage-Setup-IOMMU-part/20211027-184900
        git checkout 7ab2730940794371e7d3fdd2b60b5b2639265ce6
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 ARCH=arm64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:1487:6: warning: variable 'reason' set but not used [-Wunused-but-set-variable]
           u32 reason;
               ^
   1 warning generated.


vim +/reason +1487 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c

cdf315f907d46a drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-04-01  1482  
48ec83bcbcf509 drivers/iommu/arm-smmu-v3.c                 Will Deacon           2015-05-27  1483  /* IRQ and event handlers */
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1484  static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1485  {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1486  	int ret;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26 @1487  	u32 reason;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1488  	u32 perm = 0;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1489  	struct arm_smmu_master *master;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1490  	bool ssid_valid = evt[0] & EVTQ_0_SSV;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1491  	u8 type = FIELD_GET(EVTQ_0_ID, evt[0]);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1492  	u32 sid = FIELD_GET(EVTQ_0_SID, evt[0]);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1493  	struct iommu_fault_event fault_evt = { };
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1494  	struct iommu_fault *flt = &fault_evt.fault;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1495  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1496  	switch (FIELD_GET(EVTQ_0_ID, evt[0])) {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1497  	case EVT_ID_TRANSLATION_FAULT:
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1498  		reason = IOMMU_FAULT_REASON_PTE_FETCH;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1499  		break;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1500  	case EVT_ID_ADDR_SIZE_FAULT:
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1501  		reason = IOMMU_FAULT_REASON_OOR_ADDRESS;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1502  		break;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1503  	case EVT_ID_ACCESS_FAULT:
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1504  		reason = IOMMU_FAULT_REASON_ACCESS;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1505  		break;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1506  	case EVT_ID_PERMISSION_FAULT:
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1507  		reason = IOMMU_FAULT_REASON_PERMISSION;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1508  		break;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1509  	default:
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1510  		return -EOPNOTSUPP;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1511  	}
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1512  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1513  	/* Stage-2 is always pinned at the moment */
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1514  	if (evt[1] & EVTQ_1_S2)
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1515  		return -EFAULT;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1516  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1517  	if (evt[1] & EVTQ_1_RnW)
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1518  		perm |= IOMMU_FAULT_PERM_READ;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1519  	else
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1520  		perm |= IOMMU_FAULT_PERM_WRITE;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1521  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1522  	if (evt[1] & EVTQ_1_InD)
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1523  		perm |= IOMMU_FAULT_PERM_EXEC;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1524  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1525  	if (evt[1] & EVTQ_1_PnU)
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1526  		perm |= IOMMU_FAULT_PERM_PRIV;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1527  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1528  	if (evt[1] & EVTQ_1_STALL) {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1529  		flt->type = IOMMU_FAULT_PAGE_REQ;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1530  		flt->prm = (struct iommu_fault_page_request) {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1531  			.flags = IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE,
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1532  			.grpid = FIELD_GET(EVTQ_1_STAG, evt[1]),
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1533  			.perm = perm,
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1534  			.addr = FIELD_GET(EVTQ_2_ADDR, evt[2]),
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1535  		};
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1536  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1537  		if (ssid_valid) {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1538  			flt->prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1539  			flt->prm.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1540  		}
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1541  	} else {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1542  		flt->type = IOMMU_FAULT_DMA_UNRECOV;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1543  		flt->event = (struct iommu_fault_unrecoverable) {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1544  			.perm = perm,
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1545  			.addr = FIELD_GET(EVTQ_2_ADDR, evt[2]),
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1546  		};
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1547  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1548  		if (ssid_valid) {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1549  			flt->event.flags |= IOMMU_FAULT_UNRECOV_PASID_VALID;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1550  			flt->event.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1551  		}
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1552  
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1553  		switch (type) {
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1554  		case EVT_ID_TRANSLATION_FAULT:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1555  			flt->event.reason = IOMMU_FAULT_REASON_PTE_FETCH;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1556  			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1557  			break;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1558  		case EVT_ID_ADDR_SIZE_FAULT:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1559  			flt->event.reason = IOMMU_FAULT_REASON_OOR_ADDRESS;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1560  			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1561  			break;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1562  		case EVT_ID_ACCESS_FAULT:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1563  			flt->event.reason = IOMMU_FAULT_REASON_ACCESS;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1564  			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1565  			break;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1566  		case EVT_ID_PERMISSION_FAULT:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1567  			flt->event.reason = IOMMU_FAULT_REASON_PERMISSION;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1568  			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1569  			break;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1570  		case EVT_ID_BAD_SUBSTREAMID:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1571  			flt->event.reason = IOMMU_FAULT_REASON_PASID_INVALID;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1572  			break;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1573  		case EVT_ID_CD_FETCH:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1574  			flt->event.reason = IOMMU_FAULT_REASON_PASID_FETCH;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1575  			flt->event.flags |= IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1576  			break;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1577  		case EVT_ID_BAD_CD:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1578  			flt->event.reason = IOMMU_FAULT_REASON_BAD_PASID_ENTRY;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1579  			break;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1580  		case EVT_ID_WALK_EABT:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1581  			flt->event.reason = IOMMU_FAULT_REASON_WALK_EABT;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1582  			flt->event.flags |= IOMMU_FAULT_UNRECOV_ADDR_VALID |
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1583  					    IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1584  			break;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1585  		default:
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1586  			/* TODO: report other unrecoverable faults. */
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1587  			return -EFAULT;
7ab27309407943 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Eric Auger            2021-10-27  1588  		}
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1589  	}
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1590  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1591  	mutex_lock(&smmu->streams_mutex);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1592  	master = arm_smmu_find_master(smmu, sid);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1593  	if (!master) {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1594  		ret = -EINVAL;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1595  		goto out_unlock;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1596  	}
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1597  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1598  	ret = iommu_report_device_fault(master->dev, &fault_evt);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1599  	if (ret && flt->type == IOMMU_FAULT_PAGE_REQ) {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1600  		/* Nobody cared, abort the access */
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1601  		struct iommu_page_response resp = {
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1602  			.pasid		= flt->prm.pasid,
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1603  			.grpid		= flt->prm.grpid,
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1604  			.code		= IOMMU_PAGE_RESP_FAILURE,
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1605  		};
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1606  		arm_smmu_page_response(master->dev, &fault_evt, &resp);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1607  	}
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1608  
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1609  out_unlock:
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1610  	mutex_unlock(&smmu->streams_mutex);
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1611  	return ret;
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1612  }
395ad89d11fd93 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c Jean-Philippe Brucker 2021-05-26  1613  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 45135 bytes --]

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 2/9] iommu: Introduce iommu_get_nesting
  2021-10-27 10:44   ` Eric Auger
                     ` (2 preceding siblings ...)
  (?)
@ 2021-10-28  3:22   ` kernel test robot
  -1 siblings, 0 replies; 116+ messages in thread
From: kernel test robot @ 2021-10-28  3:22 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 5732 bytes --]

Hi Eric,

[FYI, it's a private test report for your RFC patch.]
[auto build test ERROR on joro-iommu/next]
[also build test ERROR on v5.15-rc7 next-20211027]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Eric-Auger/SMMUv3-Nested-Stage-Setup-IOMMU-part/20211027-184900
base:   https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git next
config: ia64-randconfig-r033-20211027 (attached as .config)
compiler: ia64-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/bd101c49e9c42dc3299df01f6aabc15433f147a8
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Eric-Auger/SMMUv3-Nested-Stage-Setup-IOMMU-part/20211027-184900
        git checkout bd101c49e9c42dc3299df01f6aabc15433f147a8
        # save the attached .config to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=ia64 SHELL=/bin/bash drivers/iommu/intel/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All error/warnings (new ones prefixed by >>):

   In file included from arch/ia64/include/asm/pgtable.h:153,
                    from include/linux/pgtable.h:6,
                    from arch/ia64/include/asm/uaccess.h:40,
                    from include/linux/uaccess.h:11,
                    from include/linux/sched/task.h:11,
                    from include/linux/sched/signal.h:9,
                    from include/linux/rcuwait.h:6,
                    from include/linux/percpu-rwsem.h:7,
                    from include/linux/fs.h:33,
                    from include/linux/debugfs.h:15,
                    from drivers/iommu/intel/iommu.c:18:
   arch/ia64/include/asm/mmu_context.h: In function 'reload_context':
   arch/ia64/include/asm/mmu_context.h:127:48: warning: variable 'old_rr4' set but not used [-Wunused-but-set-variable]
     127 |         unsigned long rr0, rr1, rr2, rr3, rr4, old_rr4;
         |                                                ^~~~~~~
   In file included from include/linux/bitops.h:7,
                    from include/linux/bitmap.h:8,
                    from drivers/iommu/intel/iommu.c:17:
   drivers/iommu/intel/iommu.c: In function 'intel_iommu_get_nesting':
>> drivers/iommu/intel/iommu.c:5610:48: error: 'flags' undeclared (first use in this function)
    5610 |         spin_lock_irqsave(&device_domain_lock, flags);
         |                                                ^~~~~
   include/linux/typecheck.h:11:16: note: in definition of macro 'typecheck'
      11 |         typeof(x) __dummy2; \
         |                ^
   include/linux/spinlock.h:393:9: note: in expansion of macro 'raw_spin_lock_irqsave'
     393 |         raw_spin_lock_irqsave(spinlock_check(lock), flags);     \
         |         ^~~~~~~~~~~~~~~~~~~~~
   drivers/iommu/intel/iommu.c:5610:9: note: in expansion of macro 'spin_lock_irqsave'
    5610 |         spin_lock_irqsave(&device_domain_lock, flags);
         |         ^~~~~~~~~~~~~~~~~
   drivers/iommu/intel/iommu.c:5610:48: note: each undeclared identifier is reported only once for each function it appears in
    5610 |         spin_lock_irqsave(&device_domain_lock, flags);
         |                                                ^~~~~
   include/linux/typecheck.h:11:16: note: in definition of macro 'typecheck'
      11 |         typeof(x) __dummy2; \
         |                ^
   include/linux/spinlock.h:393:9: note: in expansion of macro 'raw_spin_lock_irqsave'
     393 |         raw_spin_lock_irqsave(spinlock_check(lock), flags);     \
         |         ^~~~~~~~~~~~~~~~~~~~~
   drivers/iommu/intel/iommu.c:5610:9: note: in expansion of macro 'spin_lock_irqsave'
    5610 |         spin_lock_irqsave(&device_domain_lock, flags);
         |         ^~~~~~~~~~~~~~~~~
>> include/linux/typecheck.h:12:25: warning: comparison of distinct pointer types lacks a cast
      12 |         (void)(&__dummy == &__dummy2); \
         |                         ^~
   include/linux/spinlock.h:255:17: note: in expansion of macro 'typecheck'
     255 |                 typecheck(unsigned long, flags);        \
         |                 ^~~~~~~~~
   include/linux/spinlock.h:393:9: note: in expansion of macro 'raw_spin_lock_irqsave'
     393 |         raw_spin_lock_irqsave(spinlock_check(lock), flags);     \
         |         ^~~~~~~~~~~~~~~~~~~~~
   drivers/iommu/intel/iommu.c:5610:9: note: in expansion of macro 'spin_lock_irqsave'
    5610 |         spin_lock_irqsave(&device_domain_lock, flags);
         |         ^~~~~~~~~~~~~~~~~


vim +/flags +5610 drivers/iommu/intel/iommu.c

  5604	
  5605	static bool intel_iommu_get_nesting(struct iommu_domain *domain)
  5606	{
  5607		struct dmar_domain *dmar_domain = to_dmar_domain(domain);
  5608		bool nesting;
  5609	
> 5610		spin_lock_irqsave(&device_domain_lock, flags);
  5611		nesting =  dmar_domain->flags & DOMAIN_FLAG_NESTING_MODE &&
  5612			   !(dmar_domain->flags & DOMAIN_FLAG_USE_FIRST_LEVEL);
  5613		spin_unlock_irqrestore(&device_domain_lock, flags);
  5614		return nesting;
  5615	}
  5616	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 32390 bytes --]

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
  2021-10-27 10:44 ` Eric Auger
  (?)
@ 2021-12-03 12:27   ` Zhangfei Gao
  -1 siblings, 0 replies; 116+ messages in thread
From: Zhangfei Gao @ 2021-12-03 12:27 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: alex.williamson, jacob.jun.pan, yi.l.liu, kevin.tian, ashok.raj,
	maz, peter.maydell, vivek.gautam, shameerali.kolothum.thodi,
	wangxingang5, jiangkunkun, yuzenghui, nicoleotsuka, chenxiang66,
	sumitg, nicolinc, vdumpa, zhangfei.gao, lushenming, vsethi


Hi, Eric

On 2021/10/27 下午6:44, Eric Auger wrote:
> This series brings the IOMMU part of HW nested paging support
> in the SMMUv3.
>
> The SMMUv3 driver is adapted to support 2 nested stages.
>
> The IOMMU API is extended to convey the guest stage 1
> configuration and the hook is implemented in the SMMUv3 driver.
>
> This allows the guest to own the stage 1 tables and context
> descriptors (so-called PASID table) while the host owns the
> stage 2 tables and main configuration structures (STE).
>
> This work mainly is provided for test purpose as the upper
> layer integration is under rework and bound to be based on
> /dev/iommu instead of VFIO tunneling. In this version we also get
> rid of the MSI BINDING ioctl, assuming the guest enforces
> flat mapping of host IOVAs used to bind physical MSI doorbells.
> In the current QEMU integration this is achieved by exposing
> RMRs to the guest, using Shameer's series [1]. This approach
> is RFC as the IORT spec is not really meant to do that
> (single mapping flag limitation).
>
> Best Regards
>
> Eric
>
> This series (Host) can be found at:
> https://github.com/eauger/linux/tree/v5.15-rc7-nested-v16
> This includes a rebased VFIO integration (although not meant
> to be upstreamed)
>
> Guest kernel branch can be found at:
> https://github.com/eauger/linux/tree/shameer_rmrr_v7
> featuring [1]
>
> QEMU integration (still based on VFIO and exposing RMRs)
> can be found at:
> https://github.com/eauger/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
> (use iommu=nested-smmuv3 ARM virt option)
>
> Guest dependency:
> [1] [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node

Thanks a lot for upgrading these patches.

I have basically verified these patches on HiSilicon Kunpeng920.
And integrated them to these branches.
https://github.com/Linaro/linux-kernel-uadk/tree/uacce-devel-5.16
https://github.com/Linaro/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10

Though they are provided for test purpose,

Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>

Thanks

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
@ 2021-12-03 12:27   ` Zhangfei Gao
  0 siblings, 0 replies; 116+ messages in thread
From: Zhangfei Gao @ 2021-12-03 12:27 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: peter.maydell, kevin.tian, ashok.raj, maz, vivek.gautam,
	alex.williamson, vsethi, lushenming, wangxingang5


Hi, Eric

On 2021/10/27 下午6:44, Eric Auger wrote:
> This series brings the IOMMU part of HW nested paging support
> in the SMMUv3.
>
> The SMMUv3 driver is adapted to support 2 nested stages.
>
> The IOMMU API is extended to convey the guest stage 1
> configuration and the hook is implemented in the SMMUv3 driver.
>
> This allows the guest to own the stage 1 tables and context
> descriptors (so-called PASID table) while the host owns the
> stage 2 tables and main configuration structures (STE).
>
> This work mainly is provided for test purpose as the upper
> layer integration is under rework and bound to be based on
> /dev/iommu instead of VFIO tunneling. In this version we also get
> rid of the MSI BINDING ioctl, assuming the guest enforces
> flat mapping of host IOVAs used to bind physical MSI doorbells.
> In the current QEMU integration this is achieved by exposing
> RMRs to the guest, using Shameer's series [1]. This approach
> is RFC as the IORT spec is not really meant to do that
> (single mapping flag limitation).
>
> Best Regards
>
> Eric
>
> This series (Host) can be found at:
> https://github.com/eauger/linux/tree/v5.15-rc7-nested-v16
> This includes a rebased VFIO integration (although not meant
> to be upstreamed)
>
> Guest kernel branch can be found at:
> https://github.com/eauger/linux/tree/shameer_rmrr_v7
> featuring [1]
>
> QEMU integration (still based on VFIO and exposing RMRs)
> can be found at:
> https://github.com/eauger/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
> (use iommu=nested-smmuv3 ARM virt option)
>
> Guest dependency:
> [1] [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node

Thanks a lot for upgrading these patches.

I have basically verified these patches on HiSilicon Kunpeng920.
And integrated them to these branches.
https://github.com/Linaro/linux-kernel-uadk/tree/uacce-devel-5.16
https://github.com/Linaro/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10

Though they are provided for test purpose,

Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>

Thanks
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
@ 2021-12-03 12:27   ` Zhangfei Gao
  0 siblings, 0 replies; 116+ messages in thread
From: Zhangfei Gao @ 2021-12-03 12:27 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: kevin.tian, jacob.jun.pan, ashok.raj, chenxiang66, maz, vdumpa,
	nicoleotsuka, vivek.gautam, alex.williamson, yi.l.liu, nicolinc,
	vsethi, sumitg, lushenming, wangxingang5


Hi, Eric

On 2021/10/27 下午6:44, Eric Auger wrote:
> This series brings the IOMMU part of HW nested paging support
> in the SMMUv3.
>
> The SMMUv3 driver is adapted to support 2 nested stages.
>
> The IOMMU API is extended to convey the guest stage 1
> configuration and the hook is implemented in the SMMUv3 driver.
>
> This allows the guest to own the stage 1 tables and context
> descriptors (so-called PASID table) while the host owns the
> stage 2 tables and main configuration structures (STE).
>
> This work mainly is provided for test purpose as the upper
> layer integration is under rework and bound to be based on
> /dev/iommu instead of VFIO tunneling. In this version we also get
> rid of the MSI BINDING ioctl, assuming the guest enforces
> flat mapping of host IOVAs used to bind physical MSI doorbells.
> In the current QEMU integration this is achieved by exposing
> RMRs to the guest, using Shameer's series [1]. This approach
> is RFC as the IORT spec is not really meant to do that
> (single mapping flag limitation).
>
> Best Regards
>
> Eric
>
> This series (Host) can be found at:
> https://github.com/eauger/linux/tree/v5.15-rc7-nested-v16
> This includes a rebased VFIO integration (although not meant
> to be upstreamed)
>
> Guest kernel branch can be found at:
> https://github.com/eauger/linux/tree/shameer_rmrr_v7
> featuring [1]
>
> QEMU integration (still based on VFIO and exposing RMRs)
> can be found at:
> https://github.com/eauger/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
> (use iommu=nested-smmuv3 ARM virt option)
>
> Guest dependency:
> [1] [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node

Thanks a lot for upgrading these patches.

I have basically verified these patches on HiSilicon Kunpeng920.
And integrated them to these branches.
https://github.com/Linaro/linux-kernel-uadk/tree/uacce-devel-5.16
https://github.com/Linaro/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10

Though they are provided for test purpose,

Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>

Thanks
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
  2021-10-27 10:44 ` Eric Auger
  (?)
@ 2021-12-03 13:13   ` Sumit Gupta via iommu
  -1 siblings, 0 replies; 116+ messages in thread
From: Sumit Gupta @ 2021-12-03 13:13 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: alex.williamson, jacob.jun.pan, yi.l.liu, kevin.tian, ashok.raj,
	maz, peter.maydell, vivek.gautam, shameerali.kolothum.thodi,
	wangxingang5, jiangkunkun, yuzenghui, nicoleotsuka, chenxiang66,
	nicolinc, vdumpa, zhangfei.gao, zhangfei.gao, lushenming, vsethi,
	Sachin Nikam, Sumit Gupta, Pritesh Raithatha

Hi Eric,

> This series brings the IOMMU part of HW nested paging support
> in the SMMUv3.
>
> The SMMUv3 driver is adapted to support 2 nested stages.
>
> The IOMMU API is extended to convey the guest stage 1
> configuration and the hook is implemented in the SMMUv3 driver.
>
> This allows the guest to own the stage 1 tables and context
> descriptors (so-called PASID table) while the host owns the
> stage 2 tables and main configuration structures (STE).
>
> This work mainly is provided for test purpose as the upper
> layer integration is under rework and bound to be based on
> /dev/iommu instead of VFIO tunneling. In this version we also get
> rid of the MSI BINDING ioctl, assuming the guest enforces
> flat mapping of host IOVAs used to bind physical MSI doorbells.
> In the current QEMU integration this is achieved by exposing
> RMRs to the guest, using Shameer's series [1]. This approach
> is RFC as the IORT spec is not really meant to do that
> (single mapping flag limitation).
>
> Best Regards
>
> Eric
>
> This series (Host) can be found at:
> https://github.com/eauger/linux/tree/v5.15-rc7-nested-v16
> This includes a rebased VFIO integration (although not meant
> to be upstreamed)
>
> Guest kernel branch can be found at:
> https://github.com/eauger/linux/tree/shameer_rmrr_v7
> featuring [1]
>
> QEMU integration (still based on VFIO and exposing RMRs)
> can be found at:
> https://github.com/eauger/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
> (use iommu=nested-smmuv3 ARM virt option)
>
> Guest dependency:
> [1] [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node
>
> History:
>
> v15 -> v16:
> - guest RIL must support RIL
> - additional checks in the cache invalidation hook
> - removal of the MSI BINDING ioctl (tentative replacement
>    by RMRs)
>
>
> Eric Auger (9):
>    iommu: Introduce attach/detach_pasid_table API
>    iommu: Introduce iommu_get_nesting
>    iommu/smmuv3: Allow s1 and s2 configs to coexist
>    iommu/smmuv3: Get prepared for nested stage support
>    iommu/smmuv3: Implement attach/detach_pasid_table
>    iommu/smmuv3: Allow stage 1 invalidation with unmanaged ASIDs
>    iommu/smmuv3: Implement cache_invalidate
>    iommu/smmuv3: report additional recoverable faults
>    iommu/smmuv3: Disallow nested mode in presence of HW MSI regions
Hi Eric,

I validated the reworked test patches in v16 from the given
branches with Kernel v5.15 & Qemu v6.2. Verified them with
NVMe PCI device assigned to Guest VM.
Sorry, forgot to update earlier.

Tested-by: Sumit Gupta <sumitg@nvidia.com>

Thanks,
Sumit Gupta


^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
@ 2021-12-03 13:13   ` Sumit Gupta via iommu
  0 siblings, 0 replies; 116+ messages in thread
From: Sumit Gupta via iommu @ 2021-12-03 13:13 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: peter.maydell, vivek.gautam, alex.williamson, ashok.raj, maz,
	vsethi, zhangfei.gao, kevin.tian, Pritesh Raithatha,
	Sachin Nikam, wangxingang5, lushenming

Hi Eric,

> This series brings the IOMMU part of HW nested paging support
> in the SMMUv3.
>
> The SMMUv3 driver is adapted to support 2 nested stages.
>
> The IOMMU API is extended to convey the guest stage 1
> configuration and the hook is implemented in the SMMUv3 driver.
>
> This allows the guest to own the stage 1 tables and context
> descriptors (so-called PASID table) while the host owns the
> stage 2 tables and main configuration structures (STE).
>
> This work mainly is provided for test purpose as the upper
> layer integration is under rework and bound to be based on
> /dev/iommu instead of VFIO tunneling. In this version we also get
> rid of the MSI BINDING ioctl, assuming the guest enforces
> flat mapping of host IOVAs used to bind physical MSI doorbells.
> In the current QEMU integration this is achieved by exposing
> RMRs to the guest, using Shameer's series [1]. This approach
> is RFC as the IORT spec is not really meant to do that
> (single mapping flag limitation).
>
> Best Regards
>
> Eric
>
> This series (Host) can be found at:
> https://github.com/eauger/linux/tree/v5.15-rc7-nested-v16
> This includes a rebased VFIO integration (although not meant
> to be upstreamed)
>
> Guest kernel branch can be found at:
> https://github.com/eauger/linux/tree/shameer_rmrr_v7
> featuring [1]
>
> QEMU integration (still based on VFIO and exposing RMRs)
> can be found at:
> https://github.com/eauger/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
> (use iommu=nested-smmuv3 ARM virt option)
>
> Guest dependency:
> [1] [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node
>
> History:
>
> v15 -> v16:
> - guest RIL must support RIL
> - additional checks in the cache invalidation hook
> - removal of the MSI BINDING ioctl (tentative replacement
>    by RMRs)
>
>
> Eric Auger (9):
>    iommu: Introduce attach/detach_pasid_table API
>    iommu: Introduce iommu_get_nesting
>    iommu/smmuv3: Allow s1 and s2 configs to coexist
>    iommu/smmuv3: Get prepared for nested stage support
>    iommu/smmuv3: Implement attach/detach_pasid_table
>    iommu/smmuv3: Allow stage 1 invalidation with unmanaged ASIDs
>    iommu/smmuv3: Implement cache_invalidate
>    iommu/smmuv3: report additional recoverable faults
>    iommu/smmuv3: Disallow nested mode in presence of HW MSI regions
Hi Eric,

I validated the reworked test patches in v16 from the given
branches with Kernel v5.15 & Qemu v6.2. Verified them with
NVMe PCI device assigned to Guest VM.
Sorry, forgot to update earlier.

Tested-by: Sumit Gupta <sumitg@nvidia.com>

Thanks,
Sumit Gupta

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
@ 2021-12-03 13:13   ` Sumit Gupta via iommu
  0 siblings, 0 replies; 116+ messages in thread
From: Sumit Gupta @ 2021-12-03 13:13 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: vivek.gautam, yi.l.liu, alex.williamson, ashok.raj, maz, vsethi,
	nicolinc, zhangfei.gao, Sumit Gupta, kevin.tian, jacob.jun.pan,
	Pritesh Raithatha, nicoleotsuka, Sachin Nikam, wangxingang5,
	chenxiang66, vdumpa, lushenming

Hi Eric,

> This series brings the IOMMU part of HW nested paging support
> in the SMMUv3.
>
> The SMMUv3 driver is adapted to support 2 nested stages.
>
> The IOMMU API is extended to convey the guest stage 1
> configuration and the hook is implemented in the SMMUv3 driver.
>
> This allows the guest to own the stage 1 tables and context
> descriptors (so-called PASID table) while the host owns the
> stage 2 tables and main configuration structures (STE).
>
> This work mainly is provided for test purpose as the upper
> layer integration is under rework and bound to be based on
> /dev/iommu instead of VFIO tunneling. In this version we also get
> rid of the MSI BINDING ioctl, assuming the guest enforces
> flat mapping of host IOVAs used to bind physical MSI doorbells.
> In the current QEMU integration this is achieved by exposing
> RMRs to the guest, using Shameer's series [1]. This approach
> is RFC as the IORT spec is not really meant to do that
> (single mapping flag limitation).
>
> Best Regards
>
> Eric
>
> This series (Host) can be found at:
> https://github.com/eauger/linux/tree/v5.15-rc7-nested-v16
> This includes a rebased VFIO integration (although not meant
> to be upstreamed)
>
> Guest kernel branch can be found at:
> https://github.com/eauger/linux/tree/shameer_rmrr_v7
> featuring [1]
>
> QEMU integration (still based on VFIO and exposing RMRs)
> can be found at:
> https://github.com/eauger/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
> (use iommu=nested-smmuv3 ARM virt option)
>
> Guest dependency:
> [1] [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node
>
> History:
>
> v15 -> v16:
> - guest RIL must support RIL
> - additional checks in the cache invalidation hook
> - removal of the MSI BINDING ioctl (tentative replacement
>    by RMRs)
>
>
> Eric Auger (9):
>    iommu: Introduce attach/detach_pasid_table API
>    iommu: Introduce iommu_get_nesting
>    iommu/smmuv3: Allow s1 and s2 configs to coexist
>    iommu/smmuv3: Get prepared for nested stage support
>    iommu/smmuv3: Implement attach/detach_pasid_table
>    iommu/smmuv3: Allow stage 1 invalidation with unmanaged ASIDs
>    iommu/smmuv3: Implement cache_invalidate
>    iommu/smmuv3: report additional recoverable faults
>    iommu/smmuv3: Disallow nested mode in presence of HW MSI regions
Hi Eric,

I validated the reworked test patches in v16 from the given
branches with Kernel v5.15 & Qemu v6.2. Verified them with
NVMe PCI device assigned to Guest VM.
Sorry, forgot to update earlier.

Tested-by: Sumit Gupta <sumitg@nvidia.com>

Thanks,
Sumit Gupta

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
  2021-10-27 10:44   ` Eric Auger
  (?)
@ 2021-12-06 10:48     ` Joerg Roedel
  -1 siblings, 0 replies; 116+ messages in thread
From: Joerg Roedel @ 2021-12-06 10:48 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, iommu, linux-kernel, kvm, kvmarm, will,
	robin.murphy, jean-philippe, zhukeqian1, alex.williamson,
	jacob.jun.pan, yi.l.liu, kevin.tian, ashok.raj, maz,
	peter.maydell, vivek.gautam, shameerali.kolothum.thodi,
	wangxingang5, jiangkunkun, yuzenghui, nicoleotsuka, chenxiang66,
	sumitg, nicolinc, vdumpa, zhangfei.gao, zhangfei.gao, lushenming,
	vsethi

On Wed, Oct 27, 2021 at 12:44:20PM +0200, Eric Auger wrote:
> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
> Signed-off-by: Liu, Yi L <yi.l.liu@linux.intel.com>
> Signed-off-by: Ashok Raj <ashok.raj@intel.com>
> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>

This Signed-of-by chain looks dubious, you are the author but the last
one in the chain?

> +int iommu_uapi_attach_pasid_table(struct iommu_domain *domain,
> +				  void __user *uinfo)
> +{

[...]

> +	if (pasid_table_data.format == IOMMU_PASID_FORMAT_SMMUV3 &&
> +	    pasid_table_data.argsz <
> +		offsetofend(struct iommu_pasid_table_config, vendor_data.smmuv3))
> +		return -EINVAL;

This check looks like it belongs in driver specific code.

Regards,

	Joerg


^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-06 10:48     ` Joerg Roedel
  0 siblings, 0 replies; 116+ messages in thread
From: Joerg Roedel @ 2021-12-06 10:48 UTC (permalink / raw)
  To: Eric Auger
  Cc: peter.maydell, kvm, vivek.gautam, kvmarm, eric.auger.pro,
	jean-philippe, ashok.raj, maz, vsethi, zhangfei.gao, kevin.tian,
	will, alex.williamson, wangxingang5, linux-kernel, lushenming,
	iommu, robin.murphy

On Wed, Oct 27, 2021 at 12:44:20PM +0200, Eric Auger wrote:
> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
> Signed-off-by: Liu, Yi L <yi.l.liu@linux.intel.com>
> Signed-off-by: Ashok Raj <ashok.raj@intel.com>
> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>

This Signed-of-by chain looks dubious, you are the author but the last
one in the chain?

> +int iommu_uapi_attach_pasid_table(struct iommu_domain *domain,
> +				  void __user *uinfo)
> +{

[...]

> +	if (pasid_table_data.format == IOMMU_PASID_FORMAT_SMMUV3 &&
> +	    pasid_table_data.argsz <
> +		offsetofend(struct iommu_pasid_table_config, vendor_data.smmuv3))
> +		return -EINVAL;

This check looks like it belongs in driver specific code.

Regards,

	Joerg

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-06 10:48     ` Joerg Roedel
  0 siblings, 0 replies; 116+ messages in thread
From: Joerg Roedel @ 2021-12-06 10:48 UTC (permalink / raw)
  To: Eric Auger
  Cc: kvm, vivek.gautam, vdumpa, kvmarm, eric.auger.pro, jean-philippe,
	yi.l.liu, ashok.raj, maz, vsethi, nicolinc, zhangfei.gao, sumitg,
	kevin.tian, jacob.jun.pan, will, nicoleotsuka, alex.williamson,
	wangxingang5, chenxiang66, linux-kernel, lushenming, iommu,
	robin.murphy

On Wed, Oct 27, 2021 at 12:44:20PM +0200, Eric Auger wrote:
> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
> Signed-off-by: Liu, Yi L <yi.l.liu@linux.intel.com>
> Signed-off-by: Ashok Raj <ashok.raj@intel.com>
> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>

This Signed-of-by chain looks dubious, you are the author but the last
one in the chain?

> +int iommu_uapi_attach_pasid_table(struct iommu_domain *domain,
> +				  void __user *uinfo)
> +{

[...]

> +	if (pasid_table_data.format == IOMMU_PASID_FORMAT_SMMUV3 &&
> +	    pasid_table_data.argsz <
> +		offsetofend(struct iommu_pasid_table_config, vendor_data.smmuv3))
> +		return -EINVAL;

This check looks like it belongs in driver specific code.

Regards,

	Joerg

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
  2021-12-06 10:48     ` Joerg Roedel
  (?)
@ 2021-12-07 10:22       ` Eric Auger
  -1 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-07 10:22 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: eric.auger.pro, iommu, linux-kernel, kvm, kvmarm, will,
	robin.murphy, jean-philippe, zhukeqian1, alex.williamson,
	jacob.jun.pan, yi.l.liu, kevin.tian, ashok.raj, maz,
	peter.maydell, vivek.gautam, shameerali.kolothum.thodi,
	wangxingang5, jiangkunkun, yuzenghui, nicoleotsuka, chenxiang66,
	sumitg, nicolinc, vdumpa, zhangfei.gao, zhangfei.gao, lushenming,
	vsethi

Hi Joerg,

On 12/6/21 11:48 AM, Joerg Roedel wrote:
> On Wed, Oct 27, 2021 at 12:44:20PM +0200, Eric Auger wrote:
>> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
>> Signed-off-by: Liu, Yi L <yi.l.liu@linux.intel.com>
>> Signed-off-by: Ashok Raj <ashok.raj@intel.com>
>> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> This Signed-of-by chain looks dubious, you are the author but the last
> one in the chain?
The 1st RFC in Aug 2018
(https://lists.cs.columbia.edu/pipermail/kvmarm/2018-August/032478.html)
said this was a generalization of Jacob's patch


  [PATCH v5 01/23] iommu: introduce bind_pasid_table API function


  https://lists.linuxfoundation.org/pipermail/iommu/2018-May/027647.html

So indeed Jacob should be the author. I guess the multiple rebases got
this eventually replaced at some point, which is not an excuse. Please
forgive me for that.
Now the original patch already had this list of SoB so I don't know if I
shall simplify it.


>
>> +int iommu_uapi_attach_pasid_table(struct iommu_domain *domain,
>> +				  void __user *uinfo)
>> +{
> [...]
>
>> +	if (pasid_table_data.format == IOMMU_PASID_FORMAT_SMMUV3 &&
>> +	    pasid_table_data.argsz <
>> +		offsetofend(struct iommu_pasid_table_config, vendor_data.smmuv3))
>> +		return -EINVAL;
> This check looks like it belongs in driver specific code.
Indeed, I will fix that in my next respin :-)

Thanks!

Eric
>
> Regards,
>
> 	Joerg
>


^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-07 10:22       ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-07 10:22 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: peter.maydell, kvm, vivek.gautam, kvmarm, eric.auger.pro,
	jean-philippe, ashok.raj, maz, vsethi, zhangfei.gao, kevin.tian,
	will, alex.williamson, wangxingang5, linux-kernel, lushenming,
	iommu, robin.murphy

Hi Joerg,

On 12/6/21 11:48 AM, Joerg Roedel wrote:
> On Wed, Oct 27, 2021 at 12:44:20PM +0200, Eric Auger wrote:
>> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
>> Signed-off-by: Liu, Yi L <yi.l.liu@linux.intel.com>
>> Signed-off-by: Ashok Raj <ashok.raj@intel.com>
>> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> This Signed-of-by chain looks dubious, you are the author but the last
> one in the chain?
The 1st RFC in Aug 2018
(https://lists.cs.columbia.edu/pipermail/kvmarm/2018-August/032478.html)
said this was a generalization of Jacob's patch


  [PATCH v5 01/23] iommu: introduce bind_pasid_table API function


  https://lists.linuxfoundation.org/pipermail/iommu/2018-May/027647.html

So indeed Jacob should be the author. I guess the multiple rebases got
this eventually replaced at some point, which is not an excuse. Please
forgive me for that.
Now the original patch already had this list of SoB so I don't know if I
shall simplify it.


>
>> +int iommu_uapi_attach_pasid_table(struct iommu_domain *domain,
>> +				  void __user *uinfo)
>> +{
> [...]
>
>> +	if (pasid_table_data.format == IOMMU_PASID_FORMAT_SMMUV3 &&
>> +	    pasid_table_data.argsz <
>> +		offsetofend(struct iommu_pasid_table_config, vendor_data.smmuv3))
>> +		return -EINVAL;
> This check looks like it belongs in driver specific code.
Indeed, I will fix that in my next respin :-)

Thanks!

Eric
>
> Regards,
>
> 	Joerg
>

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-07 10:22       ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-07 10:22 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: kvm, vivek.gautam, vdumpa, kvmarm, eric.auger.pro, jean-philippe,
	yi.l.liu, ashok.raj, maz, vsethi, nicolinc, zhangfei.gao, sumitg,
	kevin.tian, jacob.jun.pan, will, nicoleotsuka, alex.williamson,
	wangxingang5, chenxiang66, linux-kernel, lushenming, iommu,
	robin.murphy

Hi Joerg,

On 12/6/21 11:48 AM, Joerg Roedel wrote:
> On Wed, Oct 27, 2021 at 12:44:20PM +0200, Eric Auger wrote:
>> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
>> Signed-off-by: Liu, Yi L <yi.l.liu@linux.intel.com>
>> Signed-off-by: Ashok Raj <ashok.raj@intel.com>
>> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> This Signed-of-by chain looks dubious, you are the author but the last
> one in the chain?
The 1st RFC in Aug 2018
(https://lists.cs.columbia.edu/pipermail/kvmarm/2018-August/032478.html)
said this was a generalization of Jacob's patch


  [PATCH v5 01/23] iommu: introduce bind_pasid_table API function


  https://lists.linuxfoundation.org/pipermail/iommu/2018-May/027647.html

So indeed Jacob should be the author. I guess the multiple rebases got
this eventually replaced at some point, which is not an excuse. Please
forgive me for that.
Now the original patch already had this list of SoB so I don't know if I
shall simplify it.


>
>> +int iommu_uapi_attach_pasid_table(struct iommu_domain *domain,
>> +				  void __user *uinfo)
>> +{
> [...]
>
>> +	if (pasid_table_data.format == IOMMU_PASID_FORMAT_SMMUV3 &&
>> +	    pasid_table_data.argsz <
>> +		offsetofend(struct iommu_pasid_table_config, vendor_data.smmuv3))
>> +		return -EINVAL;
> This check looks like it belongs in driver specific code.
Indeed, I will fix that in my next respin :-)

Thanks!

Eric
>
> Regards,
>
> 	Joerg
>

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
  2021-12-03 12:27   ` Zhangfei Gao
  (?)
@ 2021-12-07 10:27     ` Eric Auger
  -1 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-07 10:27 UTC (permalink / raw)
  To: Zhangfei Gao, eric.auger.pro, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: alex.williamson, jacob.jun.pan, yi.l.liu, kevin.tian, ashok.raj,
	maz, peter.maydell, vivek.gautam, shameerali.kolothum.thodi,
	wangxingang5, jiangkunkun, yuzenghui, nicoleotsuka, chenxiang66,
	sumitg, nicolinc, vdumpa, zhangfei.gao, lushenming, vsethi

Hi Zhangfei,

On 12/3/21 1:27 PM, Zhangfei Gao wrote:
>
> Hi, Eric
>
> On 2021/10/27 下午6:44, Eric Auger wrote:
>> This series brings the IOMMU part of HW nested paging support
>> in the SMMUv3.
>>
>> The SMMUv3 driver is adapted to support 2 nested stages.
>>
>> The IOMMU API is extended to convey the guest stage 1
>> configuration and the hook is implemented in the SMMUv3 driver.
>>
>> This allows the guest to own the stage 1 tables and context
>> descriptors (so-called PASID table) while the host owns the
>> stage 2 tables and main configuration structures (STE).
>>
>> This work mainly is provided for test purpose as the upper
>> layer integration is under rework and bound to be based on
>> /dev/iommu instead of VFIO tunneling. In this version we also get
>> rid of the MSI BINDING ioctl, assuming the guest enforces
>> flat mapping of host IOVAs used to bind physical MSI doorbells.
>> In the current QEMU integration this is achieved by exposing
>> RMRs to the guest, using Shameer's series [1]. This approach
>> is RFC as the IORT spec is not really meant to do that
>> (single mapping flag limitation).
>>
>> Best Regards
>>
>> Eric
>>
>> This series (Host) can be found at:
>> https://github.com/eauger/linux/tree/v5.15-rc7-nested-v16
>> This includes a rebased VFIO integration (although not meant
>> to be upstreamed)
>>
>> Guest kernel branch can be found at:
>> https://github.com/eauger/linux/tree/shameer_rmrr_v7
>> featuring [1]
>>
>> QEMU integration (still based on VFIO and exposing RMRs)
>> can be found at:
>> https://github.com/eauger/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
>> (use iommu=nested-smmuv3 ARM virt option)
>>
>> Guest dependency:
>> [1] [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node
>
> Thanks a lot for upgrading these patches.
>
> I have basically verified these patches on HiSilicon Kunpeng920.
> And integrated them to these branches.
> https://github.com/Linaro/linux-kernel-uadk/tree/uacce-devel-5.16
> https://github.com/Linaro/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
>
> Though they are provided for test purpose,
>
> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>

Thank you very much. As you mentioned, until we do not have the
/dev/iommu integration this is maintained for testing purpose. The SMMU
changes shouldn't be much impacted though.
The added value of this respin was to propose an MSI binding solution
based on RMRRs which simplify things at kernel level.

Thanks!

Eric
>
> Thanks
>


^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
@ 2021-12-07 10:27     ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-07 10:27 UTC (permalink / raw)
  To: Zhangfei Gao, eric.auger.pro, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: peter.maydell, kevin.tian, ashok.raj, maz, vivek.gautam,
	alex.williamson, vsethi, lushenming, wangxingang5

Hi Zhangfei,

On 12/3/21 1:27 PM, Zhangfei Gao wrote:
>
> Hi, Eric
>
> On 2021/10/27 下午6:44, Eric Auger wrote:
>> This series brings the IOMMU part of HW nested paging support
>> in the SMMUv3.
>>
>> The SMMUv3 driver is adapted to support 2 nested stages.
>>
>> The IOMMU API is extended to convey the guest stage 1
>> configuration and the hook is implemented in the SMMUv3 driver.
>>
>> This allows the guest to own the stage 1 tables and context
>> descriptors (so-called PASID table) while the host owns the
>> stage 2 tables and main configuration structures (STE).
>>
>> This work mainly is provided for test purpose as the upper
>> layer integration is under rework and bound to be based on
>> /dev/iommu instead of VFIO tunneling. In this version we also get
>> rid of the MSI BINDING ioctl, assuming the guest enforces
>> flat mapping of host IOVAs used to bind physical MSI doorbells.
>> In the current QEMU integration this is achieved by exposing
>> RMRs to the guest, using Shameer's series [1]. This approach
>> is RFC as the IORT spec is not really meant to do that
>> (single mapping flag limitation).
>>
>> Best Regards
>>
>> Eric
>>
>> This series (Host) can be found at:
>> https://github.com/eauger/linux/tree/v5.15-rc7-nested-v16
>> This includes a rebased VFIO integration (although not meant
>> to be upstreamed)
>>
>> Guest kernel branch can be found at:
>> https://github.com/eauger/linux/tree/shameer_rmrr_v7
>> featuring [1]
>>
>> QEMU integration (still based on VFIO and exposing RMRs)
>> can be found at:
>> https://github.com/eauger/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
>> (use iommu=nested-smmuv3 ARM virt option)
>>
>> Guest dependency:
>> [1] [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node
>
> Thanks a lot for upgrading these patches.
>
> I have basically verified these patches on HiSilicon Kunpeng920.
> And integrated them to these branches.
> https://github.com/Linaro/linux-kernel-uadk/tree/uacce-devel-5.16
> https://github.com/Linaro/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
>
> Though they are provided for test purpose,
>
> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>

Thank you very much. As you mentioned, until we do not have the
/dev/iommu integration this is maintained for testing purpose. The SMMU
changes shouldn't be much impacted though.
The added value of this respin was to propose an MSI binding solution
based on RMRRs which simplify things at kernel level.

Thanks!

Eric
>
> Thanks
>

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
@ 2021-12-07 10:27     ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-07 10:27 UTC (permalink / raw)
  To: Zhangfei Gao, eric.auger.pro, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: kevin.tian, jacob.jun.pan, ashok.raj, chenxiang66, maz, vdumpa,
	nicoleotsuka, vivek.gautam, alex.williamson, yi.l.liu, nicolinc,
	vsethi, sumitg, lushenming, wangxingang5

Hi Zhangfei,

On 12/3/21 1:27 PM, Zhangfei Gao wrote:
>
> Hi, Eric
>
> On 2021/10/27 下午6:44, Eric Auger wrote:
>> This series brings the IOMMU part of HW nested paging support
>> in the SMMUv3.
>>
>> The SMMUv3 driver is adapted to support 2 nested stages.
>>
>> The IOMMU API is extended to convey the guest stage 1
>> configuration and the hook is implemented in the SMMUv3 driver.
>>
>> This allows the guest to own the stage 1 tables and context
>> descriptors (so-called PASID table) while the host owns the
>> stage 2 tables and main configuration structures (STE).
>>
>> This work mainly is provided for test purpose as the upper
>> layer integration is under rework and bound to be based on
>> /dev/iommu instead of VFIO tunneling. In this version we also get
>> rid of the MSI BINDING ioctl, assuming the guest enforces
>> flat mapping of host IOVAs used to bind physical MSI doorbells.
>> In the current QEMU integration this is achieved by exposing
>> RMRs to the guest, using Shameer's series [1]. This approach
>> is RFC as the IORT spec is not really meant to do that
>> (single mapping flag limitation).
>>
>> Best Regards
>>
>> Eric
>>
>> This series (Host) can be found at:
>> https://github.com/eauger/linux/tree/v5.15-rc7-nested-v16
>> This includes a rebased VFIO integration (although not meant
>> to be upstreamed)
>>
>> Guest kernel branch can be found at:
>> https://github.com/eauger/linux/tree/shameer_rmrr_v7
>> featuring [1]
>>
>> QEMU integration (still based on VFIO and exposing RMRs)
>> can be found at:
>> https://github.com/eauger/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
>> (use iommu=nested-smmuv3 ARM virt option)
>>
>> Guest dependency:
>> [1] [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node
>
> Thanks a lot for upgrading these patches.
>
> I have basically verified these patches on HiSilicon Kunpeng920.
> And integrated them to these branches.
> https://github.com/Linaro/linux-kernel-uadk/tree/uacce-devel-5.16
> https://github.com/Linaro/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
>
> Though they are provided for test purpose,
>
> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>

Thank you very much. As you mentioned, until we do not have the
/dev/iommu integration this is maintained for testing purpose. The SMMU
changes shouldn't be much impacted though.
The added value of this respin was to propose an MSI binding solution
based on RMRRs which simplify things at kernel level.

Thanks!

Eric
>
> Thanks
>

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
  2021-12-03 13:13   ` Sumit Gupta via iommu
  (?)
@ 2021-12-07 10:28     ` Eric Auger
  -1 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-07 10:28 UTC (permalink / raw)
  To: Sumit Gupta, eric.auger.pro, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: alex.williamson, jacob.jun.pan, yi.l.liu, kevin.tian, ashok.raj,
	maz, peter.maydell, vivek.gautam, shameerali.kolothum.thodi,
	wangxingang5, jiangkunkun, yuzenghui, nicoleotsuka, chenxiang66,
	nicolinc, vdumpa, zhangfei.gao, zhangfei.gao, lushenming, vsethi,
	Sachin Nikam, Pritesh Raithatha

Hi Sumit,

On 12/3/21 2:13 PM, Sumit Gupta wrote:
> Hi Eric,
>
>> This series brings the IOMMU part of HW nested paging support
>> in the SMMUv3.
>>
>> The SMMUv3 driver is adapted to support 2 nested stages.
>>
>> The IOMMU API is extended to convey the guest stage 1
>> configuration and the hook is implemented in the SMMUv3 driver.
>>
>> This allows the guest to own the stage 1 tables and context
>> descriptors (so-called PASID table) while the host owns the
>> stage 2 tables and main configuration structures (STE).
>>
>> This work mainly is provided for test purpose as the upper
>> layer integration is under rework and bound to be based on
>> /dev/iommu instead of VFIO tunneling. In this version we also get
>> rid of the MSI BINDING ioctl, assuming the guest enforces
>> flat mapping of host IOVAs used to bind physical MSI doorbells.
>> In the current QEMU integration this is achieved by exposing
>> RMRs to the guest, using Shameer's series [1]. This approach
>> is RFC as the IORT spec is not really meant to do that
>> (single mapping flag limitation).
>>
>> Best Regards
>>
>> Eric
>>
>> This series (Host) can be found at:
>> https://github.com/eauger/linux/tree/v5.15-rc7-nested-v16
>> This includes a rebased VFIO integration (although not meant
>> to be upstreamed)
>>
>> Guest kernel branch can be found at:
>> https://github.com/eauger/linux/tree/shameer_rmrr_v7
>> featuring [1]
>>
>> QEMU integration (still based on VFIO and exposing RMRs)
>> can be found at:
>> https://github.com/eauger/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
>> (use iommu=nested-smmuv3 ARM virt option)
>>
>> Guest dependency:
>> [1] [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node
>>
>> History:
>>
>> v15 -> v16:
>> - guest RIL must support RIL
>> - additional checks in the cache invalidation hook
>> - removal of the MSI BINDING ioctl (tentative replacement
>>    by RMRs)
>>
>>
>> Eric Auger (9):
>>    iommu: Introduce attach/detach_pasid_table API
>>    iommu: Introduce iommu_get_nesting
>>    iommu/smmuv3: Allow s1 and s2 configs to coexist
>>    iommu/smmuv3: Get prepared for nested stage support
>>    iommu/smmuv3: Implement attach/detach_pasid_table
>>    iommu/smmuv3: Allow stage 1 invalidation with unmanaged ASIDs
>>    iommu/smmuv3: Implement cache_invalidate
>>    iommu/smmuv3: report additional recoverable faults
>>    iommu/smmuv3: Disallow nested mode in presence of HW MSI regions
> Hi Eric,
>
> I validated the reworked test patches in v16 from the given
> branches with Kernel v5.15 & Qemu v6.2. Verified them with
> NVMe PCI device assigned to Guest VM.
> Sorry, forgot to update earlier.
>
> Tested-by: Sumit Gupta <sumitg@nvidia.com>

Thank you very much!

Best Regards

Eric
>
> Thanks,
> Sumit Gupta
>


^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
@ 2021-12-07 10:28     ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-07 10:28 UTC (permalink / raw)
  To: Sumit Gupta, eric.auger.pro, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: peter.maydell, vivek.gautam, alex.williamson, ashok.raj, maz,
	vsethi, zhangfei.gao, kevin.tian, Pritesh Raithatha,
	Sachin Nikam, wangxingang5, lushenming

Hi Sumit,

On 12/3/21 2:13 PM, Sumit Gupta wrote:
> Hi Eric,
>
>> This series brings the IOMMU part of HW nested paging support
>> in the SMMUv3.
>>
>> The SMMUv3 driver is adapted to support 2 nested stages.
>>
>> The IOMMU API is extended to convey the guest stage 1
>> configuration and the hook is implemented in the SMMUv3 driver.
>>
>> This allows the guest to own the stage 1 tables and context
>> descriptors (so-called PASID table) while the host owns the
>> stage 2 tables and main configuration structures (STE).
>>
>> This work mainly is provided for test purpose as the upper
>> layer integration is under rework and bound to be based on
>> /dev/iommu instead of VFIO tunneling. In this version we also get
>> rid of the MSI BINDING ioctl, assuming the guest enforces
>> flat mapping of host IOVAs used to bind physical MSI doorbells.
>> In the current QEMU integration this is achieved by exposing
>> RMRs to the guest, using Shameer's series [1]. This approach
>> is RFC as the IORT spec is not really meant to do that
>> (single mapping flag limitation).
>>
>> Best Regards
>>
>> Eric
>>
>> This series (Host) can be found at:
>> https://github.com/eauger/linux/tree/v5.15-rc7-nested-v16
>> This includes a rebased VFIO integration (although not meant
>> to be upstreamed)
>>
>> Guest kernel branch can be found at:
>> https://github.com/eauger/linux/tree/shameer_rmrr_v7
>> featuring [1]
>>
>> QEMU integration (still based on VFIO and exposing RMRs)
>> can be found at:
>> https://github.com/eauger/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
>> (use iommu=nested-smmuv3 ARM virt option)
>>
>> Guest dependency:
>> [1] [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node
>>
>> History:
>>
>> v15 -> v16:
>> - guest RIL must support RIL
>> - additional checks in the cache invalidation hook
>> - removal of the MSI BINDING ioctl (tentative replacement
>>    by RMRs)
>>
>>
>> Eric Auger (9):
>>    iommu: Introduce attach/detach_pasid_table API
>>    iommu: Introduce iommu_get_nesting
>>    iommu/smmuv3: Allow s1 and s2 configs to coexist
>>    iommu/smmuv3: Get prepared for nested stage support
>>    iommu/smmuv3: Implement attach/detach_pasid_table
>>    iommu/smmuv3: Allow stage 1 invalidation with unmanaged ASIDs
>>    iommu/smmuv3: Implement cache_invalidate
>>    iommu/smmuv3: report additional recoverable faults
>>    iommu/smmuv3: Disallow nested mode in presence of HW MSI regions
> Hi Eric,
>
> I validated the reworked test patches in v16 from the given
> branches with Kernel v5.15 & Qemu v6.2. Verified them with
> NVMe PCI device assigned to Guest VM.
> Sorry, forgot to update earlier.
>
> Tested-by: Sumit Gupta <sumitg@nvidia.com>

Thank you very much!

Best Regards

Eric
>
> Thanks,
> Sumit Gupta
>

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
@ 2021-12-07 10:28     ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-07 10:28 UTC (permalink / raw)
  To: Sumit Gupta, eric.auger.pro, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: vivek.gautam, yi.l.liu, alex.williamson, ashok.raj, maz, vsethi,
	nicolinc, zhangfei.gao, kevin.tian, jacob.jun.pan,
	Pritesh Raithatha, nicoleotsuka, Sachin Nikam, wangxingang5,
	chenxiang66, vdumpa, lushenming

Hi Sumit,

On 12/3/21 2:13 PM, Sumit Gupta wrote:
> Hi Eric,
>
>> This series brings the IOMMU part of HW nested paging support
>> in the SMMUv3.
>>
>> The SMMUv3 driver is adapted to support 2 nested stages.
>>
>> The IOMMU API is extended to convey the guest stage 1
>> configuration and the hook is implemented in the SMMUv3 driver.
>>
>> This allows the guest to own the stage 1 tables and context
>> descriptors (so-called PASID table) while the host owns the
>> stage 2 tables and main configuration structures (STE).
>>
>> This work mainly is provided for test purpose as the upper
>> layer integration is under rework and bound to be based on
>> /dev/iommu instead of VFIO tunneling. In this version we also get
>> rid of the MSI BINDING ioctl, assuming the guest enforces
>> flat mapping of host IOVAs used to bind physical MSI doorbells.
>> In the current QEMU integration this is achieved by exposing
>> RMRs to the guest, using Shameer's series [1]. This approach
>> is RFC as the IORT spec is not really meant to do that
>> (single mapping flag limitation).
>>
>> Best Regards
>>
>> Eric
>>
>> This series (Host) can be found at:
>> https://github.com/eauger/linux/tree/v5.15-rc7-nested-v16
>> This includes a rebased VFIO integration (although not meant
>> to be upstreamed)
>>
>> Guest kernel branch can be found at:
>> https://github.com/eauger/linux/tree/shameer_rmrr_v7
>> featuring [1]
>>
>> QEMU integration (still based on VFIO and exposing RMRs)
>> can be found at:
>> https://github.com/eauger/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
>> (use iommu=nested-smmuv3 ARM virt option)
>>
>> Guest dependency:
>> [1] [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node
>>
>> History:
>>
>> v15 -> v16:
>> - guest RIL must support RIL
>> - additional checks in the cache invalidation hook
>> - removal of the MSI BINDING ioctl (tentative replacement
>>    by RMRs)
>>
>>
>> Eric Auger (9):
>>    iommu: Introduce attach/detach_pasid_table API
>>    iommu: Introduce iommu_get_nesting
>>    iommu/smmuv3: Allow s1 and s2 configs to coexist
>>    iommu/smmuv3: Get prepared for nested stage support
>>    iommu/smmuv3: Implement attach/detach_pasid_table
>>    iommu/smmuv3: Allow stage 1 invalidation with unmanaged ASIDs
>>    iommu/smmuv3: Implement cache_invalidate
>>    iommu/smmuv3: report additional recoverable faults
>>    iommu/smmuv3: Disallow nested mode in presence of HW MSI regions
> Hi Eric,
>
> I validated the reworked test patches in v16 from the given
> branches with Kernel v5.15 & Qemu v6.2. Verified them with
> NVMe PCI device assigned to Guest VM.
> Sorry, forgot to update earlier.
>
> Tested-by: Sumit Gupta <sumitg@nvidia.com>

Thank you very much!

Best Regards

Eric
>
> Thanks,
> Sumit Gupta
>

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
  2021-12-07 10:27     ` Eric Auger
  (?)
@ 2021-12-07 10:35       ` Zhangfei Gao
  -1 siblings, 0 replies; 116+ messages in thread
From: Zhangfei Gao @ 2021-12-07 10:35 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: alex.williamson, jacob.jun.pan, yi.l.liu, kevin.tian, ashok.raj,
	maz, peter.maydell, vivek.gautam, shameerali.kolothum.thodi,
	wangxingang5, jiangkunkun, yuzenghui, nicoleotsuka, chenxiang66,
	sumitg, nicolinc, vdumpa, zhangfei.gao, lushenming, vsethi



On 2021/12/7 下午6:27, Eric Auger wrote:
> Hi Zhangfei,
>
> On 12/3/21 1:27 PM, Zhangfei Gao wrote:
>> Hi, Eric
>>
>> On 2021/10/27 下午6:44, Eric Auger wrote:
>>> This series brings the IOMMU part of HW nested paging support
>>> in the SMMUv3.
>>>
>>> The SMMUv3 driver is adapted to support 2 nested stages.
>>>
>>> The IOMMU API is extended to convey the guest stage 1
>>> configuration and the hook is implemented in the SMMUv3 driver.
>>>
>>> This allows the guest to own the stage 1 tables and context
>>> descriptors (so-called PASID table) while the host owns the
>>> stage 2 tables and main configuration structures (STE).
>>>
>>> This work mainly is provided for test purpose as the upper
>>> layer integration is under rework and bound to be based on
>>> /dev/iommu instead of VFIO tunneling. In this version we also get
>>> rid of the MSI BINDING ioctl, assuming the guest enforces
>>> flat mapping of host IOVAs used to bind physical MSI doorbells.
>>> In the current QEMU integration this is achieved by exposing
>>> RMRs to the guest, using Shameer's series [1]. This approach
>>> is RFC as the IORT spec is not really meant to do that
>>> (single mapping flag limitation).
>>>
>>> Best Regards
>>>
>>> Eric
>>>
>>> This series (Host) can be found at:
>>> https://github.com/eauger/linux/tree/v5.15-rc7-nested-v16
>>> This includes a rebased VFIO integration (although not meant
>>> to be upstreamed)
>>>
>>> Guest kernel branch can be found at:
>>> https://github.com/eauger/linux/tree/shameer_rmrr_v7
>>> featuring [1]
>>>
>>> QEMU integration (still based on VFIO and exposing RMRs)
>>> can be found at:
>>> https://github.com/eauger/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
>>> (use iommu=nested-smmuv3 ARM virt option)
>>>
>>> Guest dependency:
>>> [1] [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node
>> Thanks a lot for upgrading these patches.
>>
>> I have basically verified these patches on HiSilicon Kunpeng920.
>> And integrated them to these branches.
>> https://github.com/Linaro/linux-kernel-uadk/tree/uacce-devel-5.16
>> https://github.com/Linaro/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
>>
>> Though they are provided for test purpose,
>>
>> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
> Thank you very much. As you mentioned, until we do not have the
> /dev/iommu integration this is maintained for testing purpose. The SMMU
> changes shouldn't be much impacted though.
> The added value of this respin was to propose an MSI binding solution
> based on RMRRs which simplify things at kernel level.

Current RMRR solution requires uefi enabled,
and QEMU_EFI.fd  has to be provided to start qemu.

Any plan to support dtb as well, which will be simpler since no need 
QEMU_EFI.fd anymore.

Thanks



^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
@ 2021-12-07 10:35       ` Zhangfei Gao
  0 siblings, 0 replies; 116+ messages in thread
From: Zhangfei Gao @ 2021-12-07 10:35 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: peter.maydell, kevin.tian, ashok.raj, maz, vivek.gautam,
	alex.williamson, vsethi, lushenming, wangxingang5



On 2021/12/7 下午6:27, Eric Auger wrote:
> Hi Zhangfei,
>
> On 12/3/21 1:27 PM, Zhangfei Gao wrote:
>> Hi, Eric
>>
>> On 2021/10/27 下午6:44, Eric Auger wrote:
>>> This series brings the IOMMU part of HW nested paging support
>>> in the SMMUv3.
>>>
>>> The SMMUv3 driver is adapted to support 2 nested stages.
>>>
>>> The IOMMU API is extended to convey the guest stage 1
>>> configuration and the hook is implemented in the SMMUv3 driver.
>>>
>>> This allows the guest to own the stage 1 tables and context
>>> descriptors (so-called PASID table) while the host owns the
>>> stage 2 tables and main configuration structures (STE).
>>>
>>> This work mainly is provided for test purpose as the upper
>>> layer integration is under rework and bound to be based on
>>> /dev/iommu instead of VFIO tunneling. In this version we also get
>>> rid of the MSI BINDING ioctl, assuming the guest enforces
>>> flat mapping of host IOVAs used to bind physical MSI doorbells.
>>> In the current QEMU integration this is achieved by exposing
>>> RMRs to the guest, using Shameer's series [1]. This approach
>>> is RFC as the IORT spec is not really meant to do that
>>> (single mapping flag limitation).
>>>
>>> Best Regards
>>>
>>> Eric
>>>
>>> This series (Host) can be found at:
>>> https://github.com/eauger/linux/tree/v5.15-rc7-nested-v16
>>> This includes a rebased VFIO integration (although not meant
>>> to be upstreamed)
>>>
>>> Guest kernel branch can be found at:
>>> https://github.com/eauger/linux/tree/shameer_rmrr_v7
>>> featuring [1]
>>>
>>> QEMU integration (still based on VFIO and exposing RMRs)
>>> can be found at:
>>> https://github.com/eauger/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
>>> (use iommu=nested-smmuv3 ARM virt option)
>>>
>>> Guest dependency:
>>> [1] [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node
>> Thanks a lot for upgrading these patches.
>>
>> I have basically verified these patches on HiSilicon Kunpeng920.
>> And integrated them to these branches.
>> https://github.com/Linaro/linux-kernel-uadk/tree/uacce-devel-5.16
>> https://github.com/Linaro/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
>>
>> Though they are provided for test purpose,
>>
>> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
> Thank you very much. As you mentioned, until we do not have the
> /dev/iommu integration this is maintained for testing purpose. The SMMU
> changes shouldn't be much impacted though.
> The added value of this respin was to propose an MSI binding solution
> based on RMRRs which simplify things at kernel level.

Current RMRR solution requires uefi enabled,
and QEMU_EFI.fd  has to be provided to start qemu.

Any plan to support dtb as well, which will be simpler since no need 
QEMU_EFI.fd anymore.

Thanks


_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
@ 2021-12-07 10:35       ` Zhangfei Gao
  0 siblings, 0 replies; 116+ messages in thread
From: Zhangfei Gao @ 2021-12-07 10:35 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: kevin.tian, jacob.jun.pan, ashok.raj, chenxiang66, maz, vdumpa,
	nicoleotsuka, vivek.gautam, alex.williamson, yi.l.liu, nicolinc,
	vsethi, sumitg, lushenming, wangxingang5



On 2021/12/7 下午6:27, Eric Auger wrote:
> Hi Zhangfei,
>
> On 12/3/21 1:27 PM, Zhangfei Gao wrote:
>> Hi, Eric
>>
>> On 2021/10/27 下午6:44, Eric Auger wrote:
>>> This series brings the IOMMU part of HW nested paging support
>>> in the SMMUv3.
>>>
>>> The SMMUv3 driver is adapted to support 2 nested stages.
>>>
>>> The IOMMU API is extended to convey the guest stage 1
>>> configuration and the hook is implemented in the SMMUv3 driver.
>>>
>>> This allows the guest to own the stage 1 tables and context
>>> descriptors (so-called PASID table) while the host owns the
>>> stage 2 tables and main configuration structures (STE).
>>>
>>> This work mainly is provided for test purpose as the upper
>>> layer integration is under rework and bound to be based on
>>> /dev/iommu instead of VFIO tunneling. In this version we also get
>>> rid of the MSI BINDING ioctl, assuming the guest enforces
>>> flat mapping of host IOVAs used to bind physical MSI doorbells.
>>> In the current QEMU integration this is achieved by exposing
>>> RMRs to the guest, using Shameer's series [1]. This approach
>>> is RFC as the IORT spec is not really meant to do that
>>> (single mapping flag limitation).
>>>
>>> Best Regards
>>>
>>> Eric
>>>
>>> This series (Host) can be found at:
>>> https://github.com/eauger/linux/tree/v5.15-rc7-nested-v16
>>> This includes a rebased VFIO integration (although not meant
>>> to be upstreamed)
>>>
>>> Guest kernel branch can be found at:
>>> https://github.com/eauger/linux/tree/shameer_rmrr_v7
>>> featuring [1]
>>>
>>> QEMU integration (still based on VFIO and exposing RMRs)
>>> can be found at:
>>> https://github.com/eauger/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
>>> (use iommu=nested-smmuv3 ARM virt option)
>>>
>>> Guest dependency:
>>> [1] [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node
>> Thanks a lot for upgrading these patches.
>>
>> I have basically verified these patches on HiSilicon Kunpeng920.
>> And integrated them to these branches.
>> https://github.com/Linaro/linux-kernel-uadk/tree/uacce-devel-5.16
>> https://github.com/Linaro/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
>>
>> Though they are provided for test purpose,
>>
>> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
> Thank you very much. As you mentioned, until we do not have the
> /dev/iommu integration this is maintained for testing purpose. The SMMU
> changes shouldn't be much impacted though.
> The added value of this respin was to propose an MSI binding solution
> based on RMRRs which simplify things at kernel level.

Current RMRR solution requires uefi enabled,
and QEMU_EFI.fd  has to be provided to start qemu.

Any plan to support dtb as well, which will be simpler since no need 
QEMU_EFI.fd anymore.

Thanks


_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
  2021-12-07 10:35       ` Zhangfei Gao
  (?)
@ 2021-12-07 11:06         ` Eric Auger
  -1 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-07 11:06 UTC (permalink / raw)
  To: Zhangfei Gao, eric.auger.pro, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: alex.williamson, jacob.jun.pan, yi.l.liu, kevin.tian, ashok.raj,
	maz, peter.maydell, vivek.gautam, shameerali.kolothum.thodi,
	wangxingang5, jiangkunkun, yuzenghui, nicoleotsuka, chenxiang66,
	sumitg, nicolinc, vdumpa, zhangfei.gao, lushenming, vsethi

Hi Zhangfei,

On 12/7/21 11:35 AM, Zhangfei Gao wrote:
>
>
> On 2021/12/7 下午6:27, Eric Auger wrote:
>> Hi Zhangfei,
>>
>> On 12/3/21 1:27 PM, Zhangfei Gao wrote:
>>> Hi, Eric
>>>
>>> On 2021/10/27 下午6:44, Eric Auger wrote:
>>>> This series brings the IOMMU part of HW nested paging support
>>>> in the SMMUv3.
>>>>
>>>> The SMMUv3 driver is adapted to support 2 nested stages.
>>>>
>>>> The IOMMU API is extended to convey the guest stage 1
>>>> configuration and the hook is implemented in the SMMUv3 driver.
>>>>
>>>> This allows the guest to own the stage 1 tables and context
>>>> descriptors (so-called PASID table) while the host owns the
>>>> stage 2 tables and main configuration structures (STE).
>>>>
>>>> This work mainly is provided for test purpose as the upper
>>>> layer integration is under rework and bound to be based on
>>>> /dev/iommu instead of VFIO tunneling. In this version we also get
>>>> rid of the MSI BINDING ioctl, assuming the guest enforces
>>>> flat mapping of host IOVAs used to bind physical MSI doorbells.
>>>> In the current QEMU integration this is achieved by exposing
>>>> RMRs to the guest, using Shameer's series [1]. This approach
>>>> is RFC as the IORT spec is not really meant to do that
>>>> (single mapping flag limitation).
>>>>
>>>> Best Regards
>>>>
>>>> Eric
>>>>
>>>> This series (Host) can be found at:
>>>> https://github.com/eauger/linux/tree/v5.15-rc7-nested-v16
>>>> This includes a rebased VFIO integration (although not meant
>>>> to be upstreamed)
>>>>
>>>> Guest kernel branch can be found at:
>>>> https://github.com/eauger/linux/tree/shameer_rmrr_v7
>>>> featuring [1]
>>>>
>>>> QEMU integration (still based on VFIO and exposing RMRs)
>>>> can be found at:
>>>> https://github.com/eauger/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
>>>> (use iommu=nested-smmuv3 ARM virt option)
>>>>
>>>> Guest dependency:
>>>> [1] [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node
>>> Thanks a lot for upgrading these patches.
>>>
>>> I have basically verified these patches on HiSilicon Kunpeng920.
>>> And integrated them to these branches.
>>> https://github.com/Linaro/linux-kernel-uadk/tree/uacce-devel-5.16
>>> https://github.com/Linaro/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
>>>
>>> Though they are provided for test purpose,
>>>
>>> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
>> Thank you very much. As you mentioned, until we do not have the
>> /dev/iommu integration this is maintained for testing purpose. The SMMU
>> changes shouldn't be much impacted though.
>> The added value of this respin was to propose an MSI binding solution
>> based on RMRRs which simplify things at kernel level.
>
> Current RMRR solution requires uefi enabled,
> and QEMU_EFI.fd  has to be provided to start qemu.
>
> Any plan to support dtb as well, which will be simpler since no need
> QEMU_EFI.fd anymore.
Yes the solution is based on ACPI IORT nodes. No clue if some DT
integration is under work. Shameer?

Thanks

Eric
>
> Thanks
>
>


^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
@ 2021-12-07 11:06         ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-07 11:06 UTC (permalink / raw)
  To: Zhangfei Gao, eric.auger.pro, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: peter.maydell, kevin.tian, ashok.raj, maz, vivek.gautam,
	alex.williamson, vsethi, lushenming, wangxingang5

Hi Zhangfei,

On 12/7/21 11:35 AM, Zhangfei Gao wrote:
>
>
> On 2021/12/7 下午6:27, Eric Auger wrote:
>> Hi Zhangfei,
>>
>> On 12/3/21 1:27 PM, Zhangfei Gao wrote:
>>> Hi, Eric
>>>
>>> On 2021/10/27 下午6:44, Eric Auger wrote:
>>>> This series brings the IOMMU part of HW nested paging support
>>>> in the SMMUv3.
>>>>
>>>> The SMMUv3 driver is adapted to support 2 nested stages.
>>>>
>>>> The IOMMU API is extended to convey the guest stage 1
>>>> configuration and the hook is implemented in the SMMUv3 driver.
>>>>
>>>> This allows the guest to own the stage 1 tables and context
>>>> descriptors (so-called PASID table) while the host owns the
>>>> stage 2 tables and main configuration structures (STE).
>>>>
>>>> This work mainly is provided for test purpose as the upper
>>>> layer integration is under rework and bound to be based on
>>>> /dev/iommu instead of VFIO tunneling. In this version we also get
>>>> rid of the MSI BINDING ioctl, assuming the guest enforces
>>>> flat mapping of host IOVAs used to bind physical MSI doorbells.
>>>> In the current QEMU integration this is achieved by exposing
>>>> RMRs to the guest, using Shameer's series [1]. This approach
>>>> is RFC as the IORT spec is not really meant to do that
>>>> (single mapping flag limitation).
>>>>
>>>> Best Regards
>>>>
>>>> Eric
>>>>
>>>> This series (Host) can be found at:
>>>> https://github.com/eauger/linux/tree/v5.15-rc7-nested-v16
>>>> This includes a rebased VFIO integration (although not meant
>>>> to be upstreamed)
>>>>
>>>> Guest kernel branch can be found at:
>>>> https://github.com/eauger/linux/tree/shameer_rmrr_v7
>>>> featuring [1]
>>>>
>>>> QEMU integration (still based on VFIO and exposing RMRs)
>>>> can be found at:
>>>> https://github.com/eauger/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
>>>> (use iommu=nested-smmuv3 ARM virt option)
>>>>
>>>> Guest dependency:
>>>> [1] [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node
>>> Thanks a lot for upgrading these patches.
>>>
>>> I have basically verified these patches on HiSilicon Kunpeng920.
>>> And integrated them to these branches.
>>> https://github.com/Linaro/linux-kernel-uadk/tree/uacce-devel-5.16
>>> https://github.com/Linaro/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
>>>
>>> Though they are provided for test purpose,
>>>
>>> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
>> Thank you very much. As you mentioned, until we do not have the
>> /dev/iommu integration this is maintained for testing purpose. The SMMU
>> changes shouldn't be much impacted though.
>> The added value of this respin was to propose an MSI binding solution
>> based on RMRRs which simplify things at kernel level.
>
> Current RMRR solution requires uefi enabled,
> and QEMU_EFI.fd  has to be provided to start qemu.
>
> Any plan to support dtb as well, which will be simpler since no need
> QEMU_EFI.fd anymore.
Yes the solution is based on ACPI IORT nodes. No clue if some DT
integration is under work. Shameer?

Thanks

Eric
>
> Thanks
>
>

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
@ 2021-12-07 11:06         ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-07 11:06 UTC (permalink / raw)
  To: Zhangfei Gao, eric.auger.pro, iommu, linux-kernel, kvm, kvmarm,
	joro, will, robin.murphy, jean-philippe, zhukeqian1
  Cc: kevin.tian, jacob.jun.pan, ashok.raj, chenxiang66, maz, vdumpa,
	nicoleotsuka, vivek.gautam, alex.williamson, yi.l.liu, nicolinc,
	vsethi, sumitg, lushenming, wangxingang5

Hi Zhangfei,

On 12/7/21 11:35 AM, Zhangfei Gao wrote:
>
>
> On 2021/12/7 下午6:27, Eric Auger wrote:
>> Hi Zhangfei,
>>
>> On 12/3/21 1:27 PM, Zhangfei Gao wrote:
>>> Hi, Eric
>>>
>>> On 2021/10/27 下午6:44, Eric Auger wrote:
>>>> This series brings the IOMMU part of HW nested paging support
>>>> in the SMMUv3.
>>>>
>>>> The SMMUv3 driver is adapted to support 2 nested stages.
>>>>
>>>> The IOMMU API is extended to convey the guest stage 1
>>>> configuration and the hook is implemented in the SMMUv3 driver.
>>>>
>>>> This allows the guest to own the stage 1 tables and context
>>>> descriptors (so-called PASID table) while the host owns the
>>>> stage 2 tables and main configuration structures (STE).
>>>>
>>>> This work mainly is provided for test purpose as the upper
>>>> layer integration is under rework and bound to be based on
>>>> /dev/iommu instead of VFIO tunneling. In this version we also get
>>>> rid of the MSI BINDING ioctl, assuming the guest enforces
>>>> flat mapping of host IOVAs used to bind physical MSI doorbells.
>>>> In the current QEMU integration this is achieved by exposing
>>>> RMRs to the guest, using Shameer's series [1]. This approach
>>>> is RFC as the IORT spec is not really meant to do that
>>>> (single mapping flag limitation).
>>>>
>>>> Best Regards
>>>>
>>>> Eric
>>>>
>>>> This series (Host) can be found at:
>>>> https://github.com/eauger/linux/tree/v5.15-rc7-nested-v16
>>>> This includes a rebased VFIO integration (although not meant
>>>> to be upstreamed)
>>>>
>>>> Guest kernel branch can be found at:
>>>> https://github.com/eauger/linux/tree/shameer_rmrr_v7
>>>> featuring [1]
>>>>
>>>> QEMU integration (still based on VFIO and exposing RMRs)
>>>> can be found at:
>>>> https://github.com/eauger/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
>>>> (use iommu=nested-smmuv3 ARM virt option)
>>>>
>>>> Guest dependency:
>>>> [1] [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node
>>> Thanks a lot for upgrading these patches.
>>>
>>> I have basically verified these patches on HiSilicon Kunpeng920.
>>> And integrated them to these branches.
>>> https://github.com/Linaro/linux-kernel-uadk/tree/uacce-devel-5.16
>>> https://github.com/Linaro/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
>>>
>>> Though they are provided for test purpose,
>>>
>>> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
>> Thank you very much. As you mentioned, until we do not have the
>> /dev/iommu integration this is maintained for testing purpose. The SMMU
>> changes shouldn't be much impacted though.
>> The added value of this respin was to propose an MSI binding solution
>> based on RMRRs which simplify things at kernel level.
>
> Current RMRR solution requires uefi enabled,
> and QEMU_EFI.fd  has to be provided to start qemu.
>
> Any plan to support dtb as well, which will be simpler since no need
> QEMU_EFI.fd anymore.
Yes the solution is based on ACPI IORT nodes. No clue if some DT
integration is under work. Shameer?

Thanks

Eric
>
> Thanks
>
>

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
  2021-12-07 10:22       ` Eric Auger
  (?)
@ 2021-12-08  2:44         ` Lu Baolu
  -1 siblings, 0 replies; 116+ messages in thread
From: Lu Baolu @ 2021-12-08  2:44 UTC (permalink / raw)
  To: eric.auger, Joerg Roedel
  Cc: baolu.lu, peter.maydell, kvm, vivek.gautam, kvmarm,
	eric.auger.pro, jean-philippe, ashok.raj, maz, vsethi,
	zhangfei.gao, kevin.tian, will, alex.williamson, wangxingang5,
	linux-kernel, lushenming, iommu, robin.murphy, Jason Gunthorpe

Hi Eric,

On 12/7/21 6:22 PM, Eric Auger wrote:
> On 12/6/21 11:48 AM, Joerg Roedel wrote:
>> On Wed, Oct 27, 2021 at 12:44:20PM +0200, Eric Auger wrote:
>>> Signed-off-by: Jean-Philippe Brucker<jean-philippe.brucker@arm.com>
>>> Signed-off-by: Liu, Yi L<yi.l.liu@linux.intel.com>
>>> Signed-off-by: Ashok Raj<ashok.raj@intel.com>
>>> Signed-off-by: Jacob Pan<jacob.jun.pan@linux.intel.com>
>>> Signed-off-by: Eric Auger<eric.auger@redhat.com>
>> This Signed-of-by chain looks dubious, you are the author but the last
>> one in the chain?
> The 1st RFC in Aug 2018
> (https://lists.cs.columbia.edu/pipermail/kvmarm/2018-August/032478.html)
> said this was a generalization of Jacob's patch
> 
> 
>    [PATCH v5 01/23] iommu: introduce bind_pasid_table API function
> 
> 
>    https://lists.linuxfoundation.org/pipermail/iommu/2018-May/027647.html
> 
> So indeed Jacob should be the author. I guess the multiple rebases got
> this eventually replaced at some point, which is not an excuse. Please
> forgive me for that.
> Now the original patch already had this list of SoB so I don't know if I
> shall simplify it.

As we have decided to move the nested mode (dual stages) implementation
onto the developing iommufd framework, what's the value of adding this
into iommu core?

Best regards,
baolu

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-08  2:44         ` Lu Baolu
  0 siblings, 0 replies; 116+ messages in thread
From: Lu Baolu @ 2021-12-08  2:44 UTC (permalink / raw)
  To: eric.auger, Joerg Roedel
  Cc: peter.maydell, kevin.tian, lushenming, ashok.raj, kvm,
	jean-philippe, maz, robin.murphy, linux-kernel, iommu, vsethi,
	vivek.gautam, alex.williamson, Jason Gunthorpe, wangxingang5,
	zhangfei.gao, will, kvmarm, eric.auger.pro

Hi Eric,

On 12/7/21 6:22 PM, Eric Auger wrote:
> On 12/6/21 11:48 AM, Joerg Roedel wrote:
>> On Wed, Oct 27, 2021 at 12:44:20PM +0200, Eric Auger wrote:
>>> Signed-off-by: Jean-Philippe Brucker<jean-philippe.brucker@arm.com>
>>> Signed-off-by: Liu, Yi L<yi.l.liu@linux.intel.com>
>>> Signed-off-by: Ashok Raj<ashok.raj@intel.com>
>>> Signed-off-by: Jacob Pan<jacob.jun.pan@linux.intel.com>
>>> Signed-off-by: Eric Auger<eric.auger@redhat.com>
>> This Signed-of-by chain looks dubious, you are the author but the last
>> one in the chain?
> The 1st RFC in Aug 2018
> (https://lists.cs.columbia.edu/pipermail/kvmarm/2018-August/032478.html)
> said this was a generalization of Jacob's patch
> 
> 
>    [PATCH v5 01/23] iommu: introduce bind_pasid_table API function
> 
> 
>    https://lists.linuxfoundation.org/pipermail/iommu/2018-May/027647.html
> 
> So indeed Jacob should be the author. I guess the multiple rebases got
> this eventually replaced at some point, which is not an excuse. Please
> forgive me for that.
> Now the original patch already had this list of SoB so I don't know if I
> shall simplify it.

As we have decided to move the nested mode (dual stages) implementation
onto the developing iommufd framework, what's the value of adding this
into iommu core?

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-08  2:44         ` Lu Baolu
  0 siblings, 0 replies; 116+ messages in thread
From: Lu Baolu @ 2021-12-08  2:44 UTC (permalink / raw)
  To: eric.auger, Joerg Roedel
  Cc: kevin.tian, lushenming, ashok.raj, kvm, jean-philippe, maz,
	robin.murphy, linux-kernel, iommu, vsethi, vivek.gautam,
	alex.williamson, Jason Gunthorpe, wangxingang5, zhangfei.gao,
	baolu.lu, will, kvmarm, eric.auger.pro

Hi Eric,

On 12/7/21 6:22 PM, Eric Auger wrote:
> On 12/6/21 11:48 AM, Joerg Roedel wrote:
>> On Wed, Oct 27, 2021 at 12:44:20PM +0200, Eric Auger wrote:
>>> Signed-off-by: Jean-Philippe Brucker<jean-philippe.brucker@arm.com>
>>> Signed-off-by: Liu, Yi L<yi.l.liu@linux.intel.com>
>>> Signed-off-by: Ashok Raj<ashok.raj@intel.com>
>>> Signed-off-by: Jacob Pan<jacob.jun.pan@linux.intel.com>
>>> Signed-off-by: Eric Auger<eric.auger@redhat.com>
>> This Signed-of-by chain looks dubious, you are the author but the last
>> one in the chain?
> The 1st RFC in Aug 2018
> (https://lists.cs.columbia.edu/pipermail/kvmarm/2018-August/032478.html)
> said this was a generalization of Jacob's patch
> 
> 
>    [PATCH v5 01/23] iommu: introduce bind_pasid_table API function
> 
> 
>    https://lists.linuxfoundation.org/pipermail/iommu/2018-May/027647.html
> 
> So indeed Jacob should be the author. I guess the multiple rebases got
> this eventually replaced at some point, which is not an excuse. Please
> forgive me for that.
> Now the original patch already had this list of SoB so I don't know if I
> shall simplify it.

As we have decided to move the nested mode (dual stages) implementation
onto the developing iommufd framework, what's the value of adding this
into iommu core?

Best regards,
baolu
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
  2021-12-08  2:44         ` Lu Baolu
  (?)
@ 2021-12-08  7:33           ` Eric Auger
  -1 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-08  7:33 UTC (permalink / raw)
  To: Lu Baolu, Joerg Roedel
  Cc: peter.maydell, kvm, vivek.gautam, kvmarm, eric.auger.pro,
	jean-philippe, ashok.raj, maz, vsethi, zhangfei.gao, kevin.tian,
	will, alex.williamson, wangxingang5, linux-kernel, lushenming,
	iommu, robin.murphy, Jason Gunthorpe

Hi Baolu,

On 12/8/21 3:44 AM, Lu Baolu wrote:
> Hi Eric,
>
> On 12/7/21 6:22 PM, Eric Auger wrote:
>> On 12/6/21 11:48 AM, Joerg Roedel wrote:
>>> On Wed, Oct 27, 2021 at 12:44:20PM +0200, Eric Auger wrote:
>>>> Signed-off-by: Jean-Philippe Brucker<jean-philippe.brucker@arm.com>
>>>> Signed-off-by: Liu, Yi L<yi.l.liu@linux.intel.com>
>>>> Signed-off-by: Ashok Raj<ashok.raj@intel.com>
>>>> Signed-off-by: Jacob Pan<jacob.jun.pan@linux.intel.com>
>>>> Signed-off-by: Eric Auger<eric.auger@redhat.com>
>>> This Signed-of-by chain looks dubious, you are the author but the last
>>> one in the chain?
>> The 1st RFC in Aug 2018
>> (https://lists.cs.columbia.edu/pipermail/kvmarm/2018-August/032478.html)
>> said this was a generalization of Jacob's patch
>>
>>
>>    [PATCH v5 01/23] iommu: introduce bind_pasid_table API function
>>
>>
>>   
>> https://lists.linuxfoundation.org/pipermail/iommu/2018-May/027647.html
>>
>> So indeed Jacob should be the author. I guess the multiple rebases got
>> this eventually replaced at some point, which is not an excuse. Please
>> forgive me for that.
>> Now the original patch already had this list of SoB so I don't know if I
>> shall simplify it.
>
> As we have decided to move the nested mode (dual stages) implementation
> onto the developing iommufd framework, what's the value of adding this
> into iommu core?

The iommu_uapi_attach_pasid_table uapi should disappear indeed as it is
is bound to be replaced by /dev/iommu fellow API.
However until I can rebase on /dev/iommu code I am obliged to keep it to
maintain this integration, hence the RFC.

Thanks

Eric
>
> Best regards,
> baolu
>


^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-08  7:33           ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-08  7:33 UTC (permalink / raw)
  To: Lu Baolu, Joerg Roedel
  Cc: peter.maydell, kevin.tian, lushenming, ashok.raj, kvm,
	jean-philippe, maz, linux-kernel, iommu, vsethi, vivek.gautam,
	alex.williamson, Jason Gunthorpe, wangxingang5, zhangfei.gao,
	robin.murphy, will, kvmarm, eric.auger.pro

Hi Baolu,

On 12/8/21 3:44 AM, Lu Baolu wrote:
> Hi Eric,
>
> On 12/7/21 6:22 PM, Eric Auger wrote:
>> On 12/6/21 11:48 AM, Joerg Roedel wrote:
>>> On Wed, Oct 27, 2021 at 12:44:20PM +0200, Eric Auger wrote:
>>>> Signed-off-by: Jean-Philippe Brucker<jean-philippe.brucker@arm.com>
>>>> Signed-off-by: Liu, Yi L<yi.l.liu@linux.intel.com>
>>>> Signed-off-by: Ashok Raj<ashok.raj@intel.com>
>>>> Signed-off-by: Jacob Pan<jacob.jun.pan@linux.intel.com>
>>>> Signed-off-by: Eric Auger<eric.auger@redhat.com>
>>> This Signed-of-by chain looks dubious, you are the author but the last
>>> one in the chain?
>> The 1st RFC in Aug 2018
>> (https://lists.cs.columbia.edu/pipermail/kvmarm/2018-August/032478.html)
>> said this was a generalization of Jacob's patch
>>
>>
>>    [PATCH v5 01/23] iommu: introduce bind_pasid_table API function
>>
>>
>>   
>> https://lists.linuxfoundation.org/pipermail/iommu/2018-May/027647.html
>>
>> So indeed Jacob should be the author. I guess the multiple rebases got
>> this eventually replaced at some point, which is not an excuse. Please
>> forgive me for that.
>> Now the original patch already had this list of SoB so I don't know if I
>> shall simplify it.
>
> As we have decided to move the nested mode (dual stages) implementation
> onto the developing iommufd framework, what's the value of adding this
> into iommu core?

The iommu_uapi_attach_pasid_table uapi should disappear indeed as it is
is bound to be replaced by /dev/iommu fellow API.
However until I can rebase on /dev/iommu code I am obliged to keep it to
maintain this integration, hence the RFC.

Thanks

Eric
>
> Best regards,
> baolu
>

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-08  7:33           ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-08  7:33 UTC (permalink / raw)
  To: Lu Baolu, Joerg Roedel
  Cc: kevin.tian, lushenming, ashok.raj, kvm, jean-philippe, maz,
	linux-kernel, iommu, vsethi, vivek.gautam, alex.williamson,
	Jason Gunthorpe, wangxingang5, zhangfei.gao, robin.murphy, will,
	kvmarm, eric.auger.pro

Hi Baolu,

On 12/8/21 3:44 AM, Lu Baolu wrote:
> Hi Eric,
>
> On 12/7/21 6:22 PM, Eric Auger wrote:
>> On 12/6/21 11:48 AM, Joerg Roedel wrote:
>>> On Wed, Oct 27, 2021 at 12:44:20PM +0200, Eric Auger wrote:
>>>> Signed-off-by: Jean-Philippe Brucker<jean-philippe.brucker@arm.com>
>>>> Signed-off-by: Liu, Yi L<yi.l.liu@linux.intel.com>
>>>> Signed-off-by: Ashok Raj<ashok.raj@intel.com>
>>>> Signed-off-by: Jacob Pan<jacob.jun.pan@linux.intel.com>
>>>> Signed-off-by: Eric Auger<eric.auger@redhat.com>
>>> This Signed-of-by chain looks dubious, you are the author but the last
>>> one in the chain?
>> The 1st RFC in Aug 2018
>> (https://lists.cs.columbia.edu/pipermail/kvmarm/2018-August/032478.html)
>> said this was a generalization of Jacob's patch
>>
>>
>>    [PATCH v5 01/23] iommu: introduce bind_pasid_table API function
>>
>>
>>   
>> https://lists.linuxfoundation.org/pipermail/iommu/2018-May/027647.html
>>
>> So indeed Jacob should be the author. I guess the multiple rebases got
>> this eventually replaced at some point, which is not an excuse. Please
>> forgive me for that.
>> Now the original patch already had this list of SoB so I don't know if I
>> shall simplify it.
>
> As we have decided to move the nested mode (dual stages) implementation
> onto the developing iommufd framework, what's the value of adding this
> into iommu core?

The iommu_uapi_attach_pasid_table uapi should disappear indeed as it is
is bound to be replaced by /dev/iommu fellow API.
However until I can rebase on /dev/iommu code I am obliged to keep it to
maintain this integration, hence the RFC.

Thanks

Eric
>
> Best regards,
> baolu
>

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
  2021-12-08  7:33           ` Eric Auger
  (?)
@ 2021-12-08 12:56             ` Jason Gunthorpe via iommu
  -1 siblings, 0 replies; 116+ messages in thread
From: Jason Gunthorpe @ 2021-12-08 12:56 UTC (permalink / raw)
  To: Eric Auger
  Cc: Lu Baolu, Joerg Roedel, peter.maydell, kvm, vivek.gautam, kvmarm,
	eric.auger.pro, jean-philippe, ashok.raj, maz, vsethi,
	zhangfei.gao, kevin.tian, will, alex.williamson, wangxingang5,
	linux-kernel, lushenming, iommu, robin.murphy

On Wed, Dec 08, 2021 at 08:33:33AM +0100, Eric Auger wrote:
> Hi Baolu,
> 
> On 12/8/21 3:44 AM, Lu Baolu wrote:
> > Hi Eric,
> >
> > On 12/7/21 6:22 PM, Eric Auger wrote:
> >> On 12/6/21 11:48 AM, Joerg Roedel wrote:
> >>> On Wed, Oct 27, 2021 at 12:44:20PM +0200, Eric Auger wrote:
> >>>> Signed-off-by: Jean-Philippe Brucker<jean-philippe.brucker@arm.com>
> >>>> Signed-off-by: Liu, Yi L<yi.l.liu@linux.intel.com>
> >>>> Signed-off-by: Ashok Raj<ashok.raj@intel.com>
> >>>> Signed-off-by: Jacob Pan<jacob.jun.pan@linux.intel.com>
> >>>> Signed-off-by: Eric Auger<eric.auger@redhat.com>
> >>> This Signed-of-by chain looks dubious, you are the author but the last
> >>> one in the chain?
> >> The 1st RFC in Aug 2018
> >> (https://lists.cs.columbia.edu/pipermail/kvmarm/2018-August/032478.html)
> >> said this was a generalization of Jacob's patch
> >>
> >>
> >>    [PATCH v5 01/23] iommu: introduce bind_pasid_table API function
> >>
> >>
> >>   
> >> https://lists.linuxfoundation.org/pipermail/iommu/2018-May/027647.html
> >>
> >> So indeed Jacob should be the author. I guess the multiple rebases got
> >> this eventually replaced at some point, which is not an excuse. Please
> >> forgive me for that.
> >> Now the original patch already had this list of SoB so I don't know if I
> >> shall simplify it.
> >
> > As we have decided to move the nested mode (dual stages) implementation
> > onto the developing iommufd framework, what's the value of adding this
> > into iommu core?
> 
> The iommu_uapi_attach_pasid_table uapi should disappear indeed as it is
> is bound to be replaced by /dev/iommu fellow API.
> However until I can rebase on /dev/iommu code I am obliged to keep it to
> maintain this integration, hence the RFC.

Indeed, we are getting pretty close to having the base iommufd that we
can start adding stuff like this into. Maybe in January, you can look
at some parts of what is evolving here:

https://github.com/jgunthorpe/linux/commits/iommufd
https://github.com/LuBaolu/intel-iommu/commits/iommu-dma-ownership-v2
https://github.com/luxis1999/iommufd/commits/iommufd-v5.16-rc2

From a progress perspective I would like to start with simple 'page
tables in userspace', ie no PASID in this step.

'page tables in userspace' means an iommufd ioctl to create an
iommu_domain where the IOMMU HW is directly travesering a
device-specific page table structure in user space memory. All the HW
today implements this by using another iommu_domain to allow the IOMMU
HW DMA access to user memory - ie nesting or multi-stage or whatever.

This would come along with some ioctls to invalidate the IOTLB.

I'm imagining this step as a iommu_group->op->create_user_domain()
driver callback which will create a new kind of domain with
domain-unique ops. Ie map/unmap related should all be NULL as those
are impossible operations.

From there the usual struct device (ie RID) attach/detatch stuff needs
to take care of routing DMAs to this iommu_domain.

Step two would be to add the ability for an iommufd using driver to
request that a RID&PASID is connected to an iommu_domain. This
connection can be requested for any kind of iommu_domain, kernel owned
or user owned.

I don't quite have an answer how exactly the SMMUv3 vs Intel
difference in PASID routing should be resolved.

to get answers I'm hoping to start building some sketch RFCs for these
different things on iommufd, hopefully in January. I'm looking at user
page tables, PASID, dirty tracking and userspace IO fault handling as
the main features iommufd must tackle.

The purpose of the sketches would be to validate that the HW features
we want to exposed can work will with the choices the base is making.

Jason

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-08 12:56             ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 116+ messages in thread
From: Jason Gunthorpe via iommu @ 2021-12-08 12:56 UTC (permalink / raw)
  To: Eric Auger
  Cc: peter.maydell, kevin.tian, lushenming, robin.murphy, ashok.raj,
	kvm, jean-philippe, maz, linux-kernel, iommu, vsethi,
	vivek.gautam, alex.williamson, wangxingang5, zhangfei.gao,
	eric.auger.pro, will, kvmarm

On Wed, Dec 08, 2021 at 08:33:33AM +0100, Eric Auger wrote:
> Hi Baolu,
> 
> On 12/8/21 3:44 AM, Lu Baolu wrote:
> > Hi Eric,
> >
> > On 12/7/21 6:22 PM, Eric Auger wrote:
> >> On 12/6/21 11:48 AM, Joerg Roedel wrote:
> >>> On Wed, Oct 27, 2021 at 12:44:20PM +0200, Eric Auger wrote:
> >>>> Signed-off-by: Jean-Philippe Brucker<jean-philippe.brucker@arm.com>
> >>>> Signed-off-by: Liu, Yi L<yi.l.liu@linux.intel.com>
> >>>> Signed-off-by: Ashok Raj<ashok.raj@intel.com>
> >>>> Signed-off-by: Jacob Pan<jacob.jun.pan@linux.intel.com>
> >>>> Signed-off-by: Eric Auger<eric.auger@redhat.com>
> >>> This Signed-of-by chain looks dubious, you are the author but the last
> >>> one in the chain?
> >> The 1st RFC in Aug 2018
> >> (https://lists.cs.columbia.edu/pipermail/kvmarm/2018-August/032478.html)
> >> said this was a generalization of Jacob's patch
> >>
> >>
> >>    [PATCH v5 01/23] iommu: introduce bind_pasid_table API function
> >>
> >>
> >>   
> >> https://lists.linuxfoundation.org/pipermail/iommu/2018-May/027647.html
> >>
> >> So indeed Jacob should be the author. I guess the multiple rebases got
> >> this eventually replaced at some point, which is not an excuse. Please
> >> forgive me for that.
> >> Now the original patch already had this list of SoB so I don't know if I
> >> shall simplify it.
> >
> > As we have decided to move the nested mode (dual stages) implementation
> > onto the developing iommufd framework, what's the value of adding this
> > into iommu core?
> 
> The iommu_uapi_attach_pasid_table uapi should disappear indeed as it is
> is bound to be replaced by /dev/iommu fellow API.
> However until I can rebase on /dev/iommu code I am obliged to keep it to
> maintain this integration, hence the RFC.

Indeed, we are getting pretty close to having the base iommufd that we
can start adding stuff like this into. Maybe in January, you can look
at some parts of what is evolving here:

https://github.com/jgunthorpe/linux/commits/iommufd
https://github.com/LuBaolu/intel-iommu/commits/iommu-dma-ownership-v2
https://github.com/luxis1999/iommufd/commits/iommufd-v5.16-rc2

From a progress perspective I would like to start with simple 'page
tables in userspace', ie no PASID in this step.

'page tables in userspace' means an iommufd ioctl to create an
iommu_domain where the IOMMU HW is directly travesering a
device-specific page table structure in user space memory. All the HW
today implements this by using another iommu_domain to allow the IOMMU
HW DMA access to user memory - ie nesting or multi-stage or whatever.

This would come along with some ioctls to invalidate the IOTLB.

I'm imagining this step as a iommu_group->op->create_user_domain()
driver callback which will create a new kind of domain with
domain-unique ops. Ie map/unmap related should all be NULL as those
are impossible operations.

From there the usual struct device (ie RID) attach/detatch stuff needs
to take care of routing DMAs to this iommu_domain.

Step two would be to add the ability for an iommufd using driver to
request that a RID&PASID is connected to an iommu_domain. This
connection can be requested for any kind of iommu_domain, kernel owned
or user owned.

I don't quite have an answer how exactly the SMMUv3 vs Intel
difference in PASID routing should be resolved.

to get answers I'm hoping to start building some sketch RFCs for these
different things on iommufd, hopefully in January. I'm looking at user
page tables, PASID, dirty tracking and userspace IO fault handling as
the main features iommufd must tackle.

The purpose of the sketches would be to validate that the HW features
we want to exposed can work will with the choices the base is making.

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-08 12:56             ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 116+ messages in thread
From: Jason Gunthorpe @ 2021-12-08 12:56 UTC (permalink / raw)
  To: Eric Auger
  Cc: kevin.tian, lushenming, robin.murphy, ashok.raj, kvm,
	jean-philippe, maz, Joerg Roedel, linux-kernel, iommu, vsethi,
	vivek.gautam, alex.williamson, wangxingang5, zhangfei.gao,
	eric.auger.pro, will, kvmarm, Lu Baolu

On Wed, Dec 08, 2021 at 08:33:33AM +0100, Eric Auger wrote:
> Hi Baolu,
> 
> On 12/8/21 3:44 AM, Lu Baolu wrote:
> > Hi Eric,
> >
> > On 12/7/21 6:22 PM, Eric Auger wrote:
> >> On 12/6/21 11:48 AM, Joerg Roedel wrote:
> >>> On Wed, Oct 27, 2021 at 12:44:20PM +0200, Eric Auger wrote:
> >>>> Signed-off-by: Jean-Philippe Brucker<jean-philippe.brucker@arm.com>
> >>>> Signed-off-by: Liu, Yi L<yi.l.liu@linux.intel.com>
> >>>> Signed-off-by: Ashok Raj<ashok.raj@intel.com>
> >>>> Signed-off-by: Jacob Pan<jacob.jun.pan@linux.intel.com>
> >>>> Signed-off-by: Eric Auger<eric.auger@redhat.com>
> >>> This Signed-of-by chain looks dubious, you are the author but the last
> >>> one in the chain?
> >> The 1st RFC in Aug 2018
> >> (https://lists.cs.columbia.edu/pipermail/kvmarm/2018-August/032478.html)
> >> said this was a generalization of Jacob's patch
> >>
> >>
> >>    [PATCH v5 01/23] iommu: introduce bind_pasid_table API function
> >>
> >>
> >>   
> >> https://lists.linuxfoundation.org/pipermail/iommu/2018-May/027647.html
> >>
> >> So indeed Jacob should be the author. I guess the multiple rebases got
> >> this eventually replaced at some point, which is not an excuse. Please
> >> forgive me for that.
> >> Now the original patch already had this list of SoB so I don't know if I
> >> shall simplify it.
> >
> > As we have decided to move the nested mode (dual stages) implementation
> > onto the developing iommufd framework, what's the value of adding this
> > into iommu core?
> 
> The iommu_uapi_attach_pasid_table uapi should disappear indeed as it is
> is bound to be replaced by /dev/iommu fellow API.
> However until I can rebase on /dev/iommu code I am obliged to keep it to
> maintain this integration, hence the RFC.

Indeed, we are getting pretty close to having the base iommufd that we
can start adding stuff like this into. Maybe in January, you can look
at some parts of what is evolving here:

https://github.com/jgunthorpe/linux/commits/iommufd
https://github.com/LuBaolu/intel-iommu/commits/iommu-dma-ownership-v2
https://github.com/luxis1999/iommufd/commits/iommufd-v5.16-rc2

From a progress perspective I would like to start with simple 'page
tables in userspace', ie no PASID in this step.

'page tables in userspace' means an iommufd ioctl to create an
iommu_domain where the IOMMU HW is directly travesering a
device-specific page table structure in user space memory. All the HW
today implements this by using another iommu_domain to allow the IOMMU
HW DMA access to user memory - ie nesting or multi-stage or whatever.

This would come along with some ioctls to invalidate the IOTLB.

I'm imagining this step as a iommu_group->op->create_user_domain()
driver callback which will create a new kind of domain with
domain-unique ops. Ie map/unmap related should all be NULL as those
are impossible operations.

From there the usual struct device (ie RID) attach/detatch stuff needs
to take care of routing DMAs to this iommu_domain.

Step two would be to add the ability for an iommufd using driver to
request that a RID&PASID is connected to an iommu_domain. This
connection can be requested for any kind of iommu_domain, kernel owned
or user owned.

I don't quite have an answer how exactly the SMMUv3 vs Intel
difference in PASID routing should be resolved.

to get answers I'm hoping to start building some sketch RFCs for these
different things on iommufd, hopefully in January. I'm looking at user
page tables, PASID, dirty tracking and userspace IO fault handling as
the main features iommufd must tackle.

The purpose of the sketches would be to validate that the HW features
we want to exposed can work will with the choices the base is making.

Jason
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* RE: [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
  2021-12-07 11:06         ` Eric Auger
  (?)
@ 2021-12-08 13:33           ` Shameerali Kolothum Thodi via iommu
  -1 siblings, 0 replies; 116+ messages in thread
From: Shameerali Kolothum Thodi @ 2021-12-08 13:33 UTC (permalink / raw)
  To: eric.auger, Zhangfei Gao, eric.auger.pro, iommu, linux-kernel,
	kvm, kvmarm, joro, will, robin.murphy, jean-philippe, zhukeqian
  Cc: alex.williamson, jacob.jun.pan, yi.l.liu, kevin.tian, ashok.raj,
	maz, peter.maydell, vivek.gautam, wangxingang, jiangkunkun,
	yuzenghui, nicoleotsuka, chenxiang (M),
	sumitg, nicolinc, vdumpa, zhangfei.gao, vsethi



> -----Original Message-----
> From: Eric Auger [mailto:eric.auger@redhat.com]
> Sent: 07 December 2021 11:06
> To: Zhangfei Gao <zhangfei.gao@linaro.org>; eric.auger.pro@gmail.com;
> iommu@lists.linux-foundation.org; linux-kernel@vger.kernel.org;
> kvm@vger.kernel.org; kvmarm@lists.cs.columbia.edu; joro@8bytes.org;
> will@kernel.org; robin.murphy@arm.com; jean-philippe@linaro.org;
> zhukeqian <zhukeqian1@huawei.com>
> Cc: alex.williamson@redhat.com; jacob.jun.pan@linux.intel.com;
> yi.l.liu@intel.com; kevin.tian@intel.com; ashok.raj@intel.com;
> maz@kernel.org; peter.maydell@linaro.org; vivek.gautam@arm.com;
> Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> wangxingang <wangxingang5@huawei.com>; jiangkunkun
> <jiangkunkun@huawei.com>; yuzenghui <yuzenghui@huawei.com>;
> nicoleotsuka@gmail.com; chenxiang (M) <chenxiang66@hisilicon.com>;
> sumitg@nvidia.com; nicolinc@nvidia.com; vdumpa@nvidia.com;
> zhangfei.gao@gmail.com; lushenming@huawei.com; vsethi@nvidia.com
> Subject: Re: [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
> 
> Hi Zhangfei,
> 
> On 12/7/21 11:35 AM, Zhangfei Gao wrote:
> >
> >
> > On 2021/12/7 下午6:27, Eric Auger wrote:
> >> Hi Zhangfei,
> >>
> >> On 12/3/21 1:27 PM, Zhangfei Gao wrote:
> >>> Hi, Eric
> >>>
> >>> On 2021/10/27 下午6:44, Eric Auger wrote:
> >>>> This series brings the IOMMU part of HW nested paging support
> >>>> in the SMMUv3.
> >>>>
> >>>> The SMMUv3 driver is adapted to support 2 nested stages.
> >>>>
> >>>> The IOMMU API is extended to convey the guest stage 1
> >>>> configuration and the hook is implemented in the SMMUv3 driver.
> >>>>
> >>>> This allows the guest to own the stage 1 tables and context
> >>>> descriptors (so-called PASID table) while the host owns the
> >>>> stage 2 tables and main configuration structures (STE).
> >>>>
> >>>> This work mainly is provided for test purpose as the upper
> >>>> layer integration is under rework and bound to be based on
> >>>> /dev/iommu instead of VFIO tunneling. In this version we also get
> >>>> rid of the MSI BINDING ioctl, assuming the guest enforces
> >>>> flat mapping of host IOVAs used to bind physical MSI doorbells.
> >>>> In the current QEMU integration this is achieved by exposing
> >>>> RMRs to the guest, using Shameer's series [1]. This approach
> >>>> is RFC as the IORT spec is not really meant to do that
> >>>> (single mapping flag limitation).
> >>>>
> >>>> Best Regards
> >>>>
> >>>> Eric
> >>>>
> >>>> This series (Host) can be found at:
> >>>> https://github.com/eauger/linux/tree/v5.15-rc7-nested-v16
> >>>> This includes a rebased VFIO integration (although not meant
> >>>> to be upstreamed)
> >>>>
> >>>> Guest kernel branch can be found at:
> >>>> https://github.com/eauger/linux/tree/shameer_rmrr_v7
> >>>> featuring [1]
> >>>>
> >>>> QEMU integration (still based on VFIO and exposing RMRs)
> >>>> can be found at:
> >>>>
> https://github.com/eauger/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
> >>>> (use iommu=nested-smmuv3 ARM virt option)
> >>>>
> >>>> Guest dependency:
> >>>> [1] [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node
> >>> Thanks a lot for upgrading these patches.
> >>>
> >>> I have basically verified these patches on HiSilicon Kunpeng920.
> >>> And integrated them to these branches.
> >>> https://github.com/Linaro/linux-kernel-uadk/tree/uacce-devel-5.16
> >>> https://github.com/Linaro/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
> >>>
> >>> Though they are provided for test purpose,
> >>>
> >>> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
> >> Thank you very much. As you mentioned, until we do not have the
> >> /dev/iommu integration this is maintained for testing purpose. The SMMU
> >> changes shouldn't be much impacted though.
> >> The added value of this respin was to propose an MSI binding solution
> >> based on RMRRs which simplify things at kernel level.
> >
> > Current RMRR solution requires uefi enabled,
> > and QEMU_EFI.fd  has to be provided to start qemu.
> >
> > Any plan to support dtb as well, which will be simpler since no need
> > QEMU_EFI.fd anymore.
> Yes the solution is based on ACPI IORT nodes. No clue if some DT
> integration is under work. Shameer?

There was some attempt in the past to create identity mappings using DT.
This is the latest I can find on it,
https://lore.kernel.org/linux-iommu/YTelDHx2REIIvV%2FN@orome.fritz.box/T/

Thanks,
Shameer


^ permalink raw reply	[flat|nested] 116+ messages in thread

* RE: [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
@ 2021-12-08 13:33           ` Shameerali Kolothum Thodi via iommu
  0 siblings, 0 replies; 116+ messages in thread
From: Shameerali Kolothum Thodi via iommu @ 2021-12-08 13:33 UTC (permalink / raw)
  To: eric.auger, Zhangfei Gao, eric.auger.pro, iommu, linux-kernel,
	kvm, kvmarm, joro, will, robin.murphy, jean-philippe, zhukeqian
  Cc: peter.maydell, kevin.tian, ashok.raj, maz, vivek.gautam,
	alex.williamson, vsethi, wangxingang



> -----Original Message-----
> From: Eric Auger [mailto:eric.auger@redhat.com]
> Sent: 07 December 2021 11:06
> To: Zhangfei Gao <zhangfei.gao@linaro.org>; eric.auger.pro@gmail.com;
> iommu@lists.linux-foundation.org; linux-kernel@vger.kernel.org;
> kvm@vger.kernel.org; kvmarm@lists.cs.columbia.edu; joro@8bytes.org;
> will@kernel.org; robin.murphy@arm.com; jean-philippe@linaro.org;
> zhukeqian <zhukeqian1@huawei.com>
> Cc: alex.williamson@redhat.com; jacob.jun.pan@linux.intel.com;
> yi.l.liu@intel.com; kevin.tian@intel.com; ashok.raj@intel.com;
> maz@kernel.org; peter.maydell@linaro.org; vivek.gautam@arm.com;
> Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> wangxingang <wangxingang5@huawei.com>; jiangkunkun
> <jiangkunkun@huawei.com>; yuzenghui <yuzenghui@huawei.com>;
> nicoleotsuka@gmail.com; chenxiang (M) <chenxiang66@hisilicon.com>;
> sumitg@nvidia.com; nicolinc@nvidia.com; vdumpa@nvidia.com;
> zhangfei.gao@gmail.com; lushenming@huawei.com; vsethi@nvidia.com
> Subject: Re: [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
> 
> Hi Zhangfei,
> 
> On 12/7/21 11:35 AM, Zhangfei Gao wrote:
> >
> >
> > On 2021/12/7 下午6:27, Eric Auger wrote:
> >> Hi Zhangfei,
> >>
> >> On 12/3/21 1:27 PM, Zhangfei Gao wrote:
> >>> Hi, Eric
> >>>
> >>> On 2021/10/27 下午6:44, Eric Auger wrote:
> >>>> This series brings the IOMMU part of HW nested paging support
> >>>> in the SMMUv3.
> >>>>
> >>>> The SMMUv3 driver is adapted to support 2 nested stages.
> >>>>
> >>>> The IOMMU API is extended to convey the guest stage 1
> >>>> configuration and the hook is implemented in the SMMUv3 driver.
> >>>>
> >>>> This allows the guest to own the stage 1 tables and context
> >>>> descriptors (so-called PASID table) while the host owns the
> >>>> stage 2 tables and main configuration structures (STE).
> >>>>
> >>>> This work mainly is provided for test purpose as the upper
> >>>> layer integration is under rework and bound to be based on
> >>>> /dev/iommu instead of VFIO tunneling. In this version we also get
> >>>> rid of the MSI BINDING ioctl, assuming the guest enforces
> >>>> flat mapping of host IOVAs used to bind physical MSI doorbells.
> >>>> In the current QEMU integration this is achieved by exposing
> >>>> RMRs to the guest, using Shameer's series [1]. This approach
> >>>> is RFC as the IORT spec is not really meant to do that
> >>>> (single mapping flag limitation).
> >>>>
> >>>> Best Regards
> >>>>
> >>>> Eric
> >>>>
> >>>> This series (Host) can be found at:
> >>>> https://github.com/eauger/linux/tree/v5.15-rc7-nested-v16
> >>>> This includes a rebased VFIO integration (although not meant
> >>>> to be upstreamed)
> >>>>
> >>>> Guest kernel branch can be found at:
> >>>> https://github.com/eauger/linux/tree/shameer_rmrr_v7
> >>>> featuring [1]
> >>>>
> >>>> QEMU integration (still based on VFIO and exposing RMRs)
> >>>> can be found at:
> >>>>
> https://github.com/eauger/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
> >>>> (use iommu=nested-smmuv3 ARM virt option)
> >>>>
> >>>> Guest dependency:
> >>>> [1] [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node
> >>> Thanks a lot for upgrading these patches.
> >>>
> >>> I have basically verified these patches on HiSilicon Kunpeng920.
> >>> And integrated them to these branches.
> >>> https://github.com/Linaro/linux-kernel-uadk/tree/uacce-devel-5.16
> >>> https://github.com/Linaro/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
> >>>
> >>> Though they are provided for test purpose,
> >>>
> >>> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
> >> Thank you very much. As you mentioned, until we do not have the
> >> /dev/iommu integration this is maintained for testing purpose. The SMMU
> >> changes shouldn't be much impacted though.
> >> The added value of this respin was to propose an MSI binding solution
> >> based on RMRRs which simplify things at kernel level.
> >
> > Current RMRR solution requires uefi enabled,
> > and QEMU_EFI.fd  has to be provided to start qemu.
> >
> > Any plan to support dtb as well, which will be simpler since no need
> > QEMU_EFI.fd anymore.
> Yes the solution is based on ACPI IORT nodes. No clue if some DT
> integration is under work. Shameer?

There was some attempt in the past to create identity mappings using DT.
This is the latest I can find on it,
https://lore.kernel.org/linux-iommu/YTelDHx2REIIvV%2FN@orome.fritz.box/T/

Thanks,
Shameer

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 116+ messages in thread

* RE: [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
@ 2021-12-08 13:33           ` Shameerali Kolothum Thodi via iommu
  0 siblings, 0 replies; 116+ messages in thread
From: Shameerali Kolothum Thodi @ 2021-12-08 13:33 UTC (permalink / raw)
  To: eric.auger, Zhangfei Gao, eric.auger.pro, iommu, linux-kernel,
	kvm, kvmarm, joro, will, robin.murphy, jean-philippe, zhukeqian
  Cc: kevin.tian, jacob.jun.pan, ashok.raj, chenxiang (M),
	maz, vdumpa, nicoleotsuka, vivek.gautam, alex.williamson,
	yi.l.liu, nicolinc, vsethi, sumitg, wangxingang



> -----Original Message-----
> From: Eric Auger [mailto:eric.auger@redhat.com]
> Sent: 07 December 2021 11:06
> To: Zhangfei Gao <zhangfei.gao@linaro.org>; eric.auger.pro@gmail.com;
> iommu@lists.linux-foundation.org; linux-kernel@vger.kernel.org;
> kvm@vger.kernel.org; kvmarm@lists.cs.columbia.edu; joro@8bytes.org;
> will@kernel.org; robin.murphy@arm.com; jean-philippe@linaro.org;
> zhukeqian <zhukeqian1@huawei.com>
> Cc: alex.williamson@redhat.com; jacob.jun.pan@linux.intel.com;
> yi.l.liu@intel.com; kevin.tian@intel.com; ashok.raj@intel.com;
> maz@kernel.org; peter.maydell@linaro.org; vivek.gautam@arm.com;
> Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> wangxingang <wangxingang5@huawei.com>; jiangkunkun
> <jiangkunkun@huawei.com>; yuzenghui <yuzenghui@huawei.com>;
> nicoleotsuka@gmail.com; chenxiang (M) <chenxiang66@hisilicon.com>;
> sumitg@nvidia.com; nicolinc@nvidia.com; vdumpa@nvidia.com;
> zhangfei.gao@gmail.com; lushenming@huawei.com; vsethi@nvidia.com
> Subject: Re: [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part)
> 
> Hi Zhangfei,
> 
> On 12/7/21 11:35 AM, Zhangfei Gao wrote:
> >
> >
> > On 2021/12/7 下午6:27, Eric Auger wrote:
> >> Hi Zhangfei,
> >>
> >> On 12/3/21 1:27 PM, Zhangfei Gao wrote:
> >>> Hi, Eric
> >>>
> >>> On 2021/10/27 下午6:44, Eric Auger wrote:
> >>>> This series brings the IOMMU part of HW nested paging support
> >>>> in the SMMUv3.
> >>>>
> >>>> The SMMUv3 driver is adapted to support 2 nested stages.
> >>>>
> >>>> The IOMMU API is extended to convey the guest stage 1
> >>>> configuration and the hook is implemented in the SMMUv3 driver.
> >>>>
> >>>> This allows the guest to own the stage 1 tables and context
> >>>> descriptors (so-called PASID table) while the host owns the
> >>>> stage 2 tables and main configuration structures (STE).
> >>>>
> >>>> This work mainly is provided for test purpose as the upper
> >>>> layer integration is under rework and bound to be based on
> >>>> /dev/iommu instead of VFIO tunneling. In this version we also get
> >>>> rid of the MSI BINDING ioctl, assuming the guest enforces
> >>>> flat mapping of host IOVAs used to bind physical MSI doorbells.
> >>>> In the current QEMU integration this is achieved by exposing
> >>>> RMRs to the guest, using Shameer's series [1]. This approach
> >>>> is RFC as the IORT spec is not really meant to do that
> >>>> (single mapping flag limitation).
> >>>>
> >>>> Best Regards
> >>>>
> >>>> Eric
> >>>>
> >>>> This series (Host) can be found at:
> >>>> https://github.com/eauger/linux/tree/v5.15-rc7-nested-v16
> >>>> This includes a rebased VFIO integration (although not meant
> >>>> to be upstreamed)
> >>>>
> >>>> Guest kernel branch can be found at:
> >>>> https://github.com/eauger/linux/tree/shameer_rmrr_v7
> >>>> featuring [1]
> >>>>
> >>>> QEMU integration (still based on VFIO and exposing RMRs)
> >>>> can be found at:
> >>>>
> https://github.com/eauger/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
> >>>> (use iommu=nested-smmuv3 ARM virt option)
> >>>>
> >>>> Guest dependency:
> >>>> [1] [PATCH v7 0/9] ACPI/IORT: Support for IORT RMR node
> >>> Thanks a lot for upgrading these patches.
> >>>
> >>> I have basically verified these patches on HiSilicon Kunpeng920.
> >>> And integrated them to these branches.
> >>> https://github.com/Linaro/linux-kernel-uadk/tree/uacce-devel-5.16
> >>> https://github.com/Linaro/qemu/tree/v6.1.0-rmr-v2-nested_smmuv3_v10
> >>>
> >>> Though they are provided for test purpose,
> >>>
> >>> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
> >> Thank you very much. As you mentioned, until we do not have the
> >> /dev/iommu integration this is maintained for testing purpose. The SMMU
> >> changes shouldn't be much impacted though.
> >> The added value of this respin was to propose an MSI binding solution
> >> based on RMRRs which simplify things at kernel level.
> >
> > Current RMRR solution requires uefi enabled,
> > and QEMU_EFI.fd  has to be provided to start qemu.
> >
> > Any plan to support dtb as well, which will be simpler since no need
> > QEMU_EFI.fd anymore.
> Yes the solution is based on ACPI IORT nodes. No clue if some DT
> integration is under work. Shameer?

There was some attempt in the past to create identity mappings using DT.
This is the latest I can find on it,
https://lore.kernel.org/linux-iommu/YTelDHx2REIIvV%2FN@orome.fritz.box/T/

Thanks,
Shameer

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
  2021-12-08 12:56             ` Jason Gunthorpe via iommu
  (?)
@ 2021-12-08 17:20               ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 116+ messages in thread
From: Jean-Philippe Brucker @ 2021-12-08 17:20 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Eric Auger, Lu Baolu, Joerg Roedel, peter.maydell, kvm,
	vivek.gautam, kvmarm, eric.auger.pro, ashok.raj, maz, vsethi,
	zhangfei.gao, kevin.tian, will, alex.williamson, wangxingang5,
	linux-kernel, lushenming, iommu, robin.murphy

On Wed, Dec 08, 2021 at 08:56:16AM -0400, Jason Gunthorpe wrote:
> From a progress perspective I would like to start with simple 'page
> tables in userspace', ie no PASID in this step.
> 
> 'page tables in userspace' means an iommufd ioctl to create an
> iommu_domain where the IOMMU HW is directly travesering a
> device-specific page table structure in user space memory. All the HW
> today implements this by using another iommu_domain to allow the IOMMU
> HW DMA access to user memory - ie nesting or multi-stage or whatever.
> 
> This would come along with some ioctls to invalidate the IOTLB.
> 
> I'm imagining this step as a iommu_group->op->create_user_domain()
> driver callback which will create a new kind of domain with
> domain-unique ops. Ie map/unmap related should all be NULL as those
> are impossible operations.
> 
> From there the usual struct device (ie RID) attach/detatch stuff needs
> to take care of routing DMAs to this iommu_domain.
> 
> Step two would be to add the ability for an iommufd using driver to
> request that a RID&PASID is connected to an iommu_domain. This
> connection can be requested for any kind of iommu_domain, kernel owned
> or user owned.
> 
> I don't quite have an answer how exactly the SMMUv3 vs Intel
> difference in PASID routing should be resolved.

In SMMUv3 the user pgd is always stored in the PASID table (actually
called "context descriptor table" but I want to avoid confusion with the
VT-d "context table"). And to access the PASID table, the SMMUv3 first
translate its GPA into a PA using the stage-2 page table. For userspace to
pass individual pgds to the kernel, as opposed to passing whole PASID
tables, the host kernel needs to reserve GPA space and map it in stage-2,
so it can store the PASID table in there. Userspace manages GPA space.

This would be easy for a single pgd. In this case the PASID table has a
single entry and userspace could just pass one GPA page during
registration. However it isn't easily generalized to full PASID support,
because managing a multi-level PASID table will require runtime GPA
allocation, and that API is awkward. That's why we opted for "attach PASID
table" operation rather than "attach page table" (back then the choice was
easy since VT-d used the same concept).

So I think the simplest way to support nesting is still to have separate
modes of operations depending on the hardware.

Thanks,
Jean

> 
> to get answers I'm hoping to start building some sketch RFCs for these
> different things on iommufd, hopefully in January. I'm looking at user
> page tables, PASID, dirty tracking and userspace IO fault handling as
> the main features iommufd must tackle.
> 
> The purpose of the sketches would be to validate that the HW features
> we want to exposed can work will with the choices the base is making.
> 
> Jason

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-08 17:20               ` Jean-Philippe Brucker
  0 siblings, 0 replies; 116+ messages in thread
From: Jean-Philippe Brucker @ 2021-12-08 17:20 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: peter.maydell, kevin.tian, lushenming, robin.murphy, ashok.raj,
	kvm, vivek.gautam, maz, linux-kernel, iommu, vsethi,
	alex.williamson, wangxingang5, zhangfei.gao, eric.auger.pro,
	will, kvmarm

On Wed, Dec 08, 2021 at 08:56:16AM -0400, Jason Gunthorpe wrote:
> From a progress perspective I would like to start with simple 'page
> tables in userspace', ie no PASID in this step.
> 
> 'page tables in userspace' means an iommufd ioctl to create an
> iommu_domain where the IOMMU HW is directly travesering a
> device-specific page table structure in user space memory. All the HW
> today implements this by using another iommu_domain to allow the IOMMU
> HW DMA access to user memory - ie nesting or multi-stage or whatever.
> 
> This would come along with some ioctls to invalidate the IOTLB.
> 
> I'm imagining this step as a iommu_group->op->create_user_domain()
> driver callback which will create a new kind of domain with
> domain-unique ops. Ie map/unmap related should all be NULL as those
> are impossible operations.
> 
> From there the usual struct device (ie RID) attach/detatch stuff needs
> to take care of routing DMAs to this iommu_domain.
> 
> Step two would be to add the ability for an iommufd using driver to
> request that a RID&PASID is connected to an iommu_domain. This
> connection can be requested for any kind of iommu_domain, kernel owned
> or user owned.
> 
> I don't quite have an answer how exactly the SMMUv3 vs Intel
> difference in PASID routing should be resolved.

In SMMUv3 the user pgd is always stored in the PASID table (actually
called "context descriptor table" but I want to avoid confusion with the
VT-d "context table"). And to access the PASID table, the SMMUv3 first
translate its GPA into a PA using the stage-2 page table. For userspace to
pass individual pgds to the kernel, as opposed to passing whole PASID
tables, the host kernel needs to reserve GPA space and map it in stage-2,
so it can store the PASID table in there. Userspace manages GPA space.

This would be easy for a single pgd. In this case the PASID table has a
single entry and userspace could just pass one GPA page during
registration. However it isn't easily generalized to full PASID support,
because managing a multi-level PASID table will require runtime GPA
allocation, and that API is awkward. That's why we opted for "attach PASID
table" operation rather than "attach page table" (back then the choice was
easy since VT-d used the same concept).

So I think the simplest way to support nesting is still to have separate
modes of operations depending on the hardware.

Thanks,
Jean

> 
> to get answers I'm hoping to start building some sketch RFCs for these
> different things on iommufd, hopefully in January. I'm looking at user
> page tables, PASID, dirty tracking and userspace IO fault handling as
> the main features iommufd must tackle.
> 
> The purpose of the sketches would be to validate that the HW features
> we want to exposed can work will with the choices the base is making.
> 
> Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-08 17:20               ` Jean-Philippe Brucker
  0 siblings, 0 replies; 116+ messages in thread
From: Jean-Philippe Brucker @ 2021-12-08 17:20 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: kevin.tian, lushenming, robin.murphy, ashok.raj, kvm,
	vivek.gautam, maz, Joerg Roedel, linux-kernel, iommu, vsethi,
	alex.williamson, wangxingang5, zhangfei.gao, eric.auger.pro,
	will, kvmarm, Lu Baolu

On Wed, Dec 08, 2021 at 08:56:16AM -0400, Jason Gunthorpe wrote:
> From a progress perspective I would like to start with simple 'page
> tables in userspace', ie no PASID in this step.
> 
> 'page tables in userspace' means an iommufd ioctl to create an
> iommu_domain where the IOMMU HW is directly travesering a
> device-specific page table structure in user space memory. All the HW
> today implements this by using another iommu_domain to allow the IOMMU
> HW DMA access to user memory - ie nesting or multi-stage or whatever.
> 
> This would come along with some ioctls to invalidate the IOTLB.
> 
> I'm imagining this step as a iommu_group->op->create_user_domain()
> driver callback which will create a new kind of domain with
> domain-unique ops. Ie map/unmap related should all be NULL as those
> are impossible operations.
> 
> From there the usual struct device (ie RID) attach/detatch stuff needs
> to take care of routing DMAs to this iommu_domain.
> 
> Step two would be to add the ability for an iommufd using driver to
> request that a RID&PASID is connected to an iommu_domain. This
> connection can be requested for any kind of iommu_domain, kernel owned
> or user owned.
> 
> I don't quite have an answer how exactly the SMMUv3 vs Intel
> difference in PASID routing should be resolved.

In SMMUv3 the user pgd is always stored in the PASID table (actually
called "context descriptor table" but I want to avoid confusion with the
VT-d "context table"). And to access the PASID table, the SMMUv3 first
translate its GPA into a PA using the stage-2 page table. For userspace to
pass individual pgds to the kernel, as opposed to passing whole PASID
tables, the host kernel needs to reserve GPA space and map it in stage-2,
so it can store the PASID table in there. Userspace manages GPA space.

This would be easy for a single pgd. In this case the PASID table has a
single entry and userspace could just pass one GPA page during
registration. However it isn't easily generalized to full PASID support,
because managing a multi-level PASID table will require runtime GPA
allocation, and that API is awkward. That's why we opted for "attach PASID
table" operation rather than "attach page table" (back then the choice was
easy since VT-d used the same concept).

So I think the simplest way to support nesting is still to have separate
modes of operations depending on the hardware.

Thanks,
Jean

> 
> to get answers I'm hoping to start building some sketch RFCs for these
> different things on iommufd, hopefully in January. I'm looking at user
> page tables, PASID, dirty tracking and userspace IO fault handling as
> the main features iommufd must tackle.
> 
> The purpose of the sketches would be to validate that the HW features
> we want to exposed can work will with the choices the base is making.
> 
> Jason
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
  2021-12-08 17:20               ` Jean-Philippe Brucker
  (?)
@ 2021-12-08 18:31                 ` Jason Gunthorpe via iommu
  -1 siblings, 0 replies; 116+ messages in thread
From: Jason Gunthorpe @ 2021-12-08 18:31 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Eric Auger, Lu Baolu, Joerg Roedel, peter.maydell, kvm,
	vivek.gautam, kvmarm, eric.auger.pro, ashok.raj, maz, vsethi,
	zhangfei.gao, kevin.tian, will, alex.williamson, wangxingang5,
	linux-kernel, lushenming, iommu, robin.murphy

On Wed, Dec 08, 2021 at 05:20:39PM +0000, Jean-Philippe Brucker wrote:
> On Wed, Dec 08, 2021 at 08:56:16AM -0400, Jason Gunthorpe wrote:
> > From a progress perspective I would like to start with simple 'page
> > tables in userspace', ie no PASID in this step.
> > 
> > 'page tables in userspace' means an iommufd ioctl to create an
> > iommu_domain where the IOMMU HW is directly travesering a
> > device-specific page table structure in user space memory. All the HW
> > today implements this by using another iommu_domain to allow the IOMMU
> > HW DMA access to user memory - ie nesting or multi-stage or whatever.
> > 
> > This would come along with some ioctls to invalidate the IOTLB.
> > 
> > I'm imagining this step as a iommu_group->op->create_user_domain()
> > driver callback which will create a new kind of domain with
> > domain-unique ops. Ie map/unmap related should all be NULL as those
> > are impossible operations.
> > 
> > From there the usual struct device (ie RID) attach/detatch stuff needs
> > to take care of routing DMAs to this iommu_domain.
> > 
> > Step two would be to add the ability for an iommufd using driver to
> > request that a RID&PASID is connected to an iommu_domain. This
> > connection can be requested for any kind of iommu_domain, kernel owned
> > or user owned.
> > 
> > I don't quite have an answer how exactly the SMMUv3 vs Intel
> > difference in PASID routing should be resolved.
> 
> In SMMUv3 the user pgd is always stored in the PASID table (actually
> called "context descriptor table" but I want to avoid confusion with
> the VT-d "context table"). And to access the PASID table, the SMMUv3 first
> translate its GPA into a PA using the stage-2 page table. For userspace to
> pass individual pgds to the kernel, as opposed to passing whole PASID
> tables, the host kernel needs to reserve GPA space and map it in stage-2,
> so it can store the PASID table in there. Userspace manages GPA space.

It is what I thought.. So in the SMMUv3 spec the STE is completely in
kernel memory, but it points to an S1ContextPtr that must be an IPA if
the "stage 1 translation tables" are IPA. Only via S1ContextPtr can we
decode the substream?

So in SMMUv3 land we don't really ever talk about PASID, we have a
'user page table' that is bound to an entire RID and *all* PASIDs.

While Intel would have a 'user page table' that is only bound to a RID
& PASID

Certianly it is not a difference we can hide from userspace.
 
> This would be easy for a single pgd. In this case the PASID table has a
> single entry and userspace could just pass one GPA page during
> registration. However it isn't easily generalized to full PASID support,
> because managing a multi-level PASID table will require runtime GPA
> allocation, and that API is awkward. That's why we opted for "attach PASID
> table" operation rather than "attach page table" (back then the choice was
> easy since VT-d used the same concept).

I think the entire context descriptor table should be in userspace,
and filled in by userspace, as part of the userspace page table.

The kernel API should accept the S1ContextPtr IPA and all the parts of
the STE that relate to the defining the layout of what the S1Context
points to an thats it.

We should have another mode where the kernel owns everything, and the
S1ContexPtr is a PA with Stage 2 bypassed.

That part is fine, the more open question is what does the driver
interface look like when userspace tell something like vfio-pci to
connect to this thing. At some level the attaching device needs to
authorize iommufd to take the entire PASID table and RID.

Specifically we cannot use this thing with a mdev, while the Intel
version of a userspace page table can be.

Maybe that is just some 'allow whole device' flag in an API

Jason

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-08 18:31                 ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 116+ messages in thread
From: Jason Gunthorpe via iommu @ 2021-12-08 18:31 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: peter.maydell, kevin.tian, lushenming, robin.murphy, ashok.raj,
	kvm, vivek.gautam, maz, linux-kernel, iommu, vsethi,
	alex.williamson, wangxingang5, zhangfei.gao, eric.auger.pro,
	will, kvmarm

On Wed, Dec 08, 2021 at 05:20:39PM +0000, Jean-Philippe Brucker wrote:
> On Wed, Dec 08, 2021 at 08:56:16AM -0400, Jason Gunthorpe wrote:
> > From a progress perspective I would like to start with simple 'page
> > tables in userspace', ie no PASID in this step.
> > 
> > 'page tables in userspace' means an iommufd ioctl to create an
> > iommu_domain where the IOMMU HW is directly travesering a
> > device-specific page table structure in user space memory. All the HW
> > today implements this by using another iommu_domain to allow the IOMMU
> > HW DMA access to user memory - ie nesting or multi-stage or whatever.
> > 
> > This would come along with some ioctls to invalidate the IOTLB.
> > 
> > I'm imagining this step as a iommu_group->op->create_user_domain()
> > driver callback which will create a new kind of domain with
> > domain-unique ops. Ie map/unmap related should all be NULL as those
> > are impossible operations.
> > 
> > From there the usual struct device (ie RID) attach/detatch stuff needs
> > to take care of routing DMAs to this iommu_domain.
> > 
> > Step two would be to add the ability for an iommufd using driver to
> > request that a RID&PASID is connected to an iommu_domain. This
> > connection can be requested for any kind of iommu_domain, kernel owned
> > or user owned.
> > 
> > I don't quite have an answer how exactly the SMMUv3 vs Intel
> > difference in PASID routing should be resolved.
> 
> In SMMUv3 the user pgd is always stored in the PASID table (actually
> called "context descriptor table" but I want to avoid confusion with
> the VT-d "context table"). And to access the PASID table, the SMMUv3 first
> translate its GPA into a PA using the stage-2 page table. For userspace to
> pass individual pgds to the kernel, as opposed to passing whole PASID
> tables, the host kernel needs to reserve GPA space and map it in stage-2,
> so it can store the PASID table in there. Userspace manages GPA space.

It is what I thought.. So in the SMMUv3 spec the STE is completely in
kernel memory, but it points to an S1ContextPtr that must be an IPA if
the "stage 1 translation tables" are IPA. Only via S1ContextPtr can we
decode the substream?

So in SMMUv3 land we don't really ever talk about PASID, we have a
'user page table' that is bound to an entire RID and *all* PASIDs.

While Intel would have a 'user page table' that is only bound to a RID
& PASID

Certianly it is not a difference we can hide from userspace.
 
> This would be easy for a single pgd. In this case the PASID table has a
> single entry and userspace could just pass one GPA page during
> registration. However it isn't easily generalized to full PASID support,
> because managing a multi-level PASID table will require runtime GPA
> allocation, and that API is awkward. That's why we opted for "attach PASID
> table" operation rather than "attach page table" (back then the choice was
> easy since VT-d used the same concept).

I think the entire context descriptor table should be in userspace,
and filled in by userspace, as part of the userspace page table.

The kernel API should accept the S1ContextPtr IPA and all the parts of
the STE that relate to the defining the layout of what the S1Context
points to an thats it.

We should have another mode where the kernel owns everything, and the
S1ContexPtr is a PA with Stage 2 bypassed.

That part is fine, the more open question is what does the driver
interface look like when userspace tell something like vfio-pci to
connect to this thing. At some level the attaching device needs to
authorize iommufd to take the entire PASID table and RID.

Specifically we cannot use this thing with a mdev, while the Intel
version of a userspace page table can be.

Maybe that is just some 'allow whole device' flag in an API

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-08 18:31                 ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 116+ messages in thread
From: Jason Gunthorpe @ 2021-12-08 18:31 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: kevin.tian, lushenming, robin.murphy, ashok.raj, kvm,
	vivek.gautam, maz, Joerg Roedel, linux-kernel, iommu, vsethi,
	alex.williamson, wangxingang5, zhangfei.gao, eric.auger.pro,
	will, kvmarm, Lu Baolu

On Wed, Dec 08, 2021 at 05:20:39PM +0000, Jean-Philippe Brucker wrote:
> On Wed, Dec 08, 2021 at 08:56:16AM -0400, Jason Gunthorpe wrote:
> > From a progress perspective I would like to start with simple 'page
> > tables in userspace', ie no PASID in this step.
> > 
> > 'page tables in userspace' means an iommufd ioctl to create an
> > iommu_domain where the IOMMU HW is directly travesering a
> > device-specific page table structure in user space memory. All the HW
> > today implements this by using another iommu_domain to allow the IOMMU
> > HW DMA access to user memory - ie nesting or multi-stage or whatever.
> > 
> > This would come along with some ioctls to invalidate the IOTLB.
> > 
> > I'm imagining this step as a iommu_group->op->create_user_domain()
> > driver callback which will create a new kind of domain with
> > domain-unique ops. Ie map/unmap related should all be NULL as those
> > are impossible operations.
> > 
> > From there the usual struct device (ie RID) attach/detatch stuff needs
> > to take care of routing DMAs to this iommu_domain.
> > 
> > Step two would be to add the ability for an iommufd using driver to
> > request that a RID&PASID is connected to an iommu_domain. This
> > connection can be requested for any kind of iommu_domain, kernel owned
> > or user owned.
> > 
> > I don't quite have an answer how exactly the SMMUv3 vs Intel
> > difference in PASID routing should be resolved.
> 
> In SMMUv3 the user pgd is always stored in the PASID table (actually
> called "context descriptor table" but I want to avoid confusion with
> the VT-d "context table"). And to access the PASID table, the SMMUv3 first
> translate its GPA into a PA using the stage-2 page table. For userspace to
> pass individual pgds to the kernel, as opposed to passing whole PASID
> tables, the host kernel needs to reserve GPA space and map it in stage-2,
> so it can store the PASID table in there. Userspace manages GPA space.

It is what I thought.. So in the SMMUv3 spec the STE is completely in
kernel memory, but it points to an S1ContextPtr that must be an IPA if
the "stage 1 translation tables" are IPA. Only via S1ContextPtr can we
decode the substream?

So in SMMUv3 land we don't really ever talk about PASID, we have a
'user page table' that is bound to an entire RID and *all* PASIDs.

While Intel would have a 'user page table' that is only bound to a RID
& PASID

Certianly it is not a difference we can hide from userspace.
 
> This would be easy for a single pgd. In this case the PASID table has a
> single entry and userspace could just pass one GPA page during
> registration. However it isn't easily generalized to full PASID support,
> because managing a multi-level PASID table will require runtime GPA
> allocation, and that API is awkward. That's why we opted for "attach PASID
> table" operation rather than "attach page table" (back then the choice was
> easy since VT-d used the same concept).

I think the entire context descriptor table should be in userspace,
and filled in by userspace, as part of the userspace page table.

The kernel API should accept the S1ContextPtr IPA and all the parts of
the STE that relate to the defining the layout of what the S1Context
points to an thats it.

We should have another mode where the kernel owns everything, and the
S1ContexPtr is a PA with Stage 2 bypassed.

That part is fine, the more open question is what does the driver
interface look like when userspace tell something like vfio-pci to
connect to this thing. At some level the attaching device needs to
authorize iommufd to take the entire PASID table and RID.

Specifically we cannot use this thing with a mdev, while the Intel
version of a userspace page table can be.

Maybe that is just some 'allow whole device' flag in an API

Jason
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* RE: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
  2021-12-08 18:31                 ` Jason Gunthorpe via iommu
  (?)
@ 2021-12-09  2:58                   ` Tian, Kevin
  -1 siblings, 0 replies; 116+ messages in thread
From: Tian, Kevin @ 2021-12-09  2:58 UTC (permalink / raw)
  To: Jason Gunthorpe, Jean-Philippe Brucker
  Cc: Eric Auger, Lu Baolu, Joerg Roedel, peter.maydell, kvm,
	vivek.gautam, kvmarm, eric.auger.pro, Raj, Ashok, maz, vsethi,
	zhangfei.gao, will, alex.williamson, wangxingang5, linux-kernel,
	lushenming, iommu, robin.murphy

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Thursday, December 9, 2021 2:31 AM
> 
> On Wed, Dec 08, 2021 at 05:20:39PM +0000, Jean-Philippe Brucker wrote:
> > On Wed, Dec 08, 2021 at 08:56:16AM -0400, Jason Gunthorpe wrote:
> > > From a progress perspective I would like to start with simple 'page
> > > tables in userspace', ie no PASID in this step.
> > >
> > > 'page tables in userspace' means an iommufd ioctl to create an
> > > iommu_domain where the IOMMU HW is directly travesering a
> > > device-specific page table structure in user space memory. All the HW
> > > today implements this by using another iommu_domain to allow the
> IOMMU
> > > HW DMA access to user memory - ie nesting or multi-stage or whatever.
> > >
> > > This would come along with some ioctls to invalidate the IOTLB.
> > >
> > > I'm imagining this step as a iommu_group->op->create_user_domain()
> > > driver callback which will create a new kind of domain with
> > > domain-unique ops. Ie map/unmap related should all be NULL as those
> > > are impossible operations.
> > >
> > > From there the usual struct device (ie RID) attach/detatch stuff needs
> > > to take care of routing DMAs to this iommu_domain.
> > >
> > > Step two would be to add the ability for an iommufd using driver to
> > > request that a RID&PASID is connected to an iommu_domain. This
> > > connection can be requested for any kind of iommu_domain, kernel
> owned
> > > or user owned.
> > >
> > > I don't quite have an answer how exactly the SMMUv3 vs Intel
> > > difference in PASID routing should be resolved.
> >
> > In SMMUv3 the user pgd is always stored in the PASID table (actually
> > called "context descriptor table" but I want to avoid confusion with
> > the VT-d "context table"). And to access the PASID table, the SMMUv3 first
> > translate its GPA into a PA using the stage-2 page table. For userspace to
> > pass individual pgds to the kernel, as opposed to passing whole PASID
> > tables, the host kernel needs to reserve GPA space and map it in stage-2,
> > so it can store the PASID table in there. Userspace manages GPA space.
> 
> It is what I thought.. So in the SMMUv3 spec the STE is completely in
> kernel memory, but it points to an S1ContextPtr that must be an IPA if
> the "stage 1 translation tables" are IPA. Only via S1ContextPtr can we
> decode the substream?
> 
> So in SMMUv3 land we don't really ever talk about PASID, we have a
> 'user page table' that is bound to an entire RID and *all* PASIDs.
> 
> While Intel would have a 'user page table' that is only bound to a RID
> & PASID
> 
> Certianly it is not a difference we can hide from userspace.

Concept-wise it is still a 'user page table' with vendor specific format.

Taking your earlier analog it's just for a single 84-bit address space
(20PASID+64bitVA) per RID.

So what we requires is still one unified ioctl in your step-1 to support
per-RID 'user page table'.

For ARM it's SMMU's PASID table format. There is no step-2 since PASID
is already within the address space covered by the user PASID table.

For Intel it's VT-d's 1st level page table format. When moving to step-2
then allows multiple 'user page tables' connected to RID & PASID.

> 
> > This would be easy for a single pgd. In this case the PASID table has a
> > single entry and userspace could just pass one GPA page during
> > registration. However it isn't easily generalized to full PASID support,
> > because managing a multi-level PASID table will require runtime GPA
> > allocation, and that API is awkward. That's why we opted for "attach PASID
> > table" operation rather than "attach page table" (back then the choice was
> > easy since VT-d used the same concept).
> 
> I think the entire context descriptor table should be in userspace,
> and filled in by userspace, as part of the userspace page table.
> 
> The kernel API should accept the S1ContextPtr IPA and all the parts of
> the STE that relate to the defining the layout of what the S1Context
> points to an thats it.
> 
> We should have another mode where the kernel owns everything, and the
> S1ContexPtr is a PA with Stage 2 bypassed.

I guess this is for the usage like DPDK. In that case yes we can have
unified ioctl since the kernel manages everything including the PASID
table. 

> 
> That part is fine, the more open question is what does the driver
> interface look like when userspace tell something like vfio-pci to
> connect to this thing. At some level the attaching device needs to
> authorize iommufd to take the entire PASID table and RID.

as long as smmu driver advocates only supporting step-1 ioctl,
then this authorization should be implied already.

> 
> Specifically we cannot use this thing with a mdev, while the Intel
> version of a userspace page table can be.

yes. Supporting mdev is all the reason why Intel puts the PASID
table in host physical address space to be managed by the kernel.

> 
> Maybe that is just some 'allow whole device' flag in an API
> 

As said, I feel this special flag is not required as long as the 
vendor iommu driver only supports your step-1 interface which
implies 'allow whole device' for ARM.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 116+ messages in thread

* RE: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-09  2:58                   ` Tian, Kevin
  0 siblings, 0 replies; 116+ messages in thread
From: Tian, Kevin @ 2021-12-09  2:58 UTC (permalink / raw)
  To: Jason Gunthorpe, Jean-Philippe Brucker
  Cc: lushenming, robin.murphy, Raj, Ashok, kvm, vivek.gautam, maz,
	Joerg Roedel, linux-kernel, iommu, vsethi, alex.williamson,
	wangxingang5, zhangfei.gao, eric.auger.pro, will, kvmarm,
	Lu Baolu

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Thursday, December 9, 2021 2:31 AM
> 
> On Wed, Dec 08, 2021 at 05:20:39PM +0000, Jean-Philippe Brucker wrote:
> > On Wed, Dec 08, 2021 at 08:56:16AM -0400, Jason Gunthorpe wrote:
> > > From a progress perspective I would like to start with simple 'page
> > > tables in userspace', ie no PASID in this step.
> > >
> > > 'page tables in userspace' means an iommufd ioctl to create an
> > > iommu_domain where the IOMMU HW is directly travesering a
> > > device-specific page table structure in user space memory. All the HW
> > > today implements this by using another iommu_domain to allow the
> IOMMU
> > > HW DMA access to user memory - ie nesting or multi-stage or whatever.
> > >
> > > This would come along with some ioctls to invalidate the IOTLB.
> > >
> > > I'm imagining this step as a iommu_group->op->create_user_domain()
> > > driver callback which will create a new kind of domain with
> > > domain-unique ops. Ie map/unmap related should all be NULL as those
> > > are impossible operations.
> > >
> > > From there the usual struct device (ie RID) attach/detatch stuff needs
> > > to take care of routing DMAs to this iommu_domain.
> > >
> > > Step two would be to add the ability for an iommufd using driver to
> > > request that a RID&PASID is connected to an iommu_domain. This
> > > connection can be requested for any kind of iommu_domain, kernel
> owned
> > > or user owned.
> > >
> > > I don't quite have an answer how exactly the SMMUv3 vs Intel
> > > difference in PASID routing should be resolved.
> >
> > In SMMUv3 the user pgd is always stored in the PASID table (actually
> > called "context descriptor table" but I want to avoid confusion with
> > the VT-d "context table"). And to access the PASID table, the SMMUv3 first
> > translate its GPA into a PA using the stage-2 page table. For userspace to
> > pass individual pgds to the kernel, as opposed to passing whole PASID
> > tables, the host kernel needs to reserve GPA space and map it in stage-2,
> > so it can store the PASID table in there. Userspace manages GPA space.
> 
> It is what I thought.. So in the SMMUv3 spec the STE is completely in
> kernel memory, but it points to an S1ContextPtr that must be an IPA if
> the "stage 1 translation tables" are IPA. Only via S1ContextPtr can we
> decode the substream?
> 
> So in SMMUv3 land we don't really ever talk about PASID, we have a
> 'user page table' that is bound to an entire RID and *all* PASIDs.
> 
> While Intel would have a 'user page table' that is only bound to a RID
> & PASID
> 
> Certianly it is not a difference we can hide from userspace.

Concept-wise it is still a 'user page table' with vendor specific format.

Taking your earlier analog it's just for a single 84-bit address space
(20PASID+64bitVA) per RID.

So what we requires is still one unified ioctl in your step-1 to support
per-RID 'user page table'.

For ARM it's SMMU's PASID table format. There is no step-2 since PASID
is already within the address space covered by the user PASID table.

For Intel it's VT-d's 1st level page table format. When moving to step-2
then allows multiple 'user page tables' connected to RID & PASID.

> 
> > This would be easy for a single pgd. In this case the PASID table has a
> > single entry and userspace could just pass one GPA page during
> > registration. However it isn't easily generalized to full PASID support,
> > because managing a multi-level PASID table will require runtime GPA
> > allocation, and that API is awkward. That's why we opted for "attach PASID
> > table" operation rather than "attach page table" (back then the choice was
> > easy since VT-d used the same concept).
> 
> I think the entire context descriptor table should be in userspace,
> and filled in by userspace, as part of the userspace page table.
> 
> The kernel API should accept the S1ContextPtr IPA and all the parts of
> the STE that relate to the defining the layout of what the S1Context
> points to an thats it.
> 
> We should have another mode where the kernel owns everything, and the
> S1ContexPtr is a PA with Stage 2 bypassed.

I guess this is for the usage like DPDK. In that case yes we can have
unified ioctl since the kernel manages everything including the PASID
table. 

> 
> That part is fine, the more open question is what does the driver
> interface look like when userspace tell something like vfio-pci to
> connect to this thing. At some level the attaching device needs to
> authorize iommufd to take the entire PASID table and RID.

as long as smmu driver advocates only supporting step-1 ioctl,
then this authorization should be implied already.

> 
> Specifically we cannot use this thing with a mdev, while the Intel
> version of a userspace page table can be.

yes. Supporting mdev is all the reason why Intel puts the PASID
table in host physical address space to be managed by the kernel.

> 
> Maybe that is just some 'allow whole device' flag in an API
> 

As said, I feel this special flag is not required as long as the 
vendor iommu driver only supports your step-1 interface which
implies 'allow whole device' for ARM.

Thanks
Kevin
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* RE: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-09  2:58                   ` Tian, Kevin
  0 siblings, 0 replies; 116+ messages in thread
From: Tian, Kevin @ 2021-12-09  2:58 UTC (permalink / raw)
  To: Jason Gunthorpe, Jean-Philippe Brucker
  Cc: peter.maydell, lushenming, robin.murphy, Raj, Ashok, kvm,
	vivek.gautam, maz, linux-kernel, iommu, vsethi, alex.williamson,
	wangxingang5, zhangfei.gao, eric.auger.pro, will, kvmarm

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Thursday, December 9, 2021 2:31 AM
> 
> On Wed, Dec 08, 2021 at 05:20:39PM +0000, Jean-Philippe Brucker wrote:
> > On Wed, Dec 08, 2021 at 08:56:16AM -0400, Jason Gunthorpe wrote:
> > > From a progress perspective I would like to start with simple 'page
> > > tables in userspace', ie no PASID in this step.
> > >
> > > 'page tables in userspace' means an iommufd ioctl to create an
> > > iommu_domain where the IOMMU HW is directly travesering a
> > > device-specific page table structure in user space memory. All the HW
> > > today implements this by using another iommu_domain to allow the
> IOMMU
> > > HW DMA access to user memory - ie nesting or multi-stage or whatever.
> > >
> > > This would come along with some ioctls to invalidate the IOTLB.
> > >
> > > I'm imagining this step as a iommu_group->op->create_user_domain()
> > > driver callback which will create a new kind of domain with
> > > domain-unique ops. Ie map/unmap related should all be NULL as those
> > > are impossible operations.
> > >
> > > From there the usual struct device (ie RID) attach/detatch stuff needs
> > > to take care of routing DMAs to this iommu_domain.
> > >
> > > Step two would be to add the ability for an iommufd using driver to
> > > request that a RID&PASID is connected to an iommu_domain. This
> > > connection can be requested for any kind of iommu_domain, kernel
> owned
> > > or user owned.
> > >
> > > I don't quite have an answer how exactly the SMMUv3 vs Intel
> > > difference in PASID routing should be resolved.
> >
> > In SMMUv3 the user pgd is always stored in the PASID table (actually
> > called "context descriptor table" but I want to avoid confusion with
> > the VT-d "context table"). And to access the PASID table, the SMMUv3 first
> > translate its GPA into a PA using the stage-2 page table. For userspace to
> > pass individual pgds to the kernel, as opposed to passing whole PASID
> > tables, the host kernel needs to reserve GPA space and map it in stage-2,
> > so it can store the PASID table in there. Userspace manages GPA space.
> 
> It is what I thought.. So in the SMMUv3 spec the STE is completely in
> kernel memory, but it points to an S1ContextPtr that must be an IPA if
> the "stage 1 translation tables" are IPA. Only via S1ContextPtr can we
> decode the substream?
> 
> So in SMMUv3 land we don't really ever talk about PASID, we have a
> 'user page table' that is bound to an entire RID and *all* PASIDs.
> 
> While Intel would have a 'user page table' that is only bound to a RID
> & PASID
> 
> Certianly it is not a difference we can hide from userspace.

Concept-wise it is still a 'user page table' with vendor specific format.

Taking your earlier analog it's just for a single 84-bit address space
(20PASID+64bitVA) per RID.

So what we requires is still one unified ioctl in your step-1 to support
per-RID 'user page table'.

For ARM it's SMMU's PASID table format. There is no step-2 since PASID
is already within the address space covered by the user PASID table.

For Intel it's VT-d's 1st level page table format. When moving to step-2
then allows multiple 'user page tables' connected to RID & PASID.

> 
> > This would be easy for a single pgd. In this case the PASID table has a
> > single entry and userspace could just pass one GPA page during
> > registration. However it isn't easily generalized to full PASID support,
> > because managing a multi-level PASID table will require runtime GPA
> > allocation, and that API is awkward. That's why we opted for "attach PASID
> > table" operation rather than "attach page table" (back then the choice was
> > easy since VT-d used the same concept).
> 
> I think the entire context descriptor table should be in userspace,
> and filled in by userspace, as part of the userspace page table.
> 
> The kernel API should accept the S1ContextPtr IPA and all the parts of
> the STE that relate to the defining the layout of what the S1Context
> points to an thats it.
> 
> We should have another mode where the kernel owns everything, and the
> S1ContexPtr is a PA with Stage 2 bypassed.

I guess this is for the usage like DPDK. In that case yes we can have
unified ioctl since the kernel manages everything including the PASID
table. 

> 
> That part is fine, the more open question is what does the driver
> interface look like when userspace tell something like vfio-pci to
> connect to this thing. At some level the attaching device needs to
> authorize iommufd to take the entire PASID table and RID.

as long as smmu driver advocates only supporting step-1 ioctl,
then this authorization should be implied already.

> 
> Specifically we cannot use this thing with a mdev, while the Intel
> version of a userspace page table can be.

yes. Supporting mdev is all the reason why Intel puts the PASID
table in host physical address space to be managed by the kernel.

> 
> Maybe that is just some 'allow whole device' flag in an API
> 

As said, I feel this special flag is not required as long as the 
vendor iommu driver only supports your step-1 interface which
implies 'allow whole device' for ARM.

Thanks
Kevin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 116+ messages in thread

* RE: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
  2021-12-08 12:56             ` Jason Gunthorpe via iommu
  (?)
@ 2021-12-09  3:21               ` Tian, Kevin
  -1 siblings, 0 replies; 116+ messages in thread
From: Tian, Kevin @ 2021-12-09  3:21 UTC (permalink / raw)
  To: Jason Gunthorpe, Eric Auger
  Cc: Lu Baolu, Joerg Roedel, peter.maydell, kvm, vivek.gautam, kvmarm,
	eric.auger.pro, jean-philippe, Raj, Ashok, maz, vsethi,
	zhangfei.gao, will, alex.williamson, wangxingang5, linux-kernel,
	lushenming, iommu, robin.murphy

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, December 8, 2021 8:56 PM
> 
> On Wed, Dec 08, 2021 at 08:33:33AM +0100, Eric Auger wrote:
> > Hi Baolu,
> >
> > On 12/8/21 3:44 AM, Lu Baolu wrote:
> > > Hi Eric,
> > >
> > > On 12/7/21 6:22 PM, Eric Auger wrote:
> > >> On 12/6/21 11:48 AM, Joerg Roedel wrote:
> > >>> On Wed, Oct 27, 2021 at 12:44:20PM +0200, Eric Auger wrote:
> > >>>> Signed-off-by: Jean-Philippe Brucker<jean-philippe.brucker@arm.com>
> > >>>> Signed-off-by: Liu, Yi L<yi.l.liu@linux.intel.com>
> > >>>> Signed-off-by: Ashok Raj<ashok.raj@intel.com>
> > >>>> Signed-off-by: Jacob Pan<jacob.jun.pan@linux.intel.com>
> > >>>> Signed-off-by: Eric Auger<eric.auger@redhat.com>
> > >>> This Signed-of-by chain looks dubious, you are the author but the last
> > >>> one in the chain?
> > >> The 1st RFC in Aug 2018
> > >> (https://lists.cs.columbia.edu/pipermail/kvmarm/2018-
> August/032478.html)
> > >> said this was a generalization of Jacob's patch
> > >>
> > >>
> > >>    [PATCH v5 01/23] iommu: introduce bind_pasid_table API function
> > >>
> > >>
> > >>
> > >> https://lists.linuxfoundation.org/pipermail/iommu/2018-
> May/027647.html
> > >>
> > >> So indeed Jacob should be the author. I guess the multiple rebases got
> > >> this eventually replaced at some point, which is not an excuse. Please
> > >> forgive me for that.
> > >> Now the original patch already had this list of SoB so I don't know if I
> > >> shall simplify it.
> > >
> > > As we have decided to move the nested mode (dual stages)
> implementation
> > > onto the developing iommufd framework, what's the value of adding this
> > > into iommu core?
> >
> > The iommu_uapi_attach_pasid_table uapi should disappear indeed as it is
> > is bound to be replaced by /dev/iommu fellow API.
> > However until I can rebase on /dev/iommu code I am obliged to keep it to
> > maintain this integration, hence the RFC.
> 
> Indeed, we are getting pretty close to having the base iommufd that we
> can start adding stuff like this into. Maybe in January, you can look
> at some parts of what is evolving here:
> 
> https://github.com/jgunthorpe/linux/commits/iommufd
> https://github.com/LuBaolu/intel-iommu/commits/iommu-dma-ownership-
> v2
> https://github.com/luxis1999/iommufd/commits/iommufd-v5.16-rc2
> 
> From a progress perspective I would like to start with simple 'page
> tables in userspace', ie no PASID in this step.
> 
> 'page tables in userspace' means an iommufd ioctl to create an
> iommu_domain where the IOMMU HW is directly travesering a
> device-specific page table structure in user space memory. All the HW
> today implements this by using another iommu_domain to allow the IOMMU
> HW DMA access to user memory - ie nesting or multi-stage or whatever.

One clarification here in case people may get confused based on the
current iommu_domain definition. Jason brainstormed with us on how
to represent 'user page table' in the IOMMU sub-system. One is to
extend iommu_domain as a general representation for any page table
instance. The other option is to create new representations for user
page tables and then link them under existing iommu_domain.

This context is based on the 1st option. and As Jason said in the bottom
we still need to sketch out whether it works as expected. 😊

> 
> This would come along with some ioctls to invalidate the IOTLB.
> 
> I'm imagining this step as a iommu_group->op->create_user_domain()
> driver callback which will create a new kind of domain with
> domain-unique ops. Ie map/unmap related should all be NULL as those
> are impossible operations.
> 
> From there the usual struct device (ie RID) attach/detatch stuff needs
> to take care of routing DMAs to this iommu_domain.

Usage-wise this covers the guest IOVA requirements i.e. when the guest
kernel enables vIOMMU for kernel DMA-API mappings or for device
assignment to guest userspace.

For intel this means optimization to the existing shadow-based vIOMMU
implementation.

For ARM this actually enables guest IOVA usage for the first time (correct
me Eric?). IIRC SMMU doesn't support caching mode while write-protecting
guest I/O page table was considered a no-go. So nesting is considered as
the only option to support that.

and once 'user pasid table' is installed, this actually means guest SVA usage
can also partially work for ARM if I/O page fault is not incurred.

> 
> Step two would be to add the ability for an iommufd using driver to
> request that a RID&PASID is connected to an iommu_domain. This
> connection can be requested for any kind of iommu_domain, kernel owned
> or user owned.
> 
> I don't quite have an answer how exactly the SMMUv3 vs Intel
> difference in PASID routing should be resolved.

For kernel owned the iommufd interface should be generic as the
vendor difference is managed by the kernel itself.

For user owned we'll need new uAPIs for user to specify PASID. 
As I replied in another thread only Intel currently requires it due to
mdev. But other vendors could also do so when they decide to 
support mdev one day.

> 
> to get answers I'm hoping to start building some sketch RFCs for these
> different things on iommufd, hopefully in January. I'm looking at user
> page tables, PASID, dirty tracking and userspace IO fault handling as
> the main features iommufd must tackle.

Make sense.

> 
> The purpose of the sketches would be to validate that the HW features
> we want to exposed can work will with the choices the base is making.
> 
> Jason

Thanks
Kevin

^ permalink raw reply	[flat|nested] 116+ messages in thread

* RE: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-09  3:21               ` Tian, Kevin
  0 siblings, 0 replies; 116+ messages in thread
From: Tian, Kevin @ 2021-12-09  3:21 UTC (permalink / raw)
  To: Jason Gunthorpe, Eric Auger
  Cc: peter.maydell, lushenming, robin.murphy, Raj, Ashok, kvm,
	jean-philippe, maz, linux-kernel, iommu, vsethi, vivek.gautam,
	alex.williamson, wangxingang5, zhangfei.gao, eric.auger.pro,
	will, kvmarm

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, December 8, 2021 8:56 PM
> 
> On Wed, Dec 08, 2021 at 08:33:33AM +0100, Eric Auger wrote:
> > Hi Baolu,
> >
> > On 12/8/21 3:44 AM, Lu Baolu wrote:
> > > Hi Eric,
> > >
> > > On 12/7/21 6:22 PM, Eric Auger wrote:
> > >> On 12/6/21 11:48 AM, Joerg Roedel wrote:
> > >>> On Wed, Oct 27, 2021 at 12:44:20PM +0200, Eric Auger wrote:
> > >>>> Signed-off-by: Jean-Philippe Brucker<jean-philippe.brucker@arm.com>
> > >>>> Signed-off-by: Liu, Yi L<yi.l.liu@linux.intel.com>
> > >>>> Signed-off-by: Ashok Raj<ashok.raj@intel.com>
> > >>>> Signed-off-by: Jacob Pan<jacob.jun.pan@linux.intel.com>
> > >>>> Signed-off-by: Eric Auger<eric.auger@redhat.com>
> > >>> This Signed-of-by chain looks dubious, you are the author but the last
> > >>> one in the chain?
> > >> The 1st RFC in Aug 2018
> > >> (https://lists.cs.columbia.edu/pipermail/kvmarm/2018-
> August/032478.html)
> > >> said this was a generalization of Jacob's patch
> > >>
> > >>
> > >>    [PATCH v5 01/23] iommu: introduce bind_pasid_table API function
> > >>
> > >>
> > >>
> > >> https://lists.linuxfoundation.org/pipermail/iommu/2018-
> May/027647.html
> > >>
> > >> So indeed Jacob should be the author. I guess the multiple rebases got
> > >> this eventually replaced at some point, which is not an excuse. Please
> > >> forgive me for that.
> > >> Now the original patch already had this list of SoB so I don't know if I
> > >> shall simplify it.
> > >
> > > As we have decided to move the nested mode (dual stages)
> implementation
> > > onto the developing iommufd framework, what's the value of adding this
> > > into iommu core?
> >
> > The iommu_uapi_attach_pasid_table uapi should disappear indeed as it is
> > is bound to be replaced by /dev/iommu fellow API.
> > However until I can rebase on /dev/iommu code I am obliged to keep it to
> > maintain this integration, hence the RFC.
> 
> Indeed, we are getting pretty close to having the base iommufd that we
> can start adding stuff like this into. Maybe in January, you can look
> at some parts of what is evolving here:
> 
> https://github.com/jgunthorpe/linux/commits/iommufd
> https://github.com/LuBaolu/intel-iommu/commits/iommu-dma-ownership-
> v2
> https://github.com/luxis1999/iommufd/commits/iommufd-v5.16-rc2
> 
> From a progress perspective I would like to start with simple 'page
> tables in userspace', ie no PASID in this step.
> 
> 'page tables in userspace' means an iommufd ioctl to create an
> iommu_domain where the IOMMU HW is directly travesering a
> device-specific page table structure in user space memory. All the HW
> today implements this by using another iommu_domain to allow the IOMMU
> HW DMA access to user memory - ie nesting or multi-stage or whatever.

One clarification here in case people may get confused based on the
current iommu_domain definition. Jason brainstormed with us on how
to represent 'user page table' in the IOMMU sub-system. One is to
extend iommu_domain as a general representation for any page table
instance. The other option is to create new representations for user
page tables and then link them under existing iommu_domain.

This context is based on the 1st option. and As Jason said in the bottom
we still need to sketch out whether it works as expected. 😊

> 
> This would come along with some ioctls to invalidate the IOTLB.
> 
> I'm imagining this step as a iommu_group->op->create_user_domain()
> driver callback which will create a new kind of domain with
> domain-unique ops. Ie map/unmap related should all be NULL as those
> are impossible operations.
> 
> From there the usual struct device (ie RID) attach/detatch stuff needs
> to take care of routing DMAs to this iommu_domain.

Usage-wise this covers the guest IOVA requirements i.e. when the guest
kernel enables vIOMMU for kernel DMA-API mappings or for device
assignment to guest userspace.

For intel this means optimization to the existing shadow-based vIOMMU
implementation.

For ARM this actually enables guest IOVA usage for the first time (correct
me Eric?). IIRC SMMU doesn't support caching mode while write-protecting
guest I/O page table was considered a no-go. So nesting is considered as
the only option to support that.

and once 'user pasid table' is installed, this actually means guest SVA usage
can also partially work for ARM if I/O page fault is not incurred.

> 
> Step two would be to add the ability for an iommufd using driver to
> request that a RID&PASID is connected to an iommu_domain. This
> connection can be requested for any kind of iommu_domain, kernel owned
> or user owned.
> 
> I don't quite have an answer how exactly the SMMUv3 vs Intel
> difference in PASID routing should be resolved.

For kernel owned the iommufd interface should be generic as the
vendor difference is managed by the kernel itself.

For user owned we'll need new uAPIs for user to specify PASID. 
As I replied in another thread only Intel currently requires it due to
mdev. But other vendors could also do so when they decide to 
support mdev one day.

> 
> to get answers I'm hoping to start building some sketch RFCs for these
> different things on iommufd, hopefully in January. I'm looking at user
> page tables, PASID, dirty tracking and userspace IO fault handling as
> the main features iommufd must tackle.

Make sense.

> 
> The purpose of the sketches would be to validate that the HW features
> we want to exposed can work will with the choices the base is making.
> 
> Jason

Thanks
Kevin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 116+ messages in thread

* RE: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-09  3:21               ` Tian, Kevin
  0 siblings, 0 replies; 116+ messages in thread
From: Tian, Kevin @ 2021-12-09  3:21 UTC (permalink / raw)
  To: Jason Gunthorpe, Eric Auger
  Cc: lushenming, robin.murphy, Raj, Ashok, kvm, jean-philippe, maz,
	Joerg Roedel, linux-kernel, iommu, vsethi, vivek.gautam,
	alex.williamson, wangxingang5, zhangfei.gao, eric.auger.pro,
	will, kvmarm, Lu Baolu

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, December 8, 2021 8:56 PM
> 
> On Wed, Dec 08, 2021 at 08:33:33AM +0100, Eric Auger wrote:
> > Hi Baolu,
> >
> > On 12/8/21 3:44 AM, Lu Baolu wrote:
> > > Hi Eric,
> > >
> > > On 12/7/21 6:22 PM, Eric Auger wrote:
> > >> On 12/6/21 11:48 AM, Joerg Roedel wrote:
> > >>> On Wed, Oct 27, 2021 at 12:44:20PM +0200, Eric Auger wrote:
> > >>>> Signed-off-by: Jean-Philippe Brucker<jean-philippe.brucker@arm.com>
> > >>>> Signed-off-by: Liu, Yi L<yi.l.liu@linux.intel.com>
> > >>>> Signed-off-by: Ashok Raj<ashok.raj@intel.com>
> > >>>> Signed-off-by: Jacob Pan<jacob.jun.pan@linux.intel.com>
> > >>>> Signed-off-by: Eric Auger<eric.auger@redhat.com>
> > >>> This Signed-of-by chain looks dubious, you are the author but the last
> > >>> one in the chain?
> > >> The 1st RFC in Aug 2018
> > >> (https://lists.cs.columbia.edu/pipermail/kvmarm/2018-
> August/032478.html)
> > >> said this was a generalization of Jacob's patch
> > >>
> > >>
> > >>    [PATCH v5 01/23] iommu: introduce bind_pasid_table API function
> > >>
> > >>
> > >>
> > >> https://lists.linuxfoundation.org/pipermail/iommu/2018-
> May/027647.html
> > >>
> > >> So indeed Jacob should be the author. I guess the multiple rebases got
> > >> this eventually replaced at some point, which is not an excuse. Please
> > >> forgive me for that.
> > >> Now the original patch already had this list of SoB so I don't know if I
> > >> shall simplify it.
> > >
> > > As we have decided to move the nested mode (dual stages)
> implementation
> > > onto the developing iommufd framework, what's the value of adding this
> > > into iommu core?
> >
> > The iommu_uapi_attach_pasid_table uapi should disappear indeed as it is
> > is bound to be replaced by /dev/iommu fellow API.
> > However until I can rebase on /dev/iommu code I am obliged to keep it to
> > maintain this integration, hence the RFC.
> 
> Indeed, we are getting pretty close to having the base iommufd that we
> can start adding stuff like this into. Maybe in January, you can look
> at some parts of what is evolving here:
> 
> https://github.com/jgunthorpe/linux/commits/iommufd
> https://github.com/LuBaolu/intel-iommu/commits/iommu-dma-ownership-
> v2
> https://github.com/luxis1999/iommufd/commits/iommufd-v5.16-rc2
> 
> From a progress perspective I would like to start with simple 'page
> tables in userspace', ie no PASID in this step.
> 
> 'page tables in userspace' means an iommufd ioctl to create an
> iommu_domain where the IOMMU HW is directly travesering a
> device-specific page table structure in user space memory. All the HW
> today implements this by using another iommu_domain to allow the IOMMU
> HW DMA access to user memory - ie nesting or multi-stage or whatever.

One clarification here in case people may get confused based on the
current iommu_domain definition. Jason brainstormed with us on how
to represent 'user page table' in the IOMMU sub-system. One is to
extend iommu_domain as a general representation for any page table
instance. The other option is to create new representations for user
page tables and then link them under existing iommu_domain.

This context is based on the 1st option. and As Jason said in the bottom
we still need to sketch out whether it works as expected. 😊

> 
> This would come along with some ioctls to invalidate the IOTLB.
> 
> I'm imagining this step as a iommu_group->op->create_user_domain()
> driver callback which will create a new kind of domain with
> domain-unique ops. Ie map/unmap related should all be NULL as those
> are impossible operations.
> 
> From there the usual struct device (ie RID) attach/detatch stuff needs
> to take care of routing DMAs to this iommu_domain.

Usage-wise this covers the guest IOVA requirements i.e. when the guest
kernel enables vIOMMU for kernel DMA-API mappings or for device
assignment to guest userspace.

For intel this means optimization to the existing shadow-based vIOMMU
implementation.

For ARM this actually enables guest IOVA usage for the first time (correct
me Eric?). IIRC SMMU doesn't support caching mode while write-protecting
guest I/O page table was considered a no-go. So nesting is considered as
the only option to support that.

and once 'user pasid table' is installed, this actually means guest SVA usage
can also partially work for ARM if I/O page fault is not incurred.

> 
> Step two would be to add the ability for an iommufd using driver to
> request that a RID&PASID is connected to an iommu_domain. This
> connection can be requested for any kind of iommu_domain, kernel owned
> or user owned.
> 
> I don't quite have an answer how exactly the SMMUv3 vs Intel
> difference in PASID routing should be resolved.

For kernel owned the iommufd interface should be generic as the
vendor difference is managed by the kernel itself.

For user owned we'll need new uAPIs for user to specify PASID. 
As I replied in another thread only Intel currently requires it due to
mdev. But other vendors could also do so when they decide to 
support mdev one day.

> 
> to get answers I'm hoping to start building some sketch RFCs for these
> different things on iommufd, hopefully in January. I'm looking at user
> page tables, PASID, dirty tracking and userspace IO fault handling as
> the main features iommufd must tackle.

Make sense.

> 
> The purpose of the sketches would be to validate that the HW features
> we want to exposed can work will with the choices the base is making.
> 
> Jason

Thanks
Kevin
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* RE: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
       [not found]                 ` <BN9PR11MB527624080CB9302481B74C7A8C709@BN9PR11MB5276.namprd11.prod.outlook.com>
  2021-12-09  3:59                     ` Tian, Kevin
@ 2021-12-09  3:59                     ` Tian, Kevin
  0 siblings, 0 replies; 116+ messages in thread
From: Tian, Kevin @ 2021-12-09  3:59 UTC (permalink / raw)
  To: Jason Gunthorpe, Jean-Philippe Brucker
  Cc: Eric Auger, Lu Baolu, Joerg Roedel, peter.maydell, kvm,
	vivek.gautam, kvmarm, eric.auger.pro, Raj, Ashok, maz, vsethi,
	zhangfei.gao, will, alex.williamson, wangxingang5, linux-kernel,
	lushenming, iommu, robin.murphy

> From: Tian, Kevin
> Sent: Thursday, December 9, 2021 10:58 AM
> 
> For ARM it's SMMU's PASID table format. There is no step-2 since PASID
> is already within the address space covered by the user PASID table.
> 

One correction here. 'no step-2' is definitely wrong here as it means
more than user page table in your plan (e.g. dpdk).

To simplify it what I meant is:

iommufd reports how many 'user page tables' are supported given a device.

ARM always reports only one can be supported, and it must be created in 
PASID table format. tagged by RID.

Intel reports one in step1 (tagged by RID), and N in step2 (tagged by
RID+PASID). A special flag in attach call allows the user to specify the
additional PASID routing info for a 'user page table'.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 116+ messages in thread

* RE: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-09  3:59                     ` Tian, Kevin
  0 siblings, 0 replies; 116+ messages in thread
From: Tian, Kevin @ 2021-12-09  3:59 UTC (permalink / raw)
  To: Jason Gunthorpe, Jean-Philippe Brucker
  Cc: peter.maydell, lushenming, robin.murphy, Raj, Ashok, kvm,
	vivek.gautam, maz, linux-kernel, iommu, vsethi, alex.williamson,
	wangxingang5, zhangfei.gao, eric.auger.pro, will, kvmarm

> From: Tian, Kevin
> Sent: Thursday, December 9, 2021 10:58 AM
> 
> For ARM it's SMMU's PASID table format. There is no step-2 since PASID
> is already within the address space covered by the user PASID table.
> 

One correction here. 'no step-2' is definitely wrong here as it means
more than user page table in your plan (e.g. dpdk).

To simplify it what I meant is:

iommufd reports how many 'user page tables' are supported given a device.

ARM always reports only one can be supported, and it must be created in 
PASID table format. tagged by RID.

Intel reports one in step1 (tagged by RID), and N in step2 (tagged by
RID+PASID). A special flag in attach call allows the user to specify the
additional PASID routing info for a 'user page table'.

Thanks
Kevin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 116+ messages in thread

* RE: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-09  3:59                     ` Tian, Kevin
  0 siblings, 0 replies; 116+ messages in thread
From: Tian, Kevin @ 2021-12-09  3:59 UTC (permalink / raw)
  To: Jason Gunthorpe, Jean-Philippe Brucker
  Cc: lushenming, robin.murphy, Raj, Ashok, kvm, vivek.gautam, maz,
	Joerg Roedel, linux-kernel, iommu, vsethi, alex.williamson,
	wangxingang5, zhangfei.gao, eric.auger.pro, will, kvmarm,
	Lu Baolu

> From: Tian, Kevin
> Sent: Thursday, December 9, 2021 10:58 AM
> 
> For ARM it's SMMU's PASID table format. There is no step-2 since PASID
> is already within the address space covered by the user PASID table.
> 

One correction here. 'no step-2' is definitely wrong here as it means
more than user page table in your plan (e.g. dpdk).

To simplify it what I meant is:

iommufd reports how many 'user page tables' are supported given a device.

ARM always reports only one can be supported, and it must be created in 
PASID table format. tagged by RID.

Intel reports one in step1 (tagged by RID), and N in step2 (tagged by
RID+PASID). A special flag in attach call allows the user to specify the
additional PASID routing info for a 'user page table'.

Thanks
Kevin
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
  2021-12-08 18:31                 ` Jason Gunthorpe via iommu
  (?)
@ 2021-12-09  7:50                   ` Eric Auger
  -1 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-09  7:50 UTC (permalink / raw)
  To: Jason Gunthorpe, Jean-Philippe Brucker
  Cc: Lu Baolu, Joerg Roedel, peter.maydell, kvm, vivek.gautam, kvmarm,
	eric.auger.pro, ashok.raj, maz, vsethi, zhangfei.gao, kevin.tian,
	will, alex.williamson, wangxingang5, linux-kernel, lushenming,
	iommu, robin.murphy

Hi Jason,

On 12/8/21 7:31 PM, Jason Gunthorpe wrote:
> On Wed, Dec 08, 2021 at 05:20:39PM +0000, Jean-Philippe Brucker wrote:
>> On Wed, Dec 08, 2021 at 08:56:16AM -0400, Jason Gunthorpe wrote:
>>> From a progress perspective I would like to start with simple 'page
>>> tables in userspace', ie no PASID in this step.
>>>
>>> 'page tables in userspace' means an iommufd ioctl to create an
>>> iommu_domain where the IOMMU HW is directly travesering a
>>> device-specific page table structure in user space memory. All the HW
>>> today implements this by using another iommu_domain to allow the IOMMU
>>> HW DMA access to user memory - ie nesting or multi-stage or whatever.
>>>
>>> This would come along with some ioctls to invalidate the IOTLB.
>>>
>>> I'm imagining this step as a iommu_group->op->create_user_domain()
>>> driver callback which will create a new kind of domain with
>>> domain-unique ops. Ie map/unmap related should all be NULL as those
>>> are impossible operations.
>>>
>>> From there the usual struct device (ie RID) attach/detatch stuff needs
>>> to take care of routing DMAs to this iommu_domain.
>>>
>>> Step two would be to add the ability for an iommufd using driver to
>>> request that a RID&PASID is connected to an iommu_domain. This
>>> connection can be requested for any kind of iommu_domain, kernel owned
>>> or user owned.
>>>
>>> I don't quite have an answer how exactly the SMMUv3 vs Intel
>>> difference in PASID routing should be resolved.
>> In SMMUv3 the user pgd is always stored in the PASID table (actually
>> called "context descriptor table" but I want to avoid confusion with
>> the VT-d "context table"). And to access the PASID table, the SMMUv3 first
>> translate its GPA into a PA using the stage-2 page table. For userspace to
>> pass individual pgds to the kernel, as opposed to passing whole PASID
>> tables, the host kernel needs to reserve GPA space and map it in stage-2,
>> so it can store the PASID table in there. Userspace manages GPA space.
> It is what I thought.. So in the SMMUv3 spec the STE is completely in
> kernel memory, but it points to an S1ContextPtr that must be an IPA if
> the "stage 1 translation tables" are IPA. Only via S1ContextPtr can we
> decode the substream?
Yes that's correct. S1ContextPtr is the IPA of the L1 Context Descriptor
Table which is then indexed by substreamID.

>
> So in SMMUv3 land we don't really ever talk about PASID, we have a
> 'user page table' that is bound to an entire RID and *all* PASIDs.
in ARM terminology substreamID matches the PASID and this is what
indexes the L1 Context Descriptor Table.

>
> While Intel would have a 'user page table' that is only bound to a RID
> & PASID
>
> Certianly it is not a difference we can hide from userspace.
>  
>> This would be easy for a single pgd. In this case the PASID table has a
>> single entry and userspace could just pass one GPA page during
>> registration. However it isn't easily generalized to full PASID support,
>> because managing a multi-level PASID table will require runtime GPA
>> allocation, and that API is awkward. That's why we opted for "attach PASID
>> table" operation rather than "attach page table" (back then the choice was
>> easy since VT-d used the same concept).
> I think the entire context descriptor table should be in userspace,
> and filled in by userspace, as part of the userspace page table.

In ARM nested mode the L1 Context Descriptor Table is fully managed by
the guest and the userspace only needs to trap its S1ContextPtr and pass
it to the host.
>
> The kernel API should accept the S1ContextPtr IPA and all the parts of
> the STE that relate to the defining the layout of what the S1Context
> points to an thats it.
Yes that's exactly what is done currently. At config time the host must
trap guest STE changes (format and S1ContextPtr) and "incorporate" those
changes into the stage2 related STE information. The STE is owned by the
host kernel as it contains the stage2 information (S2TTB).

In
https://developer.arm.com/documentation/ihi0070/latest
(ARM_IHI_0070_D_b_System_Memory_Management_Unit_Architecture_Specification.pdf)
Synthetic diagrams can be found in 3.3.2 StreamIDs to Context
Descriptors. They give the global view.

Note this series only coped with a single CD in the Context Descriptor
Table.

Thanks

Eric
>
> We should have another mode where the kernel owns everything, and the
> S1ContexPtr is a PA with Stage 2 bypassed.
>
> That part is fine, the more open question is what does the driver
> interface look like when userspace tell something like vfio-pci to
> connect to this thing. At some level the attaching device needs to
> authorize iommufd to take the entire PASID table and RID.
>
> Specifically we cannot use this thing with a mdev, while the Intel
> version of a userspace page table can be.
>
> Maybe that is just some 'allow whole device' flag in an API
>
> Jason
>


^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-09  7:50                   ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-09  7:50 UTC (permalink / raw)
  To: Jason Gunthorpe, Jean-Philippe Brucker
  Cc: peter.maydell, kevin.tian, lushenming, robin.murphy, ashok.raj,
	kvm, wangxingang5, maz, linux-kernel, iommu, vsethi,
	vivek.gautam, alex.williamson, zhangfei.gao, eric.auger.pro,
	will, kvmarm

Hi Jason,

On 12/8/21 7:31 PM, Jason Gunthorpe wrote:
> On Wed, Dec 08, 2021 at 05:20:39PM +0000, Jean-Philippe Brucker wrote:
>> On Wed, Dec 08, 2021 at 08:56:16AM -0400, Jason Gunthorpe wrote:
>>> From a progress perspective I would like to start with simple 'page
>>> tables in userspace', ie no PASID in this step.
>>>
>>> 'page tables in userspace' means an iommufd ioctl to create an
>>> iommu_domain where the IOMMU HW is directly travesering a
>>> device-specific page table structure in user space memory. All the HW
>>> today implements this by using another iommu_domain to allow the IOMMU
>>> HW DMA access to user memory - ie nesting or multi-stage or whatever.
>>>
>>> This would come along with some ioctls to invalidate the IOTLB.
>>>
>>> I'm imagining this step as a iommu_group->op->create_user_domain()
>>> driver callback which will create a new kind of domain with
>>> domain-unique ops. Ie map/unmap related should all be NULL as those
>>> are impossible operations.
>>>
>>> From there the usual struct device (ie RID) attach/detatch stuff needs
>>> to take care of routing DMAs to this iommu_domain.
>>>
>>> Step two would be to add the ability for an iommufd using driver to
>>> request that a RID&PASID is connected to an iommu_domain. This
>>> connection can be requested for any kind of iommu_domain, kernel owned
>>> or user owned.
>>>
>>> I don't quite have an answer how exactly the SMMUv3 vs Intel
>>> difference in PASID routing should be resolved.
>> In SMMUv3 the user pgd is always stored in the PASID table (actually
>> called "context descriptor table" but I want to avoid confusion with
>> the VT-d "context table"). And to access the PASID table, the SMMUv3 first
>> translate its GPA into a PA using the stage-2 page table. For userspace to
>> pass individual pgds to the kernel, as opposed to passing whole PASID
>> tables, the host kernel needs to reserve GPA space and map it in stage-2,
>> so it can store the PASID table in there. Userspace manages GPA space.
> It is what I thought.. So in the SMMUv3 spec the STE is completely in
> kernel memory, but it points to an S1ContextPtr that must be an IPA if
> the "stage 1 translation tables" are IPA. Only via S1ContextPtr can we
> decode the substream?
Yes that's correct. S1ContextPtr is the IPA of the L1 Context Descriptor
Table which is then indexed by substreamID.

>
> So in SMMUv3 land we don't really ever talk about PASID, we have a
> 'user page table' that is bound to an entire RID and *all* PASIDs.
in ARM terminology substreamID matches the PASID and this is what
indexes the L1 Context Descriptor Table.

>
> While Intel would have a 'user page table' that is only bound to a RID
> & PASID
>
> Certianly it is not a difference we can hide from userspace.
>  
>> This would be easy for a single pgd. In this case the PASID table has a
>> single entry and userspace could just pass one GPA page during
>> registration. However it isn't easily generalized to full PASID support,
>> because managing a multi-level PASID table will require runtime GPA
>> allocation, and that API is awkward. That's why we opted for "attach PASID
>> table" operation rather than "attach page table" (back then the choice was
>> easy since VT-d used the same concept).
> I think the entire context descriptor table should be in userspace,
> and filled in by userspace, as part of the userspace page table.

In ARM nested mode the L1 Context Descriptor Table is fully managed by
the guest and the userspace only needs to trap its S1ContextPtr and pass
it to the host.
>
> The kernel API should accept the S1ContextPtr IPA and all the parts of
> the STE that relate to the defining the layout of what the S1Context
> points to an thats it.
Yes that's exactly what is done currently. At config time the host must
trap guest STE changes (format and S1ContextPtr) and "incorporate" those
changes into the stage2 related STE information. The STE is owned by the
host kernel as it contains the stage2 information (S2TTB).

In
https://developer.arm.com/documentation/ihi0070/latest
(ARM_IHI_0070_D_b_System_Memory_Management_Unit_Architecture_Specification.pdf)
Synthetic diagrams can be found in 3.3.2 StreamIDs to Context
Descriptors. They give the global view.

Note this series only coped with a single CD in the Context Descriptor
Table.

Thanks

Eric
>
> We should have another mode where the kernel owns everything, and the
> S1ContexPtr is a PA with Stage 2 bypassed.
>
> That part is fine, the more open question is what does the driver
> interface look like when userspace tell something like vfio-pci to
> connect to this thing. At some level the attaching device needs to
> authorize iommufd to take the entire PASID table and RID.
>
> Specifically we cannot use this thing with a mdev, while the Intel
> version of a userspace page table can be.
>
> Maybe that is just some 'allow whole device' flag in an API
>
> Jason
>

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-09  7:50                   ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-09  7:50 UTC (permalink / raw)
  To: Jason Gunthorpe, Jean-Philippe Brucker
  Cc: kevin.tian, lushenming, robin.murphy, ashok.raj, kvm,
	wangxingang5, maz, Joerg Roedel, linux-kernel, iommu, vsethi,
	vivek.gautam, alex.williamson, zhangfei.gao, eric.auger.pro,
	will, kvmarm, Lu Baolu

Hi Jason,

On 12/8/21 7:31 PM, Jason Gunthorpe wrote:
> On Wed, Dec 08, 2021 at 05:20:39PM +0000, Jean-Philippe Brucker wrote:
>> On Wed, Dec 08, 2021 at 08:56:16AM -0400, Jason Gunthorpe wrote:
>>> From a progress perspective I would like to start with simple 'page
>>> tables in userspace', ie no PASID in this step.
>>>
>>> 'page tables in userspace' means an iommufd ioctl to create an
>>> iommu_domain where the IOMMU HW is directly travesering a
>>> device-specific page table structure in user space memory. All the HW
>>> today implements this by using another iommu_domain to allow the IOMMU
>>> HW DMA access to user memory - ie nesting or multi-stage or whatever.
>>>
>>> This would come along with some ioctls to invalidate the IOTLB.
>>>
>>> I'm imagining this step as a iommu_group->op->create_user_domain()
>>> driver callback which will create a new kind of domain with
>>> domain-unique ops. Ie map/unmap related should all be NULL as those
>>> are impossible operations.
>>>
>>> From there the usual struct device (ie RID) attach/detatch stuff needs
>>> to take care of routing DMAs to this iommu_domain.
>>>
>>> Step two would be to add the ability for an iommufd using driver to
>>> request that a RID&PASID is connected to an iommu_domain. This
>>> connection can be requested for any kind of iommu_domain, kernel owned
>>> or user owned.
>>>
>>> I don't quite have an answer how exactly the SMMUv3 vs Intel
>>> difference in PASID routing should be resolved.
>> In SMMUv3 the user pgd is always stored in the PASID table (actually
>> called "context descriptor table" but I want to avoid confusion with
>> the VT-d "context table"). And to access the PASID table, the SMMUv3 first
>> translate its GPA into a PA using the stage-2 page table. For userspace to
>> pass individual pgds to the kernel, as opposed to passing whole PASID
>> tables, the host kernel needs to reserve GPA space and map it in stage-2,
>> so it can store the PASID table in there. Userspace manages GPA space.
> It is what I thought.. So in the SMMUv3 spec the STE is completely in
> kernel memory, but it points to an S1ContextPtr that must be an IPA if
> the "stage 1 translation tables" are IPA. Only via S1ContextPtr can we
> decode the substream?
Yes that's correct. S1ContextPtr is the IPA of the L1 Context Descriptor
Table which is then indexed by substreamID.

>
> So in SMMUv3 land we don't really ever talk about PASID, we have a
> 'user page table' that is bound to an entire RID and *all* PASIDs.
in ARM terminology substreamID matches the PASID and this is what
indexes the L1 Context Descriptor Table.

>
> While Intel would have a 'user page table' that is only bound to a RID
> & PASID
>
> Certianly it is not a difference we can hide from userspace.
>  
>> This would be easy for a single pgd. In this case the PASID table has a
>> single entry and userspace could just pass one GPA page during
>> registration. However it isn't easily generalized to full PASID support,
>> because managing a multi-level PASID table will require runtime GPA
>> allocation, and that API is awkward. That's why we opted for "attach PASID
>> table" operation rather than "attach page table" (back then the choice was
>> easy since VT-d used the same concept).
> I think the entire context descriptor table should be in userspace,
> and filled in by userspace, as part of the userspace page table.

In ARM nested mode the L1 Context Descriptor Table is fully managed by
the guest and the userspace only needs to trap its S1ContextPtr and pass
it to the host.
>
> The kernel API should accept the S1ContextPtr IPA and all the parts of
> the STE that relate to the defining the layout of what the S1Context
> points to an thats it.
Yes that's exactly what is done currently. At config time the host must
trap guest STE changes (format and S1ContextPtr) and "incorporate" those
changes into the stage2 related STE information. The STE is owned by the
host kernel as it contains the stage2 information (S2TTB).

In
https://developer.arm.com/documentation/ihi0070/latest
(ARM_IHI_0070_D_b_System_Memory_Management_Unit_Architecture_Specification.pdf)
Synthetic diagrams can be found in 3.3.2 StreamIDs to Context
Descriptors. They give the global view.

Note this series only coped with a single CD in the Context Descriptor
Table.

Thanks

Eric
>
> We should have another mode where the kernel owns everything, and the
> S1ContexPtr is a PA with Stage 2 bypassed.
>
> That part is fine, the more open question is what does the driver
> interface look like when userspace tell something like vfio-pci to
> connect to this thing. At some level the attaching device needs to
> authorize iommufd to take the entire PASID table and RID.
>
> Specifically we cannot use this thing with a mdev, while the Intel
> version of a userspace page table can be.
>
> Maybe that is just some 'allow whole device' flag in an API
>
> Jason
>

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
  2021-12-08 12:56             ` Jason Gunthorpe via iommu
  (?)
@ 2021-12-09  8:31               ` Eric Auger
  -1 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-09  8:31 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Lu Baolu, Joerg Roedel, peter.maydell, kvm, vivek.gautam, kvmarm,
	eric.auger.pro, jean-philippe, ashok.raj, maz, vsethi,
	zhangfei.gao, kevin.tian, will, alex.williamson, wangxingang5,
	linux-kernel, lushenming, iommu, robin.murphy

Hi Jason,

On 12/8/21 1:56 PM, Jason Gunthorpe wrote:
> On Wed, Dec 08, 2021 at 08:33:33AM +0100, Eric Auger wrote:
>> Hi Baolu,
>>
>> On 12/8/21 3:44 AM, Lu Baolu wrote:
>>> Hi Eric,
>>>
>>> On 12/7/21 6:22 PM, Eric Auger wrote:
>>>> On 12/6/21 11:48 AM, Joerg Roedel wrote:
>>>>> On Wed, Oct 27, 2021 at 12:44:20PM +0200, Eric Auger wrote:
>>>>>> Signed-off-by: Jean-Philippe Brucker<jean-philippe.brucker@arm.com>
>>>>>> Signed-off-by: Liu, Yi L<yi.l.liu@linux.intel.com>
>>>>>> Signed-off-by: Ashok Raj<ashok.raj@intel.com>
>>>>>> Signed-off-by: Jacob Pan<jacob.jun.pan@linux.intel.com>
>>>>>> Signed-off-by: Eric Auger<eric.auger@redhat.com>
>>>>> This Signed-of-by chain looks dubious, you are the author but the last
>>>>> one in the chain?
>>>> The 1st RFC in Aug 2018
>>>> (https://lists.cs.columbia.edu/pipermail/kvmarm/2018-August/032478.html)
>>>> said this was a generalization of Jacob's patch
>>>>
>>>>
>>>>    [PATCH v5 01/23] iommu: introduce bind_pasid_table API function
>>>>
>>>>
>>>>   
>>>> https://lists.linuxfoundation.org/pipermail/iommu/2018-May/027647.html
>>>>
>>>> So indeed Jacob should be the author. I guess the multiple rebases got
>>>> this eventually replaced at some point, which is not an excuse. Please
>>>> forgive me for that.
>>>> Now the original patch already had this list of SoB so I don't know if I
>>>> shall simplify it.
>>> As we have decided to move the nested mode (dual stages) implementation
>>> onto the developing iommufd framework, what's the value of adding this
>>> into iommu core?
>> The iommu_uapi_attach_pasid_table uapi should disappear indeed as it is
>> is bound to be replaced by /dev/iommu fellow API.
>> However until I can rebase on /dev/iommu code I am obliged to keep it to
>> maintain this integration, hence the RFC.
> Indeed, we are getting pretty close to having the base iommufd that we
> can start adding stuff like this into. Maybe in January, you can look
> at some parts of what is evolving here:
>
> https://github.com/jgunthorpe/linux/commits/iommufd
> https://github.com/LuBaolu/intel-iommu/commits/iommu-dma-ownership-v2
> https://github.com/luxis1999/iommufd/commits/iommufd-v5.16-rc2
Interesting. thank you for the preview links. I will have a look asap

Eric
>
> From a progress perspective I would like to start with simple 'page
> tables in userspace', ie no PASID in this step.
>
> 'page tables in userspace' means an iommufd ioctl to create an
> iommu_domain where the IOMMU HW is directly travesering a
> device-specific page table structure in user space memory. All the HW
> today implements this by using another iommu_domain to allow the IOMMU
> HW DMA access to user memory - ie nesting or multi-stage or whatever.
>
> This would come along with some ioctls to invalidate the IOTLB.
>
> I'm imagining this step as a iommu_group->op->create_user_domain()
> driver callback which will create a new kind of domain with
> domain-unique ops. Ie map/unmap related should all be NULL as those
> are impossible operations.
>
> From there the usual struct device (ie RID) attach/detatch stuff needs
> to take care of routing DMAs to this iommu_domain.
>
> Step two would be to add the ability for an iommufd using driver to
> request that a RID&PASID is connected to an iommu_domain. This
> connection can be requested for any kind of iommu_domain, kernel owned
> or user owned.
>
> I don't quite have an answer how exactly the SMMUv3 vs Intel
> difference in PASID routing should be resolved.
>
> to get answers I'm hoping to start building some sketch RFCs for these
> different things on iommufd, hopefully in January. I'm looking at user
> page tables, PASID, dirty tracking and userspace IO fault handling as
> the main features iommufd must tackle.
>
> The purpose of the sketches would be to validate that the HW features
> we want to exposed can work will with the choices the base is making.
>
> Jason
>


^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-09  8:31               ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-09  8:31 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: peter.maydell, kevin.tian, lushenming, robin.murphy, ashok.raj,
	kvm, jean-philippe, maz, linux-kernel, iommu, vsethi,
	vivek.gautam, alex.williamson, wangxingang5, zhangfei.gao,
	eric.auger.pro, will, kvmarm

Hi Jason,

On 12/8/21 1:56 PM, Jason Gunthorpe wrote:
> On Wed, Dec 08, 2021 at 08:33:33AM +0100, Eric Auger wrote:
>> Hi Baolu,
>>
>> On 12/8/21 3:44 AM, Lu Baolu wrote:
>>> Hi Eric,
>>>
>>> On 12/7/21 6:22 PM, Eric Auger wrote:
>>>> On 12/6/21 11:48 AM, Joerg Roedel wrote:
>>>>> On Wed, Oct 27, 2021 at 12:44:20PM +0200, Eric Auger wrote:
>>>>>> Signed-off-by: Jean-Philippe Brucker<jean-philippe.brucker@arm.com>
>>>>>> Signed-off-by: Liu, Yi L<yi.l.liu@linux.intel.com>
>>>>>> Signed-off-by: Ashok Raj<ashok.raj@intel.com>
>>>>>> Signed-off-by: Jacob Pan<jacob.jun.pan@linux.intel.com>
>>>>>> Signed-off-by: Eric Auger<eric.auger@redhat.com>
>>>>> This Signed-of-by chain looks dubious, you are the author but the last
>>>>> one in the chain?
>>>> The 1st RFC in Aug 2018
>>>> (https://lists.cs.columbia.edu/pipermail/kvmarm/2018-August/032478.html)
>>>> said this was a generalization of Jacob's patch
>>>>
>>>>
>>>>    [PATCH v5 01/23] iommu: introduce bind_pasid_table API function
>>>>
>>>>
>>>>   
>>>> https://lists.linuxfoundation.org/pipermail/iommu/2018-May/027647.html
>>>>
>>>> So indeed Jacob should be the author. I guess the multiple rebases got
>>>> this eventually replaced at some point, which is not an excuse. Please
>>>> forgive me for that.
>>>> Now the original patch already had this list of SoB so I don't know if I
>>>> shall simplify it.
>>> As we have decided to move the nested mode (dual stages) implementation
>>> onto the developing iommufd framework, what's the value of adding this
>>> into iommu core?
>> The iommu_uapi_attach_pasid_table uapi should disappear indeed as it is
>> is bound to be replaced by /dev/iommu fellow API.
>> However until I can rebase on /dev/iommu code I am obliged to keep it to
>> maintain this integration, hence the RFC.
> Indeed, we are getting pretty close to having the base iommufd that we
> can start adding stuff like this into. Maybe in January, you can look
> at some parts of what is evolving here:
>
> https://github.com/jgunthorpe/linux/commits/iommufd
> https://github.com/LuBaolu/intel-iommu/commits/iommu-dma-ownership-v2
> https://github.com/luxis1999/iommufd/commits/iommufd-v5.16-rc2
Interesting. thank you for the preview links. I will have a look asap

Eric
>
> From a progress perspective I would like to start with simple 'page
> tables in userspace', ie no PASID in this step.
>
> 'page tables in userspace' means an iommufd ioctl to create an
> iommu_domain where the IOMMU HW is directly travesering a
> device-specific page table structure in user space memory. All the HW
> today implements this by using another iommu_domain to allow the IOMMU
> HW DMA access to user memory - ie nesting or multi-stage or whatever.
>
> This would come along with some ioctls to invalidate the IOTLB.
>
> I'm imagining this step as a iommu_group->op->create_user_domain()
> driver callback which will create a new kind of domain with
> domain-unique ops. Ie map/unmap related should all be NULL as those
> are impossible operations.
>
> From there the usual struct device (ie RID) attach/detatch stuff needs
> to take care of routing DMAs to this iommu_domain.
>
> Step two would be to add the ability for an iommufd using driver to
> request that a RID&PASID is connected to an iommu_domain. This
> connection can be requested for any kind of iommu_domain, kernel owned
> or user owned.
>
> I don't quite have an answer how exactly the SMMUv3 vs Intel
> difference in PASID routing should be resolved.
>
> to get answers I'm hoping to start building some sketch RFCs for these
> different things on iommufd, hopefully in January. I'm looking at user
> page tables, PASID, dirty tracking and userspace IO fault handling as
> the main features iommufd must tackle.
>
> The purpose of the sketches would be to validate that the HW features
> we want to exposed can work will with the choices the base is making.
>
> Jason
>

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-09  8:31               ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-09  8:31 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: kevin.tian, lushenming, robin.murphy, ashok.raj, kvm,
	jean-philippe, maz, Joerg Roedel, linux-kernel, iommu, vsethi,
	vivek.gautam, alex.williamson, wangxingang5, zhangfei.gao,
	eric.auger.pro, will, kvmarm, Lu Baolu

Hi Jason,

On 12/8/21 1:56 PM, Jason Gunthorpe wrote:
> On Wed, Dec 08, 2021 at 08:33:33AM +0100, Eric Auger wrote:
>> Hi Baolu,
>>
>> On 12/8/21 3:44 AM, Lu Baolu wrote:
>>> Hi Eric,
>>>
>>> On 12/7/21 6:22 PM, Eric Auger wrote:
>>>> On 12/6/21 11:48 AM, Joerg Roedel wrote:
>>>>> On Wed, Oct 27, 2021 at 12:44:20PM +0200, Eric Auger wrote:
>>>>>> Signed-off-by: Jean-Philippe Brucker<jean-philippe.brucker@arm.com>
>>>>>> Signed-off-by: Liu, Yi L<yi.l.liu@linux.intel.com>
>>>>>> Signed-off-by: Ashok Raj<ashok.raj@intel.com>
>>>>>> Signed-off-by: Jacob Pan<jacob.jun.pan@linux.intel.com>
>>>>>> Signed-off-by: Eric Auger<eric.auger@redhat.com>
>>>>> This Signed-of-by chain looks dubious, you are the author but the last
>>>>> one in the chain?
>>>> The 1st RFC in Aug 2018
>>>> (https://lists.cs.columbia.edu/pipermail/kvmarm/2018-August/032478.html)
>>>> said this was a generalization of Jacob's patch
>>>>
>>>>
>>>>    [PATCH v5 01/23] iommu: introduce bind_pasid_table API function
>>>>
>>>>
>>>>   
>>>> https://lists.linuxfoundation.org/pipermail/iommu/2018-May/027647.html
>>>>
>>>> So indeed Jacob should be the author. I guess the multiple rebases got
>>>> this eventually replaced at some point, which is not an excuse. Please
>>>> forgive me for that.
>>>> Now the original patch already had this list of SoB so I don't know if I
>>>> shall simplify it.
>>> As we have decided to move the nested mode (dual stages) implementation
>>> onto the developing iommufd framework, what's the value of adding this
>>> into iommu core?
>> The iommu_uapi_attach_pasid_table uapi should disappear indeed as it is
>> is bound to be replaced by /dev/iommu fellow API.
>> However until I can rebase on /dev/iommu code I am obliged to keep it to
>> maintain this integration, hence the RFC.
> Indeed, we are getting pretty close to having the base iommufd that we
> can start adding stuff like this into. Maybe in January, you can look
> at some parts of what is evolving here:
>
> https://github.com/jgunthorpe/linux/commits/iommufd
> https://github.com/LuBaolu/intel-iommu/commits/iommu-dma-ownership-v2
> https://github.com/luxis1999/iommufd/commits/iommufd-v5.16-rc2
Interesting. thank you for the preview links. I will have a look asap

Eric
>
> From a progress perspective I would like to start with simple 'page
> tables in userspace', ie no PASID in this step.
>
> 'page tables in userspace' means an iommufd ioctl to create an
> iommu_domain where the IOMMU HW is directly travesering a
> device-specific page table structure in user space memory. All the HW
> today implements this by using another iommu_domain to allow the IOMMU
> HW DMA access to user memory - ie nesting or multi-stage or whatever.
>
> This would come along with some ioctls to invalidate the IOTLB.
>
> I'm imagining this step as a iommu_group->op->create_user_domain()
> driver callback which will create a new kind of domain with
> domain-unique ops. Ie map/unmap related should all be NULL as those
> are impossible operations.
>
> From there the usual struct device (ie RID) attach/detatch stuff needs
> to take care of routing DMAs to this iommu_domain.
>
> Step two would be to add the ability for an iommufd using driver to
> request that a RID&PASID is connected to an iommu_domain. This
> connection can be requested for any kind of iommu_domain, kernel owned
> or user owned.
>
> I don't quite have an answer how exactly the SMMUv3 vs Intel
> difference in PASID routing should be resolved.
>
> to get answers I'm hoping to start building some sketch RFCs for these
> different things on iommufd, hopefully in January. I'm looking at user
> page tables, PASID, dirty tracking and userspace IO fault handling as
> the main features iommufd must tackle.
>
> The purpose of the sketches would be to validate that the HW features
> we want to exposed can work will with the choices the base is making.
>
> Jason
>

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
  2021-12-09  3:21               ` Tian, Kevin
  (?)
@ 2021-12-09  9:44                 ` Eric Auger
  -1 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-09  9:44 UTC (permalink / raw)
  To: Tian, Kevin, Jason Gunthorpe
  Cc: Lu Baolu, Joerg Roedel, peter.maydell, kvm, vivek.gautam, kvmarm,
	eric.auger.pro, jean-philippe, Raj, Ashok, maz, vsethi,
	zhangfei.gao, will, alex.williamson, wangxingang5, linux-kernel,
	lushenming, iommu, robin.murphy

Hi Kevin,

On 12/9/21 4:21 AM, Tian, Kevin wrote:
>> From: Jason Gunthorpe <jgg@nvidia.com>
>> Sent: Wednesday, December 8, 2021 8:56 PM
>>
>> On Wed, Dec 08, 2021 at 08:33:33AM +0100, Eric Auger wrote:
>>> Hi Baolu,
>>>
>>> On 12/8/21 3:44 AM, Lu Baolu wrote:
>>>> Hi Eric,
>>>>
>>>> On 12/7/21 6:22 PM, Eric Auger wrote:
>>>>> On 12/6/21 11:48 AM, Joerg Roedel wrote:
>>>>>> On Wed, Oct 27, 2021 at 12:44:20PM +0200, Eric Auger wrote:
>>>>>>> Signed-off-by: Jean-Philippe Brucker<jean-philippe.brucker@arm.com>
>>>>>>> Signed-off-by: Liu, Yi L<yi.l.liu@linux.intel.com>
>>>>>>> Signed-off-by: Ashok Raj<ashok.raj@intel.com>
>>>>>>> Signed-off-by: Jacob Pan<jacob.jun.pan@linux.intel.com>
>>>>>>> Signed-off-by: Eric Auger<eric.auger@redhat.com>
>>>>>> This Signed-of-by chain looks dubious, you are the author but the last
>>>>>> one in the chain?
>>>>> The 1st RFC in Aug 2018
>>>>> (https://lists.cs.columbia.edu/pipermail/kvmarm/2018-
>> August/032478.html)
>>>>> said this was a generalization of Jacob's patch
>>>>>
>>>>>
>>>>>    [PATCH v5 01/23] iommu: introduce bind_pasid_table API function
>>>>>
>>>>>
>>>>>
>>>>> https://lists.linuxfoundation.org/pipermail/iommu/2018-
>> May/027647.html
>>>>> So indeed Jacob should be the author. I guess the multiple rebases got
>>>>> this eventually replaced at some point, which is not an excuse. Please
>>>>> forgive me for that.
>>>>> Now the original patch already had this list of SoB so I don't know if I
>>>>> shall simplify it.
>>>> As we have decided to move the nested mode (dual stages)
>> implementation
>>>> onto the developing iommufd framework, what's the value of adding this
>>>> into iommu core?
>>> The iommu_uapi_attach_pasid_table uapi should disappear indeed as it is
>>> is bound to be replaced by /dev/iommu fellow API.
>>> However until I can rebase on /dev/iommu code I am obliged to keep it to
>>> maintain this integration, hence the RFC.
>> Indeed, we are getting pretty close to having the base iommufd that we
>> can start adding stuff like this into. Maybe in January, you can look
>> at some parts of what is evolving here:
>>
>> https://github.com/jgunthorpe/linux/commits/iommufd
>> https://github.com/LuBaolu/intel-iommu/commits/iommu-dma-ownership-
>> v2
>> https://github.com/luxis1999/iommufd/commits/iommufd-v5.16-rc2
>>
>> From a progress perspective I would like to start with simple 'page
>> tables in userspace', ie no PASID in this step.
>>
>> 'page tables in userspace' means an iommufd ioctl to create an
>> iommu_domain where the IOMMU HW is directly travesering a
>> device-specific page table structure in user space memory. All the HW
>> today implements this by using another iommu_domain to allow the IOMMU
>> HW DMA access to user memory - ie nesting or multi-stage or whatever.
> One clarification here in case people may get confused based on the
> current iommu_domain definition. Jason brainstormed with us on how
> to represent 'user page table' in the IOMMU sub-system. One is to
> extend iommu_domain as a general representation for any page table
> instance. The other option is to create new representations for user
> page tables and then link them under existing iommu_domain.
>
> This context is based on the 1st option. and As Jason said in the bottom
> we still need to sketch out whether it works as expected. 😊
>
>> This would come along with some ioctls to invalidate the IOTLB.
>>
>> I'm imagining this step as a iommu_group->op->create_user_domain()
>> driver callback which will create a new kind of domain with
>> domain-unique ops. Ie map/unmap related should all be NULL as those
>> are impossible operations.
>>
>> From there the usual struct device (ie RID) attach/detatch stuff needs
>> to take care of routing DMAs to this iommu_domain.
> Usage-wise this covers the guest IOVA requirements i.e. when the guest
> kernel enables vIOMMU for kernel DMA-API mappings or for device
> assignment to guest userspace.
>
> For intel this means optimization to the existing shadow-based vIOMMU
> implementation.
>
> For ARM this actually enables guest IOVA usage for the first time (correct
> me Eric?).
Yes that's correct. This is the scope of this series (single PASID)
>  IIRC SMMU doesn't support caching mode while write-protecting
> guest I/O page table was considered a no-go. So nesting is considered as
> the only option to support that.
that's correct too. No 'caching mode' provisionned in the SMMU spec. I
thought it would just be a matter of adding 1 bit in an ID reg though.

Thanks

Eric
>
> and once 'user pasid table' is installed, this actually means guest SVA usage
> can also partially work for ARM if I/O page fault is not incurred.
>
>> Step two would be to add the ability for an iommufd using driver to
>> request that a RID&PASID is connected to an iommu_domain. This
>> connection can be requested for any kind of iommu_domain, kernel owned
>> or user owned.
>>
>> I don't quite have an answer how exactly the SMMUv3 vs Intel
>> difference in PASID routing should be resolved.
> For kernel owned the iommufd interface should be generic as the
> vendor difference is managed by the kernel itself.
>
> For user owned we'll need new uAPIs for user to specify PASID. 
> As I replied in another thread only Intel currently requires it due to
> mdev. But other vendors could also do so when they decide to 
> support mdev one day.
>
>> to get answers I'm hoping to start building some sketch RFCs for these
>> different things on iommufd, hopefully in January. I'm looking at user
>> page tables, PASID, dirty tracking and userspace IO fault handling as
>> the main features iommufd must tackle.
> Make sense.
>
>> The purpose of the sketches would be to validate that the HW features
>> we want to exposed can work will with the choices the base is making.
>>
>> Jason
> Thanks
> Kevin


^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-09  9:44                 ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-09  9:44 UTC (permalink / raw)
  To: Tian, Kevin, Jason Gunthorpe
  Cc: peter.maydell, lushenming, robin.murphy, Raj, Ashok, kvm,
	jean-philippe, maz, linux-kernel, iommu, vsethi, vivek.gautam,
	alex.williamson, wangxingang5, zhangfei.gao, eric.auger.pro,
	will, kvmarm

Hi Kevin,

On 12/9/21 4:21 AM, Tian, Kevin wrote:
>> From: Jason Gunthorpe <jgg@nvidia.com>
>> Sent: Wednesday, December 8, 2021 8:56 PM
>>
>> On Wed, Dec 08, 2021 at 08:33:33AM +0100, Eric Auger wrote:
>>> Hi Baolu,
>>>
>>> On 12/8/21 3:44 AM, Lu Baolu wrote:
>>>> Hi Eric,
>>>>
>>>> On 12/7/21 6:22 PM, Eric Auger wrote:
>>>>> On 12/6/21 11:48 AM, Joerg Roedel wrote:
>>>>>> On Wed, Oct 27, 2021 at 12:44:20PM +0200, Eric Auger wrote:
>>>>>>> Signed-off-by: Jean-Philippe Brucker<jean-philippe.brucker@arm.com>
>>>>>>> Signed-off-by: Liu, Yi L<yi.l.liu@linux.intel.com>
>>>>>>> Signed-off-by: Ashok Raj<ashok.raj@intel.com>
>>>>>>> Signed-off-by: Jacob Pan<jacob.jun.pan@linux.intel.com>
>>>>>>> Signed-off-by: Eric Auger<eric.auger@redhat.com>
>>>>>> This Signed-of-by chain looks dubious, you are the author but the last
>>>>>> one in the chain?
>>>>> The 1st RFC in Aug 2018
>>>>> (https://lists.cs.columbia.edu/pipermail/kvmarm/2018-
>> August/032478.html)
>>>>> said this was a generalization of Jacob's patch
>>>>>
>>>>>
>>>>>    [PATCH v5 01/23] iommu: introduce bind_pasid_table API function
>>>>>
>>>>>
>>>>>
>>>>> https://lists.linuxfoundation.org/pipermail/iommu/2018-
>> May/027647.html
>>>>> So indeed Jacob should be the author. I guess the multiple rebases got
>>>>> this eventually replaced at some point, which is not an excuse. Please
>>>>> forgive me for that.
>>>>> Now the original patch already had this list of SoB so I don't know if I
>>>>> shall simplify it.
>>>> As we have decided to move the nested mode (dual stages)
>> implementation
>>>> onto the developing iommufd framework, what's the value of adding this
>>>> into iommu core?
>>> The iommu_uapi_attach_pasid_table uapi should disappear indeed as it is
>>> is bound to be replaced by /dev/iommu fellow API.
>>> However until I can rebase on /dev/iommu code I am obliged to keep it to
>>> maintain this integration, hence the RFC.
>> Indeed, we are getting pretty close to having the base iommufd that we
>> can start adding stuff like this into. Maybe in January, you can look
>> at some parts of what is evolving here:
>>
>> https://github.com/jgunthorpe/linux/commits/iommufd
>> https://github.com/LuBaolu/intel-iommu/commits/iommu-dma-ownership-
>> v2
>> https://github.com/luxis1999/iommufd/commits/iommufd-v5.16-rc2
>>
>> From a progress perspective I would like to start with simple 'page
>> tables in userspace', ie no PASID in this step.
>>
>> 'page tables in userspace' means an iommufd ioctl to create an
>> iommu_domain where the IOMMU HW is directly travesering a
>> device-specific page table structure in user space memory. All the HW
>> today implements this by using another iommu_domain to allow the IOMMU
>> HW DMA access to user memory - ie nesting or multi-stage or whatever.
> One clarification here in case people may get confused based on the
> current iommu_domain definition. Jason brainstormed with us on how
> to represent 'user page table' in the IOMMU sub-system. One is to
> extend iommu_domain as a general representation for any page table
> instance. The other option is to create new representations for user
> page tables and then link them under existing iommu_domain.
>
> This context is based on the 1st option. and As Jason said in the bottom
> we still need to sketch out whether it works as expected. 😊
>
>> This would come along with some ioctls to invalidate the IOTLB.
>>
>> I'm imagining this step as a iommu_group->op->create_user_domain()
>> driver callback which will create a new kind of domain with
>> domain-unique ops. Ie map/unmap related should all be NULL as those
>> are impossible operations.
>>
>> From there the usual struct device (ie RID) attach/detatch stuff needs
>> to take care of routing DMAs to this iommu_domain.
> Usage-wise this covers the guest IOVA requirements i.e. when the guest
> kernel enables vIOMMU for kernel DMA-API mappings or for device
> assignment to guest userspace.
>
> For intel this means optimization to the existing shadow-based vIOMMU
> implementation.
>
> For ARM this actually enables guest IOVA usage for the first time (correct
> me Eric?).
Yes that's correct. This is the scope of this series (single PASID)
>  IIRC SMMU doesn't support caching mode while write-protecting
> guest I/O page table was considered a no-go. So nesting is considered as
> the only option to support that.
that's correct too. No 'caching mode' provisionned in the SMMU spec. I
thought it would just be a matter of adding 1 bit in an ID reg though.

Thanks

Eric
>
> and once 'user pasid table' is installed, this actually means guest SVA usage
> can also partially work for ARM if I/O page fault is not incurred.
>
>> Step two would be to add the ability for an iommufd using driver to
>> request that a RID&PASID is connected to an iommu_domain. This
>> connection can be requested for any kind of iommu_domain, kernel owned
>> or user owned.
>>
>> I don't quite have an answer how exactly the SMMUv3 vs Intel
>> difference in PASID routing should be resolved.
> For kernel owned the iommufd interface should be generic as the
> vendor difference is managed by the kernel itself.
>
> For user owned we'll need new uAPIs for user to specify PASID. 
> As I replied in another thread only Intel currently requires it due to
> mdev. But other vendors could also do so when they decide to 
> support mdev one day.
>
>> to get answers I'm hoping to start building some sketch RFCs for these
>> different things on iommufd, hopefully in January. I'm looking at user
>> page tables, PASID, dirty tracking and userspace IO fault handling as
>> the main features iommufd must tackle.
> Make sense.
>
>> The purpose of the sketches would be to validate that the HW features
>> we want to exposed can work will with the choices the base is making.
>>
>> Jason
> Thanks
> Kevin

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-09  9:44                 ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-09  9:44 UTC (permalink / raw)
  To: Tian, Kevin, Jason Gunthorpe
  Cc: lushenming, robin.murphy, Raj, Ashok, kvm, jean-philippe, maz,
	Joerg Roedel, linux-kernel, iommu, vsethi, vivek.gautam,
	alex.williamson, wangxingang5, zhangfei.gao, eric.auger.pro,
	will, kvmarm, Lu Baolu

Hi Kevin,

On 12/9/21 4:21 AM, Tian, Kevin wrote:
>> From: Jason Gunthorpe <jgg@nvidia.com>
>> Sent: Wednesday, December 8, 2021 8:56 PM
>>
>> On Wed, Dec 08, 2021 at 08:33:33AM +0100, Eric Auger wrote:
>>> Hi Baolu,
>>>
>>> On 12/8/21 3:44 AM, Lu Baolu wrote:
>>>> Hi Eric,
>>>>
>>>> On 12/7/21 6:22 PM, Eric Auger wrote:
>>>>> On 12/6/21 11:48 AM, Joerg Roedel wrote:
>>>>>> On Wed, Oct 27, 2021 at 12:44:20PM +0200, Eric Auger wrote:
>>>>>>> Signed-off-by: Jean-Philippe Brucker<jean-philippe.brucker@arm.com>
>>>>>>> Signed-off-by: Liu, Yi L<yi.l.liu@linux.intel.com>
>>>>>>> Signed-off-by: Ashok Raj<ashok.raj@intel.com>
>>>>>>> Signed-off-by: Jacob Pan<jacob.jun.pan@linux.intel.com>
>>>>>>> Signed-off-by: Eric Auger<eric.auger@redhat.com>
>>>>>> This Signed-of-by chain looks dubious, you are the author but the last
>>>>>> one in the chain?
>>>>> The 1st RFC in Aug 2018
>>>>> (https://lists.cs.columbia.edu/pipermail/kvmarm/2018-
>> August/032478.html)
>>>>> said this was a generalization of Jacob's patch
>>>>>
>>>>>
>>>>>    [PATCH v5 01/23] iommu: introduce bind_pasid_table API function
>>>>>
>>>>>
>>>>>
>>>>> https://lists.linuxfoundation.org/pipermail/iommu/2018-
>> May/027647.html
>>>>> So indeed Jacob should be the author. I guess the multiple rebases got
>>>>> this eventually replaced at some point, which is not an excuse. Please
>>>>> forgive me for that.
>>>>> Now the original patch already had this list of SoB so I don't know if I
>>>>> shall simplify it.
>>>> As we have decided to move the nested mode (dual stages)
>> implementation
>>>> onto the developing iommufd framework, what's the value of adding this
>>>> into iommu core?
>>> The iommu_uapi_attach_pasid_table uapi should disappear indeed as it is
>>> is bound to be replaced by /dev/iommu fellow API.
>>> However until I can rebase on /dev/iommu code I am obliged to keep it to
>>> maintain this integration, hence the RFC.
>> Indeed, we are getting pretty close to having the base iommufd that we
>> can start adding stuff like this into. Maybe in January, you can look
>> at some parts of what is evolving here:
>>
>> https://github.com/jgunthorpe/linux/commits/iommufd
>> https://github.com/LuBaolu/intel-iommu/commits/iommu-dma-ownership-
>> v2
>> https://github.com/luxis1999/iommufd/commits/iommufd-v5.16-rc2
>>
>> From a progress perspective I would like to start with simple 'page
>> tables in userspace', ie no PASID in this step.
>>
>> 'page tables in userspace' means an iommufd ioctl to create an
>> iommu_domain where the IOMMU HW is directly travesering a
>> device-specific page table structure in user space memory. All the HW
>> today implements this by using another iommu_domain to allow the IOMMU
>> HW DMA access to user memory - ie nesting or multi-stage or whatever.
> One clarification here in case people may get confused based on the
> current iommu_domain definition. Jason brainstormed with us on how
> to represent 'user page table' in the IOMMU sub-system. One is to
> extend iommu_domain as a general representation for any page table
> instance. The other option is to create new representations for user
> page tables and then link them under existing iommu_domain.
>
> This context is based on the 1st option. and As Jason said in the bottom
> we still need to sketch out whether it works as expected. 😊
>
>> This would come along with some ioctls to invalidate the IOTLB.
>>
>> I'm imagining this step as a iommu_group->op->create_user_domain()
>> driver callback which will create a new kind of domain with
>> domain-unique ops. Ie map/unmap related should all be NULL as those
>> are impossible operations.
>>
>> From there the usual struct device (ie RID) attach/detatch stuff needs
>> to take care of routing DMAs to this iommu_domain.
> Usage-wise this covers the guest IOVA requirements i.e. when the guest
> kernel enables vIOMMU for kernel DMA-API mappings or for device
> assignment to guest userspace.
>
> For intel this means optimization to the existing shadow-based vIOMMU
> implementation.
>
> For ARM this actually enables guest IOVA usage for the first time (correct
> me Eric?).
Yes that's correct. This is the scope of this series (single PASID)
>  IIRC SMMU doesn't support caching mode while write-protecting
> guest I/O page table was considered a no-go. So nesting is considered as
> the only option to support that.
that's correct too. No 'caching mode' provisionned in the SMMU spec. I
thought it would just be a matter of adding 1 bit in an ID reg though.

Thanks

Eric
>
> and once 'user pasid table' is installed, this actually means guest SVA usage
> can also partially work for ARM if I/O page fault is not incurred.
>
>> Step two would be to add the ability for an iommufd using driver to
>> request that a RID&PASID is connected to an iommu_domain. This
>> connection can be requested for any kind of iommu_domain, kernel owned
>> or user owned.
>>
>> I don't quite have an answer how exactly the SMMUv3 vs Intel
>> difference in PASID routing should be resolved.
> For kernel owned the iommufd interface should be generic as the
> vendor difference is managed by the kernel itself.
>
> For user owned we'll need new uAPIs for user to specify PASID. 
> As I replied in another thread only Intel currently requires it due to
> mdev. But other vendors could also do so when they decide to 
> support mdev one day.
>
>> to get answers I'm hoping to start building some sketch RFCs for these
>> different things on iommufd, hopefully in January. I'm looking at user
>> page tables, PASID, dirty tracking and userspace IO fault handling as
>> the main features iommufd must tackle.
> Make sense.
>
>> The purpose of the sketches would be to validate that the HW features
>> we want to exposed can work will with the choices the base is making.
>>
>> Jason
> Thanks
> Kevin

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
  2021-12-09  7:50                   ` Eric Auger
  (?)
@ 2021-12-09 15:40                     ` Jason Gunthorpe via iommu
  -1 siblings, 0 replies; 116+ messages in thread
From: Jason Gunthorpe @ 2021-12-09 15:40 UTC (permalink / raw)
  To: Eric Auger
  Cc: Jean-Philippe Brucker, Lu Baolu, Joerg Roedel, peter.maydell,
	kvm, vivek.gautam, kvmarm, eric.auger.pro, ashok.raj, maz,
	vsethi, zhangfei.gao, kevin.tian, will, alex.williamson,
	wangxingang5, linux-kernel, lushenming, iommu, robin.murphy

On Thu, Dec 09, 2021 at 08:50:04AM +0100, Eric Auger wrote:

> > The kernel API should accept the S1ContextPtr IPA and all the parts of
> > the STE that relate to the defining the layout of what the S1Context
> > points to an thats it.

> Yes that's exactly what is done currently. At config time the host must
> trap guest STE changes (format and S1ContextPtr) and "incorporate" those
> changes into the stage2 related STE information. The STE is owned by the
> host kernel as it contains the stage2 information (S2TTB).

[..]

> Note this series only coped with a single CD in the Context Descriptor
> Table.

I'm confused, where does this limit arise?

The kernel accepts as input all the bits in the STE that describe the
layout of the CDT owned by userspace, shouldn't userspace be able to
construct all forms of CDT with any number of CDs in them?

Or do you mean this is some qemu limitation?

Jason

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-09 15:40                     ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 116+ messages in thread
From: Jason Gunthorpe via iommu @ 2021-12-09 15:40 UTC (permalink / raw)
  To: Eric Auger
  Cc: peter.maydell, kevin.tian, lushenming, robin.murphy, ashok.raj,
	kvm, Jean-Philippe Brucker, maz, linux-kernel, iommu, vsethi,
	vivek.gautam, alex.williamson, wangxingang5, zhangfei.gao,
	eric.auger.pro, will, kvmarm

On Thu, Dec 09, 2021 at 08:50:04AM +0100, Eric Auger wrote:

> > The kernel API should accept the S1ContextPtr IPA and all the parts of
> > the STE that relate to the defining the layout of what the S1Context
> > points to an thats it.

> Yes that's exactly what is done currently. At config time the host must
> trap guest STE changes (format and S1ContextPtr) and "incorporate" those
> changes into the stage2 related STE information. The STE is owned by the
> host kernel as it contains the stage2 information (S2TTB).

[..]

> Note this series only coped with a single CD in the Context Descriptor
> Table.

I'm confused, where does this limit arise?

The kernel accepts as input all the bits in the STE that describe the
layout of the CDT owned by userspace, shouldn't userspace be able to
construct all forms of CDT with any number of CDs in them?

Or do you mean this is some qemu limitation?

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-09 15:40                     ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 116+ messages in thread
From: Jason Gunthorpe @ 2021-12-09 15:40 UTC (permalink / raw)
  To: Eric Auger
  Cc: kevin.tian, lushenming, robin.murphy, ashok.raj, kvm,
	Jean-Philippe Brucker, maz, Joerg Roedel, linux-kernel, iommu,
	vsethi, vivek.gautam, alex.williamson, wangxingang5,
	zhangfei.gao, eric.auger.pro, will, kvmarm, Lu Baolu

On Thu, Dec 09, 2021 at 08:50:04AM +0100, Eric Auger wrote:

> > The kernel API should accept the S1ContextPtr IPA and all the parts of
> > the STE that relate to the defining the layout of what the S1Context
> > points to an thats it.

> Yes that's exactly what is done currently. At config time the host must
> trap guest STE changes (format and S1ContextPtr) and "incorporate" those
> changes into the stage2 related STE information. The STE is owned by the
> host kernel as it contains the stage2 information (S2TTB).

[..]

> Note this series only coped with a single CD in the Context Descriptor
> Table.

I'm confused, where does this limit arise?

The kernel accepts as input all the bits in the STE that describe the
layout of the CDT owned by userspace, shouldn't userspace be able to
construct all forms of CDT with any number of CDs in them?

Or do you mean this is some qemu limitation?

Jason
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
  2021-12-09  3:59                     ` Tian, Kevin
  (?)
@ 2021-12-09 16:08                       ` Jason Gunthorpe via iommu
  -1 siblings, 0 replies; 116+ messages in thread
From: Jason Gunthorpe @ 2021-12-09 16:08 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Jean-Philippe Brucker, Eric Auger, Lu Baolu, Joerg Roedel,
	peter.maydell, kvm, vivek.gautam, kvmarm, eric.auger.pro, Raj,
	Ashok, maz, vsethi, zhangfei.gao, will, alex.williamson,
	wangxingang5, linux-kernel, lushenming, iommu, robin.murphy

On Thu, Dec 09, 2021 at 03:59:57AM +0000, Tian, Kevin wrote:
> > From: Tian, Kevin
> > Sent: Thursday, December 9, 2021 10:58 AM
> > 
> > For ARM it's SMMU's PASID table format. There is no step-2 since PASID
> > is already within the address space covered by the user PASID table.
> > 
> 
> One correction here. 'no step-2' is definitely wrong here as it means
> more than user page table in your plan (e.g. dpdk).
> 
> To simplify it what I meant is:
> 
> iommufd reports how many 'user page tables' are supported given a device.
> 
> ARM always reports only one can be supported, and it must be created in 
> PASID table format. tagged by RID.
> 
> Intel reports one in step1 (tagged by RID), and N in step2 (tagged by
> RID+PASID). A special flag in attach call allows the user to specify the
> additional PASID routing info for a 'user page table'.

I don't think 'number of user page tables' makes sense

It really is 'attach to the whole device' vs 'attach to the RID' as a
semantic that should exist 

If we imagine a userspace using kernel page tables it certainly makes
sense to assign page table A to the RID and page table B to a PASID
even in simple cases like vfio-pci.

The only case where userspace would want to capture the entire RID and
all PASIDs is something like this ARM situation - but userspace just
created a device specific object and already knows exactly what kind
of behavior it has.

So, something like vfio pci would implement three uAPI operations:
 - Attach page table to RID
 - Attach page table to PASID
 - Attach page table to RID and all PASIDs
   And here 'page table' is everything below the STE in SMMUv3

While mdev can only support:
 - Access emulated page table
 - Attach page table to PASID

It is what I've said a couple of times, the API the driver calls
toward iommufd to attach a page table must be unambiguous as to the
intention, which also means userspace must be unambiguous too.

Jason

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-09 16:08                       ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 116+ messages in thread
From: Jason Gunthorpe via iommu @ 2021-12-09 16:08 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: peter.maydell, lushenming, robin.murphy, Raj, Ashok, kvm,
	Jean-Philippe Brucker, maz, linux-kernel, iommu, vsethi,
	alex.williamson, wangxingang5, vivek.gautam, zhangfei.gao,
	eric.auger.pro, will, kvmarm

On Thu, Dec 09, 2021 at 03:59:57AM +0000, Tian, Kevin wrote:
> > From: Tian, Kevin
> > Sent: Thursday, December 9, 2021 10:58 AM
> > 
> > For ARM it's SMMU's PASID table format. There is no step-2 since PASID
> > is already within the address space covered by the user PASID table.
> > 
> 
> One correction here. 'no step-2' is definitely wrong here as it means
> more than user page table in your plan (e.g. dpdk).
> 
> To simplify it what I meant is:
> 
> iommufd reports how many 'user page tables' are supported given a device.
> 
> ARM always reports only one can be supported, and it must be created in 
> PASID table format. tagged by RID.
> 
> Intel reports one in step1 (tagged by RID), and N in step2 (tagged by
> RID+PASID). A special flag in attach call allows the user to specify the
> additional PASID routing info for a 'user page table'.

I don't think 'number of user page tables' makes sense

It really is 'attach to the whole device' vs 'attach to the RID' as a
semantic that should exist 

If we imagine a userspace using kernel page tables it certainly makes
sense to assign page table A to the RID and page table B to a PASID
even in simple cases like vfio-pci.

The only case where userspace would want to capture the entire RID and
all PASIDs is something like this ARM situation - but userspace just
created a device specific object and already knows exactly what kind
of behavior it has.

So, something like vfio pci would implement three uAPI operations:
 - Attach page table to RID
 - Attach page table to PASID
 - Attach page table to RID and all PASIDs
   And here 'page table' is everything below the STE in SMMUv3

While mdev can only support:
 - Access emulated page table
 - Attach page table to PASID

It is what I've said a couple of times, the API the driver calls
toward iommufd to attach a page table must be unambiguous as to the
intention, which also means userspace must be unambiguous too.

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-09 16:08                       ` Jason Gunthorpe via iommu
  0 siblings, 0 replies; 116+ messages in thread
From: Jason Gunthorpe @ 2021-12-09 16:08 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: lushenming, robin.murphy, Raj, Ashok, kvm, Jean-Philippe Brucker,
	maz, Joerg Roedel, linux-kernel, iommu, vsethi, alex.williamson,
	wangxingang5, vivek.gautam, zhangfei.gao, eric.auger.pro, will,
	kvmarm, Lu Baolu

On Thu, Dec 09, 2021 at 03:59:57AM +0000, Tian, Kevin wrote:
> > From: Tian, Kevin
> > Sent: Thursday, December 9, 2021 10:58 AM
> > 
> > For ARM it's SMMU's PASID table format. There is no step-2 since PASID
> > is already within the address space covered by the user PASID table.
> > 
> 
> One correction here. 'no step-2' is definitely wrong here as it means
> more than user page table in your plan (e.g. dpdk).
> 
> To simplify it what I meant is:
> 
> iommufd reports how many 'user page tables' are supported given a device.
> 
> ARM always reports only one can be supported, and it must be created in 
> PASID table format. tagged by RID.
> 
> Intel reports one in step1 (tagged by RID), and N in step2 (tagged by
> RID+PASID). A special flag in attach call allows the user to specify the
> additional PASID routing info for a 'user page table'.

I don't think 'number of user page tables' makes sense

It really is 'attach to the whole device' vs 'attach to the RID' as a
semantic that should exist 

If we imagine a userspace using kernel page tables it certainly makes
sense to assign page table A to the RID and page table B to a PASID
even in simple cases like vfio-pci.

The only case where userspace would want to capture the entire RID and
all PASIDs is something like this ARM situation - but userspace just
created a device specific object and already knows exactly what kind
of behavior it has.

So, something like vfio pci would implement three uAPI operations:
 - Attach page table to RID
 - Attach page table to PASID
 - Attach page table to RID and all PASIDs
   And here 'page table' is everything below the STE in SMMUv3

While mdev can only support:
 - Access emulated page table
 - Attach page table to PASID

It is what I've said a couple of times, the API the driver calls
toward iommufd to attach a page table must be unambiguous as to the
intention, which also means userspace must be unambiguous too.

Jason
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
  2021-12-09 15:40                     ` Jason Gunthorpe via iommu
  (?)
@ 2021-12-09 16:37                       ` Eric Auger
  -1 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-09 16:37 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Jean-Philippe Brucker, Lu Baolu, Joerg Roedel, peter.maydell,
	kvm, vivek.gautam, kvmarm, eric.auger.pro, ashok.raj, maz,
	vsethi, zhangfei.gao, kevin.tian, will, alex.williamson,
	wangxingang5, linux-kernel, lushenming, iommu, robin.murphy

Hi Jason,

On 12/9/21 4:40 PM, Jason Gunthorpe wrote:
> On Thu, Dec 09, 2021 at 08:50:04AM +0100, Eric Auger wrote:
>
>>> The kernel API should accept the S1ContextPtr IPA and all the parts of
>>> the STE that relate to the defining the layout of what the S1Context
>>> points to an thats it.
>> Yes that's exactly what is done currently. At config time the host must
>> trap guest STE changes (format and S1ContextPtr) and "incorporate" those
>> changes into the stage2 related STE information. The STE is owned by the
>> host kernel as it contains the stage2 information (S2TTB).
> [..]
>
>> Note this series only coped with a single CD in the Context Descriptor
>> Table.
> I'm confused, where does this limit arise?
>
> The kernel accepts as input all the bits in the STE that describe the
> layout of the CDT owned by userspace, shouldn't userspace be able to
> construct all forms of CDT with any number of CDs in them?
>
> Or do you mean this is some qemu limitation?
The upstream vSMMUv3 emulation does not support multiple CDs at the
moment and since I have no proper means to validate the vSVA case I am
rejecting any attempt from user space to inject guest configs featuring
mutliple PASIDs. Also PASID cache invalidation must be added to this series.

Nevertheless those limitations were tackled and overcomed by others in
CC so I don't think there is any blocking issue.

Thanks

Eric
>
> Jason
>


^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-09 16:37                       ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-09 16:37 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: peter.maydell, kevin.tian, lushenming, robin.murphy, ashok.raj,
	kvm, Jean-Philippe Brucker, maz, linux-kernel, iommu, vsethi,
	vivek.gautam, alex.williamson, wangxingang5, zhangfei.gao,
	eric.auger.pro, will, kvmarm

Hi Jason,

On 12/9/21 4:40 PM, Jason Gunthorpe wrote:
> On Thu, Dec 09, 2021 at 08:50:04AM +0100, Eric Auger wrote:
>
>>> The kernel API should accept the S1ContextPtr IPA and all the parts of
>>> the STE that relate to the defining the layout of what the S1Context
>>> points to an thats it.
>> Yes that's exactly what is done currently. At config time the host must
>> trap guest STE changes (format and S1ContextPtr) and "incorporate" those
>> changes into the stage2 related STE information. The STE is owned by the
>> host kernel as it contains the stage2 information (S2TTB).
> [..]
>
>> Note this series only coped with a single CD in the Context Descriptor
>> Table.
> I'm confused, where does this limit arise?
>
> The kernel accepts as input all the bits in the STE that describe the
> layout of the CDT owned by userspace, shouldn't userspace be able to
> construct all forms of CDT with any number of CDs in them?
>
> Or do you mean this is some qemu limitation?
The upstream vSMMUv3 emulation does not support multiple CDs at the
moment and since I have no proper means to validate the vSVA case I am
rejecting any attempt from user space to inject guest configs featuring
mutliple PASIDs. Also PASID cache invalidation must be added to this series.

Nevertheless those limitations were tackled and overcomed by others in
CC so I don't think there is any blocking issue.

Thanks

Eric
>
> Jason
>

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-09 16:37                       ` Eric Auger
  0 siblings, 0 replies; 116+ messages in thread
From: Eric Auger @ 2021-12-09 16:37 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: kevin.tian, lushenming, robin.murphy, ashok.raj, kvm,
	Jean-Philippe Brucker, maz, Joerg Roedel, linux-kernel, iommu,
	vsethi, vivek.gautam, alex.williamson, wangxingang5,
	zhangfei.gao, eric.auger.pro, will, kvmarm, Lu Baolu

Hi Jason,

On 12/9/21 4:40 PM, Jason Gunthorpe wrote:
> On Thu, Dec 09, 2021 at 08:50:04AM +0100, Eric Auger wrote:
>
>>> The kernel API should accept the S1ContextPtr IPA and all the parts of
>>> the STE that relate to the defining the layout of what the S1Context
>>> points to an thats it.
>> Yes that's exactly what is done currently. At config time the host must
>> trap guest STE changes (format and S1ContextPtr) and "incorporate" those
>> changes into the stage2 related STE information. The STE is owned by the
>> host kernel as it contains the stage2 information (S2TTB).
> [..]
>
>> Note this series only coped with a single CD in the Context Descriptor
>> Table.
> I'm confused, where does this limit arise?
>
> The kernel accepts as input all the bits in the STE that describe the
> layout of the CDT owned by userspace, shouldn't userspace be able to
> construct all forms of CDT with any number of CDs in them?
>
> Or do you mean this is some qemu limitation?
The upstream vSMMUv3 emulation does not support multiple CDs at the
moment and since I have no proper means to validate the vSVA case I am
rejecting any attempt from user space to inject guest configs featuring
mutliple PASIDs. Also PASID cache invalidation must be added to this series.

Nevertheless those limitations were tackled and overcomed by others in
CC so I don't think there is any blocking issue.

Thanks

Eric
>
> Jason
>

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* RE: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
  2021-12-09 16:08                       ` Jason Gunthorpe via iommu
@ 2021-12-10  8:56                         ` Tian, Kevin
  -1 siblings, 0 replies; 116+ messages in thread
From: Tian, Kevin @ 2021-12-10  8:56 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: peter.maydell, lushenming, robin.murphy, Raj, Ashok, kvm,
	Jean-Philippe Brucker, maz, linux-kernel, vsethi,
	alex.williamson, wangxingang5, vivek.gautam, zhangfei.gao,
	eric.auger.pro, will, kvmarm

> From: Jason Gunthorpe via iommu
> Sent: Friday, December 10, 2021 12:08 AM
> 
> On Thu, Dec 09, 2021 at 03:59:57AM +0000, Tian, Kevin wrote:
> > > From: Tian, Kevin
> > > Sent: Thursday, December 9, 2021 10:58 AM
> > >
> > > For ARM it's SMMU's PASID table format. There is no step-2 since PASID
> > > is already within the address space covered by the user PASID table.
> > >
> >
> > One correction here. 'no step-2' is definitely wrong here as it means
> > more than user page table in your plan (e.g. dpdk).
> >
> > To simplify it what I meant is:
> >
> > iommufd reports how many 'user page tables' are supported given a device.
> >
> > ARM always reports only one can be supported, and it must be created in
> > PASID table format. tagged by RID.
> >
> > Intel reports one in step1 (tagged by RID), and N in step2 (tagged by
> > RID+PASID). A special flag in attach call allows the user to specify the
> > additional PASID routing info for a 'user page table'.
> 
> I don't think 'number of user page tables' makes sense
> 
> It really is 'attach to the whole device' vs 'attach to the RID' as a
> semantic that should exist
> 
> If we imagine a userspace using kernel page tables it certainly makes
> sense to assign page table A to the RID and page table B to a PASID
> even in simple cases like vfio-pci.
> 
> The only case where userspace would want to capture the entire RID and
> all PASIDs is something like this ARM situation - but userspace just
> created a device specific object and already knows exactly what kind
> of behavior it has.
> 
> So, something like vfio pci would implement three uAPI operations:
>  - Attach page table to RID
>  - Attach page table to PASID
>  - Attach page table to RID and all PASIDs
>    And here 'page table' is everything below the STE in SMMUv3
> 
> While mdev can only support:
>  - Access emulated page table
>  - Attach page table to PASID

mdev is a pci device from user p.o.v, having its vRID and vPASID. From
this angle the uAPI is no different from vfio-pci (except the ARM one):

  - (sw mdev) Attach emulated page table to vRID (no iommu domain)
  - (hw mdev) Attach page table to vRID (mapped to mdev PASID)
  - (hw mdev) Attach page table to vPASID (mapped to a fungible PASID)

> 
> It is what I've said a couple of times, the API the driver calls
> toward iommufd to attach a page table must be unambiguous as to the
> intention, which also means userspace must be unambiguous too.
> 

No question on the unambiguous part. But we also need to consider
the common semantics that can be abstracted.

From user p.o.v a vRID can be attached to at most two page tables (if
nesting is enabled). This just requires the basic attaching form for 
either one page table or two page tables:

	at_data = {
		.iommufd	= xxx;
		.pgtable_id	= yyy;
	};
	ioctl(device_fd, VFIO_DEVICE_ATTACH_PGTABLE, &at_data);

This can already cover ARM's requirement. The user page table
attached to vRID is in vendor specific format, e.g. either ARM pasid 
table format or Intel stage-1 format. For ARM pasid_table + underlying 
stage-1 page tables can be considered as a single big paging structure.

From this angle I'm not sure the benefit of making a separate uAPI 
just because it's a pasid table for ARM.

Then when PASID needs to be explicitly specified (e.g. in Intel case):

	at_data = {
		.iommufd	= xxx;
		.pgtable_id	= yyy;
		.flags 		= VFIO_ATTACH_FLAGS_PASID;
		.pasid		= zzz;
	};
	ioctl(device_fd, VFIO_DEVICE_ATTACH_PGTABLE, &at_data);

Again, I don't think what a simple flag can solve needs to be made
into a separate uAPI.

Is modeling like above considered ambiguous?

Thanks
Kevin

^ permalink raw reply	[flat|nested] 116+ messages in thread

* RE: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-10  8:56                         ` Tian, Kevin
  0 siblings, 0 replies; 116+ messages in thread
From: Tian, Kevin @ 2021-12-10  8:56 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Raj, Ashok, kvm, Jean-Philippe Brucker, maz, will, linux-kernel,
	vsethi, lushenming, alex.williamson, vivek.gautam, zhangfei.gao,
	eric.auger.pro, robin.murphy, kvmarm, wangxingang5

> From: Jason Gunthorpe via iommu
> Sent: Friday, December 10, 2021 12:08 AM
> 
> On Thu, Dec 09, 2021 at 03:59:57AM +0000, Tian, Kevin wrote:
> > > From: Tian, Kevin
> > > Sent: Thursday, December 9, 2021 10:58 AM
> > >
> > > For ARM it's SMMU's PASID table format. There is no step-2 since PASID
> > > is already within the address space covered by the user PASID table.
> > >
> >
> > One correction here. 'no step-2' is definitely wrong here as it means
> > more than user page table in your plan (e.g. dpdk).
> >
> > To simplify it what I meant is:
> >
> > iommufd reports how many 'user page tables' are supported given a device.
> >
> > ARM always reports only one can be supported, and it must be created in
> > PASID table format. tagged by RID.
> >
> > Intel reports one in step1 (tagged by RID), and N in step2 (tagged by
> > RID+PASID). A special flag in attach call allows the user to specify the
> > additional PASID routing info for a 'user page table'.
> 
> I don't think 'number of user page tables' makes sense
> 
> It really is 'attach to the whole device' vs 'attach to the RID' as a
> semantic that should exist
> 
> If we imagine a userspace using kernel page tables it certainly makes
> sense to assign page table A to the RID and page table B to a PASID
> even in simple cases like vfio-pci.
> 
> The only case where userspace would want to capture the entire RID and
> all PASIDs is something like this ARM situation - but userspace just
> created a device specific object and already knows exactly what kind
> of behavior it has.
> 
> So, something like vfio pci would implement three uAPI operations:
>  - Attach page table to RID
>  - Attach page table to PASID
>  - Attach page table to RID and all PASIDs
>    And here 'page table' is everything below the STE in SMMUv3
> 
> While mdev can only support:
>  - Access emulated page table
>  - Attach page table to PASID

mdev is a pci device from user p.o.v, having its vRID and vPASID. From
this angle the uAPI is no different from vfio-pci (except the ARM one):

  - (sw mdev) Attach emulated page table to vRID (no iommu domain)
  - (hw mdev) Attach page table to vRID (mapped to mdev PASID)
  - (hw mdev) Attach page table to vPASID (mapped to a fungible PASID)

> 
> It is what I've said a couple of times, the API the driver calls
> toward iommufd to attach a page table must be unambiguous as to the
> intention, which also means userspace must be unambiguous too.
> 

No question on the unambiguous part. But we also need to consider
the common semantics that can be abstracted.

From user p.o.v a vRID can be attached to at most two page tables (if
nesting is enabled). This just requires the basic attaching form for 
either one page table or two page tables:

	at_data = {
		.iommufd	= xxx;
		.pgtable_id	= yyy;
	};
	ioctl(device_fd, VFIO_DEVICE_ATTACH_PGTABLE, &at_data);

This can already cover ARM's requirement. The user page table
attached to vRID is in vendor specific format, e.g. either ARM pasid 
table format or Intel stage-1 format. For ARM pasid_table + underlying 
stage-1 page tables can be considered as a single big paging structure.

From this angle I'm not sure the benefit of making a separate uAPI 
just because it's a pasid table for ARM.

Then when PASID needs to be explicitly specified (e.g. in Intel case):

	at_data = {
		.iommufd	= xxx;
		.pgtable_id	= yyy;
		.flags 		= VFIO_ATTACH_FLAGS_PASID;
		.pasid		= zzz;
	};
	ioctl(device_fd, VFIO_DEVICE_ATTACH_PGTABLE, &at_data);

Again, I don't think what a simple flag can solve needs to be made
into a separate uAPI.

Is modeling like above considered ambiguous?

Thanks
Kevin
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
  2021-12-10  8:56                         ` Tian, Kevin
@ 2021-12-10 13:23                           ` Jason Gunthorpe
  -1 siblings, 0 replies; 116+ messages in thread
From: Jason Gunthorpe @ 2021-12-10 13:23 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: peter.maydell, lushenming, robin.murphy, Raj, Ashok, kvm,
	Jean-Philippe Brucker, maz, linux-kernel, vsethi,
	alex.williamson, wangxingang5, vivek.gautam, zhangfei.gao,
	eric.auger.pro, will, kvmarm

On Fri, Dec 10, 2021 at 08:56:56AM +0000, Tian, Kevin wrote:
> > So, something like vfio pci would implement three uAPI operations:
> >  - Attach page table to RID
> >  - Attach page table to PASID
> >  - Attach page table to RID and all PASIDs
> >    And here 'page table' is everything below the STE in SMMUv3
> > 
> > While mdev can only support:
> >  - Access emulated page table
> >  - Attach page table to PASID
> 
> mdev is a pci device from user p.o.v, having its vRID and vPASID. From
> this angle the uAPI is no different from vfio-pci (except the ARM one):

No, it isn't. The internal operation is completely different, and it
must call different iommufd APIs than vfio-pci does.

This is user visible - mdev can never be attached to an ARM user page
table, for instance.

For iommufd there is no vRID, vPASID or any confusing stuff like
that. You'll have an easier time if you stop thinking in these terms.

We probably end up with three iommufd calls:
 int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id, unsigned int flags)
 int iommufd_device_attach_pasid(struct iommufd_device *idev, u32 *pt_id, unsigned int flags, ioasid_t *pasid)
 int iommufd_device_attach_sw_iommu(struct iommufd_device *idev, u32 pt_id);

And the uAPI from VFIO must map onto them.

vfio-pci:
  - 'VFIO_SET_CONTAINER' does
    iommufd_device_attach(idev, &pt_id, IOMMUFD_FULL_DEVICE);
    # IOMMU specific if this captures PASIDs or cause them to fail,
    # but IOMMUFD_FULL_DEVICE will prevent attaching any PASID
    # later on all iommu's.

vfio-mdev:
  - 'VFIO_SET_CONTAINER' does one of:
    iommufd_device_attach_pasid(idev, &pt_id, IOMMUFD_ASSIGN_PASID, &pasid);
    iommufd_device_attach_sw_iommu(idev, pt_id);

That is three of the cases.

Then we have new ioctls for the other cases:

vfio-pci:
  - 'bind only the RID, so we can use PASID'
    iommufd_device_attach(idev, &pt_id, 0);
  - 'bind to a specific PASID'
    iommufd_device_attach_pasid(idev, &pt_id, 0, &pasid);

vfio-mdev:
  - 'like VFIO_SET_CONTAINER but bind to a specific PASID'
    iommufd_device_attach_pasid(idev, &pt_id, 0, &pasid);

The iommu driver will block attachments that are incompatible, ie ARM
user page tables only work with:
 iommufd_device_attach(idev, &pt_id, IOMMUFD_FULL_DEVICE)
all other calls fail.

How exactly we put all of this into new ioctls, I'm not sure, but it
does seem pretty clear this is what the iommufd kAPI will need to look
like to cover the cases we know about already.

As you can see, userpace needs to understand what mode it is operating
in. If it does IOMMUFD_FULL_DEVICE and manages PASID somehow in
userspace, or it doesn't and can use the iommufd_device_attach_pasid()
paths.

> Is modeling like above considered ambiguous?

You've skipped straight to the ioctls without designing the kernel API
to meet all the requirements  :)

Jason

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-10 13:23                           ` Jason Gunthorpe
  0 siblings, 0 replies; 116+ messages in thread
From: Jason Gunthorpe @ 2021-12-10 13:23 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Raj, Ashok, kvm, Jean-Philippe Brucker, maz, will, linux-kernel,
	vsethi, lushenming, alex.williamson, vivek.gautam, zhangfei.gao,
	eric.auger.pro, robin.murphy, kvmarm, wangxingang5

On Fri, Dec 10, 2021 at 08:56:56AM +0000, Tian, Kevin wrote:
> > So, something like vfio pci would implement three uAPI operations:
> >  - Attach page table to RID
> >  - Attach page table to PASID
> >  - Attach page table to RID and all PASIDs
> >    And here 'page table' is everything below the STE in SMMUv3
> > 
> > While mdev can only support:
> >  - Access emulated page table
> >  - Attach page table to PASID
> 
> mdev is a pci device from user p.o.v, having its vRID and vPASID. From
> this angle the uAPI is no different from vfio-pci (except the ARM one):

No, it isn't. The internal operation is completely different, and it
must call different iommufd APIs than vfio-pci does.

This is user visible - mdev can never be attached to an ARM user page
table, for instance.

For iommufd there is no vRID, vPASID or any confusing stuff like
that. You'll have an easier time if you stop thinking in these terms.

We probably end up with three iommufd calls:
 int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id, unsigned int flags)
 int iommufd_device_attach_pasid(struct iommufd_device *idev, u32 *pt_id, unsigned int flags, ioasid_t *pasid)
 int iommufd_device_attach_sw_iommu(struct iommufd_device *idev, u32 pt_id);

And the uAPI from VFIO must map onto them.

vfio-pci:
  - 'VFIO_SET_CONTAINER' does
    iommufd_device_attach(idev, &pt_id, IOMMUFD_FULL_DEVICE);
    # IOMMU specific if this captures PASIDs or cause them to fail,
    # but IOMMUFD_FULL_DEVICE will prevent attaching any PASID
    # later on all iommu's.

vfio-mdev:
  - 'VFIO_SET_CONTAINER' does one of:
    iommufd_device_attach_pasid(idev, &pt_id, IOMMUFD_ASSIGN_PASID, &pasid);
    iommufd_device_attach_sw_iommu(idev, pt_id);

That is three of the cases.

Then we have new ioctls for the other cases:

vfio-pci:
  - 'bind only the RID, so we can use PASID'
    iommufd_device_attach(idev, &pt_id, 0);
  - 'bind to a specific PASID'
    iommufd_device_attach_pasid(idev, &pt_id, 0, &pasid);

vfio-mdev:
  - 'like VFIO_SET_CONTAINER but bind to a specific PASID'
    iommufd_device_attach_pasid(idev, &pt_id, 0, &pasid);

The iommu driver will block attachments that are incompatible, ie ARM
user page tables only work with:
 iommufd_device_attach(idev, &pt_id, IOMMUFD_FULL_DEVICE)
all other calls fail.

How exactly we put all of this into new ioctls, I'm not sure, but it
does seem pretty clear this is what the iommufd kAPI will need to look
like to cover the cases we know about already.

As you can see, userpace needs to understand what mode it is operating
in. If it does IOMMUFD_FULL_DEVICE and manages PASID somehow in
userspace, or it doesn't and can use the iommufd_device_attach_pasid()
paths.

> Is modeling like above considered ambiguous?

You've skipped straight to the ioctls without designing the kernel API
to meet all the requirements  :)

Jason
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* RE: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
  2021-12-10 13:23                           ` Jason Gunthorpe
@ 2021-12-11  3:57                             ` Tian, Kevin
  -1 siblings, 0 replies; 116+ messages in thread
From: Tian, Kevin @ 2021-12-11  3:57 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: peter.maydell, lushenming, robin.murphy, Raj, Ashok, kvm,
	Jean-Philippe Brucker, maz, linux-kernel, vsethi,
	alex.williamson, wangxingang5, vivek.gautam, zhangfei.gao,
	eric.auger.pro, will, kvmarm

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Friday, December 10, 2021 9:23 PM
> 
> On Fri, Dec 10, 2021 at 08:56:56AM +0000, Tian, Kevin wrote:
> > > So, something like vfio pci would implement three uAPI operations:
> > >  - Attach page table to RID
> > >  - Attach page table to PASID
> > >  - Attach page table to RID and all PASIDs
> > >    And here 'page table' is everything below the STE in SMMUv3
> > >
> > > While mdev can only support:
> > >  - Access emulated page table
> > >  - Attach page table to PASID
> >
> > mdev is a pci device from user p.o.v, having its vRID and vPASID. From
> > this angle the uAPI is no different from vfio-pci (except the ARM one):
> 
> No, it isn't. The internal operation is completely different, and it
> must call different iommufd APIs than vfio-pci does.

Well, you mentioned "uAPI operations" thus my earlier comment 
is purely from uAPI p.o.v instead of internal iommufd APIs (not meant
I didn't think of them). I think this is the main divergence in this 
discussion as when I saw you said "while mdev can only support" 
I assume it's still about uAPI (more specifically VFIO uAPI as it carries 
the attach call to iommufd).

> 
> This is user visible - mdev can never be attached to an ARM user page
> table, for instance.

sure. the iommu driver will fail the attach request when seeing
incompatible way is used.

> 
> For iommufd there is no vRID, vPASID or any confusing stuff like
> that. You'll have an easier time if you stop thinking in these terms.

I don't have a difficulty here as from vfio uAPI p.o.v it's about
vRID and vPASID. But there is NO any confusion on iommufd which
should only deal with physical thing. This has been settled down
long time ago in high level design discussion. 😊

> 
> We probably end up with three iommufd calls:
>  int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id,
> unsigned int flags)
>  int iommufd_device_attach_pasid(struct iommufd_device *idev, u32 *pt_id,
> unsigned int flags, ioasid_t *pasid)
>  int iommufd_device_attach_sw_iommu(struct iommufd_device *idev, u32
> pt_id);

this is aligned with previous design.

> 
> And the uAPI from VFIO must map onto them.
> 
> vfio-pci:
>   - 'VFIO_SET_CONTAINER' does
>     iommufd_device_attach(idev, &pt_id, IOMMUFD_FULL_DEVICE);
>     # IOMMU specific if this captures PASIDs or cause them to fail,
>     # but IOMMUFD_FULL_DEVICE will prevent attaching any PASID
>     # later on all iommu's.
> 
> vfio-mdev:
>   - 'VFIO_SET_CONTAINER' does one of:
>     iommufd_device_attach_pasid(idev, &pt_id, IOMMUFD_ASSIGN_PASID,
> &pasid);
>     iommufd_device_attach_sw_iommu(idev, pt_id);
> 
> That is three of the cases.
> 
> Then we have new ioctls for the other cases:
> 
> vfio-pci:
>   - 'bind only the RID, so we can use PASID'
>     iommufd_device_attach(idev, &pt_id, 0);
>   - 'bind to a specific PASID'
>     iommufd_device_attach_pasid(idev, &pt_id, 0, &pasid);
> 
> vfio-mdev:
>   - 'like VFIO_SET_CONTAINER but bind to a specific PASID'
>     iommufd_device_attach_pasid(idev, &pt_id, 0, &pasid);
> 
> The iommu driver will block attachments that are incompatible, ie ARM
> user page tables only work with:
>  iommufd_device_attach(idev, &pt_id, IOMMUFD_FULL_DEVICE)
> all other calls fail.

Above are all good except the FULL_DEVICE thing.

This might be the only open as I still didn't see why we need an
explicit flag to claim a 'full device' thing. From kernel p.o.v the
ARM case is no different from Intel that both allows an user
page table attached to vRID, just with different format and
addr width (Intel is 64bit, ARM is 84bit where PASID can be
considered a sub-handle in the 84bit address space and not
the kernel's business).

and ARM doesn't support explicit PASID attach then those calls
will fail for sure.

> 
> How exactly we put all of this into new ioctls, I'm not sure, but it
> does seem pretty clear this is what the iommufd kAPI will need to look
> like to cover the cases we know about already.
> 
> As you can see, userpace needs to understand what mode it is operating
> in. If it does IOMMUFD_FULL_DEVICE and manages PASID somehow in
> userspace, or it doesn't and can use the iommufd_device_attach_pasid()
> paths.
> 
> > Is modeling like above considered ambiguous?
> 
> You've skipped straight to the ioctls without designing the kernel API
> to meet all the requirements  :)
> 

No problem on this. Just we focus on different matter in this discussion.
As I replied I think the only open is whether ARM thing needs to be
specialized via a new ioctl or flag. Otherwise all other things are aligned. 

Thanks
Kevin

^ permalink raw reply	[flat|nested] 116+ messages in thread

* RE: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-11  3:57                             ` Tian, Kevin
  0 siblings, 0 replies; 116+ messages in thread
From: Tian, Kevin @ 2021-12-11  3:57 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Raj, Ashok, kvm, Jean-Philippe Brucker, maz, will, linux-kernel,
	vsethi, lushenming, alex.williamson, vivek.gautam, zhangfei.gao,
	eric.auger.pro, robin.murphy, kvmarm, wangxingang5

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Friday, December 10, 2021 9:23 PM
> 
> On Fri, Dec 10, 2021 at 08:56:56AM +0000, Tian, Kevin wrote:
> > > So, something like vfio pci would implement three uAPI operations:
> > >  - Attach page table to RID
> > >  - Attach page table to PASID
> > >  - Attach page table to RID and all PASIDs
> > >    And here 'page table' is everything below the STE in SMMUv3
> > >
> > > While mdev can only support:
> > >  - Access emulated page table
> > >  - Attach page table to PASID
> >
> > mdev is a pci device from user p.o.v, having its vRID and vPASID. From
> > this angle the uAPI is no different from vfio-pci (except the ARM one):
> 
> No, it isn't. The internal operation is completely different, and it
> must call different iommufd APIs than vfio-pci does.

Well, you mentioned "uAPI operations" thus my earlier comment 
is purely from uAPI p.o.v instead of internal iommufd APIs (not meant
I didn't think of them). I think this is the main divergence in this 
discussion as when I saw you said "while mdev can only support" 
I assume it's still about uAPI (more specifically VFIO uAPI as it carries 
the attach call to iommufd).

> 
> This is user visible - mdev can never be attached to an ARM user page
> table, for instance.

sure. the iommu driver will fail the attach request when seeing
incompatible way is used.

> 
> For iommufd there is no vRID, vPASID or any confusing stuff like
> that. You'll have an easier time if you stop thinking in these terms.

I don't have a difficulty here as from vfio uAPI p.o.v it's about
vRID and vPASID. But there is NO any confusion on iommufd which
should only deal with physical thing. This has been settled down
long time ago in high level design discussion. 😊

> 
> We probably end up with three iommufd calls:
>  int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id,
> unsigned int flags)
>  int iommufd_device_attach_pasid(struct iommufd_device *idev, u32 *pt_id,
> unsigned int flags, ioasid_t *pasid)
>  int iommufd_device_attach_sw_iommu(struct iommufd_device *idev, u32
> pt_id);

this is aligned with previous design.

> 
> And the uAPI from VFIO must map onto them.
> 
> vfio-pci:
>   - 'VFIO_SET_CONTAINER' does
>     iommufd_device_attach(idev, &pt_id, IOMMUFD_FULL_DEVICE);
>     # IOMMU specific if this captures PASIDs or cause them to fail,
>     # but IOMMUFD_FULL_DEVICE will prevent attaching any PASID
>     # later on all iommu's.
> 
> vfio-mdev:
>   - 'VFIO_SET_CONTAINER' does one of:
>     iommufd_device_attach_pasid(idev, &pt_id, IOMMUFD_ASSIGN_PASID,
> &pasid);
>     iommufd_device_attach_sw_iommu(idev, pt_id);
> 
> That is three of the cases.
> 
> Then we have new ioctls for the other cases:
> 
> vfio-pci:
>   - 'bind only the RID, so we can use PASID'
>     iommufd_device_attach(idev, &pt_id, 0);
>   - 'bind to a specific PASID'
>     iommufd_device_attach_pasid(idev, &pt_id, 0, &pasid);
> 
> vfio-mdev:
>   - 'like VFIO_SET_CONTAINER but bind to a specific PASID'
>     iommufd_device_attach_pasid(idev, &pt_id, 0, &pasid);
> 
> The iommu driver will block attachments that are incompatible, ie ARM
> user page tables only work with:
>  iommufd_device_attach(idev, &pt_id, IOMMUFD_FULL_DEVICE)
> all other calls fail.

Above are all good except the FULL_DEVICE thing.

This might be the only open as I still didn't see why we need an
explicit flag to claim a 'full device' thing. From kernel p.o.v the
ARM case is no different from Intel that both allows an user
page table attached to vRID, just with different format and
addr width (Intel is 64bit, ARM is 84bit where PASID can be
considered a sub-handle in the 84bit address space and not
the kernel's business).

and ARM doesn't support explicit PASID attach then those calls
will fail for sure.

> 
> How exactly we put all of this into new ioctls, I'm not sure, but it
> does seem pretty clear this is what the iommufd kAPI will need to look
> like to cover the cases we know about already.
> 
> As you can see, userpace needs to understand what mode it is operating
> in. If it does IOMMUFD_FULL_DEVICE and manages PASID somehow in
> userspace, or it doesn't and can use the iommufd_device_attach_pasid()
> paths.
> 
> > Is modeling like above considered ambiguous?
> 
> You've skipped straight to the ioctls without designing the kernel API
> to meet all the requirements  :)
> 

No problem on this. Just we focus on different matter in this discussion.
As I replied I think the only open is whether ARM thing needs to be
specialized via a new ioctl or flag. Otherwise all other things are aligned. 

Thanks
Kevin
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* RE: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
  2021-12-10 13:23                           ` Jason Gunthorpe
@ 2021-12-11  5:18                             ` Tian, Kevin
  -1 siblings, 0 replies; 116+ messages in thread
From: Tian, Kevin @ 2021-12-11  5:18 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: peter.maydell, lushenming, robin.murphy, Raj, Ashok, kvm,
	Jean-Philippe Brucker, maz, linux-kernel, vsethi,
	alex.williamson, wangxingang5, vivek.gautam, zhangfei.gao,
	eric.auger.pro, will, kvmarm

> From: Tian, Kevin
> Sent: Saturday, December 11, 2021 11:58 AM
>
> This might be the only open as I still didn't see why we need an
> explicit flag to claim a 'full device' thing. From kernel p.o.v the
> ARM case is no different from Intel that both allows an user
> page table attached to vRID, just with different format and

obviously this is 'RID' to not cause further confusion since it
talks about the kernel p.o.v

> addr width (Intel is 64bit, ARM is 84bit where PASID can be
> considered a sub-handle in the 84bit address space and not
> the kernel's business).
> 
> and ARM doesn't support explicit PASID attach then those calls
> will fail for sure.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 116+ messages in thread

* RE: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-11  5:18                             ` Tian, Kevin
  0 siblings, 0 replies; 116+ messages in thread
From: Tian, Kevin @ 2021-12-11  5:18 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Raj, Ashok, kvm, Jean-Philippe Brucker, maz, will, linux-kernel,
	vsethi, lushenming, alex.williamson, vivek.gautam, zhangfei.gao,
	eric.auger.pro, robin.murphy, kvmarm, wangxingang5

> From: Tian, Kevin
> Sent: Saturday, December 11, 2021 11:58 AM
>
> This might be the only open as I still didn't see why we need an
> explicit flag to claim a 'full device' thing. From kernel p.o.v the
> ARM case is no different from Intel that both allows an user
> page table attached to vRID, just with different format and

obviously this is 'RID' to not cause further confusion since it
talks about the kernel p.o.v

> addr width (Intel is 64bit, ARM is 84bit where PASID can be
> considered a sub-handle in the 84bit address space and not
> the kernel's business).
> 
> and ARM doesn't support explicit PASID attach then those calls
> will fail for sure.

Thanks
Kevin
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
  2021-12-11  3:57                             ` Tian, Kevin
@ 2021-12-16 20:48                               ` Jason Gunthorpe
  -1 siblings, 0 replies; 116+ messages in thread
From: Jason Gunthorpe @ 2021-12-16 20:48 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: peter.maydell, lushenming, robin.murphy, Raj, Ashok, kvm,
	Jean-Philippe Brucker, maz, linux-kernel, vsethi,
	alex.williamson, wangxingang5, vivek.gautam, zhangfei.gao,
	eric.auger.pro, will, kvmarm

On Sat, Dec 11, 2021 at 03:57:45AM +0000, Tian, Kevin wrote:

> This might be the only open as I still didn't see why we need an
> explicit flag to claim a 'full device' thing. From kernel p.o.v the
> ARM case is no different from Intel that both allows an user
> page table attached to vRID, just with different format and
> addr width (Intel is 64bit, ARM is 84bit where PASID can be
> considered a sub-handle in the 84bit address space and not
> the kernel's business).

I think the difference is intention.

In one case the kernel is saying 'attach a RID and I intend to use
PASID' in which case the kernel user can call the PASID APIs.

The second case is saying 'I will not use PASID'.

They are different things and I think it is a surprising API if the
kernel user attaches a domain, intends to use PASID and then finds out
it can't, eg because an ARM user page table was hooked up.

If you imagine the flag as 'I intend to use PASID' I think it makes a
fair amount of sense from an API design too.

We could probably do without it, at least for VFIO and qemu cases, but
it seems a little bit peculiar to me.

Jason

^ permalink raw reply	[flat|nested] 116+ messages in thread

* Re: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2021-12-16 20:48                               ` Jason Gunthorpe
  0 siblings, 0 replies; 116+ messages in thread
From: Jason Gunthorpe @ 2021-12-16 20:48 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Raj, Ashok, kvm, Jean-Philippe Brucker, maz, will, linux-kernel,
	vsethi, lushenming, alex.williamson, vivek.gautam, zhangfei.gao,
	eric.auger.pro, robin.murphy, kvmarm, wangxingang5

On Sat, Dec 11, 2021 at 03:57:45AM +0000, Tian, Kevin wrote:

> This might be the only open as I still didn't see why we need an
> explicit flag to claim a 'full device' thing. From kernel p.o.v the
> ARM case is no different from Intel that both allows an user
> page table attached to vRID, just with different format and
> addr width (Intel is 64bit, ARM is 84bit where PASID can be
> considered a sub-handle in the 84bit address space and not
> the kernel's business).

I think the difference is intention.

In one case the kernel is saying 'attach a RID and I intend to use
PASID' in which case the kernel user can call the PASID APIs.

The second case is saying 'I will not use PASID'.

They are different things and I think it is a surprising API if the
kernel user attaches a domain, intends to use PASID and then finds out
it can't, eg because an ARM user page table was hooked up.

If you imagine the flag as 'I intend to use PASID' I think it makes a
fair amount of sense from an API design too.

We could probably do without it, at least for VFIO and qemu cases, but
it seems a little bit peculiar to me.

Jason
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

* RE: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
  2021-12-16 20:48                               ` Jason Gunthorpe
@ 2022-01-04  2:42                                 ` Tian, Kevin
  -1 siblings, 0 replies; 116+ messages in thread
From: Tian, Kevin @ 2022-01-04  2:42 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: peter.maydell, lushenming, robin.murphy, Raj, Ashok, kvm,
	Jean-Philippe Brucker, maz, linux-kernel, vsethi,
	alex.williamson, wangxingang5, vivek.gautam, zhangfei.gao,
	eric.auger.pro, will, kvmarm

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Friday, December 17, 2021 4:49 AM
> 
> On Sat, Dec 11, 2021 at 03:57:45AM +0000, Tian, Kevin wrote:
> 
> > This might be the only open as I still didn't see why we need an
> > explicit flag to claim a 'full device' thing. From kernel p.o.v the
> > ARM case is no different from Intel that both allows an user
> > page table attached to vRID, just with different format and
> > addr width (Intel is 64bit, ARM is 84bit where PASID can be
> > considered a sub-handle in the 84bit address space and not
> > the kernel's business).
> 
> I think the difference is intention.
> 
> In one case the kernel is saying 'attach a RID and I intend to use
> PASID' in which case the kernel user can call the PASID APIs.
> 
> The second case is saying 'I will not use PASID'.
> 
> They are different things and I think it is a surprising API if the
> kernel user attaches a domain, intends to use PASID and then finds out
> it can't, eg because an ARM user page table was hooked up.
> 
> If you imagine the flag as 'I intend to use PASID' I think it makes a
> fair amount of sense from an API design too.
> 
> We could probably do without it, at least for VFIO and qemu cases, but
> it seems a little bit peculiar to me.
> 

ok, combining the kernel user makes the flag more sensible.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 116+ messages in thread

* RE: [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API
@ 2022-01-04  2:42                                 ` Tian, Kevin
  0 siblings, 0 replies; 116+ messages in thread
From: Tian, Kevin @ 2022-01-04  2:42 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Raj, Ashok, kvm, Jean-Philippe Brucker, maz, will, linux-kernel,
	vsethi, lushenming, alex.williamson, vivek.gautam, zhangfei.gao,
	eric.auger.pro, robin.murphy, kvmarm, wangxingang5

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Friday, December 17, 2021 4:49 AM
> 
> On Sat, Dec 11, 2021 at 03:57:45AM +0000, Tian, Kevin wrote:
> 
> > This might be the only open as I still didn't see why we need an
> > explicit flag to claim a 'full device' thing. From kernel p.o.v the
> > ARM case is no different from Intel that both allows an user
> > page table attached to vRID, just with different format and
> > addr width (Intel is 64bit, ARM is 84bit where PASID can be
> > considered a sub-handle in the 84bit address space and not
> > the kernel's business).
> 
> I think the difference is intention.
> 
> In one case the kernel is saying 'attach a RID and I intend to use
> PASID' in which case the kernel user can call the PASID APIs.
> 
> The second case is saying 'I will not use PASID'.
> 
> They are different things and I think it is a surprising API if the
> kernel user attaches a domain, intends to use PASID and then finds out
> it can't, eg because an ARM user page table was hooked up.
> 
> If you imagine the flag as 'I intend to use PASID' I think it makes a
> fair amount of sense from an API design too.
> 
> We could probably do without it, at least for VFIO and qemu cases, but
> it seems a little bit peculiar to me.
> 

ok, combining the kernel user makes the flag more sensible.

Thanks
Kevin
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 116+ messages in thread

end of thread, other threads:[~2022-01-04  2:42 UTC | newest]

Thread overview: 116+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-27 10:44 [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part) Eric Auger
2021-10-27 10:44 ` Eric Auger
2021-10-27 10:44 ` Eric Auger
2021-10-27 10:44 ` [RFC v16 1/9] iommu: Introduce attach/detach_pasid_table API Eric Auger
2021-10-27 10:44   ` Eric Auger
2021-10-27 10:44   ` Eric Auger
2021-12-06 10:48   ` Joerg Roedel
2021-12-06 10:48     ` Joerg Roedel
2021-12-06 10:48     ` Joerg Roedel
2021-12-07 10:22     ` Eric Auger
2021-12-07 10:22       ` Eric Auger
2021-12-07 10:22       ` Eric Auger
2021-12-08  2:44       ` Lu Baolu
2021-12-08  2:44         ` Lu Baolu
2021-12-08  2:44         ` Lu Baolu
2021-12-08  7:33         ` Eric Auger
2021-12-08  7:33           ` Eric Auger
2021-12-08  7:33           ` Eric Auger
2021-12-08 12:56           ` Jason Gunthorpe
2021-12-08 12:56             ` Jason Gunthorpe
2021-12-08 12:56             ` Jason Gunthorpe via iommu
2021-12-08 17:20             ` Jean-Philippe Brucker
2021-12-08 17:20               ` Jean-Philippe Brucker
2021-12-08 17:20               ` Jean-Philippe Brucker
2021-12-08 18:31               ` Jason Gunthorpe
2021-12-08 18:31                 ` Jason Gunthorpe
2021-12-08 18:31                 ` Jason Gunthorpe via iommu
2021-12-09  2:58                 ` Tian, Kevin
2021-12-09  2:58                   ` Tian, Kevin
2021-12-09  2:58                   ` Tian, Kevin
     [not found]                 ` <BN9PR11MB527624080CB9302481B74C7A8C709@BN9PR11MB5276.namprd11.prod.outlook.com>
2021-12-09  3:59                   ` Tian, Kevin
2021-12-09  3:59                     ` Tian, Kevin
2021-12-09  3:59                     ` Tian, Kevin
2021-12-09 16:08                     ` Jason Gunthorpe
2021-12-09 16:08                       ` Jason Gunthorpe
2021-12-09 16:08                       ` Jason Gunthorpe via iommu
2021-12-10  8:56                       ` Tian, Kevin
2021-12-10  8:56                         ` Tian, Kevin
2021-12-10 13:23                         ` Jason Gunthorpe
2021-12-10 13:23                           ` Jason Gunthorpe
2021-12-11  3:57                           ` Tian, Kevin
2021-12-11  3:57                             ` Tian, Kevin
2021-12-16 20:48                             ` Jason Gunthorpe
2021-12-16 20:48                               ` Jason Gunthorpe
2022-01-04  2:42                               ` Tian, Kevin
2022-01-04  2:42                                 ` Tian, Kevin
2021-12-11  5:18                           ` Tian, Kevin
2021-12-11  5:18                             ` Tian, Kevin
2021-12-09  7:50                 ` Eric Auger
2021-12-09  7:50                   ` Eric Auger
2021-12-09  7:50                   ` Eric Auger
2021-12-09 15:40                   ` Jason Gunthorpe
2021-12-09 15:40                     ` Jason Gunthorpe
2021-12-09 15:40                     ` Jason Gunthorpe via iommu
2021-12-09 16:37                     ` Eric Auger
2021-12-09 16:37                       ` Eric Auger
2021-12-09 16:37                       ` Eric Auger
2021-12-09  3:21             ` Tian, Kevin
2021-12-09  3:21               ` Tian, Kevin
2021-12-09  3:21               ` Tian, Kevin
2021-12-09  9:44               ` Eric Auger
2021-12-09  9:44                 ` Eric Auger
2021-12-09  9:44                 ` Eric Auger
2021-12-09  8:31             ` Eric Auger
2021-12-09  8:31               ` Eric Auger
2021-12-09  8:31               ` Eric Auger
2021-10-27 10:44 ` [RFC v16 2/9] iommu: Introduce iommu_get_nesting Eric Auger
2021-10-27 10:44   ` Eric Auger
2021-10-27 10:44   ` Eric Auger
2021-10-27 22:15   ` kernel test robot
2021-10-28  3:22   ` kernel test robot
2021-10-27 10:44 ` [RFC v16 3/9] iommu/smmuv3: Allow s1 and s2 configs to coexist Eric Auger
2021-10-27 10:44   ` Eric Auger
2021-10-27 10:44   ` Eric Auger
2021-10-27 10:44 ` [RFC v16 4/9] iommu/smmuv3: Get prepared for nested stage support Eric Auger
2021-10-27 10:44   ` Eric Auger
2021-10-27 10:44   ` Eric Auger
2021-10-27 10:44 ` [RFC v16 5/9] iommu/smmuv3: Implement attach/detach_pasid_table Eric Auger
2021-10-27 10:44   ` Eric Auger
2021-10-27 10:44   ` Eric Auger
2021-10-27 10:44 ` [RFC v16 6/9] iommu/smmuv3: Allow stage 1 invalidation with unmanaged ASIDs Eric Auger
2021-10-27 10:44   ` Eric Auger
2021-10-27 10:44   ` Eric Auger
2021-10-27 10:44 ` [RFC v16 7/9] iommu/smmuv3: Implement cache_invalidate Eric Auger
2021-10-27 10:44   ` Eric Auger
2021-10-27 10:44   ` Eric Auger
2021-10-27 10:44 ` [RFC v16 8/9] iommu/smmuv3: report additional recoverable faults Eric Auger
2021-10-27 10:44   ` Eric Auger
2021-10-27 10:44   ` Eric Auger
2021-10-27 21:05   ` kernel test robot
2021-10-27 22:41   ` kernel test robot
2021-10-27 22:41     ` kernel test robot
2021-10-27 10:44 ` [RFC v16 9/9] iommu/smmuv3: Disallow nested mode in presence of HW MSI regions Eric Auger
2021-10-27 10:44   ` Eric Auger
2021-10-27 10:44   ` Eric Auger
2021-12-03 12:27 ` [RFC v16 0/9] SMMUv3 Nested Stage Setup (IOMMU part) Zhangfei Gao
2021-12-03 12:27   ` Zhangfei Gao
2021-12-03 12:27   ` Zhangfei Gao
2021-12-07 10:27   ` Eric Auger
2021-12-07 10:27     ` Eric Auger
2021-12-07 10:27     ` Eric Auger
2021-12-07 10:35     ` Zhangfei Gao
2021-12-07 10:35       ` Zhangfei Gao
2021-12-07 10:35       ` Zhangfei Gao
2021-12-07 11:06       ` Eric Auger
2021-12-07 11:06         ` Eric Auger
2021-12-07 11:06         ` Eric Auger
2021-12-08 13:33         ` Shameerali Kolothum Thodi
2021-12-08 13:33           ` Shameerali Kolothum Thodi
2021-12-08 13:33           ` Shameerali Kolothum Thodi via iommu
2021-12-03 13:13 ` Sumit Gupta
2021-12-03 13:13   ` Sumit Gupta
2021-12-03 13:13   ` Sumit Gupta via iommu
2021-12-07 10:28   ` Eric Auger
2021-12-07 10:28     ` Eric Auger
2021-12-07 10:28     ` Eric Auger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.