All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v9 00/10] iommu: I/O page faults for SMMUv3
@ 2021-01-08 14:52 ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: lorenzo.pieralisi, robh+dt, guohanjun, sudeep.holla, rjw, lenb,
	robin.murphy, Jonathan.Cameron, eric.auger, iommu, devicetree,
	linux-acpi, linux-arm-kernel, linux-accelerators, baolu.lu,
	vdumpa, zhangfei.gao, shameerali.kolothum.thodi, vivek.gautam,
	Jean-Philippe Brucker

Add stall support to the SMMUv3, along with a common I/O Page Fault
handler.

Changes since v8 [1]:
* Added patches 1 and 2 which aren't strictly related to IOPF but need to
  be applied in order - 8 depends on 2 which depends on 1. Patch 2 moves
  pasid-num-bits to a device property, following Robin's comment on v8.
* Patches 3-5 extract the IOPF feature from the SVA one, to support SVA
  implementations that handle I/O page faults through the device driver
  rather than the IOMMU driver [2]
* Use device properties for dma-can-stall, instead of a special fwspec
  member.
* Dropped PRI support for now, since it doesn't seem to be available in
  hardware and adds some complexity.
* Had to drop some Acks and Tested tags unfortunately, due to code
  changes.

As usual, you can get the latest SVA patches from
http://jpbrucker.net/git/linux sva/current

[1] https://lore.kernel.org/linux-iommu/20201112125519.3987595-1-jean-philippe@linaro.org/
[2] https://lore.kernel.org/linux-iommu/BY5PR12MB3764F5D07E8EC48327E39C86B3C60@BY5PR12MB3764.namprd12.prod.outlook.com/

Jean-Philippe Brucker (10):
  iommu: Remove obsolete comment
  iommu/arm-smmu-v3: Use device properties for pasid-num-bits
  iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
  iommu/vt-d: Support IOMMU_DEV_FEAT_IOPF
  uacce: Enable IOMMU_DEV_FEAT_IOPF
  iommu: Add a page fault handler
  iommu/arm-smmu-v3: Maintain a SID->device structure
  dt-bindings: document stall property for IOMMU masters
  ACPI/IORT: Enable stall support for platform devices
  iommu/arm-smmu-v3: Add stall support for platform devices

 drivers/iommu/Makefile                        |   1 +
 .../devicetree/bindings/iommu/iommu.txt       |  18 +
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  74 ++-
 drivers/iommu/iommu-sva-lib.h                 |  53 ++
 include/linux/iommu.h                         |  25 +-
 drivers/acpi/arm64/iort.c                     |  15 +-
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  70 ++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 354 ++++++++++++--
 drivers/iommu/intel/iommu.c                   |  11 +-
 drivers/iommu/io-pgfault.c                    | 462 ++++++++++++++++++
 drivers/iommu/of_iommu.c                      |   5 -
 drivers/misc/uacce/uacce.c                    |  32 +-
 12 files changed, 1046 insertions(+), 74 deletions(-)
 create mode 100644 drivers/iommu/io-pgfault.c

-- 
2.29.2


^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH v9 00/10] iommu: I/O page faults for SMMUv3
@ 2021-01-08 14:52 ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: devicetree, linux-acpi, Jean-Philippe Brucker, guohanjun, rjw,
	iommu, robh+dt, linux-accelerators, sudeep.holla, vivek.gautam,
	zhangfei.gao, robin.murphy, linux-arm-kernel, lenb

Add stall support to the SMMUv3, along with a common I/O Page Fault
handler.

Changes since v8 [1]:
* Added patches 1 and 2 which aren't strictly related to IOPF but need to
  be applied in order - 8 depends on 2 which depends on 1. Patch 2 moves
  pasid-num-bits to a device property, following Robin's comment on v8.
* Patches 3-5 extract the IOPF feature from the SVA one, to support SVA
  implementations that handle I/O page faults through the device driver
  rather than the IOMMU driver [2]
* Use device properties for dma-can-stall, instead of a special fwspec
  member.
* Dropped PRI support for now, since it doesn't seem to be available in
  hardware and adds some complexity.
* Had to drop some Acks and Tested tags unfortunately, due to code
  changes.

As usual, you can get the latest SVA patches from
http://jpbrucker.net/git/linux sva/current

[1] https://lore.kernel.org/linux-iommu/20201112125519.3987595-1-jean-philippe@linaro.org/
[2] https://lore.kernel.org/linux-iommu/BY5PR12MB3764F5D07E8EC48327E39C86B3C60@BY5PR12MB3764.namprd12.prod.outlook.com/

Jean-Philippe Brucker (10):
  iommu: Remove obsolete comment
  iommu/arm-smmu-v3: Use device properties for pasid-num-bits
  iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
  iommu/vt-d: Support IOMMU_DEV_FEAT_IOPF
  uacce: Enable IOMMU_DEV_FEAT_IOPF
  iommu: Add a page fault handler
  iommu/arm-smmu-v3: Maintain a SID->device structure
  dt-bindings: document stall property for IOMMU masters
  ACPI/IORT: Enable stall support for platform devices
  iommu/arm-smmu-v3: Add stall support for platform devices

 drivers/iommu/Makefile                        |   1 +
 .../devicetree/bindings/iommu/iommu.txt       |  18 +
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  74 ++-
 drivers/iommu/iommu-sva-lib.h                 |  53 ++
 include/linux/iommu.h                         |  25 +-
 drivers/acpi/arm64/iort.c                     |  15 +-
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  70 ++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 354 ++++++++++++--
 drivers/iommu/intel/iommu.c                   |  11 +-
 drivers/iommu/io-pgfault.c                    | 462 ++++++++++++++++++
 drivers/iommu/of_iommu.c                      |   5 -
 drivers/misc/uacce/uacce.c                    |  32 +-
 12 files changed, 1046 insertions(+), 74 deletions(-)
 create mode 100644 drivers/iommu/io-pgfault.c

-- 
2.29.2

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH v9 00/10] iommu: I/O page faults for SMMUv3
@ 2021-01-08 14:52 ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: devicetree, lorenzo.pieralisi, linux-acpi, Jean-Philippe Brucker,
	guohanjun, rjw, shameerali.kolothum.thodi, eric.auger, iommu,
	robh+dt, linux-accelerators, Jonathan.Cameron, sudeep.holla,
	vivek.gautam, zhangfei.gao, baolu.lu, robin.murphy,
	linux-arm-kernel, lenb

Add stall support to the SMMUv3, along with a common I/O Page Fault
handler.

Changes since v8 [1]:
* Added patches 1 and 2 which aren't strictly related to IOPF but need to
  be applied in order - 8 depends on 2 which depends on 1. Patch 2 moves
  pasid-num-bits to a device property, following Robin's comment on v8.
* Patches 3-5 extract the IOPF feature from the SVA one, to support SVA
  implementations that handle I/O page faults through the device driver
  rather than the IOMMU driver [2]
* Use device properties for dma-can-stall, instead of a special fwspec
  member.
* Dropped PRI support for now, since it doesn't seem to be available in
  hardware and adds some complexity.
* Had to drop some Acks and Tested tags unfortunately, due to code
  changes.

As usual, you can get the latest SVA patches from
http://jpbrucker.net/git/linux sva/current

[1] https://lore.kernel.org/linux-iommu/20201112125519.3987595-1-jean-philippe@linaro.org/
[2] https://lore.kernel.org/linux-iommu/BY5PR12MB3764F5D07E8EC48327E39C86B3C60@BY5PR12MB3764.namprd12.prod.outlook.com/

Jean-Philippe Brucker (10):
  iommu: Remove obsolete comment
  iommu/arm-smmu-v3: Use device properties for pasid-num-bits
  iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
  iommu/vt-d: Support IOMMU_DEV_FEAT_IOPF
  uacce: Enable IOMMU_DEV_FEAT_IOPF
  iommu: Add a page fault handler
  iommu/arm-smmu-v3: Maintain a SID->device structure
  dt-bindings: document stall property for IOMMU masters
  ACPI/IORT: Enable stall support for platform devices
  iommu/arm-smmu-v3: Add stall support for platform devices

 drivers/iommu/Makefile                        |   1 +
 .../devicetree/bindings/iommu/iommu.txt       |  18 +
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  74 ++-
 drivers/iommu/iommu-sva-lib.h                 |  53 ++
 include/linux/iommu.h                         |  25 +-
 drivers/acpi/arm64/iort.c                     |  15 +-
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  70 ++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 354 ++++++++++++--
 drivers/iommu/intel/iommu.c                   |  11 +-
 drivers/iommu/io-pgfault.c                    | 462 ++++++++++++++++++
 drivers/iommu/of_iommu.c                      |   5 -
 drivers/misc/uacce/uacce.c                    |  32 +-
 12 files changed, 1046 insertions(+), 74 deletions(-)
 create mode 100644 drivers/iommu/io-pgfault.c

-- 
2.29.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 105+ messages in thread

* [PATCH v9 01/10] iommu: Remove obsolete comment
  2021-01-08 14:52 ` Jean-Philippe Brucker
  (?)
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: lorenzo.pieralisi, robh+dt, guohanjun, sudeep.holla, rjw, lenb,
	robin.murphy, Jonathan.Cameron, eric.auger, iommu, devicetree,
	linux-acpi, linux-arm-kernel, linux-accelerators, baolu.lu,
	vdumpa, zhangfei.gao, shameerali.kolothum.thodi, vivek.gautam,
	Jean-Philippe Brucker

Commit 986d5ecc5699 ("iommu: Move fwspec->iommu_priv to struct
dev_iommu") removed iommu_priv from fwspec. Update the struct doc.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 include/linux/iommu.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index b3f0e2018c62..26bcde5e7746 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -570,7 +570,6 @@ struct iommu_group *fsl_mc_device_group(struct device *dev);
  * struct iommu_fwspec - per-device IOMMU instance data
  * @ops: ops for this device's IOMMU
  * @iommu_fwnode: firmware handle for this device's IOMMU
- * @iommu_priv: IOMMU driver private data for this device
  * @num_pasid_bits: number of PASID bits supported by this device
  * @num_ids: number of associated device IDs
  * @ids: IDs which this device may present to the IOMMU
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 01/10] iommu: Remove obsolete comment
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: devicetree, linux-acpi, Jean-Philippe Brucker, guohanjun, rjw,
	iommu, robh+dt, linux-accelerators, sudeep.holla, vivek.gautam,
	zhangfei.gao, robin.murphy, linux-arm-kernel, lenb

Commit 986d5ecc5699 ("iommu: Move fwspec->iommu_priv to struct
dev_iommu") removed iommu_priv from fwspec. Update the struct doc.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 include/linux/iommu.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index b3f0e2018c62..26bcde5e7746 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -570,7 +570,6 @@ struct iommu_group *fsl_mc_device_group(struct device *dev);
  * struct iommu_fwspec - per-device IOMMU instance data
  * @ops: ops for this device's IOMMU
  * @iommu_fwnode: firmware handle for this device's IOMMU
- * @iommu_priv: IOMMU driver private data for this device
  * @num_pasid_bits: number of PASID bits supported by this device
  * @num_ids: number of associated device IDs
  * @ids: IDs which this device may present to the IOMMU
-- 
2.29.2

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 01/10] iommu: Remove obsolete comment
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: devicetree, lorenzo.pieralisi, linux-acpi, Jean-Philippe Brucker,
	guohanjun, rjw, shameerali.kolothum.thodi, eric.auger, iommu,
	robh+dt, linux-accelerators, Jonathan.Cameron, sudeep.holla,
	vivek.gautam, zhangfei.gao, baolu.lu, robin.murphy,
	linux-arm-kernel, lenb

Commit 986d5ecc5699 ("iommu: Move fwspec->iommu_priv to struct
dev_iommu") removed iommu_priv from fwspec. Update the struct doc.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 include/linux/iommu.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index b3f0e2018c62..26bcde5e7746 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -570,7 +570,6 @@ struct iommu_group *fsl_mc_device_group(struct device *dev);
  * struct iommu_fwspec - per-device IOMMU instance data
  * @ops: ops for this device's IOMMU
  * @iommu_fwnode: firmware handle for this device's IOMMU
- * @iommu_priv: IOMMU driver private data for this device
  * @num_pasid_bits: number of PASID bits supported by this device
  * @num_ids: number of associated device IDs
  * @ids: IDs which this device may present to the IOMMU
-- 
2.29.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 02/10] iommu/arm-smmu-v3: Use device properties for pasid-num-bits
  2021-01-08 14:52 ` Jean-Philippe Brucker
  (?)
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: lorenzo.pieralisi, robh+dt, guohanjun, sudeep.holla, rjw, lenb,
	robin.murphy, Jonathan.Cameron, eric.auger, iommu, devicetree,
	linux-acpi, linux-arm-kernel, linux-accelerators, baolu.lu,
	vdumpa, zhangfei.gao, shameerali.kolothum.thodi, vivek.gautam,
	Jean-Philippe Brucker

The pasid-num-bits property shouldn't need a dedicated fwspec field,
it's a job for device properties. Add properties for IORT, and access
the number of PASID bits using device_property_read_u32().

Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 include/linux/iommu.h                       |  2 --
 drivers/acpi/arm64/iort.c                   | 13 +++++++------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  3 ++-
 drivers/iommu/of_iommu.c                    |  5 -----
 4 files changed, 9 insertions(+), 14 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 26bcde5e7746..583c734b2e87 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -570,7 +570,6 @@ struct iommu_group *fsl_mc_device_group(struct device *dev);
  * struct iommu_fwspec - per-device IOMMU instance data
  * @ops: ops for this device's IOMMU
  * @iommu_fwnode: firmware handle for this device's IOMMU
- * @num_pasid_bits: number of PASID bits supported by this device
  * @num_ids: number of associated device IDs
  * @ids: IDs which this device may present to the IOMMU
  */
@@ -578,7 +577,6 @@ struct iommu_fwspec {
 	const struct iommu_ops	*ops;
 	struct fwnode_handle	*iommu_fwnode;
 	u32			flags;
-	u32			num_pasid_bits;
 	unsigned int		num_ids;
 	u32			ids[];
 };
diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index d4eac6d7e9fb..c9a8bbb74b09 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -968,15 +968,16 @@ static int iort_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
 static void iort_named_component_init(struct device *dev,
 				      struct acpi_iort_node *node)
 {
+	struct property_entry props[2] = {};
 	struct acpi_iort_named_component *nc;
-	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
-
-	if (!fwspec)
-		return;
 
 	nc = (struct acpi_iort_named_component *)node->node_data;
-	fwspec->num_pasid_bits = FIELD_GET(ACPI_IORT_NC_PASID_BITS,
-					   nc->node_flags);
+	props[0] = PROPERTY_ENTRY_U32("pasid-num-bits",
+				      FIELD_GET(ACPI_IORT_NC_PASID_BITS,
+						nc->node_flags));
+
+	if (device_add_properties(dev, props))
+		dev_warn(dev, "Could not add device properties\n");
 }
 
 static int iort_nc_iommu_map(struct device *dev, struct acpi_iort_node *node)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 8ca7415d785d..6a53b4edf054 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2366,7 +2366,8 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
 		}
 	}
 
-	master->ssid_bits = min(smmu->ssid_bits, fwspec->num_pasid_bits);
+	device_property_read_u32(dev, "pasid-num-bits", &master->ssid_bits);
+	master->ssid_bits = min(smmu->ssid_bits, master->ssid_bits);
 
 	/*
 	 * Note that PASID must be enabled before, and disabled after ATS:
diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index e505b9130a1c..a9d2df001149 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -210,11 +210,6 @@ const struct iommu_ops *of_iommu_configure(struct device *dev,
 					     of_pci_iommu_init, &info);
 	} else {
 		err = of_iommu_configure_device(master_np, dev, id);
-
-		fwspec = dev_iommu_fwspec_get(dev);
-		if (!err && fwspec)
-			of_property_read_u32(master_np, "pasid-num-bits",
-					     &fwspec->num_pasid_bits);
 	}
 
 	/*
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 02/10] iommu/arm-smmu-v3: Use device properties for pasid-num-bits
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: devicetree, linux-acpi, Jean-Philippe Brucker, guohanjun, rjw,
	iommu, robh+dt, linux-accelerators, sudeep.holla, vivek.gautam,
	zhangfei.gao, robin.murphy, linux-arm-kernel, lenb

The pasid-num-bits property shouldn't need a dedicated fwspec field,
it's a job for device properties. Add properties for IORT, and access
the number of PASID bits using device_property_read_u32().

Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 include/linux/iommu.h                       |  2 --
 drivers/acpi/arm64/iort.c                   | 13 +++++++------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  3 ++-
 drivers/iommu/of_iommu.c                    |  5 -----
 4 files changed, 9 insertions(+), 14 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 26bcde5e7746..583c734b2e87 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -570,7 +570,6 @@ struct iommu_group *fsl_mc_device_group(struct device *dev);
  * struct iommu_fwspec - per-device IOMMU instance data
  * @ops: ops for this device's IOMMU
  * @iommu_fwnode: firmware handle for this device's IOMMU
- * @num_pasid_bits: number of PASID bits supported by this device
  * @num_ids: number of associated device IDs
  * @ids: IDs which this device may present to the IOMMU
  */
@@ -578,7 +577,6 @@ struct iommu_fwspec {
 	const struct iommu_ops	*ops;
 	struct fwnode_handle	*iommu_fwnode;
 	u32			flags;
-	u32			num_pasid_bits;
 	unsigned int		num_ids;
 	u32			ids[];
 };
diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index d4eac6d7e9fb..c9a8bbb74b09 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -968,15 +968,16 @@ static int iort_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
 static void iort_named_component_init(struct device *dev,
 				      struct acpi_iort_node *node)
 {
+	struct property_entry props[2] = {};
 	struct acpi_iort_named_component *nc;
-	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
-
-	if (!fwspec)
-		return;
 
 	nc = (struct acpi_iort_named_component *)node->node_data;
-	fwspec->num_pasid_bits = FIELD_GET(ACPI_IORT_NC_PASID_BITS,
-					   nc->node_flags);
+	props[0] = PROPERTY_ENTRY_U32("pasid-num-bits",
+				      FIELD_GET(ACPI_IORT_NC_PASID_BITS,
+						nc->node_flags));
+
+	if (device_add_properties(dev, props))
+		dev_warn(dev, "Could not add device properties\n");
 }
 
 static int iort_nc_iommu_map(struct device *dev, struct acpi_iort_node *node)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 8ca7415d785d..6a53b4edf054 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2366,7 +2366,8 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
 		}
 	}
 
-	master->ssid_bits = min(smmu->ssid_bits, fwspec->num_pasid_bits);
+	device_property_read_u32(dev, "pasid-num-bits", &master->ssid_bits);
+	master->ssid_bits = min(smmu->ssid_bits, master->ssid_bits);
 
 	/*
 	 * Note that PASID must be enabled before, and disabled after ATS:
diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index e505b9130a1c..a9d2df001149 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -210,11 +210,6 @@ const struct iommu_ops *of_iommu_configure(struct device *dev,
 					     of_pci_iommu_init, &info);
 	} else {
 		err = of_iommu_configure_device(master_np, dev, id);
-
-		fwspec = dev_iommu_fwspec_get(dev);
-		if (!err && fwspec)
-			of_property_read_u32(master_np, "pasid-num-bits",
-					     &fwspec->num_pasid_bits);
 	}
 
 	/*
-- 
2.29.2

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 02/10] iommu/arm-smmu-v3: Use device properties for pasid-num-bits
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: devicetree, lorenzo.pieralisi, linux-acpi, Jean-Philippe Brucker,
	guohanjun, rjw, shameerali.kolothum.thodi, eric.auger, iommu,
	robh+dt, linux-accelerators, Jonathan.Cameron, sudeep.holla,
	vivek.gautam, zhangfei.gao, baolu.lu, robin.murphy,
	linux-arm-kernel, lenb

The pasid-num-bits property shouldn't need a dedicated fwspec field,
it's a job for device properties. Add properties for IORT, and access
the number of PASID bits using device_property_read_u32().

Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 include/linux/iommu.h                       |  2 --
 drivers/acpi/arm64/iort.c                   | 13 +++++++------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  3 ++-
 drivers/iommu/of_iommu.c                    |  5 -----
 4 files changed, 9 insertions(+), 14 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 26bcde5e7746..583c734b2e87 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -570,7 +570,6 @@ struct iommu_group *fsl_mc_device_group(struct device *dev);
  * struct iommu_fwspec - per-device IOMMU instance data
  * @ops: ops for this device's IOMMU
  * @iommu_fwnode: firmware handle for this device's IOMMU
- * @num_pasid_bits: number of PASID bits supported by this device
  * @num_ids: number of associated device IDs
  * @ids: IDs which this device may present to the IOMMU
  */
@@ -578,7 +577,6 @@ struct iommu_fwspec {
 	const struct iommu_ops	*ops;
 	struct fwnode_handle	*iommu_fwnode;
 	u32			flags;
-	u32			num_pasid_bits;
 	unsigned int		num_ids;
 	u32			ids[];
 };
diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index d4eac6d7e9fb..c9a8bbb74b09 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -968,15 +968,16 @@ static int iort_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
 static void iort_named_component_init(struct device *dev,
 				      struct acpi_iort_node *node)
 {
+	struct property_entry props[2] = {};
 	struct acpi_iort_named_component *nc;
-	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
-
-	if (!fwspec)
-		return;
 
 	nc = (struct acpi_iort_named_component *)node->node_data;
-	fwspec->num_pasid_bits = FIELD_GET(ACPI_IORT_NC_PASID_BITS,
-					   nc->node_flags);
+	props[0] = PROPERTY_ENTRY_U32("pasid-num-bits",
+				      FIELD_GET(ACPI_IORT_NC_PASID_BITS,
+						nc->node_flags));
+
+	if (device_add_properties(dev, props))
+		dev_warn(dev, "Could not add device properties\n");
 }
 
 static int iort_nc_iommu_map(struct device *dev, struct acpi_iort_node *node)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 8ca7415d785d..6a53b4edf054 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2366,7 +2366,8 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
 		}
 	}
 
-	master->ssid_bits = min(smmu->ssid_bits, fwspec->num_pasid_bits);
+	device_property_read_u32(dev, "pasid-num-bits", &master->ssid_bits);
+	master->ssid_bits = min(smmu->ssid_bits, master->ssid_bits);
 
 	/*
 	 * Note that PASID must be enabled before, and disabled after ATS:
diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index e505b9130a1c..a9d2df001149 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -210,11 +210,6 @@ const struct iommu_ops *of_iommu_configure(struct device *dev,
 					     of_pci_iommu_init, &info);
 	} else {
 		err = of_iommu_configure_device(master_np, dev, id);
-
-		fwspec = dev_iommu_fwspec_get(dev);
-		if (!err && fwspec)
-			of_property_read_u32(master_np, "pasid-num-bits",
-					     &fwspec->num_pasid_bits);
 	}
 
 	/*
-- 
2.29.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
  2021-01-08 14:52 ` Jean-Philippe Brucker
  (?)
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: lorenzo.pieralisi, robh+dt, guohanjun, sudeep.holla, rjw, lenb,
	robin.murphy, Jonathan.Cameron, eric.auger, iommu, devicetree,
	linux-acpi, linux-arm-kernel, linux-accelerators, baolu.lu,
	vdumpa, zhangfei.gao, shameerali.kolothum.thodi, vivek.gautam,
	Jean-Philippe Brucker, Arnd Bergmann, David Woodhouse,
	Greg Kroah-Hartman, Zhou Wang

Some devices manage I/O Page Faults (IOPF) themselves instead of relying
on PCIe PRI or Arm SMMU stall. Allow their drivers to enable SVA without
mandating IOMMU-managed IOPF. The other device drivers now need to first
enable IOMMU_DEV_FEAT_IOPF before enabling IOMMU_DEV_FEAT_SVA.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
Cc: Zhou Wang <wangzhou1@hisilicon.com>
---
 include/linux/iommu.h | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 583c734b2e87..701b2eeb0dc5 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -156,10 +156,24 @@ struct iommu_resv_region {
 	enum iommu_resv_type	type;
 };
 
-/* Per device IOMMU features */
+/**
+ * enum iommu_dev_features - Per device IOMMU features
+ * @IOMMU_DEV_FEAT_AUX: Auxiliary domain feature
+ * @IOMMU_DEV_FEAT_SVA: Shared Virtual Addresses
+ * @IOMMU_DEV_FEAT_IOPF: I/O Page Faults such as PRI or Stall. Generally using
+ *			 %IOMMU_DEV_FEAT_SVA requires %IOMMU_DEV_FEAT_IOPF, but
+ *			 some devices manage I/O Page Faults themselves instead
+ *			 of relying on the IOMMU. When supported, this feature
+ *			 must be enabled before and disabled after
+ *			 %IOMMU_DEV_FEAT_SVA.
+ *
+ * Device drivers query whether a feature is supported using
+ * iommu_dev_has_feature(), and enable it using iommu_dev_enable_feature().
+ */
 enum iommu_dev_features {
-	IOMMU_DEV_FEAT_AUX,	/* Aux-domain feature */
-	IOMMU_DEV_FEAT_SVA,	/* Shared Virtual Addresses */
+	IOMMU_DEV_FEAT_AUX,
+	IOMMU_DEV_FEAT_SVA,
+	IOMMU_DEV_FEAT_IOPF,
 };
 
 #define IOMMU_PASID_INVALID	(-1U)
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: Greg Kroah-Hartman, vivek.gautam, guohanjun,
	Jean-Philippe Brucker, linux-acpi, zhangfei.gao, lenb,
	devicetree, Arnd Bergmann, robh+dt, linux-arm-kernel,
	David Woodhouse, rjw, iommu, sudeep.holla, robin.murphy,
	linux-accelerators

Some devices manage I/O Page Faults (IOPF) themselves instead of relying
on PCIe PRI or Arm SMMU stall. Allow their drivers to enable SVA without
mandating IOMMU-managed IOPF. The other device drivers now need to first
enable IOMMU_DEV_FEAT_IOPF before enabling IOMMU_DEV_FEAT_SVA.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
Cc: Zhou Wang <wangzhou1@hisilicon.com>
---
 include/linux/iommu.h | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 583c734b2e87..701b2eeb0dc5 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -156,10 +156,24 @@ struct iommu_resv_region {
 	enum iommu_resv_type	type;
 };
 
-/* Per device IOMMU features */
+/**
+ * enum iommu_dev_features - Per device IOMMU features
+ * @IOMMU_DEV_FEAT_AUX: Auxiliary domain feature
+ * @IOMMU_DEV_FEAT_SVA: Shared Virtual Addresses
+ * @IOMMU_DEV_FEAT_IOPF: I/O Page Faults such as PRI or Stall. Generally using
+ *			 %IOMMU_DEV_FEAT_SVA requires %IOMMU_DEV_FEAT_IOPF, but
+ *			 some devices manage I/O Page Faults themselves instead
+ *			 of relying on the IOMMU. When supported, this feature
+ *			 must be enabled before and disabled after
+ *			 %IOMMU_DEV_FEAT_SVA.
+ *
+ * Device drivers query whether a feature is supported using
+ * iommu_dev_has_feature(), and enable it using iommu_dev_enable_feature().
+ */
 enum iommu_dev_features {
-	IOMMU_DEV_FEAT_AUX,	/* Aux-domain feature */
-	IOMMU_DEV_FEAT_SVA,	/* Shared Virtual Addresses */
+	IOMMU_DEV_FEAT_AUX,
+	IOMMU_DEV_FEAT_SVA,
+	IOMMU_DEV_FEAT_IOPF,
 };
 
 #define IOMMU_PASID_INVALID	(-1U)
-- 
2.29.2

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: Greg Kroah-Hartman, vivek.gautam, guohanjun,
	Jean-Philippe Brucker, lorenzo.pieralisi, Zhou Wang, linux-acpi,
	zhangfei.gao, lenb, devicetree, Arnd Bergmann, eric.auger,
	robh+dt, Jonathan.Cameron, linux-arm-kernel, David Woodhouse,
	rjw, shameerali.kolothum.thodi, iommu, sudeep.holla,
	robin.murphy, linux-accelerators, baolu.lu

Some devices manage I/O Page Faults (IOPF) themselves instead of relying
on PCIe PRI or Arm SMMU stall. Allow their drivers to enable SVA without
mandating IOMMU-managed IOPF. The other device drivers now need to first
enable IOMMU_DEV_FEAT_IOPF before enabling IOMMU_DEV_FEAT_SVA.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
Cc: Zhou Wang <wangzhou1@hisilicon.com>
---
 include/linux/iommu.h | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 583c734b2e87..701b2eeb0dc5 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -156,10 +156,24 @@ struct iommu_resv_region {
 	enum iommu_resv_type	type;
 };
 
-/* Per device IOMMU features */
+/**
+ * enum iommu_dev_features - Per device IOMMU features
+ * @IOMMU_DEV_FEAT_AUX: Auxiliary domain feature
+ * @IOMMU_DEV_FEAT_SVA: Shared Virtual Addresses
+ * @IOMMU_DEV_FEAT_IOPF: I/O Page Faults such as PRI or Stall. Generally using
+ *			 %IOMMU_DEV_FEAT_SVA requires %IOMMU_DEV_FEAT_IOPF, but
+ *			 some devices manage I/O Page Faults themselves instead
+ *			 of relying on the IOMMU. When supported, this feature
+ *			 must be enabled before and disabled after
+ *			 %IOMMU_DEV_FEAT_SVA.
+ *
+ * Device drivers query whether a feature is supported using
+ * iommu_dev_has_feature(), and enable it using iommu_dev_enable_feature().
+ */
 enum iommu_dev_features {
-	IOMMU_DEV_FEAT_AUX,	/* Aux-domain feature */
-	IOMMU_DEV_FEAT_SVA,	/* Shared Virtual Addresses */
+	IOMMU_DEV_FEAT_AUX,
+	IOMMU_DEV_FEAT_SVA,
+	IOMMU_DEV_FEAT_IOPF,
 };
 
 #define IOMMU_PASID_INVALID	(-1U)
-- 
2.29.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 04/10] iommu/vt-d: Support IOMMU_DEV_FEAT_IOPF
  2021-01-08 14:52 ` Jean-Philippe Brucker
  (?)
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: lorenzo.pieralisi, robh+dt, guohanjun, sudeep.holla, rjw, lenb,
	robin.murphy, Jonathan.Cameron, eric.auger, iommu, devicetree,
	linux-acpi, linux-arm-kernel, linux-accelerators, baolu.lu,
	vdumpa, zhangfei.gao, shameerali.kolothum.thodi, vivek.gautam,
	Jean-Philippe Brucker, David Woodhouse

Allow drivers to query and enable IOMMU_DEV_FEAT_IOPF, which amounts to
checking whether PRI is enabled.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel/iommu.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 788119c5b021..630639c753f9 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5263,6 +5263,8 @@ static int siov_find_pci_dvsec(struct pci_dev *pdev)
 static bool
 intel_iommu_dev_has_feat(struct device *dev, enum iommu_dev_features feat)
 {
+	struct device_domain_info *info = get_domain_info(dev);
+
 	if (feat == IOMMU_DEV_FEAT_AUX) {
 		int ret;
 
@@ -5277,13 +5279,13 @@ intel_iommu_dev_has_feat(struct device *dev, enum iommu_dev_features feat)
 		return !!siov_find_pci_dvsec(to_pci_dev(dev));
 	}
 
-	if (feat == IOMMU_DEV_FEAT_SVA) {
-		struct device_domain_info *info = get_domain_info(dev);
+	if (feat == IOMMU_DEV_FEAT_IOPF)
+		return info && info->pri_supported;
 
+	if (feat == IOMMU_DEV_FEAT_SVA)
 		return info && (info->iommu->flags & VTD_FLAG_SVM_CAPABLE) &&
 			info->pasid_supported && info->pri_supported &&
 			info->ats_supported;
-	}
 
 	return false;
 }
@@ -5294,6 +5296,9 @@ intel_iommu_dev_enable_feat(struct device *dev, enum iommu_dev_features feat)
 	if (feat == IOMMU_DEV_FEAT_AUX)
 		return intel_iommu_enable_auxd(dev);
 
+	if (feat == IOMMU_DEV_FEAT_IOPF)
+		return intel_iommu_dev_has_feat(dev, feat) ? 0 : -ENODEV;
+
 	if (feat == IOMMU_DEV_FEAT_SVA) {
 		struct device_domain_info *info = get_domain_info(dev);
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 04/10] iommu/vt-d: Support IOMMU_DEV_FEAT_IOPF
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: devicetree, linux-acpi, Jean-Philippe Brucker, guohanjun, rjw,
	iommu, robh+dt, linux-accelerators, sudeep.holla, vivek.gautam,
	zhangfei.gao, robin.murphy, David Woodhouse, linux-arm-kernel,
	lenb

Allow drivers to query and enable IOMMU_DEV_FEAT_IOPF, which amounts to
checking whether PRI is enabled.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel/iommu.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 788119c5b021..630639c753f9 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5263,6 +5263,8 @@ static int siov_find_pci_dvsec(struct pci_dev *pdev)
 static bool
 intel_iommu_dev_has_feat(struct device *dev, enum iommu_dev_features feat)
 {
+	struct device_domain_info *info = get_domain_info(dev);
+
 	if (feat == IOMMU_DEV_FEAT_AUX) {
 		int ret;
 
@@ -5277,13 +5279,13 @@ intel_iommu_dev_has_feat(struct device *dev, enum iommu_dev_features feat)
 		return !!siov_find_pci_dvsec(to_pci_dev(dev));
 	}
 
-	if (feat == IOMMU_DEV_FEAT_SVA) {
-		struct device_domain_info *info = get_domain_info(dev);
+	if (feat == IOMMU_DEV_FEAT_IOPF)
+		return info && info->pri_supported;
 
+	if (feat == IOMMU_DEV_FEAT_SVA)
 		return info && (info->iommu->flags & VTD_FLAG_SVM_CAPABLE) &&
 			info->pasid_supported && info->pri_supported &&
 			info->ats_supported;
-	}
 
 	return false;
 }
@@ -5294,6 +5296,9 @@ intel_iommu_dev_enable_feat(struct device *dev, enum iommu_dev_features feat)
 	if (feat == IOMMU_DEV_FEAT_AUX)
 		return intel_iommu_enable_auxd(dev);
 
+	if (feat == IOMMU_DEV_FEAT_IOPF)
+		return intel_iommu_dev_has_feat(dev, feat) ? 0 : -ENODEV;
+
 	if (feat == IOMMU_DEV_FEAT_SVA) {
 		struct device_domain_info *info = get_domain_info(dev);
 
-- 
2.29.2

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 04/10] iommu/vt-d: Support IOMMU_DEV_FEAT_IOPF
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: devicetree, lorenzo.pieralisi, linux-acpi, Jean-Philippe Brucker,
	guohanjun, rjw, shameerali.kolothum.thodi, eric.auger, iommu,
	robh+dt, linux-accelerators, Jonathan.Cameron, sudeep.holla,
	vivek.gautam, zhangfei.gao, baolu.lu, robin.murphy,
	David Woodhouse, linux-arm-kernel, lenb

Allow drivers to query and enable IOMMU_DEV_FEAT_IOPF, which amounts to
checking whether PRI is enabled.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/intel/iommu.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 788119c5b021..630639c753f9 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5263,6 +5263,8 @@ static int siov_find_pci_dvsec(struct pci_dev *pdev)
 static bool
 intel_iommu_dev_has_feat(struct device *dev, enum iommu_dev_features feat)
 {
+	struct device_domain_info *info = get_domain_info(dev);
+
 	if (feat == IOMMU_DEV_FEAT_AUX) {
 		int ret;
 
@@ -5277,13 +5279,13 @@ intel_iommu_dev_has_feat(struct device *dev, enum iommu_dev_features feat)
 		return !!siov_find_pci_dvsec(to_pci_dev(dev));
 	}
 
-	if (feat == IOMMU_DEV_FEAT_SVA) {
-		struct device_domain_info *info = get_domain_info(dev);
+	if (feat == IOMMU_DEV_FEAT_IOPF)
+		return info && info->pri_supported;
 
+	if (feat == IOMMU_DEV_FEAT_SVA)
 		return info && (info->iommu->flags & VTD_FLAG_SVM_CAPABLE) &&
 			info->pasid_supported && info->pri_supported &&
 			info->ats_supported;
-	}
 
 	return false;
 }
@@ -5294,6 +5296,9 @@ intel_iommu_dev_enable_feat(struct device *dev, enum iommu_dev_features feat)
 	if (feat == IOMMU_DEV_FEAT_AUX)
 		return intel_iommu_enable_auxd(dev);
 
+	if (feat == IOMMU_DEV_FEAT_IOPF)
+		return intel_iommu_dev_has_feat(dev, feat) ? 0 : -ENODEV;
+
 	if (feat == IOMMU_DEV_FEAT_SVA) {
 		struct device_domain_info *info = get_domain_info(dev);
 
-- 
2.29.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 05/10] uacce: Enable IOMMU_DEV_FEAT_IOPF
  2021-01-08 14:52 ` Jean-Philippe Brucker
  (?)
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: lorenzo.pieralisi, robh+dt, guohanjun, sudeep.holla, rjw, lenb,
	robin.murphy, Jonathan.Cameron, eric.auger, iommu, devicetree,
	linux-acpi, linux-arm-kernel, linux-accelerators, baolu.lu,
	vdumpa, zhangfei.gao, shameerali.kolothum.thodi, vivek.gautam,
	Jean-Philippe Brucker, Arnd Bergmann, Greg Kroah-Hartman,
	Zhou Wang

The IOPF (I/O Page Fault) feature is now enabled independently from the
SVA feature, because some IOPF implementations are device-specific and
do not require IOMMU support for PCIe PRI or Arm SMMU stall.

Enable IOPF unconditionally when enabling SVA for now. In the future, if
a device driver implementing a uacce interface doesn't need IOPF
support, it will need to tell the uacce module, for example with a new
flag.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
Cc: Zhou Wang <wangzhou1@hisilicon.com>
---
 drivers/misc/uacce/uacce.c | 32 +++++++++++++++++++++++++-------
 1 file changed, 25 insertions(+), 7 deletions(-)

diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
index d07af4edfcac..41ef1eb62a14 100644
--- a/drivers/misc/uacce/uacce.c
+++ b/drivers/misc/uacce/uacce.c
@@ -385,6 +385,24 @@ static void uacce_release(struct device *dev)
 	kfree(uacce);
 }
 
+static unsigned int uacce_enable_sva(struct device *parent, unsigned int flags)
+{
+	if (!(flags & UACCE_DEV_SVA))
+		return flags;
+
+	flags &= ~UACCE_DEV_SVA;
+
+	if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_IOPF))
+		return flags;
+
+	if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA)) {
+		iommu_dev_disable_feature(parent, IOMMU_DEV_FEAT_IOPF);
+		return flags;
+	}
+
+	return flags | UACCE_DEV_SVA;
+}
+
 /**
  * uacce_alloc() - alloc an accelerator
  * @parent: pointer of uacce parent device
@@ -404,11 +422,7 @@ struct uacce_device *uacce_alloc(struct device *parent,
 	if (!uacce)
 		return ERR_PTR(-ENOMEM);
 
-	if (flags & UACCE_DEV_SVA) {
-		ret = iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA);
-		if (ret)
-			flags &= ~UACCE_DEV_SVA;
-	}
+	flags = uacce_enable_sva(parent, flags);
 
 	uacce->parent = parent;
 	uacce->flags = flags;
@@ -432,8 +446,10 @@ struct uacce_device *uacce_alloc(struct device *parent,
 	return uacce;
 
 err_with_uacce:
-	if (flags & UACCE_DEV_SVA)
+	if (flags & UACCE_DEV_SVA) {
 		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
+		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
+	}
 	kfree(uacce);
 	return ERR_PTR(ret);
 }
@@ -487,8 +503,10 @@ void uacce_remove(struct uacce_device *uacce)
 	mutex_unlock(&uacce->queues_lock);
 
 	/* disable sva now since no opened queues */
-	if (uacce->flags & UACCE_DEV_SVA)
+	if (uacce->flags & UACCE_DEV_SVA) {
 		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
+		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
+	}
 
 	if (uacce->cdev)
 		cdev_device_del(uacce->cdev, &uacce->dev);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 05/10] uacce: Enable IOMMU_DEV_FEAT_IOPF
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: vivek.gautam, guohanjun, Jean-Philippe Brucker, linux-acpi,
	zhangfei.gao, lenb, devicetree, Arnd Bergmann, robh+dt,
	linux-arm-kernel, Greg Kroah-Hartman, rjw, iommu, sudeep.holla,
	robin.murphy, linux-accelerators

The IOPF (I/O Page Fault) feature is now enabled independently from the
SVA feature, because some IOPF implementations are device-specific and
do not require IOMMU support for PCIe PRI or Arm SMMU stall.

Enable IOPF unconditionally when enabling SVA for now. In the future, if
a device driver implementing a uacce interface doesn't need IOPF
support, it will need to tell the uacce module, for example with a new
flag.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
Cc: Zhou Wang <wangzhou1@hisilicon.com>
---
 drivers/misc/uacce/uacce.c | 32 +++++++++++++++++++++++++-------
 1 file changed, 25 insertions(+), 7 deletions(-)

diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
index d07af4edfcac..41ef1eb62a14 100644
--- a/drivers/misc/uacce/uacce.c
+++ b/drivers/misc/uacce/uacce.c
@@ -385,6 +385,24 @@ static void uacce_release(struct device *dev)
 	kfree(uacce);
 }
 
+static unsigned int uacce_enable_sva(struct device *parent, unsigned int flags)
+{
+	if (!(flags & UACCE_DEV_SVA))
+		return flags;
+
+	flags &= ~UACCE_DEV_SVA;
+
+	if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_IOPF))
+		return flags;
+
+	if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA)) {
+		iommu_dev_disable_feature(parent, IOMMU_DEV_FEAT_IOPF);
+		return flags;
+	}
+
+	return flags | UACCE_DEV_SVA;
+}
+
 /**
  * uacce_alloc() - alloc an accelerator
  * @parent: pointer of uacce parent device
@@ -404,11 +422,7 @@ struct uacce_device *uacce_alloc(struct device *parent,
 	if (!uacce)
 		return ERR_PTR(-ENOMEM);
 
-	if (flags & UACCE_DEV_SVA) {
-		ret = iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA);
-		if (ret)
-			flags &= ~UACCE_DEV_SVA;
-	}
+	flags = uacce_enable_sva(parent, flags);
 
 	uacce->parent = parent;
 	uacce->flags = flags;
@@ -432,8 +446,10 @@ struct uacce_device *uacce_alloc(struct device *parent,
 	return uacce;
 
 err_with_uacce:
-	if (flags & UACCE_DEV_SVA)
+	if (flags & UACCE_DEV_SVA) {
 		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
+		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
+	}
 	kfree(uacce);
 	return ERR_PTR(ret);
 }
@@ -487,8 +503,10 @@ void uacce_remove(struct uacce_device *uacce)
 	mutex_unlock(&uacce->queues_lock);
 
 	/* disable sva now since no opened queues */
-	if (uacce->flags & UACCE_DEV_SVA)
+	if (uacce->flags & UACCE_DEV_SVA) {
 		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
+		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
+	}
 
 	if (uacce->cdev)
 		cdev_device_del(uacce->cdev, &uacce->dev);
-- 
2.29.2

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 05/10] uacce: Enable IOMMU_DEV_FEAT_IOPF
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: vivek.gautam, guohanjun, Jean-Philippe Brucker,
	lorenzo.pieralisi, Zhou Wang, linux-acpi, zhangfei.gao, lenb,
	devicetree, Arnd Bergmann, eric.auger, robh+dt, Jonathan.Cameron,
	linux-arm-kernel, Greg Kroah-Hartman, rjw,
	shameerali.kolothum.thodi, iommu, sudeep.holla, robin.murphy,
	linux-accelerators, baolu.lu

The IOPF (I/O Page Fault) feature is now enabled independently from the
SVA feature, because some IOPF implementations are device-specific and
do not require IOMMU support for PCIe PRI or Arm SMMU stall.

Enable IOPF unconditionally when enabling SVA for now. In the future, if
a device driver implementing a uacce interface doesn't need IOPF
support, it will need to tell the uacce module, for example with a new
flag.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
Cc: Zhou Wang <wangzhou1@hisilicon.com>
---
 drivers/misc/uacce/uacce.c | 32 +++++++++++++++++++++++++-------
 1 file changed, 25 insertions(+), 7 deletions(-)

diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
index d07af4edfcac..41ef1eb62a14 100644
--- a/drivers/misc/uacce/uacce.c
+++ b/drivers/misc/uacce/uacce.c
@@ -385,6 +385,24 @@ static void uacce_release(struct device *dev)
 	kfree(uacce);
 }
 
+static unsigned int uacce_enable_sva(struct device *parent, unsigned int flags)
+{
+	if (!(flags & UACCE_DEV_SVA))
+		return flags;
+
+	flags &= ~UACCE_DEV_SVA;
+
+	if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_IOPF))
+		return flags;
+
+	if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA)) {
+		iommu_dev_disable_feature(parent, IOMMU_DEV_FEAT_IOPF);
+		return flags;
+	}
+
+	return flags | UACCE_DEV_SVA;
+}
+
 /**
  * uacce_alloc() - alloc an accelerator
  * @parent: pointer of uacce parent device
@@ -404,11 +422,7 @@ struct uacce_device *uacce_alloc(struct device *parent,
 	if (!uacce)
 		return ERR_PTR(-ENOMEM);
 
-	if (flags & UACCE_DEV_SVA) {
-		ret = iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA);
-		if (ret)
-			flags &= ~UACCE_DEV_SVA;
-	}
+	flags = uacce_enable_sva(parent, flags);
 
 	uacce->parent = parent;
 	uacce->flags = flags;
@@ -432,8 +446,10 @@ struct uacce_device *uacce_alloc(struct device *parent,
 	return uacce;
 
 err_with_uacce:
-	if (flags & UACCE_DEV_SVA)
+	if (flags & UACCE_DEV_SVA) {
 		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
+		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
+	}
 	kfree(uacce);
 	return ERR_PTR(ret);
 }
@@ -487,8 +503,10 @@ void uacce_remove(struct uacce_device *uacce)
 	mutex_unlock(&uacce->queues_lock);
 
 	/* disable sva now since no opened queues */
-	if (uacce->flags & UACCE_DEV_SVA)
+	if (uacce->flags & UACCE_DEV_SVA) {
 		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
+		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
+	}
 
 	if (uacce->cdev)
 		cdev_device_del(uacce->cdev, &uacce->dev);
-- 
2.29.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 06/10] iommu: Add a page fault handler
  2021-01-08 14:52 ` Jean-Philippe Brucker
  (?)
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: lorenzo.pieralisi, robh+dt, guohanjun, sudeep.holla, rjw, lenb,
	robin.murphy, Jonathan.Cameron, eric.auger, iommu, devicetree,
	linux-acpi, linux-arm-kernel, linux-accelerators, baolu.lu,
	vdumpa, zhangfei.gao, shameerali.kolothum.thodi, vivek.gautam,
	Jean-Philippe Brucker

Some systems allow devices to handle I/O Page Faults in the core mm. For
example systems implementing the PCIe PRI extension or Arm SMMU stall
model. Infrastructure for reporting these recoverable page faults was
added to the IOMMU core by commit 0c830e6b3282 ("iommu: Introduce device
fault report API"). Add a page fault handler for host SVA.

IOMMU driver can now instantiate several fault workqueues and link them
to IOPF-capable devices. Drivers can choose between a single global
workqueue, one per IOMMU device, one per low-level fault queue, one per
domain, etc.

When it receives a fault event, supposedly in an IRQ handler, the IOMMU
driver reports the fault using iommu_report_device_fault(), which calls
the registered handler. The page fault handler then calls the mm fault
handler, and reports either success or failure with iommu_page_response().
When the handler succeeds, the IOMMU retries the access.

The iopf_param pointer could be embedded into iommu_fault_param. But
putting iopf_param into the iommu_param structure allows us not to care
about ordering between calls to iopf_queue_add_device() and
iommu_register_device_fault_handler().

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 drivers/iommu/Makefile        |   1 +
 drivers/iommu/iommu-sva-lib.h |  53 ++++
 include/linux/iommu.h         |   2 +
 drivers/iommu/io-pgfault.c    | 462 ++++++++++++++++++++++++++++++++++
 4 files changed, 518 insertions(+)
 create mode 100644 drivers/iommu/io-pgfault.c

diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 61bd30cd8369..60fafc23dee6 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -28,3 +28,4 @@ obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
 obj-$(CONFIG_HYPERV_IOMMU) += hyperv-iommu.o
 obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
 obj-$(CONFIG_IOMMU_SVA_LIB) += iommu-sva-lib.o
+obj-$(CONFIG_IOMMU_SVA_LIB) += io-pgfault.o
diff --git a/drivers/iommu/iommu-sva-lib.h b/drivers/iommu/iommu-sva-lib.h
index b40990aef3fd..031155010ca8 100644
--- a/drivers/iommu/iommu-sva-lib.h
+++ b/drivers/iommu/iommu-sva-lib.h
@@ -12,4 +12,57 @@ int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max);
 void iommu_sva_free_pasid(struct mm_struct *mm);
 struct mm_struct *iommu_sva_find(ioasid_t pasid);
 
+/* I/O Page fault */
+struct device;
+struct iommu_fault;
+struct iopf_queue;
+
+#ifdef CONFIG_IOMMU_SVA_LIB
+int iommu_queue_iopf(struct iommu_fault *fault, void *cookie);
+
+int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev);
+int iopf_queue_remove_device(struct iopf_queue *queue,
+			     struct device *dev);
+int iopf_queue_flush_dev(struct device *dev);
+struct iopf_queue *iopf_queue_alloc(const char *name);
+void iopf_queue_free(struct iopf_queue *queue);
+int iopf_queue_discard_partial(struct iopf_queue *queue);
+
+#else /* CONFIG_IOMMU_SVA_LIB */
+static inline int iommu_queue_iopf(struct iommu_fault *fault, void *cookie)
+{
+	return -ENODEV;
+}
+
+static inline int iopf_queue_add_device(struct iopf_queue *queue,
+					struct device *dev)
+{
+	return -ENODEV;
+}
+
+static inline int iopf_queue_remove_device(struct iopf_queue *queue,
+					   struct device *dev)
+{
+	return -ENODEV;
+}
+
+static inline int iopf_queue_flush_dev(struct device *dev)
+{
+	return -ENODEV;
+}
+
+static inline struct iopf_queue *iopf_queue_alloc(const char *name)
+{
+	return NULL;
+}
+
+static inline void iopf_queue_free(struct iopf_queue *queue)
+{
+}
+
+static inline int iopf_queue_discard_partial(struct iopf_queue *queue)
+{
+	return -ENODEV;
+}
+#endif /* CONFIG_IOMMU_SVA_LIB */
 #endif /* _IOMMU_SVA_LIB_H */
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 701b2eeb0dc5..1c721f472d25 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -366,6 +366,7 @@ struct iommu_fault_param {
  * struct dev_iommu - Collection of per-device IOMMU data
  *
  * @fault_param: IOMMU detected device fault reporting data
+ * @iopf_param:	 I/O Page Fault queue and data
  * @fwspec:	 IOMMU fwspec data
  * @iommu_dev:	 IOMMU device this device is linked to
  * @priv:	 IOMMU Driver private data
@@ -376,6 +377,7 @@ struct iommu_fault_param {
 struct dev_iommu {
 	struct mutex lock;
 	struct iommu_fault_param	*fault_param;
+	struct iopf_device_param	*iopf_param;
 	struct iommu_fwspec		*fwspec;
 	struct iommu_device		*iommu_dev;
 	void				*priv;
diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c
new file mode 100644
index 000000000000..fc1d5d29ac37
--- /dev/null
+++ b/drivers/iommu/io-pgfault.c
@@ -0,0 +1,462 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Handle device page faults
+ *
+ * Copyright (C) 2020 ARM Ltd.
+ */
+
+#include <linux/iommu.h>
+#include <linux/list.h>
+#include <linux/sched/mm.h>
+#include <linux/slab.h>
+#include <linux/workqueue.h>
+
+#include "iommu-sva-lib.h"
+
+/**
+ * struct iopf_queue - IO Page Fault queue
+ * @wq: the fault workqueue
+ * @devices: devices attached to this queue
+ * @lock: protects the device list
+ */
+struct iopf_queue {
+	struct workqueue_struct		*wq;
+	struct list_head		devices;
+	struct mutex			lock;
+};
+
+/**
+ * struct iopf_device_param - IO Page Fault data attached to a device
+ * @dev: the device that owns this param
+ * @queue: IOPF queue
+ * @queue_list: index into queue->devices
+ * @partial: faults that are part of a Page Request Group for which the last
+ *           request hasn't been submitted yet.
+ */
+struct iopf_device_param {
+	struct device			*dev;
+	struct iopf_queue		*queue;
+	struct list_head		queue_list;
+	struct list_head		partial;
+};
+
+struct iopf_fault {
+	struct iommu_fault		fault;
+	struct list_head		list;
+};
+
+struct iopf_group {
+	struct iopf_fault		last_fault;
+	struct list_head		faults;
+	struct work_struct		work;
+	struct device			*dev;
+};
+
+static int iopf_complete_group(struct device *dev, struct iopf_fault *iopf,
+			       enum iommu_page_response_code status)
+{
+	struct iommu_page_response resp = {
+		.version		= IOMMU_PAGE_RESP_VERSION_1,
+		.pasid			= iopf->fault.prm.pasid,
+		.grpid			= iopf->fault.prm.grpid,
+		.code			= status,
+	};
+
+	if ((iopf->fault.prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID) &&
+	    (iopf->fault.prm.flags & IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID))
+		resp.flags = IOMMU_PAGE_RESP_PASID_VALID;
+
+	return iommu_page_response(dev, &resp);
+}
+
+static enum iommu_page_response_code
+iopf_handle_single(struct iopf_fault *iopf)
+{
+	vm_fault_t ret;
+	struct mm_struct *mm;
+	struct vm_area_struct *vma;
+	unsigned int access_flags = 0;
+	unsigned int fault_flags = FAULT_FLAG_REMOTE;
+	struct iommu_fault_page_request *prm = &iopf->fault.prm;
+	enum iommu_page_response_code status = IOMMU_PAGE_RESP_INVALID;
+
+	if (!(prm->flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID))
+		return status;
+
+	mm = iommu_sva_find(prm->pasid);
+	if (IS_ERR_OR_NULL(mm))
+		return status;
+
+	mmap_read_lock(mm);
+
+	vma = find_extend_vma(mm, prm->addr);
+	if (!vma)
+		/* Unmapped area */
+		goto out_put_mm;
+
+	if (prm->perm & IOMMU_FAULT_PERM_READ)
+		access_flags |= VM_READ;
+
+	if (prm->perm & IOMMU_FAULT_PERM_WRITE) {
+		access_flags |= VM_WRITE;
+		fault_flags |= FAULT_FLAG_WRITE;
+	}
+
+	if (prm->perm & IOMMU_FAULT_PERM_EXEC) {
+		access_flags |= VM_EXEC;
+		fault_flags |= FAULT_FLAG_INSTRUCTION;
+	}
+
+	if (!(prm->perm & IOMMU_FAULT_PERM_PRIV))
+		fault_flags |= FAULT_FLAG_USER;
+
+	if (access_flags & ~vma->vm_flags)
+		/* Access fault */
+		goto out_put_mm;
+
+	ret = handle_mm_fault(vma, prm->addr, fault_flags, NULL);
+	status = ret & VM_FAULT_ERROR ? IOMMU_PAGE_RESP_INVALID :
+		IOMMU_PAGE_RESP_SUCCESS;
+
+out_put_mm:
+	mmap_read_unlock(mm);
+	mmput(mm);
+
+	return status;
+}
+
+static void iopf_handle_group(struct work_struct *work)
+{
+	struct iopf_group *group;
+	struct iopf_fault *iopf, *next;
+	enum iommu_page_response_code status = IOMMU_PAGE_RESP_SUCCESS;
+
+	group = container_of(work, struct iopf_group, work);
+
+	list_for_each_entry_safe(iopf, next, &group->faults, list) {
+		/*
+		 * For the moment, errors are sticky: don't handle subsequent
+		 * faults in the group if there is an error.
+		 */
+		if (status == IOMMU_PAGE_RESP_SUCCESS)
+			status = iopf_handle_single(iopf);
+
+		if (!(iopf->fault.prm.flags &
+		      IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE))
+			kfree(iopf);
+	}
+
+	iopf_complete_group(group->dev, &group->last_fault, status);
+	kfree(group);
+}
+
+/**
+ * iommu_queue_iopf - IO Page Fault handler
+ * @fault: fault event
+ * @cookie: struct device, passed to iommu_register_device_fault_handler.
+ *
+ * Add a fault to the device workqueue, to be handled by mm.
+ *
+ * This module doesn't handle PCI PASID Stop Marker; IOMMU drivers must discard
+ * them before reporting faults. A PASID Stop Marker (LRW = 0b100) doesn't
+ * expect a response. It may be generated when disabling a PASID (issuing a
+ * PASID stop request) by some PCI devices.
+ *
+ * The PASID stop request is issued by the device driver before unbind(). Once
+ * it completes, no page request is generated for this PASID anymore and
+ * outstanding ones have been pushed to the IOMMU (as per PCIe 4.0r1.0 - 6.20.1
+ * and 10.4.1.2 - Managing PASID TLP Prefix Usage). Some PCI devices will wait
+ * for all outstanding page requests to come back with a response before
+ * completing the PASID stop request. Others do not wait for page responses, and
+ * instead issue this Stop Marker that tells us when the PASID can be
+ * reallocated.
+ *
+ * It is safe to discard the Stop Marker because it is an optimization.
+ * a. Page requests, which are posted requests, have been flushed to the IOMMU
+ *    when the stop request completes.
+ * b. The IOMMU driver flushes all fault queues on unbind() before freeing the
+ *    PASID.
+ *
+ * So even though the Stop Marker might be issued by the device *after* the stop
+ * request completes, outstanding faults will have been dealt with by the time
+ * the PASID is freed.
+ *
+ * Return: 0 on success and <0 on error.
+ */
+int iommu_queue_iopf(struct iommu_fault *fault, void *cookie)
+{
+	int ret;
+	struct iopf_group *group;
+	struct iopf_fault *iopf, *next;
+	struct iopf_device_param *iopf_param;
+
+	struct device *dev = cookie;
+	struct dev_iommu *param = dev->iommu;
+
+	lockdep_assert_held(&param->lock);
+
+	if (fault->type != IOMMU_FAULT_PAGE_REQ)
+		/* Not a recoverable page fault */
+		return -EOPNOTSUPP;
+
+	/*
+	 * As long as we're holding param->lock, the queue can't be unlinked
+	 * from the device and therefore cannot disappear.
+	 */
+	iopf_param = param->iopf_param;
+	if (!iopf_param)
+		return -ENODEV;
+
+	if (!(fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE)) {
+		iopf = kzalloc(sizeof(*iopf), GFP_KERNEL);
+		if (!iopf)
+			return -ENOMEM;
+
+		iopf->fault = *fault;
+
+		/* Non-last request of a group. Postpone until the last one */
+		list_add(&iopf->list, &iopf_param->partial);
+
+		return 0;
+	}
+
+	group = kzalloc(sizeof(*group), GFP_KERNEL);
+	if (!group) {
+		/*
+		 * The caller will send a response to the hardware. But we do
+		 * need to clean up before leaving, otherwise partial faults
+		 * will be stuck.
+		 */
+		ret = -ENOMEM;
+		goto cleanup_partial;
+	}
+
+	group->dev = dev;
+	group->last_fault.fault = *fault;
+	INIT_LIST_HEAD(&group->faults);
+	list_add(&group->last_fault.list, &group->faults);
+	INIT_WORK(&group->work, iopf_handle_group);
+
+	/* See if we have partial faults for this group */
+	list_for_each_entry_safe(iopf, next, &iopf_param->partial, list) {
+		if (iopf->fault.prm.grpid == fault->prm.grpid)
+			/* Insert *before* the last fault */
+			list_move(&iopf->list, &group->faults);
+	}
+
+	queue_work(iopf_param->queue->wq, &group->work);
+	return 0;
+
+cleanup_partial:
+	list_for_each_entry_safe(iopf, next, &iopf_param->partial, list) {
+		if (iopf->fault.prm.grpid == fault->prm.grpid) {
+			list_del(&iopf->list);
+			kfree(iopf);
+		}
+	}
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_queue_iopf);
+
+/**
+ * iopf_queue_flush_dev - Ensure that all queued faults have been processed
+ * @dev: the endpoint whose faults need to be flushed.
+ *
+ * The IOMMU driver calls this before releasing a PASID, to ensure that all
+ * pending faults for this PASID have been handled, and won't hit the address
+ * space of the next process that uses this PASID. The driver must make sure
+ * that no new fault is added to the queue. In particular it must flush its
+ * low-level queue before calling this function.
+ *
+ * Return: 0 on success and <0 on error.
+ */
+int iopf_queue_flush_dev(struct device *dev)
+{
+	int ret = 0;
+	struct iopf_device_param *iopf_param;
+	struct dev_iommu *param = dev->iommu;
+
+	if (!param)
+		return -ENODEV;
+
+	mutex_lock(&param->lock);
+	iopf_param = param->iopf_param;
+	if (iopf_param)
+		flush_workqueue(iopf_param->queue->wq);
+	else
+		ret = -ENODEV;
+	mutex_unlock(&param->lock);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iopf_queue_flush_dev);
+
+/**
+ * iopf_queue_discard_partial - Remove all pending partial fault
+ * @queue: the queue whose partial faults need to be discarded
+ *
+ * When the hardware queue overflows, last page faults in a group may have been
+ * lost and the IOMMU driver calls this to discard all partial faults. The
+ * driver shouldn't be adding new faults to this queue concurrently.
+ *
+ * Return: 0 on success and <0 on error.
+ */
+int iopf_queue_discard_partial(struct iopf_queue *queue)
+{
+	struct iopf_fault *iopf, *next;
+	struct iopf_device_param *iopf_param;
+
+	if (!queue)
+		return -EINVAL;
+
+	mutex_lock(&queue->lock);
+	list_for_each_entry(iopf_param, &queue->devices, queue_list) {
+		list_for_each_entry_safe(iopf, next, &iopf_param->partial,
+					 list) {
+			list_del(&iopf->list);
+			kfree(iopf);
+		}
+	}
+	mutex_unlock(&queue->lock);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(iopf_queue_discard_partial);
+
+/**
+ * iopf_queue_add_device - Add producer to the fault queue
+ * @queue: IOPF queue
+ * @dev: device to add
+ *
+ * Return: 0 on success and <0 on error.
+ */
+int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev)
+{
+	int ret = -EBUSY;
+	struct iopf_device_param *iopf_param;
+	struct dev_iommu *param = dev->iommu;
+
+	if (!param)
+		return -ENODEV;
+
+	iopf_param = kzalloc(sizeof(*iopf_param), GFP_KERNEL);
+	if (!iopf_param)
+		return -ENOMEM;
+
+	INIT_LIST_HEAD(&iopf_param->partial);
+	iopf_param->queue = queue;
+	iopf_param->dev = dev;
+
+	mutex_lock(&queue->lock);
+	mutex_lock(&param->lock);
+	if (!param->iopf_param) {
+		list_add(&iopf_param->queue_list, &queue->devices);
+		param->iopf_param = iopf_param;
+		ret = 0;
+	}
+	mutex_unlock(&param->lock);
+	mutex_unlock(&queue->lock);
+
+	if (ret)
+		kfree(iopf_param);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iopf_queue_add_device);
+
+/**
+ * iopf_queue_remove_device - Remove producer from fault queue
+ * @queue: IOPF queue
+ * @dev: device to remove
+ *
+ * Caller makes sure that no more faults are reported for this device.
+ *
+ * Return: 0 on success and <0 on error.
+ */
+int iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev)
+{
+	int ret = 0;
+	struct iopf_fault *iopf, *next;
+	struct iopf_device_param *iopf_param;
+	struct dev_iommu *param = dev->iommu;
+
+	if (!param || !queue)
+		return -EINVAL;
+
+	mutex_lock(&queue->lock);
+	mutex_lock(&param->lock);
+	iopf_param = param->iopf_param;
+	if (iopf_param && iopf_param->queue == queue) {
+		list_del(&iopf_param->queue_list);
+		param->iopf_param = NULL;
+	} else {
+		ret = -EINVAL;
+	}
+	mutex_unlock(&param->lock);
+	mutex_unlock(&queue->lock);
+	if (ret)
+		return ret;
+
+	/* Just in case some faults are still stuck */
+	list_for_each_entry_safe(iopf, next, &iopf_param->partial, list)
+		kfree(iopf);
+
+	kfree(iopf_param);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(iopf_queue_remove_device);
+
+/**
+ * iopf_queue_alloc - Allocate and initialize a fault queue
+ * @name: a unique string identifying the queue (for workqueue)
+ *
+ * Return: the queue on success and NULL on error.
+ */
+struct iopf_queue *iopf_queue_alloc(const char *name)
+{
+	struct iopf_queue *queue;
+
+	queue = kzalloc(sizeof(*queue), GFP_KERNEL);
+	if (!queue)
+		return NULL;
+
+	/*
+	 * The WQ is unordered because the low-level handler enqueues faults by
+	 * group. PRI requests within a group have to be ordered, but once
+	 * that's dealt with, the high-level function can handle groups out of
+	 * order.
+	 */
+	queue->wq = alloc_workqueue("iopf_queue/%s", WQ_UNBOUND, 0, name);
+	if (!queue->wq) {
+		kfree(queue);
+		return NULL;
+	}
+
+	INIT_LIST_HEAD(&queue->devices);
+	mutex_init(&queue->lock);
+
+	return queue;
+}
+EXPORT_SYMBOL_GPL(iopf_queue_alloc);
+
+/**
+ * iopf_queue_free - Free IOPF queue
+ * @queue: queue to free
+ *
+ * Counterpart to iopf_queue_alloc(). The driver must not be queuing faults or
+ * adding/removing devices on this queue anymore.
+ */
+void iopf_queue_free(struct iopf_queue *queue)
+{
+	struct iopf_device_param *iopf_param, *next;
+
+	if (!queue)
+		return;
+
+	list_for_each_entry_safe(iopf_param, next, &queue->devices, queue_list)
+		iopf_queue_remove_device(queue, iopf_param->dev);
+
+	destroy_workqueue(queue->wq);
+	kfree(queue);
+}
+EXPORT_SYMBOL_GPL(iopf_queue_free);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 06/10] iommu: Add a page fault handler
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: devicetree, linux-acpi, Jean-Philippe Brucker, guohanjun, rjw,
	iommu, robh+dt, linux-accelerators, sudeep.holla, vivek.gautam,
	zhangfei.gao, robin.murphy, linux-arm-kernel, lenb

Some systems allow devices to handle I/O Page Faults in the core mm. For
example systems implementing the PCIe PRI extension or Arm SMMU stall
model. Infrastructure for reporting these recoverable page faults was
added to the IOMMU core by commit 0c830e6b3282 ("iommu: Introduce device
fault report API"). Add a page fault handler for host SVA.

IOMMU driver can now instantiate several fault workqueues and link them
to IOPF-capable devices. Drivers can choose between a single global
workqueue, one per IOMMU device, one per low-level fault queue, one per
domain, etc.

When it receives a fault event, supposedly in an IRQ handler, the IOMMU
driver reports the fault using iommu_report_device_fault(), which calls
the registered handler. The page fault handler then calls the mm fault
handler, and reports either success or failure with iommu_page_response().
When the handler succeeds, the IOMMU retries the access.

The iopf_param pointer could be embedded into iommu_fault_param. But
putting iopf_param into the iommu_param structure allows us not to care
about ordering between calls to iopf_queue_add_device() and
iommu_register_device_fault_handler().

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 drivers/iommu/Makefile        |   1 +
 drivers/iommu/iommu-sva-lib.h |  53 ++++
 include/linux/iommu.h         |   2 +
 drivers/iommu/io-pgfault.c    | 462 ++++++++++++++++++++++++++++++++++
 4 files changed, 518 insertions(+)
 create mode 100644 drivers/iommu/io-pgfault.c

diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 61bd30cd8369..60fafc23dee6 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -28,3 +28,4 @@ obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
 obj-$(CONFIG_HYPERV_IOMMU) += hyperv-iommu.o
 obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
 obj-$(CONFIG_IOMMU_SVA_LIB) += iommu-sva-lib.o
+obj-$(CONFIG_IOMMU_SVA_LIB) += io-pgfault.o
diff --git a/drivers/iommu/iommu-sva-lib.h b/drivers/iommu/iommu-sva-lib.h
index b40990aef3fd..031155010ca8 100644
--- a/drivers/iommu/iommu-sva-lib.h
+++ b/drivers/iommu/iommu-sva-lib.h
@@ -12,4 +12,57 @@ int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max);
 void iommu_sva_free_pasid(struct mm_struct *mm);
 struct mm_struct *iommu_sva_find(ioasid_t pasid);
 
+/* I/O Page fault */
+struct device;
+struct iommu_fault;
+struct iopf_queue;
+
+#ifdef CONFIG_IOMMU_SVA_LIB
+int iommu_queue_iopf(struct iommu_fault *fault, void *cookie);
+
+int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev);
+int iopf_queue_remove_device(struct iopf_queue *queue,
+			     struct device *dev);
+int iopf_queue_flush_dev(struct device *dev);
+struct iopf_queue *iopf_queue_alloc(const char *name);
+void iopf_queue_free(struct iopf_queue *queue);
+int iopf_queue_discard_partial(struct iopf_queue *queue);
+
+#else /* CONFIG_IOMMU_SVA_LIB */
+static inline int iommu_queue_iopf(struct iommu_fault *fault, void *cookie)
+{
+	return -ENODEV;
+}
+
+static inline int iopf_queue_add_device(struct iopf_queue *queue,
+					struct device *dev)
+{
+	return -ENODEV;
+}
+
+static inline int iopf_queue_remove_device(struct iopf_queue *queue,
+					   struct device *dev)
+{
+	return -ENODEV;
+}
+
+static inline int iopf_queue_flush_dev(struct device *dev)
+{
+	return -ENODEV;
+}
+
+static inline struct iopf_queue *iopf_queue_alloc(const char *name)
+{
+	return NULL;
+}
+
+static inline void iopf_queue_free(struct iopf_queue *queue)
+{
+}
+
+static inline int iopf_queue_discard_partial(struct iopf_queue *queue)
+{
+	return -ENODEV;
+}
+#endif /* CONFIG_IOMMU_SVA_LIB */
 #endif /* _IOMMU_SVA_LIB_H */
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 701b2eeb0dc5..1c721f472d25 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -366,6 +366,7 @@ struct iommu_fault_param {
  * struct dev_iommu - Collection of per-device IOMMU data
  *
  * @fault_param: IOMMU detected device fault reporting data
+ * @iopf_param:	 I/O Page Fault queue and data
  * @fwspec:	 IOMMU fwspec data
  * @iommu_dev:	 IOMMU device this device is linked to
  * @priv:	 IOMMU Driver private data
@@ -376,6 +377,7 @@ struct iommu_fault_param {
 struct dev_iommu {
 	struct mutex lock;
 	struct iommu_fault_param	*fault_param;
+	struct iopf_device_param	*iopf_param;
 	struct iommu_fwspec		*fwspec;
 	struct iommu_device		*iommu_dev;
 	void				*priv;
diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c
new file mode 100644
index 000000000000..fc1d5d29ac37
--- /dev/null
+++ b/drivers/iommu/io-pgfault.c
@@ -0,0 +1,462 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Handle device page faults
+ *
+ * Copyright (C) 2020 ARM Ltd.
+ */
+
+#include <linux/iommu.h>
+#include <linux/list.h>
+#include <linux/sched/mm.h>
+#include <linux/slab.h>
+#include <linux/workqueue.h>
+
+#include "iommu-sva-lib.h"
+
+/**
+ * struct iopf_queue - IO Page Fault queue
+ * @wq: the fault workqueue
+ * @devices: devices attached to this queue
+ * @lock: protects the device list
+ */
+struct iopf_queue {
+	struct workqueue_struct		*wq;
+	struct list_head		devices;
+	struct mutex			lock;
+};
+
+/**
+ * struct iopf_device_param - IO Page Fault data attached to a device
+ * @dev: the device that owns this param
+ * @queue: IOPF queue
+ * @queue_list: index into queue->devices
+ * @partial: faults that are part of a Page Request Group for which the last
+ *           request hasn't been submitted yet.
+ */
+struct iopf_device_param {
+	struct device			*dev;
+	struct iopf_queue		*queue;
+	struct list_head		queue_list;
+	struct list_head		partial;
+};
+
+struct iopf_fault {
+	struct iommu_fault		fault;
+	struct list_head		list;
+};
+
+struct iopf_group {
+	struct iopf_fault		last_fault;
+	struct list_head		faults;
+	struct work_struct		work;
+	struct device			*dev;
+};
+
+static int iopf_complete_group(struct device *dev, struct iopf_fault *iopf,
+			       enum iommu_page_response_code status)
+{
+	struct iommu_page_response resp = {
+		.version		= IOMMU_PAGE_RESP_VERSION_1,
+		.pasid			= iopf->fault.prm.pasid,
+		.grpid			= iopf->fault.prm.grpid,
+		.code			= status,
+	};
+
+	if ((iopf->fault.prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID) &&
+	    (iopf->fault.prm.flags & IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID))
+		resp.flags = IOMMU_PAGE_RESP_PASID_VALID;
+
+	return iommu_page_response(dev, &resp);
+}
+
+static enum iommu_page_response_code
+iopf_handle_single(struct iopf_fault *iopf)
+{
+	vm_fault_t ret;
+	struct mm_struct *mm;
+	struct vm_area_struct *vma;
+	unsigned int access_flags = 0;
+	unsigned int fault_flags = FAULT_FLAG_REMOTE;
+	struct iommu_fault_page_request *prm = &iopf->fault.prm;
+	enum iommu_page_response_code status = IOMMU_PAGE_RESP_INVALID;
+
+	if (!(prm->flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID))
+		return status;
+
+	mm = iommu_sva_find(prm->pasid);
+	if (IS_ERR_OR_NULL(mm))
+		return status;
+
+	mmap_read_lock(mm);
+
+	vma = find_extend_vma(mm, prm->addr);
+	if (!vma)
+		/* Unmapped area */
+		goto out_put_mm;
+
+	if (prm->perm & IOMMU_FAULT_PERM_READ)
+		access_flags |= VM_READ;
+
+	if (prm->perm & IOMMU_FAULT_PERM_WRITE) {
+		access_flags |= VM_WRITE;
+		fault_flags |= FAULT_FLAG_WRITE;
+	}
+
+	if (prm->perm & IOMMU_FAULT_PERM_EXEC) {
+		access_flags |= VM_EXEC;
+		fault_flags |= FAULT_FLAG_INSTRUCTION;
+	}
+
+	if (!(prm->perm & IOMMU_FAULT_PERM_PRIV))
+		fault_flags |= FAULT_FLAG_USER;
+
+	if (access_flags & ~vma->vm_flags)
+		/* Access fault */
+		goto out_put_mm;
+
+	ret = handle_mm_fault(vma, prm->addr, fault_flags, NULL);
+	status = ret & VM_FAULT_ERROR ? IOMMU_PAGE_RESP_INVALID :
+		IOMMU_PAGE_RESP_SUCCESS;
+
+out_put_mm:
+	mmap_read_unlock(mm);
+	mmput(mm);
+
+	return status;
+}
+
+static void iopf_handle_group(struct work_struct *work)
+{
+	struct iopf_group *group;
+	struct iopf_fault *iopf, *next;
+	enum iommu_page_response_code status = IOMMU_PAGE_RESP_SUCCESS;
+
+	group = container_of(work, struct iopf_group, work);
+
+	list_for_each_entry_safe(iopf, next, &group->faults, list) {
+		/*
+		 * For the moment, errors are sticky: don't handle subsequent
+		 * faults in the group if there is an error.
+		 */
+		if (status == IOMMU_PAGE_RESP_SUCCESS)
+			status = iopf_handle_single(iopf);
+
+		if (!(iopf->fault.prm.flags &
+		      IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE))
+			kfree(iopf);
+	}
+
+	iopf_complete_group(group->dev, &group->last_fault, status);
+	kfree(group);
+}
+
+/**
+ * iommu_queue_iopf - IO Page Fault handler
+ * @fault: fault event
+ * @cookie: struct device, passed to iommu_register_device_fault_handler.
+ *
+ * Add a fault to the device workqueue, to be handled by mm.
+ *
+ * This module doesn't handle PCI PASID Stop Marker; IOMMU drivers must discard
+ * them before reporting faults. A PASID Stop Marker (LRW = 0b100) doesn't
+ * expect a response. It may be generated when disabling a PASID (issuing a
+ * PASID stop request) by some PCI devices.
+ *
+ * The PASID stop request is issued by the device driver before unbind(). Once
+ * it completes, no page request is generated for this PASID anymore and
+ * outstanding ones have been pushed to the IOMMU (as per PCIe 4.0r1.0 - 6.20.1
+ * and 10.4.1.2 - Managing PASID TLP Prefix Usage). Some PCI devices will wait
+ * for all outstanding page requests to come back with a response before
+ * completing the PASID stop request. Others do not wait for page responses, and
+ * instead issue this Stop Marker that tells us when the PASID can be
+ * reallocated.
+ *
+ * It is safe to discard the Stop Marker because it is an optimization.
+ * a. Page requests, which are posted requests, have been flushed to the IOMMU
+ *    when the stop request completes.
+ * b. The IOMMU driver flushes all fault queues on unbind() before freeing the
+ *    PASID.
+ *
+ * So even though the Stop Marker might be issued by the device *after* the stop
+ * request completes, outstanding faults will have been dealt with by the time
+ * the PASID is freed.
+ *
+ * Return: 0 on success and <0 on error.
+ */
+int iommu_queue_iopf(struct iommu_fault *fault, void *cookie)
+{
+	int ret;
+	struct iopf_group *group;
+	struct iopf_fault *iopf, *next;
+	struct iopf_device_param *iopf_param;
+
+	struct device *dev = cookie;
+	struct dev_iommu *param = dev->iommu;
+
+	lockdep_assert_held(&param->lock);
+
+	if (fault->type != IOMMU_FAULT_PAGE_REQ)
+		/* Not a recoverable page fault */
+		return -EOPNOTSUPP;
+
+	/*
+	 * As long as we're holding param->lock, the queue can't be unlinked
+	 * from the device and therefore cannot disappear.
+	 */
+	iopf_param = param->iopf_param;
+	if (!iopf_param)
+		return -ENODEV;
+
+	if (!(fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE)) {
+		iopf = kzalloc(sizeof(*iopf), GFP_KERNEL);
+		if (!iopf)
+			return -ENOMEM;
+
+		iopf->fault = *fault;
+
+		/* Non-last request of a group. Postpone until the last one */
+		list_add(&iopf->list, &iopf_param->partial);
+
+		return 0;
+	}
+
+	group = kzalloc(sizeof(*group), GFP_KERNEL);
+	if (!group) {
+		/*
+		 * The caller will send a response to the hardware. But we do
+		 * need to clean up before leaving, otherwise partial faults
+		 * will be stuck.
+		 */
+		ret = -ENOMEM;
+		goto cleanup_partial;
+	}
+
+	group->dev = dev;
+	group->last_fault.fault = *fault;
+	INIT_LIST_HEAD(&group->faults);
+	list_add(&group->last_fault.list, &group->faults);
+	INIT_WORK(&group->work, iopf_handle_group);
+
+	/* See if we have partial faults for this group */
+	list_for_each_entry_safe(iopf, next, &iopf_param->partial, list) {
+		if (iopf->fault.prm.grpid == fault->prm.grpid)
+			/* Insert *before* the last fault */
+			list_move(&iopf->list, &group->faults);
+	}
+
+	queue_work(iopf_param->queue->wq, &group->work);
+	return 0;
+
+cleanup_partial:
+	list_for_each_entry_safe(iopf, next, &iopf_param->partial, list) {
+		if (iopf->fault.prm.grpid == fault->prm.grpid) {
+			list_del(&iopf->list);
+			kfree(iopf);
+		}
+	}
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_queue_iopf);
+
+/**
+ * iopf_queue_flush_dev - Ensure that all queued faults have been processed
+ * @dev: the endpoint whose faults need to be flushed.
+ *
+ * The IOMMU driver calls this before releasing a PASID, to ensure that all
+ * pending faults for this PASID have been handled, and won't hit the address
+ * space of the next process that uses this PASID. The driver must make sure
+ * that no new fault is added to the queue. In particular it must flush its
+ * low-level queue before calling this function.
+ *
+ * Return: 0 on success and <0 on error.
+ */
+int iopf_queue_flush_dev(struct device *dev)
+{
+	int ret = 0;
+	struct iopf_device_param *iopf_param;
+	struct dev_iommu *param = dev->iommu;
+
+	if (!param)
+		return -ENODEV;
+
+	mutex_lock(&param->lock);
+	iopf_param = param->iopf_param;
+	if (iopf_param)
+		flush_workqueue(iopf_param->queue->wq);
+	else
+		ret = -ENODEV;
+	mutex_unlock(&param->lock);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iopf_queue_flush_dev);
+
+/**
+ * iopf_queue_discard_partial - Remove all pending partial fault
+ * @queue: the queue whose partial faults need to be discarded
+ *
+ * When the hardware queue overflows, last page faults in a group may have been
+ * lost and the IOMMU driver calls this to discard all partial faults. The
+ * driver shouldn't be adding new faults to this queue concurrently.
+ *
+ * Return: 0 on success and <0 on error.
+ */
+int iopf_queue_discard_partial(struct iopf_queue *queue)
+{
+	struct iopf_fault *iopf, *next;
+	struct iopf_device_param *iopf_param;
+
+	if (!queue)
+		return -EINVAL;
+
+	mutex_lock(&queue->lock);
+	list_for_each_entry(iopf_param, &queue->devices, queue_list) {
+		list_for_each_entry_safe(iopf, next, &iopf_param->partial,
+					 list) {
+			list_del(&iopf->list);
+			kfree(iopf);
+		}
+	}
+	mutex_unlock(&queue->lock);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(iopf_queue_discard_partial);
+
+/**
+ * iopf_queue_add_device - Add producer to the fault queue
+ * @queue: IOPF queue
+ * @dev: device to add
+ *
+ * Return: 0 on success and <0 on error.
+ */
+int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev)
+{
+	int ret = -EBUSY;
+	struct iopf_device_param *iopf_param;
+	struct dev_iommu *param = dev->iommu;
+
+	if (!param)
+		return -ENODEV;
+
+	iopf_param = kzalloc(sizeof(*iopf_param), GFP_KERNEL);
+	if (!iopf_param)
+		return -ENOMEM;
+
+	INIT_LIST_HEAD(&iopf_param->partial);
+	iopf_param->queue = queue;
+	iopf_param->dev = dev;
+
+	mutex_lock(&queue->lock);
+	mutex_lock(&param->lock);
+	if (!param->iopf_param) {
+		list_add(&iopf_param->queue_list, &queue->devices);
+		param->iopf_param = iopf_param;
+		ret = 0;
+	}
+	mutex_unlock(&param->lock);
+	mutex_unlock(&queue->lock);
+
+	if (ret)
+		kfree(iopf_param);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iopf_queue_add_device);
+
+/**
+ * iopf_queue_remove_device - Remove producer from fault queue
+ * @queue: IOPF queue
+ * @dev: device to remove
+ *
+ * Caller makes sure that no more faults are reported for this device.
+ *
+ * Return: 0 on success and <0 on error.
+ */
+int iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev)
+{
+	int ret = 0;
+	struct iopf_fault *iopf, *next;
+	struct iopf_device_param *iopf_param;
+	struct dev_iommu *param = dev->iommu;
+
+	if (!param || !queue)
+		return -EINVAL;
+
+	mutex_lock(&queue->lock);
+	mutex_lock(&param->lock);
+	iopf_param = param->iopf_param;
+	if (iopf_param && iopf_param->queue == queue) {
+		list_del(&iopf_param->queue_list);
+		param->iopf_param = NULL;
+	} else {
+		ret = -EINVAL;
+	}
+	mutex_unlock(&param->lock);
+	mutex_unlock(&queue->lock);
+	if (ret)
+		return ret;
+
+	/* Just in case some faults are still stuck */
+	list_for_each_entry_safe(iopf, next, &iopf_param->partial, list)
+		kfree(iopf);
+
+	kfree(iopf_param);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(iopf_queue_remove_device);
+
+/**
+ * iopf_queue_alloc - Allocate and initialize a fault queue
+ * @name: a unique string identifying the queue (for workqueue)
+ *
+ * Return: the queue on success and NULL on error.
+ */
+struct iopf_queue *iopf_queue_alloc(const char *name)
+{
+	struct iopf_queue *queue;
+
+	queue = kzalloc(sizeof(*queue), GFP_KERNEL);
+	if (!queue)
+		return NULL;
+
+	/*
+	 * The WQ is unordered because the low-level handler enqueues faults by
+	 * group. PRI requests within a group have to be ordered, but once
+	 * that's dealt with, the high-level function can handle groups out of
+	 * order.
+	 */
+	queue->wq = alloc_workqueue("iopf_queue/%s", WQ_UNBOUND, 0, name);
+	if (!queue->wq) {
+		kfree(queue);
+		return NULL;
+	}
+
+	INIT_LIST_HEAD(&queue->devices);
+	mutex_init(&queue->lock);
+
+	return queue;
+}
+EXPORT_SYMBOL_GPL(iopf_queue_alloc);
+
+/**
+ * iopf_queue_free - Free IOPF queue
+ * @queue: queue to free
+ *
+ * Counterpart to iopf_queue_alloc(). The driver must not be queuing faults or
+ * adding/removing devices on this queue anymore.
+ */
+void iopf_queue_free(struct iopf_queue *queue)
+{
+	struct iopf_device_param *iopf_param, *next;
+
+	if (!queue)
+		return;
+
+	list_for_each_entry_safe(iopf_param, next, &queue->devices, queue_list)
+		iopf_queue_remove_device(queue, iopf_param->dev);
+
+	destroy_workqueue(queue->wq);
+	kfree(queue);
+}
+EXPORT_SYMBOL_GPL(iopf_queue_free);
-- 
2.29.2

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 06/10] iommu: Add a page fault handler
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: devicetree, lorenzo.pieralisi, linux-acpi, Jean-Philippe Brucker,
	guohanjun, rjw, shameerali.kolothum.thodi, eric.auger, iommu,
	robh+dt, linux-accelerators, Jonathan.Cameron, sudeep.holla,
	vivek.gautam, zhangfei.gao, baolu.lu, robin.murphy,
	linux-arm-kernel, lenb

Some systems allow devices to handle I/O Page Faults in the core mm. For
example systems implementing the PCIe PRI extension or Arm SMMU stall
model. Infrastructure for reporting these recoverable page faults was
added to the IOMMU core by commit 0c830e6b3282 ("iommu: Introduce device
fault report API"). Add a page fault handler for host SVA.

IOMMU driver can now instantiate several fault workqueues and link them
to IOPF-capable devices. Drivers can choose between a single global
workqueue, one per IOMMU device, one per low-level fault queue, one per
domain, etc.

When it receives a fault event, supposedly in an IRQ handler, the IOMMU
driver reports the fault using iommu_report_device_fault(), which calls
the registered handler. The page fault handler then calls the mm fault
handler, and reports either success or failure with iommu_page_response().
When the handler succeeds, the IOMMU retries the access.

The iopf_param pointer could be embedded into iommu_fault_param. But
putting iopf_param into the iommu_param structure allows us not to care
about ordering between calls to iopf_queue_add_device() and
iommu_register_device_fault_handler().

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 drivers/iommu/Makefile        |   1 +
 drivers/iommu/iommu-sva-lib.h |  53 ++++
 include/linux/iommu.h         |   2 +
 drivers/iommu/io-pgfault.c    | 462 ++++++++++++++++++++++++++++++++++
 4 files changed, 518 insertions(+)
 create mode 100644 drivers/iommu/io-pgfault.c

diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 61bd30cd8369..60fafc23dee6 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -28,3 +28,4 @@ obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
 obj-$(CONFIG_HYPERV_IOMMU) += hyperv-iommu.o
 obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
 obj-$(CONFIG_IOMMU_SVA_LIB) += iommu-sva-lib.o
+obj-$(CONFIG_IOMMU_SVA_LIB) += io-pgfault.o
diff --git a/drivers/iommu/iommu-sva-lib.h b/drivers/iommu/iommu-sva-lib.h
index b40990aef3fd..031155010ca8 100644
--- a/drivers/iommu/iommu-sva-lib.h
+++ b/drivers/iommu/iommu-sva-lib.h
@@ -12,4 +12,57 @@ int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max);
 void iommu_sva_free_pasid(struct mm_struct *mm);
 struct mm_struct *iommu_sva_find(ioasid_t pasid);
 
+/* I/O Page fault */
+struct device;
+struct iommu_fault;
+struct iopf_queue;
+
+#ifdef CONFIG_IOMMU_SVA_LIB
+int iommu_queue_iopf(struct iommu_fault *fault, void *cookie);
+
+int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev);
+int iopf_queue_remove_device(struct iopf_queue *queue,
+			     struct device *dev);
+int iopf_queue_flush_dev(struct device *dev);
+struct iopf_queue *iopf_queue_alloc(const char *name);
+void iopf_queue_free(struct iopf_queue *queue);
+int iopf_queue_discard_partial(struct iopf_queue *queue);
+
+#else /* CONFIG_IOMMU_SVA_LIB */
+static inline int iommu_queue_iopf(struct iommu_fault *fault, void *cookie)
+{
+	return -ENODEV;
+}
+
+static inline int iopf_queue_add_device(struct iopf_queue *queue,
+					struct device *dev)
+{
+	return -ENODEV;
+}
+
+static inline int iopf_queue_remove_device(struct iopf_queue *queue,
+					   struct device *dev)
+{
+	return -ENODEV;
+}
+
+static inline int iopf_queue_flush_dev(struct device *dev)
+{
+	return -ENODEV;
+}
+
+static inline struct iopf_queue *iopf_queue_alloc(const char *name)
+{
+	return NULL;
+}
+
+static inline void iopf_queue_free(struct iopf_queue *queue)
+{
+}
+
+static inline int iopf_queue_discard_partial(struct iopf_queue *queue)
+{
+	return -ENODEV;
+}
+#endif /* CONFIG_IOMMU_SVA_LIB */
 #endif /* _IOMMU_SVA_LIB_H */
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 701b2eeb0dc5..1c721f472d25 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -366,6 +366,7 @@ struct iommu_fault_param {
  * struct dev_iommu - Collection of per-device IOMMU data
  *
  * @fault_param: IOMMU detected device fault reporting data
+ * @iopf_param:	 I/O Page Fault queue and data
  * @fwspec:	 IOMMU fwspec data
  * @iommu_dev:	 IOMMU device this device is linked to
  * @priv:	 IOMMU Driver private data
@@ -376,6 +377,7 @@ struct iommu_fault_param {
 struct dev_iommu {
 	struct mutex lock;
 	struct iommu_fault_param	*fault_param;
+	struct iopf_device_param	*iopf_param;
 	struct iommu_fwspec		*fwspec;
 	struct iommu_device		*iommu_dev;
 	void				*priv;
diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c
new file mode 100644
index 000000000000..fc1d5d29ac37
--- /dev/null
+++ b/drivers/iommu/io-pgfault.c
@@ -0,0 +1,462 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Handle device page faults
+ *
+ * Copyright (C) 2020 ARM Ltd.
+ */
+
+#include <linux/iommu.h>
+#include <linux/list.h>
+#include <linux/sched/mm.h>
+#include <linux/slab.h>
+#include <linux/workqueue.h>
+
+#include "iommu-sva-lib.h"
+
+/**
+ * struct iopf_queue - IO Page Fault queue
+ * @wq: the fault workqueue
+ * @devices: devices attached to this queue
+ * @lock: protects the device list
+ */
+struct iopf_queue {
+	struct workqueue_struct		*wq;
+	struct list_head		devices;
+	struct mutex			lock;
+};
+
+/**
+ * struct iopf_device_param - IO Page Fault data attached to a device
+ * @dev: the device that owns this param
+ * @queue: IOPF queue
+ * @queue_list: index into queue->devices
+ * @partial: faults that are part of a Page Request Group for which the last
+ *           request hasn't been submitted yet.
+ */
+struct iopf_device_param {
+	struct device			*dev;
+	struct iopf_queue		*queue;
+	struct list_head		queue_list;
+	struct list_head		partial;
+};
+
+struct iopf_fault {
+	struct iommu_fault		fault;
+	struct list_head		list;
+};
+
+struct iopf_group {
+	struct iopf_fault		last_fault;
+	struct list_head		faults;
+	struct work_struct		work;
+	struct device			*dev;
+};
+
+static int iopf_complete_group(struct device *dev, struct iopf_fault *iopf,
+			       enum iommu_page_response_code status)
+{
+	struct iommu_page_response resp = {
+		.version		= IOMMU_PAGE_RESP_VERSION_1,
+		.pasid			= iopf->fault.prm.pasid,
+		.grpid			= iopf->fault.prm.grpid,
+		.code			= status,
+	};
+
+	if ((iopf->fault.prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID) &&
+	    (iopf->fault.prm.flags & IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID))
+		resp.flags = IOMMU_PAGE_RESP_PASID_VALID;
+
+	return iommu_page_response(dev, &resp);
+}
+
+static enum iommu_page_response_code
+iopf_handle_single(struct iopf_fault *iopf)
+{
+	vm_fault_t ret;
+	struct mm_struct *mm;
+	struct vm_area_struct *vma;
+	unsigned int access_flags = 0;
+	unsigned int fault_flags = FAULT_FLAG_REMOTE;
+	struct iommu_fault_page_request *prm = &iopf->fault.prm;
+	enum iommu_page_response_code status = IOMMU_PAGE_RESP_INVALID;
+
+	if (!(prm->flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID))
+		return status;
+
+	mm = iommu_sva_find(prm->pasid);
+	if (IS_ERR_OR_NULL(mm))
+		return status;
+
+	mmap_read_lock(mm);
+
+	vma = find_extend_vma(mm, prm->addr);
+	if (!vma)
+		/* Unmapped area */
+		goto out_put_mm;
+
+	if (prm->perm & IOMMU_FAULT_PERM_READ)
+		access_flags |= VM_READ;
+
+	if (prm->perm & IOMMU_FAULT_PERM_WRITE) {
+		access_flags |= VM_WRITE;
+		fault_flags |= FAULT_FLAG_WRITE;
+	}
+
+	if (prm->perm & IOMMU_FAULT_PERM_EXEC) {
+		access_flags |= VM_EXEC;
+		fault_flags |= FAULT_FLAG_INSTRUCTION;
+	}
+
+	if (!(prm->perm & IOMMU_FAULT_PERM_PRIV))
+		fault_flags |= FAULT_FLAG_USER;
+
+	if (access_flags & ~vma->vm_flags)
+		/* Access fault */
+		goto out_put_mm;
+
+	ret = handle_mm_fault(vma, prm->addr, fault_flags, NULL);
+	status = ret & VM_FAULT_ERROR ? IOMMU_PAGE_RESP_INVALID :
+		IOMMU_PAGE_RESP_SUCCESS;
+
+out_put_mm:
+	mmap_read_unlock(mm);
+	mmput(mm);
+
+	return status;
+}
+
+static void iopf_handle_group(struct work_struct *work)
+{
+	struct iopf_group *group;
+	struct iopf_fault *iopf, *next;
+	enum iommu_page_response_code status = IOMMU_PAGE_RESP_SUCCESS;
+
+	group = container_of(work, struct iopf_group, work);
+
+	list_for_each_entry_safe(iopf, next, &group->faults, list) {
+		/*
+		 * For the moment, errors are sticky: don't handle subsequent
+		 * faults in the group if there is an error.
+		 */
+		if (status == IOMMU_PAGE_RESP_SUCCESS)
+			status = iopf_handle_single(iopf);
+
+		if (!(iopf->fault.prm.flags &
+		      IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE))
+			kfree(iopf);
+	}
+
+	iopf_complete_group(group->dev, &group->last_fault, status);
+	kfree(group);
+}
+
+/**
+ * iommu_queue_iopf - IO Page Fault handler
+ * @fault: fault event
+ * @cookie: struct device, passed to iommu_register_device_fault_handler.
+ *
+ * Add a fault to the device workqueue, to be handled by mm.
+ *
+ * This module doesn't handle PCI PASID Stop Marker; IOMMU drivers must discard
+ * them before reporting faults. A PASID Stop Marker (LRW = 0b100) doesn't
+ * expect a response. It may be generated when disabling a PASID (issuing a
+ * PASID stop request) by some PCI devices.
+ *
+ * The PASID stop request is issued by the device driver before unbind(). Once
+ * it completes, no page request is generated for this PASID anymore and
+ * outstanding ones have been pushed to the IOMMU (as per PCIe 4.0r1.0 - 6.20.1
+ * and 10.4.1.2 - Managing PASID TLP Prefix Usage). Some PCI devices will wait
+ * for all outstanding page requests to come back with a response before
+ * completing the PASID stop request. Others do not wait for page responses, and
+ * instead issue this Stop Marker that tells us when the PASID can be
+ * reallocated.
+ *
+ * It is safe to discard the Stop Marker because it is an optimization.
+ * a. Page requests, which are posted requests, have been flushed to the IOMMU
+ *    when the stop request completes.
+ * b. The IOMMU driver flushes all fault queues on unbind() before freeing the
+ *    PASID.
+ *
+ * So even though the Stop Marker might be issued by the device *after* the stop
+ * request completes, outstanding faults will have been dealt with by the time
+ * the PASID is freed.
+ *
+ * Return: 0 on success and <0 on error.
+ */
+int iommu_queue_iopf(struct iommu_fault *fault, void *cookie)
+{
+	int ret;
+	struct iopf_group *group;
+	struct iopf_fault *iopf, *next;
+	struct iopf_device_param *iopf_param;
+
+	struct device *dev = cookie;
+	struct dev_iommu *param = dev->iommu;
+
+	lockdep_assert_held(&param->lock);
+
+	if (fault->type != IOMMU_FAULT_PAGE_REQ)
+		/* Not a recoverable page fault */
+		return -EOPNOTSUPP;
+
+	/*
+	 * As long as we're holding param->lock, the queue can't be unlinked
+	 * from the device and therefore cannot disappear.
+	 */
+	iopf_param = param->iopf_param;
+	if (!iopf_param)
+		return -ENODEV;
+
+	if (!(fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE)) {
+		iopf = kzalloc(sizeof(*iopf), GFP_KERNEL);
+		if (!iopf)
+			return -ENOMEM;
+
+		iopf->fault = *fault;
+
+		/* Non-last request of a group. Postpone until the last one */
+		list_add(&iopf->list, &iopf_param->partial);
+
+		return 0;
+	}
+
+	group = kzalloc(sizeof(*group), GFP_KERNEL);
+	if (!group) {
+		/*
+		 * The caller will send a response to the hardware. But we do
+		 * need to clean up before leaving, otherwise partial faults
+		 * will be stuck.
+		 */
+		ret = -ENOMEM;
+		goto cleanup_partial;
+	}
+
+	group->dev = dev;
+	group->last_fault.fault = *fault;
+	INIT_LIST_HEAD(&group->faults);
+	list_add(&group->last_fault.list, &group->faults);
+	INIT_WORK(&group->work, iopf_handle_group);
+
+	/* See if we have partial faults for this group */
+	list_for_each_entry_safe(iopf, next, &iopf_param->partial, list) {
+		if (iopf->fault.prm.grpid == fault->prm.grpid)
+			/* Insert *before* the last fault */
+			list_move(&iopf->list, &group->faults);
+	}
+
+	queue_work(iopf_param->queue->wq, &group->work);
+	return 0;
+
+cleanup_partial:
+	list_for_each_entry_safe(iopf, next, &iopf_param->partial, list) {
+		if (iopf->fault.prm.grpid == fault->prm.grpid) {
+			list_del(&iopf->list);
+			kfree(iopf);
+		}
+	}
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_queue_iopf);
+
+/**
+ * iopf_queue_flush_dev - Ensure that all queued faults have been processed
+ * @dev: the endpoint whose faults need to be flushed.
+ *
+ * The IOMMU driver calls this before releasing a PASID, to ensure that all
+ * pending faults for this PASID have been handled, and won't hit the address
+ * space of the next process that uses this PASID. The driver must make sure
+ * that no new fault is added to the queue. In particular it must flush its
+ * low-level queue before calling this function.
+ *
+ * Return: 0 on success and <0 on error.
+ */
+int iopf_queue_flush_dev(struct device *dev)
+{
+	int ret = 0;
+	struct iopf_device_param *iopf_param;
+	struct dev_iommu *param = dev->iommu;
+
+	if (!param)
+		return -ENODEV;
+
+	mutex_lock(&param->lock);
+	iopf_param = param->iopf_param;
+	if (iopf_param)
+		flush_workqueue(iopf_param->queue->wq);
+	else
+		ret = -ENODEV;
+	mutex_unlock(&param->lock);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iopf_queue_flush_dev);
+
+/**
+ * iopf_queue_discard_partial - Remove all pending partial fault
+ * @queue: the queue whose partial faults need to be discarded
+ *
+ * When the hardware queue overflows, last page faults in a group may have been
+ * lost and the IOMMU driver calls this to discard all partial faults. The
+ * driver shouldn't be adding new faults to this queue concurrently.
+ *
+ * Return: 0 on success and <0 on error.
+ */
+int iopf_queue_discard_partial(struct iopf_queue *queue)
+{
+	struct iopf_fault *iopf, *next;
+	struct iopf_device_param *iopf_param;
+
+	if (!queue)
+		return -EINVAL;
+
+	mutex_lock(&queue->lock);
+	list_for_each_entry(iopf_param, &queue->devices, queue_list) {
+		list_for_each_entry_safe(iopf, next, &iopf_param->partial,
+					 list) {
+			list_del(&iopf->list);
+			kfree(iopf);
+		}
+	}
+	mutex_unlock(&queue->lock);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(iopf_queue_discard_partial);
+
+/**
+ * iopf_queue_add_device - Add producer to the fault queue
+ * @queue: IOPF queue
+ * @dev: device to add
+ *
+ * Return: 0 on success and <0 on error.
+ */
+int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev)
+{
+	int ret = -EBUSY;
+	struct iopf_device_param *iopf_param;
+	struct dev_iommu *param = dev->iommu;
+
+	if (!param)
+		return -ENODEV;
+
+	iopf_param = kzalloc(sizeof(*iopf_param), GFP_KERNEL);
+	if (!iopf_param)
+		return -ENOMEM;
+
+	INIT_LIST_HEAD(&iopf_param->partial);
+	iopf_param->queue = queue;
+	iopf_param->dev = dev;
+
+	mutex_lock(&queue->lock);
+	mutex_lock(&param->lock);
+	if (!param->iopf_param) {
+		list_add(&iopf_param->queue_list, &queue->devices);
+		param->iopf_param = iopf_param;
+		ret = 0;
+	}
+	mutex_unlock(&param->lock);
+	mutex_unlock(&queue->lock);
+
+	if (ret)
+		kfree(iopf_param);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iopf_queue_add_device);
+
+/**
+ * iopf_queue_remove_device - Remove producer from fault queue
+ * @queue: IOPF queue
+ * @dev: device to remove
+ *
+ * Caller makes sure that no more faults are reported for this device.
+ *
+ * Return: 0 on success and <0 on error.
+ */
+int iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev)
+{
+	int ret = 0;
+	struct iopf_fault *iopf, *next;
+	struct iopf_device_param *iopf_param;
+	struct dev_iommu *param = dev->iommu;
+
+	if (!param || !queue)
+		return -EINVAL;
+
+	mutex_lock(&queue->lock);
+	mutex_lock(&param->lock);
+	iopf_param = param->iopf_param;
+	if (iopf_param && iopf_param->queue == queue) {
+		list_del(&iopf_param->queue_list);
+		param->iopf_param = NULL;
+	} else {
+		ret = -EINVAL;
+	}
+	mutex_unlock(&param->lock);
+	mutex_unlock(&queue->lock);
+	if (ret)
+		return ret;
+
+	/* Just in case some faults are still stuck */
+	list_for_each_entry_safe(iopf, next, &iopf_param->partial, list)
+		kfree(iopf);
+
+	kfree(iopf_param);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(iopf_queue_remove_device);
+
+/**
+ * iopf_queue_alloc - Allocate and initialize a fault queue
+ * @name: a unique string identifying the queue (for workqueue)
+ *
+ * Return: the queue on success and NULL on error.
+ */
+struct iopf_queue *iopf_queue_alloc(const char *name)
+{
+	struct iopf_queue *queue;
+
+	queue = kzalloc(sizeof(*queue), GFP_KERNEL);
+	if (!queue)
+		return NULL;
+
+	/*
+	 * The WQ is unordered because the low-level handler enqueues faults by
+	 * group. PRI requests within a group have to be ordered, but once
+	 * that's dealt with, the high-level function can handle groups out of
+	 * order.
+	 */
+	queue->wq = alloc_workqueue("iopf_queue/%s", WQ_UNBOUND, 0, name);
+	if (!queue->wq) {
+		kfree(queue);
+		return NULL;
+	}
+
+	INIT_LIST_HEAD(&queue->devices);
+	mutex_init(&queue->lock);
+
+	return queue;
+}
+EXPORT_SYMBOL_GPL(iopf_queue_alloc);
+
+/**
+ * iopf_queue_free - Free IOPF queue
+ * @queue: queue to free
+ *
+ * Counterpart to iopf_queue_alloc(). The driver must not be queuing faults or
+ * adding/removing devices on this queue anymore.
+ */
+void iopf_queue_free(struct iopf_queue *queue)
+{
+	struct iopf_device_param *iopf_param, *next;
+
+	if (!queue)
+		return;
+
+	list_for_each_entry_safe(iopf_param, next, &queue->devices, queue_list)
+		iopf_queue_remove_device(queue, iopf_param->dev);
+
+	destroy_workqueue(queue->wq);
+	kfree(queue);
+}
+EXPORT_SYMBOL_GPL(iopf_queue_free);
-- 
2.29.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 07/10] iommu/arm-smmu-v3: Maintain a SID->device structure
  2021-01-08 14:52 ` Jean-Philippe Brucker
  (?)
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: lorenzo.pieralisi, robh+dt, guohanjun, sudeep.holla, rjw, lenb,
	robin.murphy, Jonathan.Cameron, eric.auger, iommu, devicetree,
	linux-acpi, linux-arm-kernel, linux-accelerators, baolu.lu,
	vdumpa, zhangfei.gao, shameerali.kolothum.thodi, vivek.gautam,
	Jean-Philippe Brucker

When handling faults from the event or PRI queue, we need to find the
struct device associated with a SID. Add a rb_tree to keep track of
SIDs.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  13 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 161 ++++++++++++++++----
 2 files changed, 144 insertions(+), 30 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 96c2e9565e00..8ef6a1c48635 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -636,6 +636,15 @@ struct arm_smmu_device {
 
 	/* IOMMU core code handle */
 	struct iommu_device		iommu;
+
+	struct rb_root			streams;
+	struct mutex			streams_mutex;
+};
+
+struct arm_smmu_stream {
+	u32				id;
+	struct arm_smmu_master		*master;
+	struct rb_node			node;
 };
 
 /* SMMU private data for each master */
@@ -644,8 +653,8 @@ struct arm_smmu_master {
 	struct device			*dev;
 	struct arm_smmu_domain		*domain;
 	struct list_head		domain_head;
-	u32				*sids;
-	unsigned int			num_sids;
+	struct arm_smmu_stream		*streams;
+	unsigned int			num_streams;
 	bool				ats_enabled;
 	bool				sva_enabled;
 	struct list_head		bonds;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 6a53b4edf054..2dbae2e6965d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -912,8 +912,8 @@ static void arm_smmu_sync_cd(struct arm_smmu_domain *smmu_domain,
 
 	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
 	list_for_each_entry(master, &smmu_domain->devices, domain_head) {
-		for (i = 0; i < master->num_sids; i++) {
-			cmd.cfgi.sid = master->sids[i];
+		for (i = 0; i < master->num_streams; i++) {
+			cmd.cfgi.sid = master->streams[i].id;
 			arm_smmu_cmdq_batch_add(smmu, &cmds, &cmd);
 		}
 	}
@@ -1355,6 +1355,32 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
 	return 0;
 }
 
+__maybe_unused
+static struct arm_smmu_master *
+arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid)
+{
+	struct rb_node *node;
+	struct arm_smmu_stream *stream;
+	struct arm_smmu_master *master = NULL;
+
+	mutex_lock(&smmu->streams_mutex);
+	node = smmu->streams.rb_node;
+	while (node) {
+		stream = rb_entry(node, struct arm_smmu_stream, node);
+		if (stream->id < sid) {
+			node = node->rb_right;
+		} else if (stream->id > sid) {
+			node = node->rb_left;
+		} else {
+			master = stream->master;
+			break;
+		}
+	}
+	mutex_unlock(&smmu->streams_mutex);
+
+	return master;
+}
+
 /* IRQ and event handlers */
 static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
 {
@@ -1588,8 +1614,8 @@ static int arm_smmu_atc_inv_master(struct arm_smmu_master *master)
 
 	arm_smmu_atc_inv_to_cmd(0, 0, 0, &cmd);
 
-	for (i = 0; i < master->num_sids; i++) {
-		cmd.atc.sid = master->sids[i];
+	for (i = 0; i < master->num_streams; i++) {
+		cmd.atc.sid = master->streams[i].id;
 		arm_smmu_cmdq_issue_cmd(master->smmu, &cmd);
 	}
 
@@ -1632,8 +1658,8 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
 		if (!master->ats_enabled)
 			continue;
 
-		for (i = 0; i < master->num_sids; i++) {
-			cmd.atc.sid = master->sids[i];
+		for (i = 0; i < master->num_streams; i++) {
+			cmd.atc.sid = master->streams[i].id;
 			arm_smmu_cmdq_batch_add(smmu_domain->smmu, &cmds, &cmd);
 		}
 	}
@@ -2040,13 +2066,13 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
 	int i, j;
 	struct arm_smmu_device *smmu = master->smmu;
 
-	for (i = 0; i < master->num_sids; ++i) {
-		u32 sid = master->sids[i];
+	for (i = 0; i < master->num_streams; ++i) {
+		u32 sid = master->streams[i].id;
 		__le64 *step = arm_smmu_get_step_for_sid(smmu, sid);
 
 		/* Bridged PCI devices may end up with duplicated IDs */
 		for (j = 0; j < i; j++)
-			if (master->sids[j] == sid)
+			if (master->streams[j].id == sid)
 				break;
 		if (j < i)
 			continue;
@@ -2319,11 +2345,101 @@ static bool arm_smmu_sid_in_range(struct arm_smmu_device *smmu, u32 sid)
 	return sid < limit;
 }
 
+static int arm_smmu_insert_master(struct arm_smmu_device *smmu,
+				  struct arm_smmu_master *master)
+{
+	int i;
+	int ret = 0;
+	struct arm_smmu_stream *new_stream, *cur_stream;
+	struct rb_node **new_node, *parent_node = NULL;
+	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(master->dev);
+
+	master->streams = kcalloc(fwspec->num_ids,
+				  sizeof(struct arm_smmu_stream), GFP_KERNEL);
+	if (!master->streams)
+		return -ENOMEM;
+	master->num_streams = fwspec->num_ids;
+
+	mutex_lock(&smmu->streams_mutex);
+	for (i = 0; i < fwspec->num_ids && !ret; i++) {
+		u32 sid = fwspec->ids[i];
+
+		new_stream = &master->streams[i];
+		new_stream->id = sid;
+		new_stream->master = master;
+
+		/*
+		 * Check the SIDs are in range of the SMMU and our stream table
+		 */
+		if (!arm_smmu_sid_in_range(smmu, sid)) {
+			ret = -ERANGE;
+			break;
+		}
+
+		/* Ensure l2 strtab is initialised */
+		if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
+			ret = arm_smmu_init_l2_strtab(smmu, sid);
+			if (ret)
+				break;
+		}
+
+		/* Insert into SID tree */
+		new_node = &(smmu->streams.rb_node);
+		while (*new_node) {
+			cur_stream = rb_entry(*new_node, struct arm_smmu_stream,
+					      node);
+			parent_node = *new_node;
+			if (cur_stream->id > new_stream->id) {
+				new_node = &((*new_node)->rb_left);
+			} else if (cur_stream->id < new_stream->id) {
+				new_node = &((*new_node)->rb_right);
+			} else {
+				dev_warn(master->dev,
+					 "stream %u already in tree\n",
+					 cur_stream->id);
+				ret = -EINVAL;
+				break;
+			}
+		}
+
+		if (!ret) {
+			rb_link_node(&new_stream->node, parent_node, new_node);
+			rb_insert_color(&new_stream->node, &smmu->streams);
+		}
+	}
+
+	if (ret) {
+		for (; i > 0; i--)
+			rb_erase(&master->streams[i].node, &smmu->streams);
+		kfree(master->streams);
+	}
+	mutex_unlock(&smmu->streams_mutex);
+
+	return ret;
+}
+
+static void arm_smmu_remove_master(struct arm_smmu_master *master)
+{
+	int i;
+	struct arm_smmu_device *smmu = master->smmu;
+	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(master->dev);
+
+	if (!smmu || !master->streams)
+		return;
+
+	mutex_lock(&smmu->streams_mutex);
+	for (i = 0; i < fwspec->num_ids; i++)
+		rb_erase(&master->streams[i].node, &smmu->streams);
+	mutex_unlock(&smmu->streams_mutex);
+
+	kfree(master->streams);
+}
+
 static struct iommu_ops arm_smmu_ops;
 
 static struct iommu_device *arm_smmu_probe_device(struct device *dev)
 {
-	int i, ret;
+	int ret;
 	struct arm_smmu_device *smmu;
 	struct arm_smmu_master *master;
 	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
@@ -2344,27 +2460,12 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
 
 	master->dev = dev;
 	master->smmu = smmu;
-	master->sids = fwspec->ids;
-	master->num_sids = fwspec->num_ids;
 	INIT_LIST_HEAD(&master->bonds);
 	dev_iommu_priv_set(dev, master);
 
-	/* Check the SIDs are in range of the SMMU and our stream table */
-	for (i = 0; i < master->num_sids; i++) {
-		u32 sid = master->sids[i];
-
-		if (!arm_smmu_sid_in_range(smmu, sid)) {
-			ret = -ERANGE;
-			goto err_free_master;
-		}
-
-		/* Ensure l2 strtab is initialised */
-		if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
-			ret = arm_smmu_init_l2_strtab(smmu, sid);
-			if (ret)
-				goto err_free_master;
-		}
-	}
+	ret = arm_smmu_insert_master(smmu, master);
+	if (ret)
+		goto err_free_master;
 
 	device_property_read_u32(dev, "pasid-num-bits", &master->ssid_bits);
 	master->ssid_bits = min(smmu->ssid_bits, master->ssid_bits);
@@ -2403,6 +2504,7 @@ static void arm_smmu_release_device(struct device *dev)
 	WARN_ON(arm_smmu_master_sva_enabled(master));
 	arm_smmu_detach_dev(master);
 	arm_smmu_disable_pasid(master);
+	arm_smmu_remove_master(master);
 	kfree(master);
 	iommu_fwspec_free(dev);
 }
@@ -2825,6 +2927,9 @@ static int arm_smmu_init_structures(struct arm_smmu_device *smmu)
 {
 	int ret;
 
+	mutex_init(&smmu->streams_mutex);
+	smmu->streams = RB_ROOT;
+
 	ret = arm_smmu_init_queues(smmu);
 	if (ret)
 		return ret;
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 07/10] iommu/arm-smmu-v3: Maintain a SID->device structure
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: devicetree, linux-acpi, Jean-Philippe Brucker, guohanjun, rjw,
	iommu, robh+dt, linux-accelerators, sudeep.holla, vivek.gautam,
	zhangfei.gao, robin.murphy, linux-arm-kernel, lenb

When handling faults from the event or PRI queue, we need to find the
struct device associated with a SID. Add a rb_tree to keep track of
SIDs.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  13 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 161 ++++++++++++++++----
 2 files changed, 144 insertions(+), 30 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 96c2e9565e00..8ef6a1c48635 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -636,6 +636,15 @@ struct arm_smmu_device {
 
 	/* IOMMU core code handle */
 	struct iommu_device		iommu;
+
+	struct rb_root			streams;
+	struct mutex			streams_mutex;
+};
+
+struct arm_smmu_stream {
+	u32				id;
+	struct arm_smmu_master		*master;
+	struct rb_node			node;
 };
 
 /* SMMU private data for each master */
@@ -644,8 +653,8 @@ struct arm_smmu_master {
 	struct device			*dev;
 	struct arm_smmu_domain		*domain;
 	struct list_head		domain_head;
-	u32				*sids;
-	unsigned int			num_sids;
+	struct arm_smmu_stream		*streams;
+	unsigned int			num_streams;
 	bool				ats_enabled;
 	bool				sva_enabled;
 	struct list_head		bonds;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 6a53b4edf054..2dbae2e6965d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -912,8 +912,8 @@ static void arm_smmu_sync_cd(struct arm_smmu_domain *smmu_domain,
 
 	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
 	list_for_each_entry(master, &smmu_domain->devices, domain_head) {
-		for (i = 0; i < master->num_sids; i++) {
-			cmd.cfgi.sid = master->sids[i];
+		for (i = 0; i < master->num_streams; i++) {
+			cmd.cfgi.sid = master->streams[i].id;
 			arm_smmu_cmdq_batch_add(smmu, &cmds, &cmd);
 		}
 	}
@@ -1355,6 +1355,32 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
 	return 0;
 }
 
+__maybe_unused
+static struct arm_smmu_master *
+arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid)
+{
+	struct rb_node *node;
+	struct arm_smmu_stream *stream;
+	struct arm_smmu_master *master = NULL;
+
+	mutex_lock(&smmu->streams_mutex);
+	node = smmu->streams.rb_node;
+	while (node) {
+		stream = rb_entry(node, struct arm_smmu_stream, node);
+		if (stream->id < sid) {
+			node = node->rb_right;
+		} else if (stream->id > sid) {
+			node = node->rb_left;
+		} else {
+			master = stream->master;
+			break;
+		}
+	}
+	mutex_unlock(&smmu->streams_mutex);
+
+	return master;
+}
+
 /* IRQ and event handlers */
 static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
 {
@@ -1588,8 +1614,8 @@ static int arm_smmu_atc_inv_master(struct arm_smmu_master *master)
 
 	arm_smmu_atc_inv_to_cmd(0, 0, 0, &cmd);
 
-	for (i = 0; i < master->num_sids; i++) {
-		cmd.atc.sid = master->sids[i];
+	for (i = 0; i < master->num_streams; i++) {
+		cmd.atc.sid = master->streams[i].id;
 		arm_smmu_cmdq_issue_cmd(master->smmu, &cmd);
 	}
 
@@ -1632,8 +1658,8 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
 		if (!master->ats_enabled)
 			continue;
 
-		for (i = 0; i < master->num_sids; i++) {
-			cmd.atc.sid = master->sids[i];
+		for (i = 0; i < master->num_streams; i++) {
+			cmd.atc.sid = master->streams[i].id;
 			arm_smmu_cmdq_batch_add(smmu_domain->smmu, &cmds, &cmd);
 		}
 	}
@@ -2040,13 +2066,13 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
 	int i, j;
 	struct arm_smmu_device *smmu = master->smmu;
 
-	for (i = 0; i < master->num_sids; ++i) {
-		u32 sid = master->sids[i];
+	for (i = 0; i < master->num_streams; ++i) {
+		u32 sid = master->streams[i].id;
 		__le64 *step = arm_smmu_get_step_for_sid(smmu, sid);
 
 		/* Bridged PCI devices may end up with duplicated IDs */
 		for (j = 0; j < i; j++)
-			if (master->sids[j] == sid)
+			if (master->streams[j].id == sid)
 				break;
 		if (j < i)
 			continue;
@@ -2319,11 +2345,101 @@ static bool arm_smmu_sid_in_range(struct arm_smmu_device *smmu, u32 sid)
 	return sid < limit;
 }
 
+static int arm_smmu_insert_master(struct arm_smmu_device *smmu,
+				  struct arm_smmu_master *master)
+{
+	int i;
+	int ret = 0;
+	struct arm_smmu_stream *new_stream, *cur_stream;
+	struct rb_node **new_node, *parent_node = NULL;
+	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(master->dev);
+
+	master->streams = kcalloc(fwspec->num_ids,
+				  sizeof(struct arm_smmu_stream), GFP_KERNEL);
+	if (!master->streams)
+		return -ENOMEM;
+	master->num_streams = fwspec->num_ids;
+
+	mutex_lock(&smmu->streams_mutex);
+	for (i = 0; i < fwspec->num_ids && !ret; i++) {
+		u32 sid = fwspec->ids[i];
+
+		new_stream = &master->streams[i];
+		new_stream->id = sid;
+		new_stream->master = master;
+
+		/*
+		 * Check the SIDs are in range of the SMMU and our stream table
+		 */
+		if (!arm_smmu_sid_in_range(smmu, sid)) {
+			ret = -ERANGE;
+			break;
+		}
+
+		/* Ensure l2 strtab is initialised */
+		if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
+			ret = arm_smmu_init_l2_strtab(smmu, sid);
+			if (ret)
+				break;
+		}
+
+		/* Insert into SID tree */
+		new_node = &(smmu->streams.rb_node);
+		while (*new_node) {
+			cur_stream = rb_entry(*new_node, struct arm_smmu_stream,
+					      node);
+			parent_node = *new_node;
+			if (cur_stream->id > new_stream->id) {
+				new_node = &((*new_node)->rb_left);
+			} else if (cur_stream->id < new_stream->id) {
+				new_node = &((*new_node)->rb_right);
+			} else {
+				dev_warn(master->dev,
+					 "stream %u already in tree\n",
+					 cur_stream->id);
+				ret = -EINVAL;
+				break;
+			}
+		}
+
+		if (!ret) {
+			rb_link_node(&new_stream->node, parent_node, new_node);
+			rb_insert_color(&new_stream->node, &smmu->streams);
+		}
+	}
+
+	if (ret) {
+		for (; i > 0; i--)
+			rb_erase(&master->streams[i].node, &smmu->streams);
+		kfree(master->streams);
+	}
+	mutex_unlock(&smmu->streams_mutex);
+
+	return ret;
+}
+
+static void arm_smmu_remove_master(struct arm_smmu_master *master)
+{
+	int i;
+	struct arm_smmu_device *smmu = master->smmu;
+	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(master->dev);
+
+	if (!smmu || !master->streams)
+		return;
+
+	mutex_lock(&smmu->streams_mutex);
+	for (i = 0; i < fwspec->num_ids; i++)
+		rb_erase(&master->streams[i].node, &smmu->streams);
+	mutex_unlock(&smmu->streams_mutex);
+
+	kfree(master->streams);
+}
+
 static struct iommu_ops arm_smmu_ops;
 
 static struct iommu_device *arm_smmu_probe_device(struct device *dev)
 {
-	int i, ret;
+	int ret;
 	struct arm_smmu_device *smmu;
 	struct arm_smmu_master *master;
 	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
@@ -2344,27 +2460,12 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
 
 	master->dev = dev;
 	master->smmu = smmu;
-	master->sids = fwspec->ids;
-	master->num_sids = fwspec->num_ids;
 	INIT_LIST_HEAD(&master->bonds);
 	dev_iommu_priv_set(dev, master);
 
-	/* Check the SIDs are in range of the SMMU and our stream table */
-	for (i = 0; i < master->num_sids; i++) {
-		u32 sid = master->sids[i];
-
-		if (!arm_smmu_sid_in_range(smmu, sid)) {
-			ret = -ERANGE;
-			goto err_free_master;
-		}
-
-		/* Ensure l2 strtab is initialised */
-		if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
-			ret = arm_smmu_init_l2_strtab(smmu, sid);
-			if (ret)
-				goto err_free_master;
-		}
-	}
+	ret = arm_smmu_insert_master(smmu, master);
+	if (ret)
+		goto err_free_master;
 
 	device_property_read_u32(dev, "pasid-num-bits", &master->ssid_bits);
 	master->ssid_bits = min(smmu->ssid_bits, master->ssid_bits);
@@ -2403,6 +2504,7 @@ static void arm_smmu_release_device(struct device *dev)
 	WARN_ON(arm_smmu_master_sva_enabled(master));
 	arm_smmu_detach_dev(master);
 	arm_smmu_disable_pasid(master);
+	arm_smmu_remove_master(master);
 	kfree(master);
 	iommu_fwspec_free(dev);
 }
@@ -2825,6 +2927,9 @@ static int arm_smmu_init_structures(struct arm_smmu_device *smmu)
 {
 	int ret;
 
+	mutex_init(&smmu->streams_mutex);
+	smmu->streams = RB_ROOT;
+
 	ret = arm_smmu_init_queues(smmu);
 	if (ret)
 		return ret;
-- 
2.29.2

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 07/10] iommu/arm-smmu-v3: Maintain a SID->device structure
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: devicetree, lorenzo.pieralisi, linux-acpi, Jean-Philippe Brucker,
	guohanjun, rjw, shameerali.kolothum.thodi, eric.auger, iommu,
	robh+dt, linux-accelerators, Jonathan.Cameron, sudeep.holla,
	vivek.gautam, zhangfei.gao, baolu.lu, robin.murphy,
	linux-arm-kernel, lenb

When handling faults from the event or PRI queue, we need to find the
struct device associated with a SID. Add a rb_tree to keep track of
SIDs.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  13 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 161 ++++++++++++++++----
 2 files changed, 144 insertions(+), 30 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 96c2e9565e00..8ef6a1c48635 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -636,6 +636,15 @@ struct arm_smmu_device {
 
 	/* IOMMU core code handle */
 	struct iommu_device		iommu;
+
+	struct rb_root			streams;
+	struct mutex			streams_mutex;
+};
+
+struct arm_smmu_stream {
+	u32				id;
+	struct arm_smmu_master		*master;
+	struct rb_node			node;
 };
 
 /* SMMU private data for each master */
@@ -644,8 +653,8 @@ struct arm_smmu_master {
 	struct device			*dev;
 	struct arm_smmu_domain		*domain;
 	struct list_head		domain_head;
-	u32				*sids;
-	unsigned int			num_sids;
+	struct arm_smmu_stream		*streams;
+	unsigned int			num_streams;
 	bool				ats_enabled;
 	bool				sva_enabled;
 	struct list_head		bonds;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 6a53b4edf054..2dbae2e6965d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -912,8 +912,8 @@ static void arm_smmu_sync_cd(struct arm_smmu_domain *smmu_domain,
 
 	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
 	list_for_each_entry(master, &smmu_domain->devices, domain_head) {
-		for (i = 0; i < master->num_sids; i++) {
-			cmd.cfgi.sid = master->sids[i];
+		for (i = 0; i < master->num_streams; i++) {
+			cmd.cfgi.sid = master->streams[i].id;
 			arm_smmu_cmdq_batch_add(smmu, &cmds, &cmd);
 		}
 	}
@@ -1355,6 +1355,32 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
 	return 0;
 }
 
+__maybe_unused
+static struct arm_smmu_master *
+arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid)
+{
+	struct rb_node *node;
+	struct arm_smmu_stream *stream;
+	struct arm_smmu_master *master = NULL;
+
+	mutex_lock(&smmu->streams_mutex);
+	node = smmu->streams.rb_node;
+	while (node) {
+		stream = rb_entry(node, struct arm_smmu_stream, node);
+		if (stream->id < sid) {
+			node = node->rb_right;
+		} else if (stream->id > sid) {
+			node = node->rb_left;
+		} else {
+			master = stream->master;
+			break;
+		}
+	}
+	mutex_unlock(&smmu->streams_mutex);
+
+	return master;
+}
+
 /* IRQ and event handlers */
 static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
 {
@@ -1588,8 +1614,8 @@ static int arm_smmu_atc_inv_master(struct arm_smmu_master *master)
 
 	arm_smmu_atc_inv_to_cmd(0, 0, 0, &cmd);
 
-	for (i = 0; i < master->num_sids; i++) {
-		cmd.atc.sid = master->sids[i];
+	for (i = 0; i < master->num_streams; i++) {
+		cmd.atc.sid = master->streams[i].id;
 		arm_smmu_cmdq_issue_cmd(master->smmu, &cmd);
 	}
 
@@ -1632,8 +1658,8 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
 		if (!master->ats_enabled)
 			continue;
 
-		for (i = 0; i < master->num_sids; i++) {
-			cmd.atc.sid = master->sids[i];
+		for (i = 0; i < master->num_streams; i++) {
+			cmd.atc.sid = master->streams[i].id;
 			arm_smmu_cmdq_batch_add(smmu_domain->smmu, &cmds, &cmd);
 		}
 	}
@@ -2040,13 +2066,13 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
 	int i, j;
 	struct arm_smmu_device *smmu = master->smmu;
 
-	for (i = 0; i < master->num_sids; ++i) {
-		u32 sid = master->sids[i];
+	for (i = 0; i < master->num_streams; ++i) {
+		u32 sid = master->streams[i].id;
 		__le64 *step = arm_smmu_get_step_for_sid(smmu, sid);
 
 		/* Bridged PCI devices may end up with duplicated IDs */
 		for (j = 0; j < i; j++)
-			if (master->sids[j] == sid)
+			if (master->streams[j].id == sid)
 				break;
 		if (j < i)
 			continue;
@@ -2319,11 +2345,101 @@ static bool arm_smmu_sid_in_range(struct arm_smmu_device *smmu, u32 sid)
 	return sid < limit;
 }
 
+static int arm_smmu_insert_master(struct arm_smmu_device *smmu,
+				  struct arm_smmu_master *master)
+{
+	int i;
+	int ret = 0;
+	struct arm_smmu_stream *new_stream, *cur_stream;
+	struct rb_node **new_node, *parent_node = NULL;
+	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(master->dev);
+
+	master->streams = kcalloc(fwspec->num_ids,
+				  sizeof(struct arm_smmu_stream), GFP_KERNEL);
+	if (!master->streams)
+		return -ENOMEM;
+	master->num_streams = fwspec->num_ids;
+
+	mutex_lock(&smmu->streams_mutex);
+	for (i = 0; i < fwspec->num_ids && !ret; i++) {
+		u32 sid = fwspec->ids[i];
+
+		new_stream = &master->streams[i];
+		new_stream->id = sid;
+		new_stream->master = master;
+
+		/*
+		 * Check the SIDs are in range of the SMMU and our stream table
+		 */
+		if (!arm_smmu_sid_in_range(smmu, sid)) {
+			ret = -ERANGE;
+			break;
+		}
+
+		/* Ensure l2 strtab is initialised */
+		if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
+			ret = arm_smmu_init_l2_strtab(smmu, sid);
+			if (ret)
+				break;
+		}
+
+		/* Insert into SID tree */
+		new_node = &(smmu->streams.rb_node);
+		while (*new_node) {
+			cur_stream = rb_entry(*new_node, struct arm_smmu_stream,
+					      node);
+			parent_node = *new_node;
+			if (cur_stream->id > new_stream->id) {
+				new_node = &((*new_node)->rb_left);
+			} else if (cur_stream->id < new_stream->id) {
+				new_node = &((*new_node)->rb_right);
+			} else {
+				dev_warn(master->dev,
+					 "stream %u already in tree\n",
+					 cur_stream->id);
+				ret = -EINVAL;
+				break;
+			}
+		}
+
+		if (!ret) {
+			rb_link_node(&new_stream->node, parent_node, new_node);
+			rb_insert_color(&new_stream->node, &smmu->streams);
+		}
+	}
+
+	if (ret) {
+		for (; i > 0; i--)
+			rb_erase(&master->streams[i].node, &smmu->streams);
+		kfree(master->streams);
+	}
+	mutex_unlock(&smmu->streams_mutex);
+
+	return ret;
+}
+
+static void arm_smmu_remove_master(struct arm_smmu_master *master)
+{
+	int i;
+	struct arm_smmu_device *smmu = master->smmu;
+	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(master->dev);
+
+	if (!smmu || !master->streams)
+		return;
+
+	mutex_lock(&smmu->streams_mutex);
+	for (i = 0; i < fwspec->num_ids; i++)
+		rb_erase(&master->streams[i].node, &smmu->streams);
+	mutex_unlock(&smmu->streams_mutex);
+
+	kfree(master->streams);
+}
+
 static struct iommu_ops arm_smmu_ops;
 
 static struct iommu_device *arm_smmu_probe_device(struct device *dev)
 {
-	int i, ret;
+	int ret;
 	struct arm_smmu_device *smmu;
 	struct arm_smmu_master *master;
 	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
@@ -2344,27 +2460,12 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
 
 	master->dev = dev;
 	master->smmu = smmu;
-	master->sids = fwspec->ids;
-	master->num_sids = fwspec->num_ids;
 	INIT_LIST_HEAD(&master->bonds);
 	dev_iommu_priv_set(dev, master);
 
-	/* Check the SIDs are in range of the SMMU and our stream table */
-	for (i = 0; i < master->num_sids; i++) {
-		u32 sid = master->sids[i];
-
-		if (!arm_smmu_sid_in_range(smmu, sid)) {
-			ret = -ERANGE;
-			goto err_free_master;
-		}
-
-		/* Ensure l2 strtab is initialised */
-		if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
-			ret = arm_smmu_init_l2_strtab(smmu, sid);
-			if (ret)
-				goto err_free_master;
-		}
-	}
+	ret = arm_smmu_insert_master(smmu, master);
+	if (ret)
+		goto err_free_master;
 
 	device_property_read_u32(dev, "pasid-num-bits", &master->ssid_bits);
 	master->ssid_bits = min(smmu->ssid_bits, master->ssid_bits);
@@ -2403,6 +2504,7 @@ static void arm_smmu_release_device(struct device *dev)
 	WARN_ON(arm_smmu_master_sva_enabled(master));
 	arm_smmu_detach_dev(master);
 	arm_smmu_disable_pasid(master);
+	arm_smmu_remove_master(master);
 	kfree(master);
 	iommu_fwspec_free(dev);
 }
@@ -2825,6 +2927,9 @@ static int arm_smmu_init_structures(struct arm_smmu_device *smmu)
 {
 	int ret;
 
+	mutex_init(&smmu->streams_mutex);
+	smmu->streams = RB_ROOT;
+
 	ret = arm_smmu_init_queues(smmu);
 	if (ret)
 		return ret;
-- 
2.29.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 08/10] dt-bindings: document stall property for IOMMU masters
  2021-01-08 14:52 ` Jean-Philippe Brucker
  (?)
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: lorenzo.pieralisi, robh+dt, guohanjun, sudeep.holla, rjw, lenb,
	robin.murphy, Jonathan.Cameron, eric.auger, iommu, devicetree,
	linux-acpi, linux-arm-kernel, linux-accelerators, baolu.lu,
	vdumpa, zhangfei.gao, shameerali.kolothum.thodi, vivek.gautam,
	Jean-Philippe Brucker, Rob Herring

On ARM systems, some platform devices behind an IOMMU may support stall,
which is the ability to recover from page faults. Let the firmware tell us
when a device supports stall.

Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 .../devicetree/bindings/iommu/iommu.txt        | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt b/Documentation/devicetree/bindings/iommu/iommu.txt
index 3c36334e4f94..26ba9e530f13 100644
--- a/Documentation/devicetree/bindings/iommu/iommu.txt
+++ b/Documentation/devicetree/bindings/iommu/iommu.txt
@@ -92,6 +92,24 @@ Optional properties:
   tagging DMA transactions with an address space identifier. By default,
   this is 0, which means that the device only has one address space.
 
+- dma-can-stall: When present, the master can wait for a transaction to
+  complete for an indefinite amount of time. Upon translation fault some
+  IOMMUs, instead of aborting the translation immediately, may first
+  notify the driver and keep the transaction in flight. This allows the OS
+  to inspect the fault and, for example, make physical pages resident
+  before updating the mappings and completing the transaction. Such IOMMU
+  accepts a limited number of simultaneous stalled transactions before
+  having to either put back-pressure on the master, or abort new faulting
+  transactions.
+
+  Firmware has to opt-in stalling, because most buses and masters don't
+  support it. In particular it isn't compatible with PCI, where
+  transactions have to complete before a time limit. More generally it
+  won't work in systems and masters that haven't been designed for
+  stalling. For example the OS, in order to handle a stalled transaction,
+  may attempt to retrieve pages from secondary storage in a stalled
+  domain, leading to a deadlock.
+
 
 Notes:
 ======
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 08/10] dt-bindings: document stall property for IOMMU masters
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: devicetree, linux-acpi, Jean-Philippe Brucker, guohanjun, rjw,
	iommu, robh+dt, linux-accelerators, sudeep.holla, vivek.gautam,
	zhangfei.gao, robin.murphy, Rob Herring, linux-arm-kernel, lenb

On ARM systems, some platform devices behind an IOMMU may support stall,
which is the ability to recover from page faults. Let the firmware tell us
when a device supports stall.

Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 .../devicetree/bindings/iommu/iommu.txt        | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt b/Documentation/devicetree/bindings/iommu/iommu.txt
index 3c36334e4f94..26ba9e530f13 100644
--- a/Documentation/devicetree/bindings/iommu/iommu.txt
+++ b/Documentation/devicetree/bindings/iommu/iommu.txt
@@ -92,6 +92,24 @@ Optional properties:
   tagging DMA transactions with an address space identifier. By default,
   this is 0, which means that the device only has one address space.
 
+- dma-can-stall: When present, the master can wait for a transaction to
+  complete for an indefinite amount of time. Upon translation fault some
+  IOMMUs, instead of aborting the translation immediately, may first
+  notify the driver and keep the transaction in flight. This allows the OS
+  to inspect the fault and, for example, make physical pages resident
+  before updating the mappings and completing the transaction. Such IOMMU
+  accepts a limited number of simultaneous stalled transactions before
+  having to either put back-pressure on the master, or abort new faulting
+  transactions.
+
+  Firmware has to opt-in stalling, because most buses and masters don't
+  support it. In particular it isn't compatible with PCI, where
+  transactions have to complete before a time limit. More generally it
+  won't work in systems and masters that haven't been designed for
+  stalling. For example the OS, in order to handle a stalled transaction,
+  may attempt to retrieve pages from secondary storage in a stalled
+  domain, leading to a deadlock.
+
 
 Notes:
 ======
-- 
2.29.2

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 08/10] dt-bindings: document stall property for IOMMU masters
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: devicetree, lorenzo.pieralisi, linux-acpi, Jean-Philippe Brucker,
	guohanjun, rjw, shameerali.kolothum.thodi, eric.auger, iommu,
	robh+dt, linux-accelerators, Jonathan.Cameron, sudeep.holla,
	vivek.gautam, zhangfei.gao, baolu.lu, robin.murphy,
	linux-arm-kernel, lenb

On ARM systems, some platform devices behind an IOMMU may support stall,
which is the ability to recover from page faults. Let the firmware tell us
when a device supports stall.

Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
 .../devicetree/bindings/iommu/iommu.txt        | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt b/Documentation/devicetree/bindings/iommu/iommu.txt
index 3c36334e4f94..26ba9e530f13 100644
--- a/Documentation/devicetree/bindings/iommu/iommu.txt
+++ b/Documentation/devicetree/bindings/iommu/iommu.txt
@@ -92,6 +92,24 @@ Optional properties:
   tagging DMA transactions with an address space identifier. By default,
   this is 0, which means that the device only has one address space.
 
+- dma-can-stall: When present, the master can wait for a transaction to
+  complete for an indefinite amount of time. Upon translation fault some
+  IOMMUs, instead of aborting the translation immediately, may first
+  notify the driver and keep the transaction in flight. This allows the OS
+  to inspect the fault and, for example, make physical pages resident
+  before updating the mappings and completing the transaction. Such IOMMU
+  accepts a limited number of simultaneous stalled transactions before
+  having to either put back-pressure on the master, or abort new faulting
+  transactions.
+
+  Firmware has to opt-in stalling, because most buses and masters don't
+  support it. In particular it isn't compatible with PCI, where
+  transactions have to complete before a time limit. More generally it
+  won't work in systems and masters that haven't been designed for
+  stalling. For example the OS, in order to handle a stalled transaction,
+  may attempt to retrieve pages from secondary storage in a stalled
+  domain, leading to a deadlock.
+
 
 Notes:
 ======
-- 
2.29.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 09/10] ACPI/IORT: Enable stall support for platform devices
  2021-01-08 14:52 ` Jean-Philippe Brucker
  (?)
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: lorenzo.pieralisi, robh+dt, guohanjun, sudeep.holla, rjw, lenb,
	robin.murphy, Jonathan.Cameron, eric.auger, iommu, devicetree,
	linux-acpi, linux-arm-kernel, linux-accelerators, baolu.lu,
	vdumpa, zhangfei.gao, shameerali.kolothum.thodi, vivek.gautam,
	Jean-Philippe Brucker

Copy the "Stall supported" bit, that tells whether a named component
supports stall, into the dma-can-stall device property.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
v9: dropped fwspec member in favor of device properties
---
 drivers/acpi/arm64/iort.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index c9a8bbb74b09..42820d7eb869 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -968,13 +968,15 @@ static int iort_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
 static void iort_named_component_init(struct device *dev,
 				      struct acpi_iort_node *node)
 {
-	struct property_entry props[2] = {};
+	struct property_entry props[3] = {};
 	struct acpi_iort_named_component *nc;
 
 	nc = (struct acpi_iort_named_component *)node->node_data;
 	props[0] = PROPERTY_ENTRY_U32("pasid-num-bits",
 				      FIELD_GET(ACPI_IORT_NC_PASID_BITS,
 						nc->node_flags));
+	if (nc->node_flags & ACPI_IORT_NC_STALL_SUPPORTED)
+		props[1] = PROPERTY_ENTRY_BOOL("dma-can-stall");
 
 	if (device_add_properties(dev, props))
 		dev_warn(dev, "Could not add device properties\n");
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 09/10] ACPI/IORT: Enable stall support for platform devices
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: devicetree, linux-acpi, Jean-Philippe Brucker, guohanjun, rjw,
	iommu, robh+dt, linux-accelerators, sudeep.holla, vivek.gautam,
	zhangfei.gao, robin.murphy, linux-arm-kernel, lenb

Copy the "Stall supported" bit, that tells whether a named component
supports stall, into the dma-can-stall device property.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
v9: dropped fwspec member in favor of device properties
---
 drivers/acpi/arm64/iort.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index c9a8bbb74b09..42820d7eb869 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -968,13 +968,15 @@ static int iort_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
 static void iort_named_component_init(struct device *dev,
 				      struct acpi_iort_node *node)
 {
-	struct property_entry props[2] = {};
+	struct property_entry props[3] = {};
 	struct acpi_iort_named_component *nc;
 
 	nc = (struct acpi_iort_named_component *)node->node_data;
 	props[0] = PROPERTY_ENTRY_U32("pasid-num-bits",
 				      FIELD_GET(ACPI_IORT_NC_PASID_BITS,
 						nc->node_flags));
+	if (nc->node_flags & ACPI_IORT_NC_STALL_SUPPORTED)
+		props[1] = PROPERTY_ENTRY_BOOL("dma-can-stall");
 
 	if (device_add_properties(dev, props))
 		dev_warn(dev, "Could not add device properties\n");
-- 
2.29.2

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 09/10] ACPI/IORT: Enable stall support for platform devices
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: devicetree, lorenzo.pieralisi, linux-acpi, Jean-Philippe Brucker,
	guohanjun, rjw, shameerali.kolothum.thodi, eric.auger, iommu,
	robh+dt, linux-accelerators, Jonathan.Cameron, sudeep.holla,
	vivek.gautam, zhangfei.gao, baolu.lu, robin.murphy,
	linux-arm-kernel, lenb

Copy the "Stall supported" bit, that tells whether a named component
supports stall, into the dma-can-stall device property.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
v9: dropped fwspec member in favor of device properties
---
 drivers/acpi/arm64/iort.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index c9a8bbb74b09..42820d7eb869 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -968,13 +968,15 @@ static int iort_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
 static void iort_named_component_init(struct device *dev,
 				      struct acpi_iort_node *node)
 {
-	struct property_entry props[2] = {};
+	struct property_entry props[3] = {};
 	struct acpi_iort_named_component *nc;
 
 	nc = (struct acpi_iort_named_component *)node->node_data;
 	props[0] = PROPERTY_ENTRY_U32("pasid-num-bits",
 				      FIELD_GET(ACPI_IORT_NC_PASID_BITS,
 						nc->node_flags));
+	if (nc->node_flags & ACPI_IORT_NC_STALL_SUPPORTED)
+		props[1] = PROPERTY_ENTRY_BOOL("dma-can-stall");
 
 	if (device_add_properties(dev, props))
 		dev_warn(dev, "Could not add device properties\n");
-- 
2.29.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 10/10] iommu/arm-smmu-v3: Add stall support for platform devices
  2021-01-08 14:52 ` Jean-Philippe Brucker
  (?)
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: lorenzo.pieralisi, robh+dt, guohanjun, sudeep.holla, rjw, lenb,
	robin.murphy, Jonathan.Cameron, eric.auger, iommu, devicetree,
	linux-acpi, linux-arm-kernel, linux-accelerators, baolu.lu,
	vdumpa, zhangfei.gao, shameerali.kolothum.thodi, vivek.gautam,
	Jean-Philippe Brucker

The SMMU provides a Stall model for handling page faults in platform
devices. It is similar to PCIe PRI, but doesn't require devices to have
their own translation cache. Instead, faulting transactions are parked
and the OS is given a chance to fix the page tables and retry the
transaction.

Enable stall for devices that support it (opt-in by firmware). When an
event corresponds to a translation error, call the IOMMU fault handler.
If the fault is recoverable, it will call us back to terminate or
continue the stall.

To use stall device drivers need to enable IOMMU_DEV_FEAT_IOPF, which
initializes the fault queue for the device.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
v9: Add IOMMU_DEV_FEAT_IOPF
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  61 ++++++
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  70 ++++++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 192 ++++++++++++++++--
 3 files changed, 306 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 8ef6a1c48635..cb129870ef55 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -354,6 +354,13 @@
 #define CMDQ_PRI_1_GRPID		GENMASK_ULL(8, 0)
 #define CMDQ_PRI_1_RESP			GENMASK_ULL(13, 12)
 
+#define CMDQ_RESUME_0_SID		GENMASK_ULL(63, 32)
+#define CMDQ_RESUME_0_RESP_TERM		0UL
+#define CMDQ_RESUME_0_RESP_RETRY	1UL
+#define CMDQ_RESUME_0_RESP_ABORT	2UL
+#define CMDQ_RESUME_0_RESP		GENMASK_ULL(13, 12)
+#define CMDQ_RESUME_1_STAG		GENMASK_ULL(15, 0)
+
 #define CMDQ_SYNC_0_CS			GENMASK_ULL(13, 12)
 #define CMDQ_SYNC_0_CS_NONE		0
 #define CMDQ_SYNC_0_CS_IRQ		1
@@ -370,6 +377,25 @@
 
 #define EVTQ_0_ID			GENMASK_ULL(7, 0)
 
+#define EVT_ID_TRANSLATION_FAULT	0x10
+#define EVT_ID_ADDR_SIZE_FAULT		0x11
+#define EVT_ID_ACCESS_FAULT		0x12
+#define EVT_ID_PERMISSION_FAULT		0x13
+
+#define EVTQ_0_SSV			(1UL << 11)
+#define EVTQ_0_SSID			GENMASK_ULL(31, 12)
+#define EVTQ_0_SID			GENMASK_ULL(63, 32)
+#define EVTQ_1_STAG			GENMASK_ULL(15, 0)
+#define EVTQ_1_STALL			(1UL << 31)
+#define EVTQ_1_PRIV			(1UL << 33)
+#define EVTQ_1_EXEC			(1UL << 34)
+#define EVTQ_1_READ			(1UL << 35)
+#define EVTQ_1_S2			(1UL << 39)
+#define EVTQ_1_CLASS			GENMASK_ULL(41, 40)
+#define EVTQ_1_TT_READ			(1UL << 44)
+#define EVTQ_2_ADDR			GENMASK_ULL(63, 0)
+#define EVTQ_3_IPA			GENMASK_ULL(51, 12)
+
 /* PRI queue */
 #define PRIQ_ENT_SZ_SHIFT		4
 #define PRIQ_ENT_DWORDS			((1 << PRIQ_ENT_SZ_SHIFT) >> 3)
@@ -462,6 +488,13 @@ struct arm_smmu_cmdq_ent {
 			enum pri_resp		resp;
 		} pri;
 
+		#define CMDQ_OP_RESUME		0x44
+		struct {
+			u32			sid;
+			u16			stag;
+			u8			resp;
+		} resume;
+
 		#define CMDQ_OP_CMD_SYNC	0x46
 		struct {
 			u64			msiaddr;
@@ -520,6 +553,7 @@ struct arm_smmu_cmdq_batch {
 
 struct arm_smmu_evtq {
 	struct arm_smmu_queue		q;
+	struct iopf_queue		*iopf;
 	u32				max_stalls;
 };
 
@@ -656,7 +690,9 @@ struct arm_smmu_master {
 	struct arm_smmu_stream		*streams;
 	unsigned int			num_streams;
 	bool				ats_enabled;
+	bool				stall_enabled;
 	bool				sva_enabled;
+	bool				iopf_enabled;
 	struct list_head		bonds;
 	unsigned int			ssid_bits;
 };
@@ -675,6 +711,7 @@ struct arm_smmu_domain {
 
 	struct io_pgtable_ops		*pgtbl_ops;
 	bool				non_strict;
+	bool				stall_enabled;
 	atomic_t			nr_ats_masters;
 
 	enum arm_smmu_domain_stage	stage;
@@ -713,6 +750,10 @@ bool arm_smmu_master_sva_supported(struct arm_smmu_master *master);
 bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master);
 int arm_smmu_master_enable_sva(struct arm_smmu_master *master);
 int arm_smmu_master_disable_sva(struct arm_smmu_master *master);
+bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master);
+bool arm_smmu_master_iopf_enabled(struct arm_smmu_master *master);
+int arm_smmu_master_enable_iopf(struct arm_smmu_master *master);
+int arm_smmu_master_disable_iopf(struct arm_smmu_master *master);
 struct iommu_sva *arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm,
 				    void *drvdata);
 void arm_smmu_sva_unbind(struct iommu_sva *handle);
@@ -744,6 +785,26 @@ static inline int arm_smmu_master_disable_sva(struct arm_smmu_master *master)
 	return -ENODEV;
 }
 
+static inline bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master)
+{
+	return false;
+}
+
+static inline bool arm_smmu_master_iopf_enabled(struct arm_smmu_master *master)
+{
+	return false;
+}
+
+static inline int arm_smmu_master_enable_iopf(struct arm_smmu_master *master)
+{
+	return -ENODEV;
+}
+
+static inline int arm_smmu_master_disable_iopf(struct arm_smmu_master *master)
+{
+	return -ENODEV;
+}
+
 static inline struct iommu_sva *
 arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
 {
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index e13b092e6004..17acfee4f484 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -431,9 +431,9 @@ bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
 	return true;
 }
 
-static bool arm_smmu_iopf_supported(struct arm_smmu_master *master)
+bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master)
 {
-	return false;
+	return master->stall_enabled;
 }
 
 bool arm_smmu_master_sva_supported(struct arm_smmu_master *master)
@@ -441,8 +441,18 @@ bool arm_smmu_master_sva_supported(struct arm_smmu_master *master)
 	if (!(master->smmu->features & ARM_SMMU_FEAT_SVA))
 		return false;
 
-	/* SSID and IOPF support are mandatory for the moment */
-	return master->ssid_bits && arm_smmu_iopf_supported(master);
+	/* SSID support is mandatory for the moment */
+	return master->ssid_bits;
+}
+
+bool arm_smmu_master_iopf_enabled(struct arm_smmu_master *master)
+{
+	bool enabled;
+
+	mutex_lock(&sva_lock);
+	enabled = master->iopf_enabled;
+	mutex_unlock(&sva_lock);
+	return enabled;
 }
 
 bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master)
@@ -455,15 +465,67 @@ bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master)
 	return enabled;
 }
 
+int arm_smmu_master_enable_iopf(struct arm_smmu_master *master)
+{
+	int ret;
+	struct device *dev = master->dev;
+
+	mutex_lock(&sva_lock);
+	if (master->stall_enabled) {
+		ret = iopf_queue_add_device(master->smmu->evtq.iopf, dev);
+		if (ret)
+			goto err_unlock;
+	}
+
+	ret = iommu_register_device_fault_handler(dev, iommu_queue_iopf, dev);
+	if (ret)
+		goto err_remove_device;
+	master->iopf_enabled = true;
+	mutex_unlock(&sva_lock);
+	return 0;
+
+err_remove_device:
+	iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
+err_unlock:
+	mutex_unlock(&sva_lock);
+	return ret;
+}
+
 int arm_smmu_master_enable_sva(struct arm_smmu_master *master)
 {
 	mutex_lock(&sva_lock);
+	/*
+	 * Drivers for devices supporting PRI or stall should enable IOPF first.
+	 * Others have device-specific fault handlers and don't need IOPF, so
+	 * this sanity check is a bit basic.
+	 */
+	if (arm_smmu_master_iopf_supported(master) && !master->iopf_enabled) {
+		mutex_unlock(&sva_lock);
+		return -EINVAL;
+	}
 	master->sva_enabled = true;
 	mutex_unlock(&sva_lock);
 
 	return 0;
 }
 
+int arm_smmu_master_disable_iopf(struct arm_smmu_master *master)
+{
+	struct device *dev = master->dev;
+
+	mutex_lock(&sva_lock);
+	if (master->sva_enabled) {
+		mutex_unlock(&sva_lock);
+		return -EBUSY;
+	}
+
+	iommu_unregister_device_fault_handler(dev);
+	iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
+	master->iopf_enabled = false;
+	mutex_unlock(&sva_lock);
+	return 0;
+}
+
 int arm_smmu_master_disable_sva(struct arm_smmu_master *master)
 {
 	mutex_lock(&sva_lock);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 2dbae2e6965d..1fea11d65cd3 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -32,6 +32,7 @@
 #include <linux/amba/bus.h>
 
 #include "arm-smmu-v3.h"
+#include "../../iommu-sva-lib.h"
 
 static bool disable_bypass = true;
 module_param(disable_bypass, bool, 0444);
@@ -319,6 +320,11 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
 		}
 		cmd[1] |= FIELD_PREP(CMDQ_PRI_1_RESP, ent->pri.resp);
 		break;
+	case CMDQ_OP_RESUME:
+		cmd[0] |= FIELD_PREP(CMDQ_RESUME_0_SID, ent->resume.sid);
+		cmd[0] |= FIELD_PREP(CMDQ_RESUME_0_RESP, ent->resume.resp);
+		cmd[1] |= FIELD_PREP(CMDQ_RESUME_1_STAG, ent->resume.stag);
+		break;
 	case CMDQ_OP_CMD_SYNC:
 		if (ent->sync.msiaddr) {
 			cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, CMDQ_SYNC_0_CS_IRQ);
@@ -882,6 +888,44 @@ static int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu,
 	return arm_smmu_cmdq_issue_cmdlist(smmu, cmds->cmds, cmds->num, true);
 }
 
+static int arm_smmu_page_response(struct device *dev,
+				  struct iommu_fault_event *unused,
+				  struct iommu_page_response *resp)
+{
+	struct arm_smmu_cmdq_ent cmd = {0};
+	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+	int sid = master->streams[0].id;
+
+	if (master->stall_enabled) {
+		cmd.opcode		= CMDQ_OP_RESUME;
+		cmd.resume.sid		= sid;
+		cmd.resume.stag		= resp->grpid;
+		switch (resp->code) {
+		case IOMMU_PAGE_RESP_INVALID:
+		case IOMMU_PAGE_RESP_FAILURE:
+			cmd.resume.resp = CMDQ_RESUME_0_RESP_ABORT;
+			break;
+		case IOMMU_PAGE_RESP_SUCCESS:
+			cmd.resume.resp = CMDQ_RESUME_0_RESP_RETRY;
+			break;
+		default:
+			return -EINVAL;
+		}
+	} else {
+		return -ENODEV;
+	}
+
+	arm_smmu_cmdq_issue_cmd(master->smmu, &cmd);
+	/*
+	 * Don't send a SYNC, it doesn't do anything for RESUME or PRI_RESP.
+	 * RESUME consumption guarantees that the stalled transaction will be
+	 * terminated... at some point in the future. PRI_RESP is fire and
+	 * forget.
+	 */
+
+	return 0;
+}
+
 /* Context descriptor manipulation functions */
 void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid)
 {
@@ -991,7 +1035,6 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
 	u64 val;
 	bool cd_live;
 	__le64 *cdptr;
-	struct arm_smmu_device *smmu = smmu_domain->smmu;
 
 	if (WARN_ON(ssid >= (1 << smmu_domain->s1_cfg.s1cdmax)))
 		return -E2BIG;
@@ -1036,8 +1079,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
 			FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid) |
 			CTXDESC_CD_0_V;
 
-		/* STALL_MODEL==0b10 && CD.S==0 is ILLEGAL */
-		if (smmu->features & ARM_SMMU_FEAT_STALL_FORCE)
+		if (smmu_domain->stall_enabled)
 			val |= CTXDESC_CD_0_S;
 	}
 
@@ -1278,7 +1320,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 			 FIELD_PREP(STRTAB_STE_1_STRW, STRTAB_STE_1_STRW_NSEL1));
 
 		if (smmu->features & ARM_SMMU_FEAT_STALLS &&
-		   !(smmu->features & ARM_SMMU_FEAT_STALL_FORCE))
+		    !master->stall_enabled)
 			dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
 
 		val |= (s1_cfg->cdcfg.cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
@@ -1355,7 +1397,6 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
 	return 0;
 }
 
-__maybe_unused
 static struct arm_smmu_master *
 arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid)
 {
@@ -1382,9 +1423,96 @@ arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid)
 }
 
 /* IRQ and event handlers */
+static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
+{
+	int ret;
+	u32 perm = 0;
+	struct arm_smmu_master *master;
+	bool ssid_valid = evt[0] & EVTQ_0_SSV;
+	u8 type = FIELD_GET(EVTQ_0_ID, evt[0]);
+	u32 sid = FIELD_GET(EVTQ_0_SID, evt[0]);
+	struct iommu_fault_event fault_evt = { };
+	struct iommu_fault *flt = &fault_evt.fault;
+
+	/* Stage-2 is always pinned at the moment */
+	if (evt[1] & EVTQ_1_S2)
+		return -EFAULT;
+
+	master = arm_smmu_find_master(smmu, sid);
+	if (!master)
+		return -EINVAL;
+
+	if (evt[1] & EVTQ_1_READ)
+		perm |= IOMMU_FAULT_PERM_READ;
+	else
+		perm |= IOMMU_FAULT_PERM_WRITE;
+
+	if (evt[1] & EVTQ_1_EXEC)
+		perm |= IOMMU_FAULT_PERM_EXEC;
+
+	if (evt[1] & EVTQ_1_PRIV)
+		perm |= IOMMU_FAULT_PERM_PRIV;
+
+	if (evt[1] & EVTQ_1_STALL) {
+		flt->type = IOMMU_FAULT_PAGE_REQ;
+		flt->prm = (struct iommu_fault_page_request) {
+			.flags = IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE,
+			.grpid = FIELD_GET(EVTQ_1_STAG, evt[1]),
+			.perm = perm,
+			.addr = FIELD_GET(EVTQ_2_ADDR, evt[2]),
+		};
+
+		if (ssid_valid) {
+			flt->prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
+			flt->prm.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]);
+		}
+	} else {
+		flt->type = IOMMU_FAULT_DMA_UNRECOV;
+		flt->event = (struct iommu_fault_unrecoverable) {
+			.flags = IOMMU_FAULT_UNRECOV_ADDR_VALID |
+				 IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID,
+			.perm = perm,
+			.addr = FIELD_GET(EVTQ_2_ADDR, evt[2]),
+			.fetch_addr = FIELD_GET(EVTQ_3_IPA, evt[3]),
+		};
+
+		if (ssid_valid) {
+			flt->event.flags |= IOMMU_FAULT_UNRECOV_PASID_VALID;
+			flt->event.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]);
+		}
+
+		switch (type) {
+		case EVT_ID_TRANSLATION_FAULT:
+		case EVT_ID_ADDR_SIZE_FAULT:
+		case EVT_ID_ACCESS_FAULT:
+			flt->event.reason = IOMMU_FAULT_REASON_PTE_FETCH;
+			break;
+		case EVT_ID_PERMISSION_FAULT:
+			flt->event.reason = IOMMU_FAULT_REASON_PERMISSION;
+			break;
+		default:
+			/* TODO: report other unrecoverable faults. */
+			return -EFAULT;
+		}
+	}
+
+	ret = iommu_report_device_fault(master->dev, &fault_evt);
+	if (ret && flt->type == IOMMU_FAULT_PAGE_REQ) {
+		/* Nobody cared, abort the access */
+		struct iommu_page_response resp = {
+			.pasid		= flt->prm.pasid,
+			.grpid		= flt->prm.grpid,
+			.code		= IOMMU_PAGE_RESP_FAILURE,
+		};
+		arm_smmu_page_response(master->dev, NULL, &resp);
+	}
+
+	return ret;
+}
+
 static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
 {
-	int i;
+	int i, ret;
 	struct arm_smmu_device *smmu = dev;
 	struct arm_smmu_queue *q = &smmu->evtq.q;
 	struct arm_smmu_ll_queue *llq = &q->llq;
@@ -1394,11 +1522,14 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
 		while (!queue_remove_raw(q, evt)) {
 			u8 id = FIELD_GET(EVTQ_0_ID, evt[0]);
 
-			dev_info(smmu->dev, "event 0x%02x received:\n", id);
-			for (i = 0; i < ARRAY_SIZE(evt); ++i)
-				dev_info(smmu->dev, "\t0x%016llx\n",
-					 (unsigned long long)evt[i]);
-
+			ret = arm_smmu_handle_evt(smmu, evt);
+			if (ret) {
+				dev_info(smmu->dev, "event 0x%02x received:\n",
+					 id);
+				for (i = 0; i < ARRAY_SIZE(evt); ++i)
+					dev_info(smmu->dev, "\t0x%016llx\n",
+						 (unsigned long long)evt[i]);
+			}
 		}
 
 		/*
@@ -1903,6 +2034,8 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
 
 	cfg->s1cdmax = master->ssid_bits;
 
+	smmu_domain->stall_enabled = master->stall_enabled;
+
 	ret = arm_smmu_alloc_cd_tables(smmu_domain);
 	if (ret)
 		goto out_free_asid;
@@ -2250,6 +2383,12 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 			smmu_domain->s1_cfg.s1cdmax, master->ssid_bits);
 		ret = -EINVAL;
 		goto out_unlock;
+	} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
+		   smmu_domain->stall_enabled != master->stall_enabled) {
+		dev_err(dev, "cannot attach to stall-%s domain\n",
+			smmu_domain->stall_enabled ? "enabled" : "disabled");
+		ret = -EINVAL;
+		goto out_unlock;
 	}
 
 	master->domain = smmu_domain;
@@ -2484,6 +2623,11 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
 		master->ssid_bits = min_t(u8, master->ssid_bits,
 					  CTXDESC_LINEAR_CDMAX);
 
+	if ((smmu->features & ARM_SMMU_FEAT_STALLS &&
+	     device_property_read_bool(dev, "dma-can-stall")) ||
+	    smmu->features & ARM_SMMU_FEAT_STALL_FORCE)
+		master->stall_enabled = true;
+
 	return &smmu->iommu;
 
 err_free_master:
@@ -2502,6 +2646,7 @@ static void arm_smmu_release_device(struct device *dev)
 
 	master = dev_iommu_priv_get(dev);
 	WARN_ON(arm_smmu_master_sva_enabled(master));
+	iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
 	arm_smmu_detach_dev(master);
 	arm_smmu_disable_pasid(master);
 	arm_smmu_remove_master(master);
@@ -2629,6 +2774,8 @@ static bool arm_smmu_dev_has_feature(struct device *dev,
 		return false;
 
 	switch (feat) {
+	case IOMMU_DEV_FEAT_IOPF:
+		return arm_smmu_master_iopf_supported(master);
 	case IOMMU_DEV_FEAT_SVA:
 		return arm_smmu_master_sva_supported(master);
 	default:
@@ -2645,6 +2792,8 @@ static bool arm_smmu_dev_feature_enabled(struct device *dev,
 		return false;
 
 	switch (feat) {
+	case IOMMU_DEV_FEAT_IOPF:
+		return arm_smmu_master_iopf_enabled(master);
 	case IOMMU_DEV_FEAT_SVA:
 		return arm_smmu_master_sva_enabled(master);
 	default:
@@ -2655,6 +2804,8 @@ static bool arm_smmu_dev_feature_enabled(struct device *dev,
 static int arm_smmu_dev_enable_feature(struct device *dev,
 				       enum iommu_dev_features feat)
 {
+	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+
 	if (!arm_smmu_dev_has_feature(dev, feat))
 		return -ENODEV;
 
@@ -2662,8 +2813,10 @@ static int arm_smmu_dev_enable_feature(struct device *dev,
 		return -EBUSY;
 
 	switch (feat) {
+	case IOMMU_DEV_FEAT_IOPF:
+		return arm_smmu_master_enable_iopf(master);
 	case IOMMU_DEV_FEAT_SVA:
-		return arm_smmu_master_enable_sva(dev_iommu_priv_get(dev));
+		return arm_smmu_master_enable_sva(master);
 	default:
 		return -EINVAL;
 	}
@@ -2672,12 +2825,16 @@ static int arm_smmu_dev_enable_feature(struct device *dev,
 static int arm_smmu_dev_disable_feature(struct device *dev,
 					enum iommu_dev_features feat)
 {
+	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+
 	if (!arm_smmu_dev_feature_enabled(dev, feat))
 		return -EINVAL;
 
 	switch (feat) {
+	case IOMMU_DEV_FEAT_IOPF:
+		return arm_smmu_master_disable_iopf(master);
 	case IOMMU_DEV_FEAT_SVA:
-		return arm_smmu_master_disable_sva(dev_iommu_priv_get(dev));
+		return arm_smmu_master_disable_sva(master);
 	default:
 		return -EINVAL;
 	}
@@ -2708,6 +2865,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.sva_bind		= arm_smmu_sva_bind,
 	.sva_unbind		= arm_smmu_sva_unbind,
 	.sva_get_pasid		= arm_smmu_sva_get_pasid,
+	.page_response		= arm_smmu_page_response,
 	.pgsize_bitmap		= -1UL, /* Restricted during device attach */
 };
 
@@ -2785,6 +2943,7 @@ static int arm_smmu_cmdq_init(struct arm_smmu_device *smmu)
 static int arm_smmu_init_queues(struct arm_smmu_device *smmu)
 {
 	int ret;
+	bool sva = arm_smmu_sva_supported(smmu);
 
 	/* cmdq */
 	ret = arm_smmu_init_one_queue(smmu, &smmu->cmdq.q, ARM_SMMU_CMDQ_PROD,
@@ -2804,6 +2963,12 @@ static int arm_smmu_init_queues(struct arm_smmu_device *smmu)
 	if (ret)
 		return ret;
 
+	if (sva && smmu->features & ARM_SMMU_FEAT_STALLS) {
+		smmu->evtq.iopf = iopf_queue_alloc(dev_name(smmu->dev));
+		if (!smmu->evtq.iopf)
+			return -ENOMEM;
+	}
+
 	/* priq */
 	if (!(smmu->features & ARM_SMMU_FEAT_PRI))
 		return 0;
@@ -3718,6 +3883,7 @@ static int arm_smmu_device_remove(struct platform_device *pdev)
 	iommu_device_unregister(&smmu->iommu);
 	iommu_device_sysfs_remove(&smmu->iommu);
 	arm_smmu_device_disable(smmu);
+	iopf_queue_free(smmu->evtq.iopf);
 
 	return 0;
 }
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 10/10] iommu/arm-smmu-v3: Add stall support for platform devices
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: devicetree, linux-acpi, Jean-Philippe Brucker, guohanjun, rjw,
	iommu, robh+dt, linux-accelerators, sudeep.holla, vivek.gautam,
	zhangfei.gao, robin.murphy, linux-arm-kernel, lenb

The SMMU provides a Stall model for handling page faults in platform
devices. It is similar to PCIe PRI, but doesn't require devices to have
their own translation cache. Instead, faulting transactions are parked
and the OS is given a chance to fix the page tables and retry the
transaction.

Enable stall for devices that support it (opt-in by firmware). When an
event corresponds to a translation error, call the IOMMU fault handler.
If the fault is recoverable, it will call us back to terminate or
continue the stall.

To use stall device drivers need to enable IOMMU_DEV_FEAT_IOPF, which
initializes the fault queue for the device.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
v9: Add IOMMU_DEV_FEAT_IOPF
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  61 ++++++
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  70 ++++++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 192 ++++++++++++++++--
 3 files changed, 306 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 8ef6a1c48635..cb129870ef55 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -354,6 +354,13 @@
 #define CMDQ_PRI_1_GRPID		GENMASK_ULL(8, 0)
 #define CMDQ_PRI_1_RESP			GENMASK_ULL(13, 12)
 
+#define CMDQ_RESUME_0_SID		GENMASK_ULL(63, 32)
+#define CMDQ_RESUME_0_RESP_TERM		0UL
+#define CMDQ_RESUME_0_RESP_RETRY	1UL
+#define CMDQ_RESUME_0_RESP_ABORT	2UL
+#define CMDQ_RESUME_0_RESP		GENMASK_ULL(13, 12)
+#define CMDQ_RESUME_1_STAG		GENMASK_ULL(15, 0)
+
 #define CMDQ_SYNC_0_CS			GENMASK_ULL(13, 12)
 #define CMDQ_SYNC_0_CS_NONE		0
 #define CMDQ_SYNC_0_CS_IRQ		1
@@ -370,6 +377,25 @@
 
 #define EVTQ_0_ID			GENMASK_ULL(7, 0)
 
+#define EVT_ID_TRANSLATION_FAULT	0x10
+#define EVT_ID_ADDR_SIZE_FAULT		0x11
+#define EVT_ID_ACCESS_FAULT		0x12
+#define EVT_ID_PERMISSION_FAULT		0x13
+
+#define EVTQ_0_SSV			(1UL << 11)
+#define EVTQ_0_SSID			GENMASK_ULL(31, 12)
+#define EVTQ_0_SID			GENMASK_ULL(63, 32)
+#define EVTQ_1_STAG			GENMASK_ULL(15, 0)
+#define EVTQ_1_STALL			(1UL << 31)
+#define EVTQ_1_PRIV			(1UL << 33)
+#define EVTQ_1_EXEC			(1UL << 34)
+#define EVTQ_1_READ			(1UL << 35)
+#define EVTQ_1_S2			(1UL << 39)
+#define EVTQ_1_CLASS			GENMASK_ULL(41, 40)
+#define EVTQ_1_TT_READ			(1UL << 44)
+#define EVTQ_2_ADDR			GENMASK_ULL(63, 0)
+#define EVTQ_3_IPA			GENMASK_ULL(51, 12)
+
 /* PRI queue */
 #define PRIQ_ENT_SZ_SHIFT		4
 #define PRIQ_ENT_DWORDS			((1 << PRIQ_ENT_SZ_SHIFT) >> 3)
@@ -462,6 +488,13 @@ struct arm_smmu_cmdq_ent {
 			enum pri_resp		resp;
 		} pri;
 
+		#define CMDQ_OP_RESUME		0x44
+		struct {
+			u32			sid;
+			u16			stag;
+			u8			resp;
+		} resume;
+
 		#define CMDQ_OP_CMD_SYNC	0x46
 		struct {
 			u64			msiaddr;
@@ -520,6 +553,7 @@ struct arm_smmu_cmdq_batch {
 
 struct arm_smmu_evtq {
 	struct arm_smmu_queue		q;
+	struct iopf_queue		*iopf;
 	u32				max_stalls;
 };
 
@@ -656,7 +690,9 @@ struct arm_smmu_master {
 	struct arm_smmu_stream		*streams;
 	unsigned int			num_streams;
 	bool				ats_enabled;
+	bool				stall_enabled;
 	bool				sva_enabled;
+	bool				iopf_enabled;
 	struct list_head		bonds;
 	unsigned int			ssid_bits;
 };
@@ -675,6 +711,7 @@ struct arm_smmu_domain {
 
 	struct io_pgtable_ops		*pgtbl_ops;
 	bool				non_strict;
+	bool				stall_enabled;
 	atomic_t			nr_ats_masters;
 
 	enum arm_smmu_domain_stage	stage;
@@ -713,6 +750,10 @@ bool arm_smmu_master_sva_supported(struct arm_smmu_master *master);
 bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master);
 int arm_smmu_master_enable_sva(struct arm_smmu_master *master);
 int arm_smmu_master_disable_sva(struct arm_smmu_master *master);
+bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master);
+bool arm_smmu_master_iopf_enabled(struct arm_smmu_master *master);
+int arm_smmu_master_enable_iopf(struct arm_smmu_master *master);
+int arm_smmu_master_disable_iopf(struct arm_smmu_master *master);
 struct iommu_sva *arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm,
 				    void *drvdata);
 void arm_smmu_sva_unbind(struct iommu_sva *handle);
@@ -744,6 +785,26 @@ static inline int arm_smmu_master_disable_sva(struct arm_smmu_master *master)
 	return -ENODEV;
 }
 
+static inline bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master)
+{
+	return false;
+}
+
+static inline bool arm_smmu_master_iopf_enabled(struct arm_smmu_master *master)
+{
+	return false;
+}
+
+static inline int arm_smmu_master_enable_iopf(struct arm_smmu_master *master)
+{
+	return -ENODEV;
+}
+
+static inline int arm_smmu_master_disable_iopf(struct arm_smmu_master *master)
+{
+	return -ENODEV;
+}
+
 static inline struct iommu_sva *
 arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
 {
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index e13b092e6004..17acfee4f484 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -431,9 +431,9 @@ bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
 	return true;
 }
 
-static bool arm_smmu_iopf_supported(struct arm_smmu_master *master)
+bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master)
 {
-	return false;
+	return master->stall_enabled;
 }
 
 bool arm_smmu_master_sva_supported(struct arm_smmu_master *master)
@@ -441,8 +441,18 @@ bool arm_smmu_master_sva_supported(struct arm_smmu_master *master)
 	if (!(master->smmu->features & ARM_SMMU_FEAT_SVA))
 		return false;
 
-	/* SSID and IOPF support are mandatory for the moment */
-	return master->ssid_bits && arm_smmu_iopf_supported(master);
+	/* SSID support is mandatory for the moment */
+	return master->ssid_bits;
+}
+
+bool arm_smmu_master_iopf_enabled(struct arm_smmu_master *master)
+{
+	bool enabled;
+
+	mutex_lock(&sva_lock);
+	enabled = master->iopf_enabled;
+	mutex_unlock(&sva_lock);
+	return enabled;
 }
 
 bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master)
@@ -455,15 +465,67 @@ bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master)
 	return enabled;
 }
 
+int arm_smmu_master_enable_iopf(struct arm_smmu_master *master)
+{
+	int ret;
+	struct device *dev = master->dev;
+
+	mutex_lock(&sva_lock);
+	if (master->stall_enabled) {
+		ret = iopf_queue_add_device(master->smmu->evtq.iopf, dev);
+		if (ret)
+			goto err_unlock;
+	}
+
+	ret = iommu_register_device_fault_handler(dev, iommu_queue_iopf, dev);
+	if (ret)
+		goto err_remove_device;
+	master->iopf_enabled = true;
+	mutex_unlock(&sva_lock);
+	return 0;
+
+err_remove_device:
+	iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
+err_unlock:
+	mutex_unlock(&sva_lock);
+	return ret;
+}
+
 int arm_smmu_master_enable_sva(struct arm_smmu_master *master)
 {
 	mutex_lock(&sva_lock);
+	/*
+	 * Drivers for devices supporting PRI or stall should enable IOPF first.
+	 * Others have device-specific fault handlers and don't need IOPF, so
+	 * this sanity check is a bit basic.
+	 */
+	if (arm_smmu_master_iopf_supported(master) && !master->iopf_enabled) {
+		mutex_unlock(&sva_lock);
+		return -EINVAL;
+	}
 	master->sva_enabled = true;
 	mutex_unlock(&sva_lock);
 
 	return 0;
 }
 
+int arm_smmu_master_disable_iopf(struct arm_smmu_master *master)
+{
+	struct device *dev = master->dev;
+
+	mutex_lock(&sva_lock);
+	if (master->sva_enabled) {
+		mutex_unlock(&sva_lock);
+		return -EBUSY;
+	}
+
+	iommu_unregister_device_fault_handler(dev);
+	iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
+	master->iopf_enabled = false;
+	mutex_unlock(&sva_lock);
+	return 0;
+}
+
 int arm_smmu_master_disable_sva(struct arm_smmu_master *master)
 {
 	mutex_lock(&sva_lock);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 2dbae2e6965d..1fea11d65cd3 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -32,6 +32,7 @@
 #include <linux/amba/bus.h>
 
 #include "arm-smmu-v3.h"
+#include "../../iommu-sva-lib.h"
 
 static bool disable_bypass = true;
 module_param(disable_bypass, bool, 0444);
@@ -319,6 +320,11 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
 		}
 		cmd[1] |= FIELD_PREP(CMDQ_PRI_1_RESP, ent->pri.resp);
 		break;
+	case CMDQ_OP_RESUME:
+		cmd[0] |= FIELD_PREP(CMDQ_RESUME_0_SID, ent->resume.sid);
+		cmd[0] |= FIELD_PREP(CMDQ_RESUME_0_RESP, ent->resume.resp);
+		cmd[1] |= FIELD_PREP(CMDQ_RESUME_1_STAG, ent->resume.stag);
+		break;
 	case CMDQ_OP_CMD_SYNC:
 		if (ent->sync.msiaddr) {
 			cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, CMDQ_SYNC_0_CS_IRQ);
@@ -882,6 +888,44 @@ static int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu,
 	return arm_smmu_cmdq_issue_cmdlist(smmu, cmds->cmds, cmds->num, true);
 }
 
+static int arm_smmu_page_response(struct device *dev,
+				  struct iommu_fault_event *unused,
+				  struct iommu_page_response *resp)
+{
+	struct arm_smmu_cmdq_ent cmd = {0};
+	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+	int sid = master->streams[0].id;
+
+	if (master->stall_enabled) {
+		cmd.opcode		= CMDQ_OP_RESUME;
+		cmd.resume.sid		= sid;
+		cmd.resume.stag		= resp->grpid;
+		switch (resp->code) {
+		case IOMMU_PAGE_RESP_INVALID:
+		case IOMMU_PAGE_RESP_FAILURE:
+			cmd.resume.resp = CMDQ_RESUME_0_RESP_ABORT;
+			break;
+		case IOMMU_PAGE_RESP_SUCCESS:
+			cmd.resume.resp = CMDQ_RESUME_0_RESP_RETRY;
+			break;
+		default:
+			return -EINVAL;
+		}
+	} else {
+		return -ENODEV;
+	}
+
+	arm_smmu_cmdq_issue_cmd(master->smmu, &cmd);
+	/*
+	 * Don't send a SYNC, it doesn't do anything for RESUME or PRI_RESP.
+	 * RESUME consumption guarantees that the stalled transaction will be
+	 * terminated... at some point in the future. PRI_RESP is fire and
+	 * forget.
+	 */
+
+	return 0;
+}
+
 /* Context descriptor manipulation functions */
 void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid)
 {
@@ -991,7 +1035,6 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
 	u64 val;
 	bool cd_live;
 	__le64 *cdptr;
-	struct arm_smmu_device *smmu = smmu_domain->smmu;
 
 	if (WARN_ON(ssid >= (1 << smmu_domain->s1_cfg.s1cdmax)))
 		return -E2BIG;
@@ -1036,8 +1079,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
 			FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid) |
 			CTXDESC_CD_0_V;
 
-		/* STALL_MODEL==0b10 && CD.S==0 is ILLEGAL */
-		if (smmu->features & ARM_SMMU_FEAT_STALL_FORCE)
+		if (smmu_domain->stall_enabled)
 			val |= CTXDESC_CD_0_S;
 	}
 
@@ -1278,7 +1320,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 			 FIELD_PREP(STRTAB_STE_1_STRW, STRTAB_STE_1_STRW_NSEL1));
 
 		if (smmu->features & ARM_SMMU_FEAT_STALLS &&
-		   !(smmu->features & ARM_SMMU_FEAT_STALL_FORCE))
+		    !master->stall_enabled)
 			dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
 
 		val |= (s1_cfg->cdcfg.cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
@@ -1355,7 +1397,6 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
 	return 0;
 }
 
-__maybe_unused
 static struct arm_smmu_master *
 arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid)
 {
@@ -1382,9 +1423,96 @@ arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid)
 }
 
 /* IRQ and event handlers */
+static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
+{
+	int ret;
+	u32 perm = 0;
+	struct arm_smmu_master *master;
+	bool ssid_valid = evt[0] & EVTQ_0_SSV;
+	u8 type = FIELD_GET(EVTQ_0_ID, evt[0]);
+	u32 sid = FIELD_GET(EVTQ_0_SID, evt[0]);
+	struct iommu_fault_event fault_evt = { };
+	struct iommu_fault *flt = &fault_evt.fault;
+
+	/* Stage-2 is always pinned at the moment */
+	if (evt[1] & EVTQ_1_S2)
+		return -EFAULT;
+
+	master = arm_smmu_find_master(smmu, sid);
+	if (!master)
+		return -EINVAL;
+
+	if (evt[1] & EVTQ_1_READ)
+		perm |= IOMMU_FAULT_PERM_READ;
+	else
+		perm |= IOMMU_FAULT_PERM_WRITE;
+
+	if (evt[1] & EVTQ_1_EXEC)
+		perm |= IOMMU_FAULT_PERM_EXEC;
+
+	if (evt[1] & EVTQ_1_PRIV)
+		perm |= IOMMU_FAULT_PERM_PRIV;
+
+	if (evt[1] & EVTQ_1_STALL) {
+		flt->type = IOMMU_FAULT_PAGE_REQ;
+		flt->prm = (struct iommu_fault_page_request) {
+			.flags = IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE,
+			.grpid = FIELD_GET(EVTQ_1_STAG, evt[1]),
+			.perm = perm,
+			.addr = FIELD_GET(EVTQ_2_ADDR, evt[2]),
+		};
+
+		if (ssid_valid) {
+			flt->prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
+			flt->prm.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]);
+		}
+	} else {
+		flt->type = IOMMU_FAULT_DMA_UNRECOV;
+		flt->event = (struct iommu_fault_unrecoverable) {
+			.flags = IOMMU_FAULT_UNRECOV_ADDR_VALID |
+				 IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID,
+			.perm = perm,
+			.addr = FIELD_GET(EVTQ_2_ADDR, evt[2]),
+			.fetch_addr = FIELD_GET(EVTQ_3_IPA, evt[3]),
+		};
+
+		if (ssid_valid) {
+			flt->event.flags |= IOMMU_FAULT_UNRECOV_PASID_VALID;
+			flt->event.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]);
+		}
+
+		switch (type) {
+		case EVT_ID_TRANSLATION_FAULT:
+		case EVT_ID_ADDR_SIZE_FAULT:
+		case EVT_ID_ACCESS_FAULT:
+			flt->event.reason = IOMMU_FAULT_REASON_PTE_FETCH;
+			break;
+		case EVT_ID_PERMISSION_FAULT:
+			flt->event.reason = IOMMU_FAULT_REASON_PERMISSION;
+			break;
+		default:
+			/* TODO: report other unrecoverable faults. */
+			return -EFAULT;
+		}
+	}
+
+	ret = iommu_report_device_fault(master->dev, &fault_evt);
+	if (ret && flt->type == IOMMU_FAULT_PAGE_REQ) {
+		/* Nobody cared, abort the access */
+		struct iommu_page_response resp = {
+			.pasid		= flt->prm.pasid,
+			.grpid		= flt->prm.grpid,
+			.code		= IOMMU_PAGE_RESP_FAILURE,
+		};
+		arm_smmu_page_response(master->dev, NULL, &resp);
+	}
+
+	return ret;
+}
+
 static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
 {
-	int i;
+	int i, ret;
 	struct arm_smmu_device *smmu = dev;
 	struct arm_smmu_queue *q = &smmu->evtq.q;
 	struct arm_smmu_ll_queue *llq = &q->llq;
@@ -1394,11 +1522,14 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
 		while (!queue_remove_raw(q, evt)) {
 			u8 id = FIELD_GET(EVTQ_0_ID, evt[0]);
 
-			dev_info(smmu->dev, "event 0x%02x received:\n", id);
-			for (i = 0; i < ARRAY_SIZE(evt); ++i)
-				dev_info(smmu->dev, "\t0x%016llx\n",
-					 (unsigned long long)evt[i]);
-
+			ret = arm_smmu_handle_evt(smmu, evt);
+			if (ret) {
+				dev_info(smmu->dev, "event 0x%02x received:\n",
+					 id);
+				for (i = 0; i < ARRAY_SIZE(evt); ++i)
+					dev_info(smmu->dev, "\t0x%016llx\n",
+						 (unsigned long long)evt[i]);
+			}
 		}
 
 		/*
@@ -1903,6 +2034,8 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
 
 	cfg->s1cdmax = master->ssid_bits;
 
+	smmu_domain->stall_enabled = master->stall_enabled;
+
 	ret = arm_smmu_alloc_cd_tables(smmu_domain);
 	if (ret)
 		goto out_free_asid;
@@ -2250,6 +2383,12 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 			smmu_domain->s1_cfg.s1cdmax, master->ssid_bits);
 		ret = -EINVAL;
 		goto out_unlock;
+	} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
+		   smmu_domain->stall_enabled != master->stall_enabled) {
+		dev_err(dev, "cannot attach to stall-%s domain\n",
+			smmu_domain->stall_enabled ? "enabled" : "disabled");
+		ret = -EINVAL;
+		goto out_unlock;
 	}
 
 	master->domain = smmu_domain;
@@ -2484,6 +2623,11 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
 		master->ssid_bits = min_t(u8, master->ssid_bits,
 					  CTXDESC_LINEAR_CDMAX);
 
+	if ((smmu->features & ARM_SMMU_FEAT_STALLS &&
+	     device_property_read_bool(dev, "dma-can-stall")) ||
+	    smmu->features & ARM_SMMU_FEAT_STALL_FORCE)
+		master->stall_enabled = true;
+
 	return &smmu->iommu;
 
 err_free_master:
@@ -2502,6 +2646,7 @@ static void arm_smmu_release_device(struct device *dev)
 
 	master = dev_iommu_priv_get(dev);
 	WARN_ON(arm_smmu_master_sva_enabled(master));
+	iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
 	arm_smmu_detach_dev(master);
 	arm_smmu_disable_pasid(master);
 	arm_smmu_remove_master(master);
@@ -2629,6 +2774,8 @@ static bool arm_smmu_dev_has_feature(struct device *dev,
 		return false;
 
 	switch (feat) {
+	case IOMMU_DEV_FEAT_IOPF:
+		return arm_smmu_master_iopf_supported(master);
 	case IOMMU_DEV_FEAT_SVA:
 		return arm_smmu_master_sva_supported(master);
 	default:
@@ -2645,6 +2792,8 @@ static bool arm_smmu_dev_feature_enabled(struct device *dev,
 		return false;
 
 	switch (feat) {
+	case IOMMU_DEV_FEAT_IOPF:
+		return arm_smmu_master_iopf_enabled(master);
 	case IOMMU_DEV_FEAT_SVA:
 		return arm_smmu_master_sva_enabled(master);
 	default:
@@ -2655,6 +2804,8 @@ static bool arm_smmu_dev_feature_enabled(struct device *dev,
 static int arm_smmu_dev_enable_feature(struct device *dev,
 				       enum iommu_dev_features feat)
 {
+	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+
 	if (!arm_smmu_dev_has_feature(dev, feat))
 		return -ENODEV;
 
@@ -2662,8 +2813,10 @@ static int arm_smmu_dev_enable_feature(struct device *dev,
 		return -EBUSY;
 
 	switch (feat) {
+	case IOMMU_DEV_FEAT_IOPF:
+		return arm_smmu_master_enable_iopf(master);
 	case IOMMU_DEV_FEAT_SVA:
-		return arm_smmu_master_enable_sva(dev_iommu_priv_get(dev));
+		return arm_smmu_master_enable_sva(master);
 	default:
 		return -EINVAL;
 	}
@@ -2672,12 +2825,16 @@ static int arm_smmu_dev_enable_feature(struct device *dev,
 static int arm_smmu_dev_disable_feature(struct device *dev,
 					enum iommu_dev_features feat)
 {
+	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+
 	if (!arm_smmu_dev_feature_enabled(dev, feat))
 		return -EINVAL;
 
 	switch (feat) {
+	case IOMMU_DEV_FEAT_IOPF:
+		return arm_smmu_master_disable_iopf(master);
 	case IOMMU_DEV_FEAT_SVA:
-		return arm_smmu_master_disable_sva(dev_iommu_priv_get(dev));
+		return arm_smmu_master_disable_sva(master);
 	default:
 		return -EINVAL;
 	}
@@ -2708,6 +2865,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.sva_bind		= arm_smmu_sva_bind,
 	.sva_unbind		= arm_smmu_sva_unbind,
 	.sva_get_pasid		= arm_smmu_sva_get_pasid,
+	.page_response		= arm_smmu_page_response,
 	.pgsize_bitmap		= -1UL, /* Restricted during device attach */
 };
 
@@ -2785,6 +2943,7 @@ static int arm_smmu_cmdq_init(struct arm_smmu_device *smmu)
 static int arm_smmu_init_queues(struct arm_smmu_device *smmu)
 {
 	int ret;
+	bool sva = arm_smmu_sva_supported(smmu);
 
 	/* cmdq */
 	ret = arm_smmu_init_one_queue(smmu, &smmu->cmdq.q, ARM_SMMU_CMDQ_PROD,
@@ -2804,6 +2963,12 @@ static int arm_smmu_init_queues(struct arm_smmu_device *smmu)
 	if (ret)
 		return ret;
 
+	if (sva && smmu->features & ARM_SMMU_FEAT_STALLS) {
+		smmu->evtq.iopf = iopf_queue_alloc(dev_name(smmu->dev));
+		if (!smmu->evtq.iopf)
+			return -ENOMEM;
+	}
+
 	/* priq */
 	if (!(smmu->features & ARM_SMMU_FEAT_PRI))
 		return 0;
@@ -3718,6 +3883,7 @@ static int arm_smmu_device_remove(struct platform_device *pdev)
 	iommu_device_unregister(&smmu->iommu);
 	iommu_device_sysfs_remove(&smmu->iommu);
 	arm_smmu_device_disable(smmu);
+	iopf_queue_free(smmu->evtq.iopf);
 
 	return 0;
 }
-- 
2.29.2

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 105+ messages in thread

* [PATCH v9 10/10] iommu/arm-smmu-v3: Add stall support for platform devices
@ 2021-01-08 14:52   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-08 14:52 UTC (permalink / raw)
  To: joro, will
  Cc: devicetree, lorenzo.pieralisi, linux-acpi, Jean-Philippe Brucker,
	guohanjun, rjw, shameerali.kolothum.thodi, eric.auger, iommu,
	robh+dt, linux-accelerators, Jonathan.Cameron, sudeep.holla,
	vivek.gautam, zhangfei.gao, baolu.lu, robin.murphy,
	linux-arm-kernel, lenb

The SMMU provides a Stall model for handling page faults in platform
devices. It is similar to PCIe PRI, but doesn't require devices to have
their own translation cache. Instead, faulting transactions are parked
and the OS is given a chance to fix the page tables and retry the
transaction.

Enable stall for devices that support it (opt-in by firmware). When an
event corresponds to a translation error, call the IOMMU fault handler.
If the fault is recoverable, it will call us back to terminate or
continue the stall.

To use stall device drivers need to enable IOMMU_DEV_FEAT_IOPF, which
initializes the fault queue for the device.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
v9: Add IOMMU_DEV_FEAT_IOPF
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  61 ++++++
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  70 ++++++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 192 ++++++++++++++++--
 3 files changed, 306 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 8ef6a1c48635..cb129870ef55 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -354,6 +354,13 @@
 #define CMDQ_PRI_1_GRPID		GENMASK_ULL(8, 0)
 #define CMDQ_PRI_1_RESP			GENMASK_ULL(13, 12)
 
+#define CMDQ_RESUME_0_SID		GENMASK_ULL(63, 32)
+#define CMDQ_RESUME_0_RESP_TERM		0UL
+#define CMDQ_RESUME_0_RESP_RETRY	1UL
+#define CMDQ_RESUME_0_RESP_ABORT	2UL
+#define CMDQ_RESUME_0_RESP		GENMASK_ULL(13, 12)
+#define CMDQ_RESUME_1_STAG		GENMASK_ULL(15, 0)
+
 #define CMDQ_SYNC_0_CS			GENMASK_ULL(13, 12)
 #define CMDQ_SYNC_0_CS_NONE		0
 #define CMDQ_SYNC_0_CS_IRQ		1
@@ -370,6 +377,25 @@
 
 #define EVTQ_0_ID			GENMASK_ULL(7, 0)
 
+#define EVT_ID_TRANSLATION_FAULT	0x10
+#define EVT_ID_ADDR_SIZE_FAULT		0x11
+#define EVT_ID_ACCESS_FAULT		0x12
+#define EVT_ID_PERMISSION_FAULT		0x13
+
+#define EVTQ_0_SSV			(1UL << 11)
+#define EVTQ_0_SSID			GENMASK_ULL(31, 12)
+#define EVTQ_0_SID			GENMASK_ULL(63, 32)
+#define EVTQ_1_STAG			GENMASK_ULL(15, 0)
+#define EVTQ_1_STALL			(1UL << 31)
+#define EVTQ_1_PRIV			(1UL << 33)
+#define EVTQ_1_EXEC			(1UL << 34)
+#define EVTQ_1_READ			(1UL << 35)
+#define EVTQ_1_S2			(1UL << 39)
+#define EVTQ_1_CLASS			GENMASK_ULL(41, 40)
+#define EVTQ_1_TT_READ			(1UL << 44)
+#define EVTQ_2_ADDR			GENMASK_ULL(63, 0)
+#define EVTQ_3_IPA			GENMASK_ULL(51, 12)
+
 /* PRI queue */
 #define PRIQ_ENT_SZ_SHIFT		4
 #define PRIQ_ENT_DWORDS			((1 << PRIQ_ENT_SZ_SHIFT) >> 3)
@@ -462,6 +488,13 @@ struct arm_smmu_cmdq_ent {
 			enum pri_resp		resp;
 		} pri;
 
+		#define CMDQ_OP_RESUME		0x44
+		struct {
+			u32			sid;
+			u16			stag;
+			u8			resp;
+		} resume;
+
 		#define CMDQ_OP_CMD_SYNC	0x46
 		struct {
 			u64			msiaddr;
@@ -520,6 +553,7 @@ struct arm_smmu_cmdq_batch {
 
 struct arm_smmu_evtq {
 	struct arm_smmu_queue		q;
+	struct iopf_queue		*iopf;
 	u32				max_stalls;
 };
 
@@ -656,7 +690,9 @@ struct arm_smmu_master {
 	struct arm_smmu_stream		*streams;
 	unsigned int			num_streams;
 	bool				ats_enabled;
+	bool				stall_enabled;
 	bool				sva_enabled;
+	bool				iopf_enabled;
 	struct list_head		bonds;
 	unsigned int			ssid_bits;
 };
@@ -675,6 +711,7 @@ struct arm_smmu_domain {
 
 	struct io_pgtable_ops		*pgtbl_ops;
 	bool				non_strict;
+	bool				stall_enabled;
 	atomic_t			nr_ats_masters;
 
 	enum arm_smmu_domain_stage	stage;
@@ -713,6 +750,10 @@ bool arm_smmu_master_sva_supported(struct arm_smmu_master *master);
 bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master);
 int arm_smmu_master_enable_sva(struct arm_smmu_master *master);
 int arm_smmu_master_disable_sva(struct arm_smmu_master *master);
+bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master);
+bool arm_smmu_master_iopf_enabled(struct arm_smmu_master *master);
+int arm_smmu_master_enable_iopf(struct arm_smmu_master *master);
+int arm_smmu_master_disable_iopf(struct arm_smmu_master *master);
 struct iommu_sva *arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm,
 				    void *drvdata);
 void arm_smmu_sva_unbind(struct iommu_sva *handle);
@@ -744,6 +785,26 @@ static inline int arm_smmu_master_disable_sva(struct arm_smmu_master *master)
 	return -ENODEV;
 }
 
+static inline bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master)
+{
+	return false;
+}
+
+static inline bool arm_smmu_master_iopf_enabled(struct arm_smmu_master *master)
+{
+	return false;
+}
+
+static inline int arm_smmu_master_enable_iopf(struct arm_smmu_master *master)
+{
+	return -ENODEV;
+}
+
+static inline int arm_smmu_master_disable_iopf(struct arm_smmu_master *master)
+{
+	return -ENODEV;
+}
+
 static inline struct iommu_sva *
 arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
 {
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index e13b092e6004..17acfee4f484 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -431,9 +431,9 @@ bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
 	return true;
 }
 
-static bool arm_smmu_iopf_supported(struct arm_smmu_master *master)
+bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master)
 {
-	return false;
+	return master->stall_enabled;
 }
 
 bool arm_smmu_master_sva_supported(struct arm_smmu_master *master)
@@ -441,8 +441,18 @@ bool arm_smmu_master_sva_supported(struct arm_smmu_master *master)
 	if (!(master->smmu->features & ARM_SMMU_FEAT_SVA))
 		return false;
 
-	/* SSID and IOPF support are mandatory for the moment */
-	return master->ssid_bits && arm_smmu_iopf_supported(master);
+	/* SSID support is mandatory for the moment */
+	return master->ssid_bits;
+}
+
+bool arm_smmu_master_iopf_enabled(struct arm_smmu_master *master)
+{
+	bool enabled;
+
+	mutex_lock(&sva_lock);
+	enabled = master->iopf_enabled;
+	mutex_unlock(&sva_lock);
+	return enabled;
 }
 
 bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master)
@@ -455,15 +465,67 @@ bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master)
 	return enabled;
 }
 
+int arm_smmu_master_enable_iopf(struct arm_smmu_master *master)
+{
+	int ret;
+	struct device *dev = master->dev;
+
+	mutex_lock(&sva_lock);
+	if (master->stall_enabled) {
+		ret = iopf_queue_add_device(master->smmu->evtq.iopf, dev);
+		if (ret)
+			goto err_unlock;
+	}
+
+	ret = iommu_register_device_fault_handler(dev, iommu_queue_iopf, dev);
+	if (ret)
+		goto err_remove_device;
+	master->iopf_enabled = true;
+	mutex_unlock(&sva_lock);
+	return 0;
+
+err_remove_device:
+	iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
+err_unlock:
+	mutex_unlock(&sva_lock);
+	return ret;
+}
+
 int arm_smmu_master_enable_sva(struct arm_smmu_master *master)
 {
 	mutex_lock(&sva_lock);
+	/*
+	 * Drivers for devices supporting PRI or stall should enable IOPF first.
+	 * Others have device-specific fault handlers and don't need IOPF, so
+	 * this sanity check is a bit basic.
+	 */
+	if (arm_smmu_master_iopf_supported(master) && !master->iopf_enabled) {
+		mutex_unlock(&sva_lock);
+		return -EINVAL;
+	}
 	master->sva_enabled = true;
 	mutex_unlock(&sva_lock);
 
 	return 0;
 }
 
+int arm_smmu_master_disable_iopf(struct arm_smmu_master *master)
+{
+	struct device *dev = master->dev;
+
+	mutex_lock(&sva_lock);
+	if (master->sva_enabled) {
+		mutex_unlock(&sva_lock);
+		return -EBUSY;
+	}
+
+	iommu_unregister_device_fault_handler(dev);
+	iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
+	master->iopf_enabled = false;
+	mutex_unlock(&sva_lock);
+	return 0;
+}
+
 int arm_smmu_master_disable_sva(struct arm_smmu_master *master)
 {
 	mutex_lock(&sva_lock);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 2dbae2e6965d..1fea11d65cd3 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -32,6 +32,7 @@
 #include <linux/amba/bus.h>
 
 #include "arm-smmu-v3.h"
+#include "../../iommu-sva-lib.h"
 
 static bool disable_bypass = true;
 module_param(disable_bypass, bool, 0444);
@@ -319,6 +320,11 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
 		}
 		cmd[1] |= FIELD_PREP(CMDQ_PRI_1_RESP, ent->pri.resp);
 		break;
+	case CMDQ_OP_RESUME:
+		cmd[0] |= FIELD_PREP(CMDQ_RESUME_0_SID, ent->resume.sid);
+		cmd[0] |= FIELD_PREP(CMDQ_RESUME_0_RESP, ent->resume.resp);
+		cmd[1] |= FIELD_PREP(CMDQ_RESUME_1_STAG, ent->resume.stag);
+		break;
 	case CMDQ_OP_CMD_SYNC:
 		if (ent->sync.msiaddr) {
 			cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, CMDQ_SYNC_0_CS_IRQ);
@@ -882,6 +888,44 @@ static int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu,
 	return arm_smmu_cmdq_issue_cmdlist(smmu, cmds->cmds, cmds->num, true);
 }
 
+static int arm_smmu_page_response(struct device *dev,
+				  struct iommu_fault_event *unused,
+				  struct iommu_page_response *resp)
+{
+	struct arm_smmu_cmdq_ent cmd = {0};
+	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+	int sid = master->streams[0].id;
+
+	if (master->stall_enabled) {
+		cmd.opcode		= CMDQ_OP_RESUME;
+		cmd.resume.sid		= sid;
+		cmd.resume.stag		= resp->grpid;
+		switch (resp->code) {
+		case IOMMU_PAGE_RESP_INVALID:
+		case IOMMU_PAGE_RESP_FAILURE:
+			cmd.resume.resp = CMDQ_RESUME_0_RESP_ABORT;
+			break;
+		case IOMMU_PAGE_RESP_SUCCESS:
+			cmd.resume.resp = CMDQ_RESUME_0_RESP_RETRY;
+			break;
+		default:
+			return -EINVAL;
+		}
+	} else {
+		return -ENODEV;
+	}
+
+	arm_smmu_cmdq_issue_cmd(master->smmu, &cmd);
+	/*
+	 * Don't send a SYNC, it doesn't do anything for RESUME or PRI_RESP.
+	 * RESUME consumption guarantees that the stalled transaction will be
+	 * terminated... at some point in the future. PRI_RESP is fire and
+	 * forget.
+	 */
+
+	return 0;
+}
+
 /* Context descriptor manipulation functions */
 void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid)
 {
@@ -991,7 +1035,6 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
 	u64 val;
 	bool cd_live;
 	__le64 *cdptr;
-	struct arm_smmu_device *smmu = smmu_domain->smmu;
 
 	if (WARN_ON(ssid >= (1 << smmu_domain->s1_cfg.s1cdmax)))
 		return -E2BIG;
@@ -1036,8 +1079,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
 			FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid) |
 			CTXDESC_CD_0_V;
 
-		/* STALL_MODEL==0b10 && CD.S==0 is ILLEGAL */
-		if (smmu->features & ARM_SMMU_FEAT_STALL_FORCE)
+		if (smmu_domain->stall_enabled)
 			val |= CTXDESC_CD_0_S;
 	}
 
@@ -1278,7 +1320,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 			 FIELD_PREP(STRTAB_STE_1_STRW, STRTAB_STE_1_STRW_NSEL1));
 
 		if (smmu->features & ARM_SMMU_FEAT_STALLS &&
-		   !(smmu->features & ARM_SMMU_FEAT_STALL_FORCE))
+		    !master->stall_enabled)
 			dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
 
 		val |= (s1_cfg->cdcfg.cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
@@ -1355,7 +1397,6 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
 	return 0;
 }
 
-__maybe_unused
 static struct arm_smmu_master *
 arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid)
 {
@@ -1382,9 +1423,96 @@ arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid)
 }
 
 /* IRQ and event handlers */
+static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
+{
+	int ret;
+	u32 perm = 0;
+	struct arm_smmu_master *master;
+	bool ssid_valid = evt[0] & EVTQ_0_SSV;
+	u8 type = FIELD_GET(EVTQ_0_ID, evt[0]);
+	u32 sid = FIELD_GET(EVTQ_0_SID, evt[0]);
+	struct iommu_fault_event fault_evt = { };
+	struct iommu_fault *flt = &fault_evt.fault;
+
+	/* Stage-2 is always pinned at the moment */
+	if (evt[1] & EVTQ_1_S2)
+		return -EFAULT;
+
+	master = arm_smmu_find_master(smmu, sid);
+	if (!master)
+		return -EINVAL;
+
+	if (evt[1] & EVTQ_1_READ)
+		perm |= IOMMU_FAULT_PERM_READ;
+	else
+		perm |= IOMMU_FAULT_PERM_WRITE;
+
+	if (evt[1] & EVTQ_1_EXEC)
+		perm |= IOMMU_FAULT_PERM_EXEC;
+
+	if (evt[1] & EVTQ_1_PRIV)
+		perm |= IOMMU_FAULT_PERM_PRIV;
+
+	if (evt[1] & EVTQ_1_STALL) {
+		flt->type = IOMMU_FAULT_PAGE_REQ;
+		flt->prm = (struct iommu_fault_page_request) {
+			.flags = IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE,
+			.grpid = FIELD_GET(EVTQ_1_STAG, evt[1]),
+			.perm = perm,
+			.addr = FIELD_GET(EVTQ_2_ADDR, evt[2]),
+		};
+
+		if (ssid_valid) {
+			flt->prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
+			flt->prm.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]);
+		}
+	} else {
+		flt->type = IOMMU_FAULT_DMA_UNRECOV;
+		flt->event = (struct iommu_fault_unrecoverable) {
+			.flags = IOMMU_FAULT_UNRECOV_ADDR_VALID |
+				 IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID,
+			.perm = perm,
+			.addr = FIELD_GET(EVTQ_2_ADDR, evt[2]),
+			.fetch_addr = FIELD_GET(EVTQ_3_IPA, evt[3]),
+		};
+
+		if (ssid_valid) {
+			flt->event.flags |= IOMMU_FAULT_UNRECOV_PASID_VALID;
+			flt->event.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]);
+		}
+
+		switch (type) {
+		case EVT_ID_TRANSLATION_FAULT:
+		case EVT_ID_ADDR_SIZE_FAULT:
+		case EVT_ID_ACCESS_FAULT:
+			flt->event.reason = IOMMU_FAULT_REASON_PTE_FETCH;
+			break;
+		case EVT_ID_PERMISSION_FAULT:
+			flt->event.reason = IOMMU_FAULT_REASON_PERMISSION;
+			break;
+		default:
+			/* TODO: report other unrecoverable faults. */
+			return -EFAULT;
+		}
+	}
+
+	ret = iommu_report_device_fault(master->dev, &fault_evt);
+	if (ret && flt->type == IOMMU_FAULT_PAGE_REQ) {
+		/* Nobody cared, abort the access */
+		struct iommu_page_response resp = {
+			.pasid		= flt->prm.pasid,
+			.grpid		= flt->prm.grpid,
+			.code		= IOMMU_PAGE_RESP_FAILURE,
+		};
+		arm_smmu_page_response(master->dev, NULL, &resp);
+	}
+
+	return ret;
+}
+
 static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
 {
-	int i;
+	int i, ret;
 	struct arm_smmu_device *smmu = dev;
 	struct arm_smmu_queue *q = &smmu->evtq.q;
 	struct arm_smmu_ll_queue *llq = &q->llq;
@@ -1394,11 +1522,14 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
 		while (!queue_remove_raw(q, evt)) {
 			u8 id = FIELD_GET(EVTQ_0_ID, evt[0]);
 
-			dev_info(smmu->dev, "event 0x%02x received:\n", id);
-			for (i = 0; i < ARRAY_SIZE(evt); ++i)
-				dev_info(smmu->dev, "\t0x%016llx\n",
-					 (unsigned long long)evt[i]);
-
+			ret = arm_smmu_handle_evt(smmu, evt);
+			if (ret) {
+				dev_info(smmu->dev, "event 0x%02x received:\n",
+					 id);
+				for (i = 0; i < ARRAY_SIZE(evt); ++i)
+					dev_info(smmu->dev, "\t0x%016llx\n",
+						 (unsigned long long)evt[i]);
+			}
 		}
 
 		/*
@@ -1903,6 +2034,8 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
 
 	cfg->s1cdmax = master->ssid_bits;
 
+	smmu_domain->stall_enabled = master->stall_enabled;
+
 	ret = arm_smmu_alloc_cd_tables(smmu_domain);
 	if (ret)
 		goto out_free_asid;
@@ -2250,6 +2383,12 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 			smmu_domain->s1_cfg.s1cdmax, master->ssid_bits);
 		ret = -EINVAL;
 		goto out_unlock;
+	} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
+		   smmu_domain->stall_enabled != master->stall_enabled) {
+		dev_err(dev, "cannot attach to stall-%s domain\n",
+			smmu_domain->stall_enabled ? "enabled" : "disabled");
+		ret = -EINVAL;
+		goto out_unlock;
 	}
 
 	master->domain = smmu_domain;
@@ -2484,6 +2623,11 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
 		master->ssid_bits = min_t(u8, master->ssid_bits,
 					  CTXDESC_LINEAR_CDMAX);
 
+	if ((smmu->features & ARM_SMMU_FEAT_STALLS &&
+	     device_property_read_bool(dev, "dma-can-stall")) ||
+	    smmu->features & ARM_SMMU_FEAT_STALL_FORCE)
+		master->stall_enabled = true;
+
 	return &smmu->iommu;
 
 err_free_master:
@@ -2502,6 +2646,7 @@ static void arm_smmu_release_device(struct device *dev)
 
 	master = dev_iommu_priv_get(dev);
 	WARN_ON(arm_smmu_master_sva_enabled(master));
+	iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
 	arm_smmu_detach_dev(master);
 	arm_smmu_disable_pasid(master);
 	arm_smmu_remove_master(master);
@@ -2629,6 +2774,8 @@ static bool arm_smmu_dev_has_feature(struct device *dev,
 		return false;
 
 	switch (feat) {
+	case IOMMU_DEV_FEAT_IOPF:
+		return arm_smmu_master_iopf_supported(master);
 	case IOMMU_DEV_FEAT_SVA:
 		return arm_smmu_master_sva_supported(master);
 	default:
@@ -2645,6 +2792,8 @@ static bool arm_smmu_dev_feature_enabled(struct device *dev,
 		return false;
 
 	switch (feat) {
+	case IOMMU_DEV_FEAT_IOPF:
+		return arm_smmu_master_iopf_enabled(master);
 	case IOMMU_DEV_FEAT_SVA:
 		return arm_smmu_master_sva_enabled(master);
 	default:
@@ -2655,6 +2804,8 @@ static bool arm_smmu_dev_feature_enabled(struct device *dev,
 static int arm_smmu_dev_enable_feature(struct device *dev,
 				       enum iommu_dev_features feat)
 {
+	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+
 	if (!arm_smmu_dev_has_feature(dev, feat))
 		return -ENODEV;
 
@@ -2662,8 +2813,10 @@ static int arm_smmu_dev_enable_feature(struct device *dev,
 		return -EBUSY;
 
 	switch (feat) {
+	case IOMMU_DEV_FEAT_IOPF:
+		return arm_smmu_master_enable_iopf(master);
 	case IOMMU_DEV_FEAT_SVA:
-		return arm_smmu_master_enable_sva(dev_iommu_priv_get(dev));
+		return arm_smmu_master_enable_sva(master);
 	default:
 		return -EINVAL;
 	}
@@ -2672,12 +2825,16 @@ static int arm_smmu_dev_enable_feature(struct device *dev,
 static int arm_smmu_dev_disable_feature(struct device *dev,
 					enum iommu_dev_features feat)
 {
+	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+
 	if (!arm_smmu_dev_feature_enabled(dev, feat))
 		return -EINVAL;
 
 	switch (feat) {
+	case IOMMU_DEV_FEAT_IOPF:
+		return arm_smmu_master_disable_iopf(master);
 	case IOMMU_DEV_FEAT_SVA:
-		return arm_smmu_master_disable_sva(dev_iommu_priv_get(dev));
+		return arm_smmu_master_disable_sva(master);
 	default:
 		return -EINVAL;
 	}
@@ -2708,6 +2865,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.sva_bind		= arm_smmu_sva_bind,
 	.sva_unbind		= arm_smmu_sva_unbind,
 	.sva_get_pasid		= arm_smmu_sva_get_pasid,
+	.page_response		= arm_smmu_page_response,
 	.pgsize_bitmap		= -1UL, /* Restricted during device attach */
 };
 
@@ -2785,6 +2943,7 @@ static int arm_smmu_cmdq_init(struct arm_smmu_device *smmu)
 static int arm_smmu_init_queues(struct arm_smmu_device *smmu)
 {
 	int ret;
+	bool sva = arm_smmu_sva_supported(smmu);
 
 	/* cmdq */
 	ret = arm_smmu_init_one_queue(smmu, &smmu->cmdq.q, ARM_SMMU_CMDQ_PROD,
@@ -2804,6 +2963,12 @@ static int arm_smmu_init_queues(struct arm_smmu_device *smmu)
 	if (ret)
 		return ret;
 
+	if (sva && smmu->features & ARM_SMMU_FEAT_STALLS) {
+		smmu->evtq.iopf = iopf_queue_alloc(dev_name(smmu->dev));
+		if (!smmu->evtq.iopf)
+			return -ENOMEM;
+	}
+
 	/* priq */
 	if (!(smmu->features & ARM_SMMU_FEAT_PRI))
 		return 0;
@@ -3718,6 +3883,7 @@ static int arm_smmu_device_remove(struct platform_device *pdev)
 	iommu_device_unregister(&smmu->iommu);
 	iommu_device_sysfs_remove(&smmu->iommu);
 	arm_smmu_device_disable(smmu);
+	iopf_queue_free(smmu->evtq.iopf);
 
 	return 0;
 }
-- 
2.29.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 00/10] iommu: I/O page faults for SMMUv3
  2021-01-08 14:52 ` Jean-Philippe Brucker
  (?)
@ 2021-01-11  3:26   ` Zhangfei Gao
  -1 siblings, 0 replies; 105+ messages in thread
From: Zhangfei Gao @ 2021-01-11  3:26 UTC (permalink / raw)
  To: Jean-Philippe Brucker, joro, will
  Cc: lorenzo.pieralisi, robh+dt, guohanjun, sudeep.holla, rjw, lenb,
	robin.murphy, Jonathan.Cameron, eric.auger, iommu, devicetree,
	linux-acpi, linux-arm-kernel, linux-accelerators, baolu.lu,
	vdumpa, shameerali.kolothum.thodi, vivek.gautam



On 2021/1/8 下午10:52, Jean-Philippe Brucker wrote:
> Add stall support to the SMMUv3, along with a common I/O Page Fault
> handler.
>
> Changes since v8 [1]:
> * Added patches 1 and 2 which aren't strictly related to IOPF but need to
>    be applied in order - 8 depends on 2 which depends on 1. Patch 2 moves
>    pasid-num-bits to a device property, following Robin's comment on v8.
> * Patches 3-5 extract the IOPF feature from the SVA one, to support SVA
>    implementations that handle I/O page faults through the device driver
>    rather than the IOMMU driver [2]
> * Use device properties for dma-can-stall, instead of a special fwspec
>    member.
> * Dropped PRI support for now, since it doesn't seem to be available in
>    hardware and adds some complexity.
> * Had to drop some Acks and Tested tags unfortunately, due to code
>    changes.
>
> As usual, you can get the latest SVA patches from
> http://jpbrucker.net/git/linux sva/current
>
> [1] https://lore.kernel.org/linux-iommu/20201112125519.3987595-1-jean-philippe@linaro.org/
> [2] https://lore.kernel.org/linux-iommu/BY5PR12MB3764F5D07E8EC48327E39C86B3C60@BY5PR12MB3764.namprd12.prod.outlook.com/
>
> Jean-Philippe Brucker (10):
>    iommu: Remove obsolete comment
>    iommu/arm-smmu-v3: Use device properties for pasid-num-bits
>    iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
>    iommu/vt-d: Support IOMMU_DEV_FEAT_IOPF
>    uacce: Enable IOMMU_DEV_FEAT_IOPF
>    iommu: Add a page fault handler
>    iommu/arm-smmu-v3: Maintain a SID->device structure
>    dt-bindings: document stall property for IOMMU masters
>    ACPI/IORT: Enable stall support for platform devices
>    iommu/arm-smmu-v3: Add stall support for platform devices

Thanks Jean
I have tested on Hisilicon Kunpeng920 board.

  Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>

Thanks

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 00/10] iommu: I/O page faults for SMMUv3
@ 2021-01-11  3:26   ` Zhangfei Gao
  0 siblings, 0 replies; 105+ messages in thread
From: Zhangfei Gao @ 2021-01-11  3:26 UTC (permalink / raw)
  To: Jean-Philippe Brucker, joro, will
  Cc: devicetree, linux-acpi, guohanjun, rjw, iommu, robh+dt,
	linux-accelerators, sudeep.holla, vivek.gautam, robin.murphy,
	linux-arm-kernel, lenb



On 2021/1/8 下午10:52, Jean-Philippe Brucker wrote:
> Add stall support to the SMMUv3, along with a common I/O Page Fault
> handler.
>
> Changes since v8 [1]:
> * Added patches 1 and 2 which aren't strictly related to IOPF but need to
>    be applied in order - 8 depends on 2 which depends on 1. Patch 2 moves
>    pasid-num-bits to a device property, following Robin's comment on v8.
> * Patches 3-5 extract the IOPF feature from the SVA one, to support SVA
>    implementations that handle I/O page faults through the device driver
>    rather than the IOMMU driver [2]
> * Use device properties for dma-can-stall, instead of a special fwspec
>    member.
> * Dropped PRI support for now, since it doesn't seem to be available in
>    hardware and adds some complexity.
> * Had to drop some Acks and Tested tags unfortunately, due to code
>    changes.
>
> As usual, you can get the latest SVA patches from
> http://jpbrucker.net/git/linux sva/current
>
> [1] https://lore.kernel.org/linux-iommu/20201112125519.3987595-1-jean-philippe@linaro.org/
> [2] https://lore.kernel.org/linux-iommu/BY5PR12MB3764F5D07E8EC48327E39C86B3C60@BY5PR12MB3764.namprd12.prod.outlook.com/
>
> Jean-Philippe Brucker (10):
>    iommu: Remove obsolete comment
>    iommu/arm-smmu-v3: Use device properties for pasid-num-bits
>    iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
>    iommu/vt-d: Support IOMMU_DEV_FEAT_IOPF
>    uacce: Enable IOMMU_DEV_FEAT_IOPF
>    iommu: Add a page fault handler
>    iommu/arm-smmu-v3: Maintain a SID->device structure
>    dt-bindings: document stall property for IOMMU masters
>    ACPI/IORT: Enable stall support for platform devices
>    iommu/arm-smmu-v3: Add stall support for platform devices

Thanks Jean
I have tested on Hisilicon Kunpeng920 board.

  Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>

Thanks
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 00/10] iommu: I/O page faults for SMMUv3
@ 2021-01-11  3:26   ` Zhangfei Gao
  0 siblings, 0 replies; 105+ messages in thread
From: Zhangfei Gao @ 2021-01-11  3:26 UTC (permalink / raw)
  To: Jean-Philippe Brucker, joro, will
  Cc: devicetree, lorenzo.pieralisi, linux-acpi, guohanjun, rjw,
	shameerali.kolothum.thodi, eric.auger, iommu, robh+dt,
	linux-accelerators, Jonathan.Cameron, sudeep.holla, vivek.gautam,
	baolu.lu, robin.murphy, linux-arm-kernel, lenb



On 2021/1/8 下午10:52, Jean-Philippe Brucker wrote:
> Add stall support to the SMMUv3, along with a common I/O Page Fault
> handler.
>
> Changes since v8 [1]:
> * Added patches 1 and 2 which aren't strictly related to IOPF but need to
>    be applied in order - 8 depends on 2 which depends on 1. Patch 2 moves
>    pasid-num-bits to a device property, following Robin's comment on v8.
> * Patches 3-5 extract the IOPF feature from the SVA one, to support SVA
>    implementations that handle I/O page faults through the device driver
>    rather than the IOMMU driver [2]
> * Use device properties for dma-can-stall, instead of a special fwspec
>    member.
> * Dropped PRI support for now, since it doesn't seem to be available in
>    hardware and adds some complexity.
> * Had to drop some Acks and Tested tags unfortunately, due to code
>    changes.
>
> As usual, you can get the latest SVA patches from
> http://jpbrucker.net/git/linux sva/current
>
> [1] https://lore.kernel.org/linux-iommu/20201112125519.3987595-1-jean-philippe@linaro.org/
> [2] https://lore.kernel.org/linux-iommu/BY5PR12MB3764F5D07E8EC48327E39C86B3C60@BY5PR12MB3764.namprd12.prod.outlook.com/
>
> Jean-Philippe Brucker (10):
>    iommu: Remove obsolete comment
>    iommu/arm-smmu-v3: Use device properties for pasid-num-bits
>    iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
>    iommu/vt-d: Support IOMMU_DEV_FEAT_IOPF
>    uacce: Enable IOMMU_DEV_FEAT_IOPF
>    iommu: Add a page fault handler
>    iommu/arm-smmu-v3: Maintain a SID->device structure
>    dt-bindings: document stall property for IOMMU masters
>    ACPI/IORT: Enable stall support for platform devices
>    iommu/arm-smmu-v3: Add stall support for platform devices

Thanks Jean
I have tested on Hisilicon Kunpeng920 board.

  Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>

Thanks

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 05/10] uacce: Enable IOMMU_DEV_FEAT_IOPF
  2021-01-08 14:52   ` Jean-Philippe Brucker
  (?)
@ 2021-01-11  3:29     ` Zhangfei Gao
  -1 siblings, 0 replies; 105+ messages in thread
From: Zhangfei Gao @ 2021-01-11  3:29 UTC (permalink / raw)
  To: Jean-Philippe Brucker, joro, will
  Cc: lorenzo.pieralisi, robh+dt, guohanjun, sudeep.holla, rjw, lenb,
	robin.murphy, Jonathan.Cameron, eric.auger, iommu, devicetree,
	linux-acpi, linux-arm-kernel, linux-accelerators, baolu.lu,
	vdumpa, shameerali.kolothum.thodi, vivek.gautam, Arnd Bergmann,
	Greg Kroah-Hartman, Zhou Wang



On 2021/1/8 下午10:52, Jean-Philippe Brucker wrote:
> The IOPF (I/O Page Fault) feature is now enabled independently from the
> SVA feature, because some IOPF implementations are device-specific and
> do not require IOMMU support for PCIe PRI or Arm SMMU stall.
>
> Enable IOPF unconditionally when enabling SVA for now. In the future, if
> a device driver implementing a uacce interface doesn't need IOPF
> support, it will need to tell the uacce module, for example with a new
> flag.
>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> ---
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
> Cc: Zhou Wang <wangzhou1@hisilicon.com>
Thanks Jean
Acked-by: Zhangfei Gao <zhangfei.gao@linaro.org>


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 05/10] uacce: Enable IOMMU_DEV_FEAT_IOPF
@ 2021-01-11  3:29     ` Zhangfei Gao
  0 siblings, 0 replies; 105+ messages in thread
From: Zhangfei Gao @ 2021-01-11  3:29 UTC (permalink / raw)
  To: Jean-Philippe Brucker, joro, will
  Cc: devicetree, Arnd Bergmann, linux-acpi, Greg Kroah-Hartman,
	guohanjun, rjw, iommu, robh+dt, linux-accelerators, sudeep.holla,
	vivek.gautam, robin.murphy, linux-arm-kernel, lenb



On 2021/1/8 下午10:52, Jean-Philippe Brucker wrote:
> The IOPF (I/O Page Fault) feature is now enabled independently from the
> SVA feature, because some IOPF implementations are device-specific and
> do not require IOMMU support for PCIe PRI or Arm SMMU stall.
>
> Enable IOPF unconditionally when enabling SVA for now. In the future, if
> a device driver implementing a uacce interface doesn't need IOPF
> support, it will need to tell the uacce module, for example with a new
> flag.
>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> ---
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
> Cc: Zhou Wang <wangzhou1@hisilicon.com>
Thanks Jean
Acked-by: Zhangfei Gao <zhangfei.gao@linaro.org>

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 05/10] uacce: Enable IOMMU_DEV_FEAT_IOPF
@ 2021-01-11  3:29     ` Zhangfei Gao
  0 siblings, 0 replies; 105+ messages in thread
From: Zhangfei Gao @ 2021-01-11  3:29 UTC (permalink / raw)
  To: Jean-Philippe Brucker, joro, will
  Cc: devicetree, lorenzo.pieralisi, Arnd Bergmann, linux-acpi,
	Greg Kroah-Hartman, guohanjun, rjw, shameerali.kolothum.thodi,
	eric.auger, iommu, robh+dt, linux-accelerators, Jonathan.Cameron,
	sudeep.holla, vivek.gautam, Zhou Wang, baolu.lu, robin.murphy,
	linux-arm-kernel, lenb



On 2021/1/8 下午10:52, Jean-Philippe Brucker wrote:
> The IOPF (I/O Page Fault) feature is now enabled independently from the
> SVA feature, because some IOPF implementations are device-specific and
> do not require IOMMU support for PCIe PRI or Arm SMMU stall.
>
> Enable IOPF unconditionally when enabling SVA for now. In the future, if
> a device driver implementing a uacce interface doesn't need IOPF
> support, it will need to tell the uacce module, for example with a new
> flag.
>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> ---
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
> Cc: Zhou Wang <wangzhou1@hisilicon.com>
Thanks Jean
Acked-by: Zhangfei Gao <zhangfei.gao@linaro.org>


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
  2021-01-08 14:52   ` Jean-Philippe Brucker
  (?)
@ 2021-01-12  4:31     ` Lu Baolu
  -1 siblings, 0 replies; 105+ messages in thread
From: Lu Baolu @ 2021-01-12  4:31 UTC (permalink / raw)
  To: Jean-Philippe Brucker, joro, will
  Cc: baolu.lu, lorenzo.pieralisi, robh+dt, guohanjun, sudeep.holla,
	rjw, lenb, robin.murphy, Jonathan.Cameron, eric.auger, iommu,
	devicetree, linux-acpi, linux-arm-kernel, linux-accelerators,
	vdumpa, zhangfei.gao, shameerali.kolothum.thodi, vivek.gautam,
	Arnd Bergmann, David Woodhouse, Greg Kroah-Hartman, Zhou Wang,
	Tian, Kevin

Hi Jean,

On 1/8/21 10:52 PM, Jean-Philippe Brucker wrote:
> Some devices manage I/O Page Faults (IOPF) themselves instead of relying
> on PCIe PRI or Arm SMMU stall. Allow their drivers to enable SVA without
> mandating IOMMU-managed IOPF. The other device drivers now need to first
> enable IOMMU_DEV_FEAT_IOPF before enabling IOMMU_DEV_FEAT_SVA.
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> ---
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: David Woodhouse <dwmw2@infradead.org>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Lu Baolu <baolu.lu@linux.intel.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
> Cc: Zhou Wang <wangzhou1@hisilicon.com>
> ---
>   include/linux/iommu.h | 20 +++++++++++++++++---
>   1 file changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 583c734b2e87..701b2eeb0dc5 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -156,10 +156,24 @@ struct iommu_resv_region {
>   	enum iommu_resv_type	type;
>   };
>   
> -/* Per device IOMMU features */
> +/**
> + * enum iommu_dev_features - Per device IOMMU features
> + * @IOMMU_DEV_FEAT_AUX: Auxiliary domain feature
> + * @IOMMU_DEV_FEAT_SVA: Shared Virtual Addresses
> + * @IOMMU_DEV_FEAT_IOPF: I/O Page Faults such as PRI or Stall. Generally using
> + *			 %IOMMU_DEV_FEAT_SVA requires %IOMMU_DEV_FEAT_IOPF, but
> + *			 some devices manage I/O Page Faults themselves instead
> + *			 of relying on the IOMMU. When supported, this feature
> + *			 must be enabled before and disabled after
> + *			 %IOMMU_DEV_FEAT_SVA.

Is this only for SVA? We may see more scenarios of using IOPF. For
example, when passing through devices to user level, the user's pages
could be managed dynamically instead of being allocated and pinned
statically.

If @IOMMU_DEV_FEAT_IOPF is defined as generic iopf support, the current
vendor IOMMU driver support may not enough.

Best regards,
baolu

> + *
> + * Device drivers query whether a feature is supported using
> + * iommu_dev_has_feature(), and enable it using iommu_dev_enable_feature().
> + */
>   enum iommu_dev_features {
> -	IOMMU_DEV_FEAT_AUX,	/* Aux-domain feature */
> -	IOMMU_DEV_FEAT_SVA,	/* Shared Virtual Addresses */
> +	IOMMU_DEV_FEAT_AUX,
> +	IOMMU_DEV_FEAT_SVA,
> +	IOMMU_DEV_FEAT_IOPF,
>   };
>   
>   #define IOMMU_PASID_INVALID	(-1U)
> 

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
@ 2021-01-12  4:31     ` Lu Baolu
  0 siblings, 0 replies; 105+ messages in thread
From: Lu Baolu @ 2021-01-12  4:31 UTC (permalink / raw)
  To: Jean-Philippe Brucker, joro, will
  Cc: vivek.gautam, guohanjun, linux-acpi, zhangfei.gao, lenb,
	devicetree, Tian, Kevin, Arnd Bergmann, robin.murphy, robh+dt,
	linux-arm-kernel, Greg Kroah-Hartman, rjw, iommu, sudeep.holla,
	David Woodhouse, linux-accelerators

Hi Jean,

On 1/8/21 10:52 PM, Jean-Philippe Brucker wrote:
> Some devices manage I/O Page Faults (IOPF) themselves instead of relying
> on PCIe PRI or Arm SMMU stall. Allow their drivers to enable SVA without
> mandating IOMMU-managed IOPF. The other device drivers now need to first
> enable IOMMU_DEV_FEAT_IOPF before enabling IOMMU_DEV_FEAT_SVA.
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> ---
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: David Woodhouse <dwmw2@infradead.org>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Lu Baolu <baolu.lu@linux.intel.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
> Cc: Zhou Wang <wangzhou1@hisilicon.com>
> ---
>   include/linux/iommu.h | 20 +++++++++++++++++---
>   1 file changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 583c734b2e87..701b2eeb0dc5 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -156,10 +156,24 @@ struct iommu_resv_region {
>   	enum iommu_resv_type	type;
>   };
>   
> -/* Per device IOMMU features */
> +/**
> + * enum iommu_dev_features - Per device IOMMU features
> + * @IOMMU_DEV_FEAT_AUX: Auxiliary domain feature
> + * @IOMMU_DEV_FEAT_SVA: Shared Virtual Addresses
> + * @IOMMU_DEV_FEAT_IOPF: I/O Page Faults such as PRI or Stall. Generally using
> + *			 %IOMMU_DEV_FEAT_SVA requires %IOMMU_DEV_FEAT_IOPF, but
> + *			 some devices manage I/O Page Faults themselves instead
> + *			 of relying on the IOMMU. When supported, this feature
> + *			 must be enabled before and disabled after
> + *			 %IOMMU_DEV_FEAT_SVA.

Is this only for SVA? We may see more scenarios of using IOPF. For
example, when passing through devices to user level, the user's pages
could be managed dynamically instead of being allocated and pinned
statically.

If @IOMMU_DEV_FEAT_IOPF is defined as generic iopf support, the current
vendor IOMMU driver support may not enough.

Best regards,
baolu

> + *
> + * Device drivers query whether a feature is supported using
> + * iommu_dev_has_feature(), and enable it using iommu_dev_enable_feature().
> + */
>   enum iommu_dev_features {
> -	IOMMU_DEV_FEAT_AUX,	/* Aux-domain feature */
> -	IOMMU_DEV_FEAT_SVA,	/* Shared Virtual Addresses */
> +	IOMMU_DEV_FEAT_AUX,
> +	IOMMU_DEV_FEAT_SVA,
> +	IOMMU_DEV_FEAT_IOPF,
>   };
>   
>   #define IOMMU_PASID_INVALID	(-1U)
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
@ 2021-01-12  4:31     ` Lu Baolu
  0 siblings, 0 replies; 105+ messages in thread
From: Lu Baolu @ 2021-01-12  4:31 UTC (permalink / raw)
  To: Jean-Philippe Brucker, joro, will
  Cc: vivek.gautam, guohanjun, lorenzo.pieralisi, Zhou Wang,
	linux-acpi, zhangfei.gao, lenb, devicetree, Tian, Kevin,
	Arnd Bergmann, robin.murphy, eric.auger, robh+dt,
	Jonathan.Cameron, linux-arm-kernel, Greg Kroah-Hartman, rjw,
	shameerali.kolothum.thodi, iommu, sudeep.holla, David Woodhouse,
	linux-accelerators, baolu.lu

Hi Jean,

On 1/8/21 10:52 PM, Jean-Philippe Brucker wrote:
> Some devices manage I/O Page Faults (IOPF) themselves instead of relying
> on PCIe PRI or Arm SMMU stall. Allow their drivers to enable SVA without
> mandating IOMMU-managed IOPF. The other device drivers now need to first
> enable IOMMU_DEV_FEAT_IOPF before enabling IOMMU_DEV_FEAT_SVA.
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> ---
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: David Woodhouse <dwmw2@infradead.org>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Lu Baolu <baolu.lu@linux.intel.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
> Cc: Zhou Wang <wangzhou1@hisilicon.com>
> ---
>   include/linux/iommu.h | 20 +++++++++++++++++---
>   1 file changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 583c734b2e87..701b2eeb0dc5 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -156,10 +156,24 @@ struct iommu_resv_region {
>   	enum iommu_resv_type	type;
>   };
>   
> -/* Per device IOMMU features */
> +/**
> + * enum iommu_dev_features - Per device IOMMU features
> + * @IOMMU_DEV_FEAT_AUX: Auxiliary domain feature
> + * @IOMMU_DEV_FEAT_SVA: Shared Virtual Addresses
> + * @IOMMU_DEV_FEAT_IOPF: I/O Page Faults such as PRI or Stall. Generally using
> + *			 %IOMMU_DEV_FEAT_SVA requires %IOMMU_DEV_FEAT_IOPF, but
> + *			 some devices manage I/O Page Faults themselves instead
> + *			 of relying on the IOMMU. When supported, this feature
> + *			 must be enabled before and disabled after
> + *			 %IOMMU_DEV_FEAT_SVA.

Is this only for SVA? We may see more scenarios of using IOPF. For
example, when passing through devices to user level, the user's pages
could be managed dynamically instead of being allocated and pinned
statically.

If @IOMMU_DEV_FEAT_IOPF is defined as generic iopf support, the current
vendor IOMMU driver support may not enough.

Best regards,
baolu

> + *
> + * Device drivers query whether a feature is supported using
> + * iommu_dev_has_feature(), and enable it using iommu_dev_enable_feature().
> + */
>   enum iommu_dev_features {
> -	IOMMU_DEV_FEAT_AUX,	/* Aux-domain feature */
> -	IOMMU_DEV_FEAT_SVA,	/* Shared Virtual Addresses */
> +	IOMMU_DEV_FEAT_AUX,
> +	IOMMU_DEV_FEAT_SVA,
> +	IOMMU_DEV_FEAT_IOPF,
>   };
>   
>   #define IOMMU_PASID_INVALID	(-1U)
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
  2021-01-12  4:31     ` Lu Baolu
  (?)
@ 2021-01-12  9:16       ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-12  9:16 UTC (permalink / raw)
  To: Lu Baolu
  Cc: joro, will, lorenzo.pieralisi, robh+dt, guohanjun, sudeep.holla,
	rjw, lenb, robin.murphy, Jonathan.Cameron, eric.auger, iommu,
	devicetree, linux-acpi, linux-arm-kernel, linux-accelerators,
	vdumpa, zhangfei.gao, shameerali.kolothum.thodi, vivek.gautam,
	Arnd Bergmann, David Woodhouse, Greg Kroah-Hartman, Zhou Wang,
	Tian, Kevin

Hi Baolu,

On Tue, Jan 12, 2021 at 12:31:23PM +0800, Lu Baolu wrote:
> Hi Jean,
> 
> On 1/8/21 10:52 PM, Jean-Philippe Brucker wrote:
> > Some devices manage I/O Page Faults (IOPF) themselves instead of relying
> > on PCIe PRI or Arm SMMU stall. Allow their drivers to enable SVA without
> > mandating IOMMU-managed IOPF. The other device drivers now need to first
> > enable IOMMU_DEV_FEAT_IOPF before enabling IOMMU_DEV_FEAT_SVA.
> > 
> > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > ---
> > Cc: Arnd Bergmann <arnd@arndb.de>
> > Cc: David Woodhouse <dwmw2@infradead.org>
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Cc: Joerg Roedel <joro@8bytes.org>
> > Cc: Lu Baolu <baolu.lu@linux.intel.com>
> > Cc: Will Deacon <will@kernel.org>
> > Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
> > Cc: Zhou Wang <wangzhou1@hisilicon.com>
> > ---
> >   include/linux/iommu.h | 20 +++++++++++++++++---
> >   1 file changed, 17 insertions(+), 3 deletions(-)
> > 
> > diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> > index 583c734b2e87..701b2eeb0dc5 100644
> > --- a/include/linux/iommu.h
> > +++ b/include/linux/iommu.h
> > @@ -156,10 +156,24 @@ struct iommu_resv_region {
> >   	enum iommu_resv_type	type;
> >   };
> > -/* Per device IOMMU features */
> > +/**
> > + * enum iommu_dev_features - Per device IOMMU features
> > + * @IOMMU_DEV_FEAT_AUX: Auxiliary domain feature
> > + * @IOMMU_DEV_FEAT_SVA: Shared Virtual Addresses
> > + * @IOMMU_DEV_FEAT_IOPF: I/O Page Faults such as PRI or Stall. Generally using
> > + *			 %IOMMU_DEV_FEAT_SVA requires %IOMMU_DEV_FEAT_IOPF, but
> > + *			 some devices manage I/O Page Faults themselves instead
> > + *			 of relying on the IOMMU. When supported, this feature
> > + *			 must be enabled before and disabled after
> > + *			 %IOMMU_DEV_FEAT_SVA.
> 
> Is this only for SVA? We may see more scenarios of using IOPF. For
> example, when passing through devices to user level, the user's pages
> could be managed dynamically instead of being allocated and pinned
> statically.

Hm, isn't that precisely what SVA does?  I don't understand the
difference. That said FEAT_IOPF doesn't have to be only for SVA. It could
later be used as a prerequisite some another feature. For special cases
device drivers can always use the iommu_register_device_fault_handler()
API and handle faults themselves.

> If @IOMMU_DEV_FEAT_IOPF is defined as generic iopf support, the current
> vendor IOMMU driver support may not enough.

IOMMU_DEV_FEAT_IOPF on its own doesn't do anything useful, it's mainly a
way for device drivers to probe the IOMMU capability. Granted in patch
10 the SMMU driver registers the IOPF queue on enable() but that could be
done by FEAT_SVA enable() instead, if we ever repurpose FEAT_IOPF.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
@ 2021-01-12  9:16       ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-12  9:16 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Greg Kroah-Hartman, vivek.gautam, guohanjun, will, linux-acpi,
	zhangfei.gao, lenb, devicetree, Tian, Kevin, Arnd Bergmann,
	robh+dt, linux-arm-kernel, David Woodhouse, rjw, iommu,
	sudeep.holla, robin.murphy, linux-accelerators

Hi Baolu,

On Tue, Jan 12, 2021 at 12:31:23PM +0800, Lu Baolu wrote:
> Hi Jean,
> 
> On 1/8/21 10:52 PM, Jean-Philippe Brucker wrote:
> > Some devices manage I/O Page Faults (IOPF) themselves instead of relying
> > on PCIe PRI or Arm SMMU stall. Allow their drivers to enable SVA without
> > mandating IOMMU-managed IOPF. The other device drivers now need to first
> > enable IOMMU_DEV_FEAT_IOPF before enabling IOMMU_DEV_FEAT_SVA.
> > 
> > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > ---
> > Cc: Arnd Bergmann <arnd@arndb.de>
> > Cc: David Woodhouse <dwmw2@infradead.org>
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Cc: Joerg Roedel <joro@8bytes.org>
> > Cc: Lu Baolu <baolu.lu@linux.intel.com>
> > Cc: Will Deacon <will@kernel.org>
> > Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
> > Cc: Zhou Wang <wangzhou1@hisilicon.com>
> > ---
> >   include/linux/iommu.h | 20 +++++++++++++++++---
> >   1 file changed, 17 insertions(+), 3 deletions(-)
> > 
> > diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> > index 583c734b2e87..701b2eeb0dc5 100644
> > --- a/include/linux/iommu.h
> > +++ b/include/linux/iommu.h
> > @@ -156,10 +156,24 @@ struct iommu_resv_region {
> >   	enum iommu_resv_type	type;
> >   };
> > -/* Per device IOMMU features */
> > +/**
> > + * enum iommu_dev_features - Per device IOMMU features
> > + * @IOMMU_DEV_FEAT_AUX: Auxiliary domain feature
> > + * @IOMMU_DEV_FEAT_SVA: Shared Virtual Addresses
> > + * @IOMMU_DEV_FEAT_IOPF: I/O Page Faults such as PRI or Stall. Generally using
> > + *			 %IOMMU_DEV_FEAT_SVA requires %IOMMU_DEV_FEAT_IOPF, but
> > + *			 some devices manage I/O Page Faults themselves instead
> > + *			 of relying on the IOMMU. When supported, this feature
> > + *			 must be enabled before and disabled after
> > + *			 %IOMMU_DEV_FEAT_SVA.
> 
> Is this only for SVA? We may see more scenarios of using IOPF. For
> example, when passing through devices to user level, the user's pages
> could be managed dynamically instead of being allocated and pinned
> statically.

Hm, isn't that precisely what SVA does?  I don't understand the
difference. That said FEAT_IOPF doesn't have to be only for SVA. It could
later be used as a prerequisite some another feature. For special cases
device drivers can always use the iommu_register_device_fault_handler()
API and handle faults themselves.

> If @IOMMU_DEV_FEAT_IOPF is defined as generic iopf support, the current
> vendor IOMMU driver support may not enough.

IOMMU_DEV_FEAT_IOPF on its own doesn't do anything useful, it's mainly a
way for device drivers to probe the IOMMU capability. Granted in patch
10 the SMMU driver registers the IOPF queue on enable() but that could be
done by FEAT_SVA enable() instead, if we ever repurpose FEAT_IOPF.

Thanks,
Jean
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
@ 2021-01-12  9:16       ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-12  9:16 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Greg Kroah-Hartman, vivek.gautam, guohanjun, will,
	lorenzo.pieralisi, joro, Zhou Wang, linux-acpi, zhangfei.gao,
	lenb, devicetree, Tian, Kevin, Arnd Bergmann, eric.auger,
	robh+dt, Jonathan.Cameron, linux-arm-kernel, David Woodhouse,
	rjw, shameerali.kolothum.thodi, iommu, sudeep.holla,
	robin.murphy, linux-accelerators

Hi Baolu,

On Tue, Jan 12, 2021 at 12:31:23PM +0800, Lu Baolu wrote:
> Hi Jean,
> 
> On 1/8/21 10:52 PM, Jean-Philippe Brucker wrote:
> > Some devices manage I/O Page Faults (IOPF) themselves instead of relying
> > on PCIe PRI or Arm SMMU stall. Allow their drivers to enable SVA without
> > mandating IOMMU-managed IOPF. The other device drivers now need to first
> > enable IOMMU_DEV_FEAT_IOPF before enabling IOMMU_DEV_FEAT_SVA.
> > 
> > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > ---
> > Cc: Arnd Bergmann <arnd@arndb.de>
> > Cc: David Woodhouse <dwmw2@infradead.org>
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Cc: Joerg Roedel <joro@8bytes.org>
> > Cc: Lu Baolu <baolu.lu@linux.intel.com>
> > Cc: Will Deacon <will@kernel.org>
> > Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
> > Cc: Zhou Wang <wangzhou1@hisilicon.com>
> > ---
> >   include/linux/iommu.h | 20 +++++++++++++++++---
> >   1 file changed, 17 insertions(+), 3 deletions(-)
> > 
> > diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> > index 583c734b2e87..701b2eeb0dc5 100644
> > --- a/include/linux/iommu.h
> > +++ b/include/linux/iommu.h
> > @@ -156,10 +156,24 @@ struct iommu_resv_region {
> >   	enum iommu_resv_type	type;
> >   };
> > -/* Per device IOMMU features */
> > +/**
> > + * enum iommu_dev_features - Per device IOMMU features
> > + * @IOMMU_DEV_FEAT_AUX: Auxiliary domain feature
> > + * @IOMMU_DEV_FEAT_SVA: Shared Virtual Addresses
> > + * @IOMMU_DEV_FEAT_IOPF: I/O Page Faults such as PRI or Stall. Generally using
> > + *			 %IOMMU_DEV_FEAT_SVA requires %IOMMU_DEV_FEAT_IOPF, but
> > + *			 some devices manage I/O Page Faults themselves instead
> > + *			 of relying on the IOMMU. When supported, this feature
> > + *			 must be enabled before and disabled after
> > + *			 %IOMMU_DEV_FEAT_SVA.
> 
> Is this only for SVA? We may see more scenarios of using IOPF. For
> example, when passing through devices to user level, the user's pages
> could be managed dynamically instead of being allocated and pinned
> statically.

Hm, isn't that precisely what SVA does?  I don't understand the
difference. That said FEAT_IOPF doesn't have to be only for SVA. It could
later be used as a prerequisite some another feature. For special cases
device drivers can always use the iommu_register_device_fault_handler()
API and handle faults themselves.

> If @IOMMU_DEV_FEAT_IOPF is defined as generic iopf support, the current
> vendor IOMMU driver support may not enough.

IOMMU_DEV_FEAT_IOPF on its own doesn't do anything useful, it's mainly a
way for device drivers to probe the IOMMU capability. Granted in patch
10 the SMMU driver registers the IOPF queue on enable() but that could be
done by FEAT_SVA enable() instead, if we ever repurpose FEAT_IOPF.

Thanks,
Jean

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
  2021-01-12  9:16       ` Jean-Philippe Brucker
  (?)
@ 2021-01-13  2:49         ` Lu Baolu
  -1 siblings, 0 replies; 105+ messages in thread
From: Lu Baolu @ 2021-01-13  2:49 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: baolu.lu, joro, will, lorenzo.pieralisi, robh+dt, guohanjun,
	sudeep.holla, rjw, lenb, robin.murphy, Jonathan.Cameron,
	eric.auger, iommu, devicetree, linux-acpi, linux-arm-kernel,
	linux-accelerators, vdumpa, zhangfei.gao,
	shameerali.kolothum.thodi, vivek.gautam, Arnd Bergmann,
	David Woodhouse, Greg Kroah-Hartman, Zhou Wang, Tian, Kevin

Hi Jean,

On 1/12/21 5:16 PM, Jean-Philippe Brucker wrote:
> Hi Baolu,
> 
> On Tue, Jan 12, 2021 at 12:31:23PM +0800, Lu Baolu wrote:
>> Hi Jean,
>>
>> On 1/8/21 10:52 PM, Jean-Philippe Brucker wrote:
>>> Some devices manage I/O Page Faults (IOPF) themselves instead of relying
>>> on PCIe PRI or Arm SMMU stall. Allow their drivers to enable SVA without
>>> mandating IOMMU-managed IOPF. The other device drivers now need to first
>>> enable IOMMU_DEV_FEAT_IOPF before enabling IOMMU_DEV_FEAT_SVA.
>>>
>>> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
>>> ---
>>> Cc: Arnd Bergmann <arnd@arndb.de>
>>> Cc: David Woodhouse <dwmw2@infradead.org>
>>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>> Cc: Joerg Roedel <joro@8bytes.org>
>>> Cc: Lu Baolu <baolu.lu@linux.intel.com>
>>> Cc: Will Deacon <will@kernel.org>
>>> Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
>>> Cc: Zhou Wang <wangzhou1@hisilicon.com>
>>> ---
>>>    include/linux/iommu.h | 20 +++++++++++++++++---
>>>    1 file changed, 17 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>> index 583c734b2e87..701b2eeb0dc5 100644
>>> --- a/include/linux/iommu.h
>>> +++ b/include/linux/iommu.h
>>> @@ -156,10 +156,24 @@ struct iommu_resv_region {
>>>    	enum iommu_resv_type	type;
>>>    };
>>> -/* Per device IOMMU features */
>>> +/**
>>> + * enum iommu_dev_features - Per device IOMMU features
>>> + * @IOMMU_DEV_FEAT_AUX: Auxiliary domain feature
>>> + * @IOMMU_DEV_FEAT_SVA: Shared Virtual Addresses
>>> + * @IOMMU_DEV_FEAT_IOPF: I/O Page Faults such as PRI or Stall. Generally using
>>> + *			 %IOMMU_DEV_FEAT_SVA requires %IOMMU_DEV_FEAT_IOPF, but
>>> + *			 some devices manage I/O Page Faults themselves instead
>>> + *			 of relying on the IOMMU. When supported, this feature
>>> + *			 must be enabled before and disabled after
>>> + *			 %IOMMU_DEV_FEAT_SVA.
>>
>> Is this only for SVA? We may see more scenarios of using IOPF. For
>> example, when passing through devices to user level, the user's pages
>> could be managed dynamically instead of being allocated and pinned
>> statically.
> 
> Hm, isn't that precisely what SVA does?  I don't understand the
> difference. That said FEAT_IOPF doesn't have to be only for SVA. It could
> later be used as a prerequisite some another feature. For special cases
> device drivers can always use the iommu_register_device_fault_handler()
> API and handle faults themselves.

 From the perspective of IOMMU, there is a little difference between
these two. For SVA, the page table is from CPU side, so IOMMU only needs
to call handle_mm_fault(); For above pass-through case, the page table
is from IOMMU side, so the device driver (probably VFIO) needs to
register a fault handler and call iommu_map/unmap() to serve the page
faults.

If we think about the nested mode (or dual-stage translation), it's more
complicated since the kernel (probably VFIO) handles the second level
page faults, while the first level page faults need to be delivered to
user-level guest. Obviously, this hasn't been fully implemented in any
IOMMU driver.

> 
>> If @IOMMU_DEV_FEAT_IOPF is defined as generic iopf support, the current
>> vendor IOMMU driver support may not enough.
> 
> IOMMU_DEV_FEAT_IOPF on its own doesn't do anything useful, it's mainly a
> way for device drivers to probe the IOMMU capability. Granted in patch
> 10 the SMMU driver registers the IOPF queue on enable() but that could be
> done by FEAT_SVA enable() instead, if we ever repurpose FEAT_IOPF.

I have no objection to split IOPF from SVA. Actually we must have this
eventually. My concern is that at this stage, the IOMMU drivers only
support SVA type of IOPF, a generic IOMMU_DEV_FEAT_IOPF feature might
confuse the device drivers which want to add other types of IOPF usage.

> 
> Thanks,
> Jean
> 

Best regards,
baolu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
@ 2021-01-13  2:49         ` Lu Baolu
  0 siblings, 0 replies; 105+ messages in thread
From: Lu Baolu @ 2021-01-13  2:49 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Greg Kroah-Hartman, vivek.gautam, guohanjun, will, linux-acpi,
	zhangfei.gao, lenb, devicetree, Tian, Kevin, Arnd Bergmann,
	robh+dt, linux-arm-kernel, David Woodhouse, rjw, iommu,
	sudeep.holla, robin.murphy, linux-accelerators

Hi Jean,

On 1/12/21 5:16 PM, Jean-Philippe Brucker wrote:
> Hi Baolu,
> 
> On Tue, Jan 12, 2021 at 12:31:23PM +0800, Lu Baolu wrote:
>> Hi Jean,
>>
>> On 1/8/21 10:52 PM, Jean-Philippe Brucker wrote:
>>> Some devices manage I/O Page Faults (IOPF) themselves instead of relying
>>> on PCIe PRI or Arm SMMU stall. Allow their drivers to enable SVA without
>>> mandating IOMMU-managed IOPF. The other device drivers now need to first
>>> enable IOMMU_DEV_FEAT_IOPF before enabling IOMMU_DEV_FEAT_SVA.
>>>
>>> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
>>> ---
>>> Cc: Arnd Bergmann <arnd@arndb.de>
>>> Cc: David Woodhouse <dwmw2@infradead.org>
>>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>> Cc: Joerg Roedel <joro@8bytes.org>
>>> Cc: Lu Baolu <baolu.lu@linux.intel.com>
>>> Cc: Will Deacon <will@kernel.org>
>>> Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
>>> Cc: Zhou Wang <wangzhou1@hisilicon.com>
>>> ---
>>>    include/linux/iommu.h | 20 +++++++++++++++++---
>>>    1 file changed, 17 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>> index 583c734b2e87..701b2eeb0dc5 100644
>>> --- a/include/linux/iommu.h
>>> +++ b/include/linux/iommu.h
>>> @@ -156,10 +156,24 @@ struct iommu_resv_region {
>>>    	enum iommu_resv_type	type;
>>>    };
>>> -/* Per device IOMMU features */
>>> +/**
>>> + * enum iommu_dev_features - Per device IOMMU features
>>> + * @IOMMU_DEV_FEAT_AUX: Auxiliary domain feature
>>> + * @IOMMU_DEV_FEAT_SVA: Shared Virtual Addresses
>>> + * @IOMMU_DEV_FEAT_IOPF: I/O Page Faults such as PRI or Stall. Generally using
>>> + *			 %IOMMU_DEV_FEAT_SVA requires %IOMMU_DEV_FEAT_IOPF, but
>>> + *			 some devices manage I/O Page Faults themselves instead
>>> + *			 of relying on the IOMMU. When supported, this feature
>>> + *			 must be enabled before and disabled after
>>> + *			 %IOMMU_DEV_FEAT_SVA.
>>
>> Is this only for SVA? We may see more scenarios of using IOPF. For
>> example, when passing through devices to user level, the user's pages
>> could be managed dynamically instead of being allocated and pinned
>> statically.
> 
> Hm, isn't that precisely what SVA does?  I don't understand the
> difference. That said FEAT_IOPF doesn't have to be only for SVA. It could
> later be used as a prerequisite some another feature. For special cases
> device drivers can always use the iommu_register_device_fault_handler()
> API and handle faults themselves.

 From the perspective of IOMMU, there is a little difference between
these two. For SVA, the page table is from CPU side, so IOMMU only needs
to call handle_mm_fault(); For above pass-through case, the page table
is from IOMMU side, so the device driver (probably VFIO) needs to
register a fault handler and call iommu_map/unmap() to serve the page
faults.

If we think about the nested mode (or dual-stage translation), it's more
complicated since the kernel (probably VFIO) handles the second level
page faults, while the first level page faults need to be delivered to
user-level guest. Obviously, this hasn't been fully implemented in any
IOMMU driver.

> 
>> If @IOMMU_DEV_FEAT_IOPF is defined as generic iopf support, the current
>> vendor IOMMU driver support may not enough.
> 
> IOMMU_DEV_FEAT_IOPF on its own doesn't do anything useful, it's mainly a
> way for device drivers to probe the IOMMU capability. Granted in patch
> 10 the SMMU driver registers the IOPF queue on enable() but that could be
> done by FEAT_SVA enable() instead, if we ever repurpose FEAT_IOPF.

I have no objection to split IOPF from SVA. Actually we must have this
eventually. My concern is that at this stage, the IOMMU drivers only
support SVA type of IOPF, a generic IOMMU_DEV_FEAT_IOPF feature might
confuse the device drivers which want to add other types of IOPF usage.

> 
> Thanks,
> Jean
> 

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
@ 2021-01-13  2:49         ` Lu Baolu
  0 siblings, 0 replies; 105+ messages in thread
From: Lu Baolu @ 2021-01-13  2:49 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Greg Kroah-Hartman, vivek.gautam, guohanjun, will,
	lorenzo.pieralisi, joro, Zhou Wang, linux-acpi, zhangfei.gao,
	lenb, devicetree, Tian, Kevin, Arnd Bergmann, eric.auger,
	robh+dt, Jonathan.Cameron, linux-arm-kernel, David Woodhouse,
	rjw, shameerali.kolothum.thodi, iommu, sudeep.holla,
	robin.murphy, linux-accelerators, baolu.lu

Hi Jean,

On 1/12/21 5:16 PM, Jean-Philippe Brucker wrote:
> Hi Baolu,
> 
> On Tue, Jan 12, 2021 at 12:31:23PM +0800, Lu Baolu wrote:
>> Hi Jean,
>>
>> On 1/8/21 10:52 PM, Jean-Philippe Brucker wrote:
>>> Some devices manage I/O Page Faults (IOPF) themselves instead of relying
>>> on PCIe PRI or Arm SMMU stall. Allow their drivers to enable SVA without
>>> mandating IOMMU-managed IOPF. The other device drivers now need to first
>>> enable IOMMU_DEV_FEAT_IOPF before enabling IOMMU_DEV_FEAT_SVA.
>>>
>>> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
>>> ---
>>> Cc: Arnd Bergmann <arnd@arndb.de>
>>> Cc: David Woodhouse <dwmw2@infradead.org>
>>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>> Cc: Joerg Roedel <joro@8bytes.org>
>>> Cc: Lu Baolu <baolu.lu@linux.intel.com>
>>> Cc: Will Deacon <will@kernel.org>
>>> Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
>>> Cc: Zhou Wang <wangzhou1@hisilicon.com>
>>> ---
>>>    include/linux/iommu.h | 20 +++++++++++++++++---
>>>    1 file changed, 17 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>> index 583c734b2e87..701b2eeb0dc5 100644
>>> --- a/include/linux/iommu.h
>>> +++ b/include/linux/iommu.h
>>> @@ -156,10 +156,24 @@ struct iommu_resv_region {
>>>    	enum iommu_resv_type	type;
>>>    };
>>> -/* Per device IOMMU features */
>>> +/**
>>> + * enum iommu_dev_features - Per device IOMMU features
>>> + * @IOMMU_DEV_FEAT_AUX: Auxiliary domain feature
>>> + * @IOMMU_DEV_FEAT_SVA: Shared Virtual Addresses
>>> + * @IOMMU_DEV_FEAT_IOPF: I/O Page Faults such as PRI or Stall. Generally using
>>> + *			 %IOMMU_DEV_FEAT_SVA requires %IOMMU_DEV_FEAT_IOPF, but
>>> + *			 some devices manage I/O Page Faults themselves instead
>>> + *			 of relying on the IOMMU. When supported, this feature
>>> + *			 must be enabled before and disabled after
>>> + *			 %IOMMU_DEV_FEAT_SVA.
>>
>> Is this only for SVA? We may see more scenarios of using IOPF. For
>> example, when passing through devices to user level, the user's pages
>> could be managed dynamically instead of being allocated and pinned
>> statically.
> 
> Hm, isn't that precisely what SVA does?  I don't understand the
> difference. That said FEAT_IOPF doesn't have to be only for SVA. It could
> later be used as a prerequisite some another feature. For special cases
> device drivers can always use the iommu_register_device_fault_handler()
> API and handle faults themselves.

 From the perspective of IOMMU, there is a little difference between
these two. For SVA, the page table is from CPU side, so IOMMU only needs
to call handle_mm_fault(); For above pass-through case, the page table
is from IOMMU side, so the device driver (probably VFIO) needs to
register a fault handler and call iommu_map/unmap() to serve the page
faults.

If we think about the nested mode (or dual-stage translation), it's more
complicated since the kernel (probably VFIO) handles the second level
page faults, while the first level page faults need to be delivered to
user-level guest. Obviously, this hasn't been fully implemented in any
IOMMU driver.

> 
>> If @IOMMU_DEV_FEAT_IOPF is defined as generic iopf support, the current
>> vendor IOMMU driver support may not enough.
> 
> IOMMU_DEV_FEAT_IOPF on its own doesn't do anything useful, it's mainly a
> way for device drivers to probe the IOMMU capability. Granted in patch
> 10 the SMMU driver registers the IOPF queue on enable() but that could be
> done by FEAT_SVA enable() instead, if we ever repurpose FEAT_IOPF.

I have no objection to split IOPF from SVA. Actually we must have this
eventually. My concern is that at this stage, the IOMMU drivers only
support SVA type of IOPF, a generic IOMMU_DEV_FEAT_IOPF feature might
confuse the device drivers which want to add other types of IOPF usage.

> 
> Thanks,
> Jean
> 

Best regards,
baolu

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
  2021-01-13  2:49         ` Lu Baolu
  (?)
@ 2021-01-13  8:10           ` Tian, Kevin
  -1 siblings, 0 replies; 105+ messages in thread
From: Tian, Kevin @ 2021-01-13  8:10 UTC (permalink / raw)
  To: Lu Baolu, Jean-Philippe Brucker
  Cc: joro, will, lorenzo.pieralisi, robh+dt, guohanjun, sudeep.holla,
	rjw, lenb, robin.murphy, Jonathan.Cameron, eric.auger, iommu,
	devicetree, linux-acpi, linux-arm-kernel, linux-accelerators,
	vdumpa, zhangfei.gao, shameerali.kolothum.thodi, vivek.gautam,
	Arnd Bergmann, David Woodhouse, Greg Kroah-Hartman, Zhou Wang

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Wednesday, January 13, 2021 10:50 AM
> 
> Hi Jean,
> 
> On 1/12/21 5:16 PM, Jean-Philippe Brucker wrote:
> > Hi Baolu,
> >
> > On Tue, Jan 12, 2021 at 12:31:23PM +0800, Lu Baolu wrote:
> >> Hi Jean,
> >>
> >> On 1/8/21 10:52 PM, Jean-Philippe Brucker wrote:
> >>> Some devices manage I/O Page Faults (IOPF) themselves instead of
> relying
> >>> on PCIe PRI or Arm SMMU stall. Allow their drivers to enable SVA without
> >>> mandating IOMMU-managed IOPF. The other device drivers now need to
> first
> >>> enable IOMMU_DEV_FEAT_IOPF before enabling
> IOMMU_DEV_FEAT_SVA.
> >>>
> >>> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> >>> ---
> >>> Cc: Arnd Bergmann <arnd@arndb.de>
> >>> Cc: David Woodhouse <dwmw2@infradead.org>
> >>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> >>> Cc: Joerg Roedel <joro@8bytes.org>
> >>> Cc: Lu Baolu <baolu.lu@linux.intel.com>
> >>> Cc: Will Deacon <will@kernel.org>
> >>> Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
> >>> Cc: Zhou Wang <wangzhou1@hisilicon.com>
> >>> ---
> >>>    include/linux/iommu.h | 20 +++++++++++++++++---
> >>>    1 file changed, 17 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> >>> index 583c734b2e87..701b2eeb0dc5 100644
> >>> --- a/include/linux/iommu.h
> >>> +++ b/include/linux/iommu.h
> >>> @@ -156,10 +156,24 @@ struct iommu_resv_region {
> >>>    	enum iommu_resv_type	type;
> >>>    };
> >>> -/* Per device IOMMU features */
> >>> +/**
> >>> + * enum iommu_dev_features - Per device IOMMU features
> >>> + * @IOMMU_DEV_FEAT_AUX: Auxiliary domain feature
> >>> + * @IOMMU_DEV_FEAT_SVA: Shared Virtual Addresses
> >>> + * @IOMMU_DEV_FEAT_IOPF: I/O Page Faults such as PRI or Stall.
> Generally using
> >>> + *			 %IOMMU_DEV_FEAT_SVA
> requires %IOMMU_DEV_FEAT_IOPF, but
> >>> + *			 some devices manage I/O Page Faults themselves
> instead
> >>> + *			 of relying on the IOMMU. When supported, this
> feature
> >>> + *			 must be enabled before and disabled after
> >>> + *			 %IOMMU_DEV_FEAT_SVA.
> >>
> >> Is this only for SVA? We may see more scenarios of using IOPF. For
> >> example, when passing through devices to user level, the user's pages
> >> could be managed dynamically instead of being allocated and pinned
> >> statically.
> >
> > Hm, isn't that precisely what SVA does?  I don't understand the
> > difference. That said FEAT_IOPF doesn't have to be only for SVA. It could
> > later be used as a prerequisite some another feature. For special cases
> > device drivers can always use the iommu_register_device_fault_handler()
> > API and handle faults themselves.
> 
>  From the perspective of IOMMU, there is a little difference between
> these two. For SVA, the page table is from CPU side, so IOMMU only needs
> to call handle_mm_fault(); For above pass-through case, the page table
> is from IOMMU side, so the device driver (probably VFIO) needs to
> register a fault handler and call iommu_map/unmap() to serve the page
> faults.
> 
> If we think about the nested mode (or dual-stage translation), it's more
> complicated since the kernel (probably VFIO) handles the second level
> page faults, while the first level page faults need to be delivered to
> user-level guest. Obviously, this hasn't been fully implemented in any
> IOMMU driver.
> 

Thinking more the confusion might come from the fact that we mixed
hardware capability with software capability. IOMMU_FEAT describes
the hardware capability. When FEAT_IOPF is set, it purely means whatever
page faults that are enabled by the software are routed through the IOMMU.
Nothing more. Then the software (IOMMU drivers) may choose to support
only limited faulting scenarios and then evolve to support more complex 
usages gradually. For example, the intel-iommu driver only supports 1st-level
fault (thus SVA) for now, while FEAT_IOPF as a separate feature may give the
impression that 2nd-level faults are also allowed. From this angle once we 
start to separate page fault from SVA, we may also need a way to report 
the software capability (e.g. a set of faulting categories) and also extend
iommu_register_device_fault_handler to allow specifying which 
category is enabled respectively. The example categories could be:

- IOPF_BIND, for page tables which are bound/linked to the IOMMU. 
Apply to bare metal SVA and guest SVA case;
- IOPF_MAP, for page tables which are managed through explicit IOMMU
map interfaces. Apply to removing VFIO pinning restriction;

Both categories can be enabled together in nested translation, with 
additional information provided to differentiate them in fault information.
Using paging/staging level doesn't make much sense as it's IOMMU driver's 
internal knowledge, e.g. VT-d driver plans to use 1st level for GPA if no 
nesting and then turn to 2nd level when nesting is enabled.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
@ 2021-01-13  8:10           ` Tian, Kevin
  0 siblings, 0 replies; 105+ messages in thread
From: Tian, Kevin @ 2021-01-13  8:10 UTC (permalink / raw)
  To: Lu Baolu, Jean-Philippe Brucker
  Cc: Greg Kroah-Hartman, vivek.gautam, guohanjun, will, linux-acpi,
	zhangfei.gao, lenb, devicetree, Arnd Bergmann, robh+dt,
	linux-arm-kernel, David Woodhouse, rjw, iommu, sudeep.holla,
	robin.murphy, linux-accelerators

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Wednesday, January 13, 2021 10:50 AM
> 
> Hi Jean,
> 
> On 1/12/21 5:16 PM, Jean-Philippe Brucker wrote:
> > Hi Baolu,
> >
> > On Tue, Jan 12, 2021 at 12:31:23PM +0800, Lu Baolu wrote:
> >> Hi Jean,
> >>
> >> On 1/8/21 10:52 PM, Jean-Philippe Brucker wrote:
> >>> Some devices manage I/O Page Faults (IOPF) themselves instead of
> relying
> >>> on PCIe PRI or Arm SMMU stall. Allow their drivers to enable SVA without
> >>> mandating IOMMU-managed IOPF. The other device drivers now need to
> first
> >>> enable IOMMU_DEV_FEAT_IOPF before enabling
> IOMMU_DEV_FEAT_SVA.
> >>>
> >>> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> >>> ---
> >>> Cc: Arnd Bergmann <arnd@arndb.de>
> >>> Cc: David Woodhouse <dwmw2@infradead.org>
> >>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> >>> Cc: Joerg Roedel <joro@8bytes.org>
> >>> Cc: Lu Baolu <baolu.lu@linux.intel.com>
> >>> Cc: Will Deacon <will@kernel.org>
> >>> Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
> >>> Cc: Zhou Wang <wangzhou1@hisilicon.com>
> >>> ---
> >>>    include/linux/iommu.h | 20 +++++++++++++++++---
> >>>    1 file changed, 17 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> >>> index 583c734b2e87..701b2eeb0dc5 100644
> >>> --- a/include/linux/iommu.h
> >>> +++ b/include/linux/iommu.h
> >>> @@ -156,10 +156,24 @@ struct iommu_resv_region {
> >>>    	enum iommu_resv_type	type;
> >>>    };
> >>> -/* Per device IOMMU features */
> >>> +/**
> >>> + * enum iommu_dev_features - Per device IOMMU features
> >>> + * @IOMMU_DEV_FEAT_AUX: Auxiliary domain feature
> >>> + * @IOMMU_DEV_FEAT_SVA: Shared Virtual Addresses
> >>> + * @IOMMU_DEV_FEAT_IOPF: I/O Page Faults such as PRI or Stall.
> Generally using
> >>> + *			 %IOMMU_DEV_FEAT_SVA
> requires %IOMMU_DEV_FEAT_IOPF, but
> >>> + *			 some devices manage I/O Page Faults themselves
> instead
> >>> + *			 of relying on the IOMMU. When supported, this
> feature
> >>> + *			 must be enabled before and disabled after
> >>> + *			 %IOMMU_DEV_FEAT_SVA.
> >>
> >> Is this only for SVA? We may see more scenarios of using IOPF. For
> >> example, when passing through devices to user level, the user's pages
> >> could be managed dynamically instead of being allocated and pinned
> >> statically.
> >
> > Hm, isn't that precisely what SVA does?  I don't understand the
> > difference. That said FEAT_IOPF doesn't have to be only for SVA. It could
> > later be used as a prerequisite some another feature. For special cases
> > device drivers can always use the iommu_register_device_fault_handler()
> > API and handle faults themselves.
> 
>  From the perspective of IOMMU, there is a little difference between
> these two. For SVA, the page table is from CPU side, so IOMMU only needs
> to call handle_mm_fault(); For above pass-through case, the page table
> is from IOMMU side, so the device driver (probably VFIO) needs to
> register a fault handler and call iommu_map/unmap() to serve the page
> faults.
> 
> If we think about the nested mode (or dual-stage translation), it's more
> complicated since the kernel (probably VFIO) handles the second level
> page faults, while the first level page faults need to be delivered to
> user-level guest. Obviously, this hasn't been fully implemented in any
> IOMMU driver.
> 

Thinking more the confusion might come from the fact that we mixed
hardware capability with software capability. IOMMU_FEAT describes
the hardware capability. When FEAT_IOPF is set, it purely means whatever
page faults that are enabled by the software are routed through the IOMMU.
Nothing more. Then the software (IOMMU drivers) may choose to support
only limited faulting scenarios and then evolve to support more complex 
usages gradually. For example, the intel-iommu driver only supports 1st-level
fault (thus SVA) for now, while FEAT_IOPF as a separate feature may give the
impression that 2nd-level faults are also allowed. From this angle once we 
start to separate page fault from SVA, we may also need a way to report 
the software capability (e.g. a set of faulting categories) and also extend
iommu_register_device_fault_handler to allow specifying which 
category is enabled respectively. The example categories could be:

- IOPF_BIND, for page tables which are bound/linked to the IOMMU. 
Apply to bare metal SVA and guest SVA case;
- IOPF_MAP, for page tables which are managed through explicit IOMMU
map interfaces. Apply to removing VFIO pinning restriction;

Both categories can be enabled together in nested translation, with 
additional information provided to differentiate them in fault information.
Using paging/staging level doesn't make much sense as it's IOMMU driver's 
internal knowledge, e.g. VT-d driver plans to use 1st level for GPA if no 
nesting and then turn to 2nd level when nesting is enabled.

Thanks
Kevin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
@ 2021-01-13  8:10           ` Tian, Kevin
  0 siblings, 0 replies; 105+ messages in thread
From: Tian, Kevin @ 2021-01-13  8:10 UTC (permalink / raw)
  To: Lu Baolu, Jean-Philippe Brucker
  Cc: Greg Kroah-Hartman, vivek.gautam, guohanjun, will,
	lorenzo.pieralisi, joro, Zhou Wang, linux-acpi, zhangfei.gao,
	lenb, devicetree, Arnd Bergmann, eric.auger, robh+dt,
	Jonathan.Cameron, linux-arm-kernel, David Woodhouse, rjw,
	shameerali.kolothum.thodi, iommu, sudeep.holla, robin.murphy,
	linux-accelerators

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Wednesday, January 13, 2021 10:50 AM
> 
> Hi Jean,
> 
> On 1/12/21 5:16 PM, Jean-Philippe Brucker wrote:
> > Hi Baolu,
> >
> > On Tue, Jan 12, 2021 at 12:31:23PM +0800, Lu Baolu wrote:
> >> Hi Jean,
> >>
> >> On 1/8/21 10:52 PM, Jean-Philippe Brucker wrote:
> >>> Some devices manage I/O Page Faults (IOPF) themselves instead of
> relying
> >>> on PCIe PRI or Arm SMMU stall. Allow their drivers to enable SVA without
> >>> mandating IOMMU-managed IOPF. The other device drivers now need to
> first
> >>> enable IOMMU_DEV_FEAT_IOPF before enabling
> IOMMU_DEV_FEAT_SVA.
> >>>
> >>> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> >>> ---
> >>> Cc: Arnd Bergmann <arnd@arndb.de>
> >>> Cc: David Woodhouse <dwmw2@infradead.org>
> >>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> >>> Cc: Joerg Roedel <joro@8bytes.org>
> >>> Cc: Lu Baolu <baolu.lu@linux.intel.com>
> >>> Cc: Will Deacon <will@kernel.org>
> >>> Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
> >>> Cc: Zhou Wang <wangzhou1@hisilicon.com>
> >>> ---
> >>>    include/linux/iommu.h | 20 +++++++++++++++++---
> >>>    1 file changed, 17 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> >>> index 583c734b2e87..701b2eeb0dc5 100644
> >>> --- a/include/linux/iommu.h
> >>> +++ b/include/linux/iommu.h
> >>> @@ -156,10 +156,24 @@ struct iommu_resv_region {
> >>>    	enum iommu_resv_type	type;
> >>>    };
> >>> -/* Per device IOMMU features */
> >>> +/**
> >>> + * enum iommu_dev_features - Per device IOMMU features
> >>> + * @IOMMU_DEV_FEAT_AUX: Auxiliary domain feature
> >>> + * @IOMMU_DEV_FEAT_SVA: Shared Virtual Addresses
> >>> + * @IOMMU_DEV_FEAT_IOPF: I/O Page Faults such as PRI or Stall.
> Generally using
> >>> + *			 %IOMMU_DEV_FEAT_SVA
> requires %IOMMU_DEV_FEAT_IOPF, but
> >>> + *			 some devices manage I/O Page Faults themselves
> instead
> >>> + *			 of relying on the IOMMU. When supported, this
> feature
> >>> + *			 must be enabled before and disabled after
> >>> + *			 %IOMMU_DEV_FEAT_SVA.
> >>
> >> Is this only for SVA? We may see more scenarios of using IOPF. For
> >> example, when passing through devices to user level, the user's pages
> >> could be managed dynamically instead of being allocated and pinned
> >> statically.
> >
> > Hm, isn't that precisely what SVA does?  I don't understand the
> > difference. That said FEAT_IOPF doesn't have to be only for SVA. It could
> > later be used as a prerequisite some another feature. For special cases
> > device drivers can always use the iommu_register_device_fault_handler()
> > API and handle faults themselves.
> 
>  From the perspective of IOMMU, there is a little difference between
> these two. For SVA, the page table is from CPU side, so IOMMU only needs
> to call handle_mm_fault(); For above pass-through case, the page table
> is from IOMMU side, so the device driver (probably VFIO) needs to
> register a fault handler and call iommu_map/unmap() to serve the page
> faults.
> 
> If we think about the nested mode (or dual-stage translation), it's more
> complicated since the kernel (probably VFIO) handles the second level
> page faults, while the first level page faults need to be delivered to
> user-level guest. Obviously, this hasn't been fully implemented in any
> IOMMU driver.
> 

Thinking more the confusion might come from the fact that we mixed
hardware capability with software capability. IOMMU_FEAT describes
the hardware capability. When FEAT_IOPF is set, it purely means whatever
page faults that are enabled by the software are routed through the IOMMU.
Nothing more. Then the software (IOMMU drivers) may choose to support
only limited faulting scenarios and then evolve to support more complex 
usages gradually. For example, the intel-iommu driver only supports 1st-level
fault (thus SVA) for now, while FEAT_IOPF as a separate feature may give the
impression that 2nd-level faults are also allowed. From this angle once we 
start to separate page fault from SVA, we may also need a way to report 
the software capability (e.g. a set of faulting categories) and also extend
iommu_register_device_fault_handler to allow specifying which 
category is enabled respectively. The example categories could be:

- IOPF_BIND, for page tables which are bound/linked to the IOMMU. 
Apply to bare metal SVA and guest SVA case;
- IOPF_MAP, for page tables which are managed through explicit IOMMU
map interfaces. Apply to removing VFIO pinning restriction;

Both categories can be enabled together in nested translation, with 
additional information provided to differentiate them in fault information.
Using paging/staging level doesn't make much sense as it's IOMMU driver's 
internal knowledge, e.g. VT-d driver plans to use 1st level for GPA if no 
nesting and then turn to 2nd level when nesting is enabled.

Thanks
Kevin
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
  2021-01-13  8:10           ` Tian, Kevin
  (?)
@ 2021-01-14 16:41             ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-14 16:41 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Lu Baolu, joro, will, lorenzo.pieralisi, robh+dt, guohanjun,
	sudeep.holla, rjw, lenb, robin.murphy, Jonathan.Cameron,
	eric.auger, iommu, devicetree, linux-acpi, linux-arm-kernel,
	linux-accelerators, vdumpa, zhangfei.gao,
	shameerali.kolothum.thodi, vivek.gautam, Arnd Bergmann,
	David Woodhouse, Greg Kroah-Hartman, Zhou Wang

On Wed, Jan 13, 2021 at 08:10:28AM +0000, Tian, Kevin wrote:
> > >> Is this only for SVA? We may see more scenarios of using IOPF. For
> > >> example, when passing through devices to user level, the user's pages
> > >> could be managed dynamically instead of being allocated and pinned
> > >> statically.
> > >
> > > Hm, isn't that precisely what SVA does?  I don't understand the
> > > difference. That said FEAT_IOPF doesn't have to be only for SVA. It could
> > > later be used as a prerequisite some another feature. For special cases
> > > device drivers can always use the iommu_register_device_fault_handler()
> > > API and handle faults themselves.
> > 
> >  From the perspective of IOMMU, there is a little difference between
> > these two. For SVA, the page table is from CPU side, so IOMMU only needs
> > to call handle_mm_fault(); For above pass-through case, the page table
> > is from IOMMU side, so the device driver (probably VFIO) needs to
> > register a fault handler and call iommu_map/unmap() to serve the page
> > faults.
> > 
> > If we think about the nested mode (or dual-stage translation), it's more
> > complicated since the kernel (probably VFIO) handles the second level
> > page faults, while the first level page faults need to be delivered to
> > user-level guest. Obviously, this hasn't been fully implemented in any
> > IOMMU driver.
> > 
> 
> Thinking more the confusion might come from the fact that we mixed
> hardware capability with software capability. IOMMU_FEAT describes
> the hardware capability. When FEAT_IOPF is set, it purely means whatever
> page faults that are enabled by the software are routed through the IOMMU.
> Nothing more. Then the software (IOMMU drivers) may choose to support
> only limited faulting scenarios and then evolve to support more complex 
> usages gradually. For example, the intel-iommu driver only supports 1st-level
> fault (thus SVA) for now, while FEAT_IOPF as a separate feature may give the
> impression that 2nd-level faults are also allowed. From this angle once we 
> start to separate page fault from SVA, we may also need a way to report 
> the software capability (e.g. a set of faulting categories) and also extend
> iommu_register_device_fault_handler to allow specifying which 
> category is enabled respectively. The example categories could be:
> 
> - IOPF_BIND, for page tables which are bound/linked to the IOMMU. 
> Apply to bare metal SVA and guest SVA case;

These don't seem to fit in the same software capability, since the action
to perform on incoming page faults is very different. In the first case
the fault handling is entirely contained within the IOMMU driver; in the
second case the IOMMU driver only tracks page requests, and offloads
handling to VFIO.

> - IOPF_MAP, for page tables which are managed through explicit IOMMU
> map interfaces. Apply to removing VFIO pinning restriction;

From the IOMMU perspective this is the same as guest SVA, no? VFIO
registering a fault handler and doing the bulk of the work itself.

> Both categories can be enabled together in nested translation, with 
> additional information provided to differentiate them in fault information.
> Using paging/staging level doesn't make much sense as it's IOMMU driver's 
> internal knowledge, e.g. VT-d driver plans to use 1st level for GPA if no 
> nesting and then turn to 2nd level when nesting is enabled.

I guess detailing what's needed for nested IOPF can help the discussion,
although I haven't seen any concrete plan about implementing it, and it
still seems a couple of years away. There are two important steps with
nested IOPF:

(1) Figuring out whether a fault comes from L1 or L2. A SMMU stall event
    comes with this information, but a PRI page request doesn't. The IOMMU
    driver has to first translate the IOVA to a GPA, injecting the fault
    into the guest if this translation fails by using the usual
    iommu_report_device_fault().

(2) Translating the faulting GPA to a HVA that can be fed to
    handle_mm_fault(). That requires help from KVM, so another interface -
    either KVM registering GPA->HVA translation tables or IOMMU driver
    querying each translation. Either way it should be reusable by device
    drivers that implement IOPF themselves.

(1) could be enabled with iommu_dev_enable_feature(). (2) requires a more
complex interface. (2) alone might also be desirable - demand-paging for
level 2 only, no SVA for level 1.

Anyway, back to this patch. What I'm trying to convey is "can the IOMMU
receive incoming I/O page faults for this device and, when SVA is enabled,
feed them to the mm subsystem?  Enable that or return an error." I'm stuck
on the name. IOPF alone is too vague. Not IOPF_L1 as Kevin noted, since L1
is also used in virtualization. IOPF_BIND and IOPF_SVA could also mean (2)
above. IOMMU_DEV_FEAT_IOPF_FLAT?

That leaves space for the nested extensions. (1) above could be
IOMMU_FEAT_IOPF_NESTED, and (2) requires some new interfacing with KVM (or
just an external fault handler) and could be used with either IOPF_FLAT or
IOPF_NESTED. We can figure out the details later. What do you think?

Thanks,
Jean

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
@ 2021-01-14 16:41             ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-14 16:41 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Greg Kroah-Hartman, vivek.gautam, guohanjun, will, linux-acpi,
	zhangfei.gao, lenb, devicetree, Arnd Bergmann, robh+dt,
	linux-arm-kernel, David Woodhouse, rjw, iommu, sudeep.holla,
	robin.murphy, linux-accelerators

On Wed, Jan 13, 2021 at 08:10:28AM +0000, Tian, Kevin wrote:
> > >> Is this only for SVA? We may see more scenarios of using IOPF. For
> > >> example, when passing through devices to user level, the user's pages
> > >> could be managed dynamically instead of being allocated and pinned
> > >> statically.
> > >
> > > Hm, isn't that precisely what SVA does?  I don't understand the
> > > difference. That said FEAT_IOPF doesn't have to be only for SVA. It could
> > > later be used as a prerequisite some another feature. For special cases
> > > device drivers can always use the iommu_register_device_fault_handler()
> > > API and handle faults themselves.
> > 
> >  From the perspective of IOMMU, there is a little difference between
> > these two. For SVA, the page table is from CPU side, so IOMMU only needs
> > to call handle_mm_fault(); For above pass-through case, the page table
> > is from IOMMU side, so the device driver (probably VFIO) needs to
> > register a fault handler and call iommu_map/unmap() to serve the page
> > faults.
> > 
> > If we think about the nested mode (or dual-stage translation), it's more
> > complicated since the kernel (probably VFIO) handles the second level
> > page faults, while the first level page faults need to be delivered to
> > user-level guest. Obviously, this hasn't been fully implemented in any
> > IOMMU driver.
> > 
> 
> Thinking more the confusion might come from the fact that we mixed
> hardware capability with software capability. IOMMU_FEAT describes
> the hardware capability. When FEAT_IOPF is set, it purely means whatever
> page faults that are enabled by the software are routed through the IOMMU.
> Nothing more. Then the software (IOMMU drivers) may choose to support
> only limited faulting scenarios and then evolve to support more complex 
> usages gradually. For example, the intel-iommu driver only supports 1st-level
> fault (thus SVA) for now, while FEAT_IOPF as a separate feature may give the
> impression that 2nd-level faults are also allowed. From this angle once we 
> start to separate page fault from SVA, we may also need a way to report 
> the software capability (e.g. a set of faulting categories) and also extend
> iommu_register_device_fault_handler to allow specifying which 
> category is enabled respectively. The example categories could be:
> 
> - IOPF_BIND, for page tables which are bound/linked to the IOMMU. 
> Apply to bare metal SVA and guest SVA case;

These don't seem to fit in the same software capability, since the action
to perform on incoming page faults is very different. In the first case
the fault handling is entirely contained within the IOMMU driver; in the
second case the IOMMU driver only tracks page requests, and offloads
handling to VFIO.

> - IOPF_MAP, for page tables which are managed through explicit IOMMU
> map interfaces. Apply to removing VFIO pinning restriction;

From the IOMMU perspective this is the same as guest SVA, no? VFIO
registering a fault handler and doing the bulk of the work itself.

> Both categories can be enabled together in nested translation, with 
> additional information provided to differentiate them in fault information.
> Using paging/staging level doesn't make much sense as it's IOMMU driver's 
> internal knowledge, e.g. VT-d driver plans to use 1st level for GPA if no 
> nesting and then turn to 2nd level when nesting is enabled.

I guess detailing what's needed for nested IOPF can help the discussion,
although I haven't seen any concrete plan about implementing it, and it
still seems a couple of years away. There are two important steps with
nested IOPF:

(1) Figuring out whether a fault comes from L1 or L2. A SMMU stall event
    comes with this information, but a PRI page request doesn't. The IOMMU
    driver has to first translate the IOVA to a GPA, injecting the fault
    into the guest if this translation fails by using the usual
    iommu_report_device_fault().

(2) Translating the faulting GPA to a HVA that can be fed to
    handle_mm_fault(). That requires help from KVM, so another interface -
    either KVM registering GPA->HVA translation tables or IOMMU driver
    querying each translation. Either way it should be reusable by device
    drivers that implement IOPF themselves.

(1) could be enabled with iommu_dev_enable_feature(). (2) requires a more
complex interface. (2) alone might also be desirable - demand-paging for
level 2 only, no SVA for level 1.

Anyway, back to this patch. What I'm trying to convey is "can the IOMMU
receive incoming I/O page faults for this device and, when SVA is enabled,
feed them to the mm subsystem?  Enable that or return an error." I'm stuck
on the name. IOPF alone is too vague. Not IOPF_L1 as Kevin noted, since L1
is also used in virtualization. IOPF_BIND and IOPF_SVA could also mean (2)
above. IOMMU_DEV_FEAT_IOPF_FLAT?

That leaves space for the nested extensions. (1) above could be
IOMMU_FEAT_IOPF_NESTED, and (2) requires some new interfacing with KVM (or
just an external fault handler) and could be used with either IOPF_FLAT or
IOPF_NESTED. We can figure out the details later. What do you think?

Thanks,
Jean
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
@ 2021-01-14 16:41             ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-14 16:41 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Greg Kroah-Hartman, vivek.gautam, guohanjun, will,
	lorenzo.pieralisi, joro, Zhou Wang, linux-acpi, zhangfei.gao,
	lenb, devicetree, Arnd Bergmann, eric.auger, robh+dt,
	Jonathan.Cameron, linux-arm-kernel, David Woodhouse, rjw,
	shameerali.kolothum.thodi, iommu, sudeep.holla, robin.murphy,
	linux-accelerators, Lu Baolu

On Wed, Jan 13, 2021 at 08:10:28AM +0000, Tian, Kevin wrote:
> > >> Is this only for SVA? We may see more scenarios of using IOPF. For
> > >> example, when passing through devices to user level, the user's pages
> > >> could be managed dynamically instead of being allocated and pinned
> > >> statically.
> > >
> > > Hm, isn't that precisely what SVA does?  I don't understand the
> > > difference. That said FEAT_IOPF doesn't have to be only for SVA. It could
> > > later be used as a prerequisite some another feature. For special cases
> > > device drivers can always use the iommu_register_device_fault_handler()
> > > API and handle faults themselves.
> > 
> >  From the perspective of IOMMU, there is a little difference between
> > these two. For SVA, the page table is from CPU side, so IOMMU only needs
> > to call handle_mm_fault(); For above pass-through case, the page table
> > is from IOMMU side, so the device driver (probably VFIO) needs to
> > register a fault handler and call iommu_map/unmap() to serve the page
> > faults.
> > 
> > If we think about the nested mode (or dual-stage translation), it's more
> > complicated since the kernel (probably VFIO) handles the second level
> > page faults, while the first level page faults need to be delivered to
> > user-level guest. Obviously, this hasn't been fully implemented in any
> > IOMMU driver.
> > 
> 
> Thinking more the confusion might come from the fact that we mixed
> hardware capability with software capability. IOMMU_FEAT describes
> the hardware capability. When FEAT_IOPF is set, it purely means whatever
> page faults that are enabled by the software are routed through the IOMMU.
> Nothing more. Then the software (IOMMU drivers) may choose to support
> only limited faulting scenarios and then evolve to support more complex 
> usages gradually. For example, the intel-iommu driver only supports 1st-level
> fault (thus SVA) for now, while FEAT_IOPF as a separate feature may give the
> impression that 2nd-level faults are also allowed. From this angle once we 
> start to separate page fault from SVA, we may also need a way to report 
> the software capability (e.g. a set of faulting categories) and also extend
> iommu_register_device_fault_handler to allow specifying which 
> category is enabled respectively. The example categories could be:
> 
> - IOPF_BIND, for page tables which are bound/linked to the IOMMU. 
> Apply to bare metal SVA and guest SVA case;

These don't seem to fit in the same software capability, since the action
to perform on incoming page faults is very different. In the first case
the fault handling is entirely contained within the IOMMU driver; in the
second case the IOMMU driver only tracks page requests, and offloads
handling to VFIO.

> - IOPF_MAP, for page tables which are managed through explicit IOMMU
> map interfaces. Apply to removing VFIO pinning restriction;

From the IOMMU perspective this is the same as guest SVA, no? VFIO
registering a fault handler and doing the bulk of the work itself.

> Both categories can be enabled together in nested translation, with 
> additional information provided to differentiate them in fault information.
> Using paging/staging level doesn't make much sense as it's IOMMU driver's 
> internal knowledge, e.g. VT-d driver plans to use 1st level for GPA if no 
> nesting and then turn to 2nd level when nesting is enabled.

I guess detailing what's needed for nested IOPF can help the discussion,
although I haven't seen any concrete plan about implementing it, and it
still seems a couple of years away. There are two important steps with
nested IOPF:

(1) Figuring out whether a fault comes from L1 or L2. A SMMU stall event
    comes with this information, but a PRI page request doesn't. The IOMMU
    driver has to first translate the IOVA to a GPA, injecting the fault
    into the guest if this translation fails by using the usual
    iommu_report_device_fault().

(2) Translating the faulting GPA to a HVA that can be fed to
    handle_mm_fault(). That requires help from KVM, so another interface -
    either KVM registering GPA->HVA translation tables or IOMMU driver
    querying each translation. Either way it should be reusable by device
    drivers that implement IOPF themselves.

(1) could be enabled with iommu_dev_enable_feature(). (2) requires a more
complex interface. (2) alone might also be desirable - demand-paging for
level 2 only, no SVA for level 1.

Anyway, back to this patch. What I'm trying to convey is "can the IOMMU
receive incoming I/O page faults for this device and, when SVA is enabled,
feed them to the mm subsystem?  Enable that or return an error." I'm stuck
on the name. IOPF alone is too vague. Not IOPF_L1 as Kevin noted, since L1
is also used in virtualization. IOPF_BIND and IOPF_SVA could also mean (2)
above. IOMMU_DEV_FEAT_IOPF_FLAT?

That leaves space for the nested extensions. (1) above could be
IOMMU_FEAT_IOPF_NESTED, and (2) requires some new interfacing with KVM (or
just an external fault handler) and could be used with either IOPF_FLAT or
IOPF_NESTED. We can figure out the details later. What do you think?

Thanks,
Jean

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
  2021-01-14 16:41             ` Jean-Philippe Brucker
  (?)
@ 2021-01-16  3:54               ` Lu Baolu
  -1 siblings, 0 replies; 105+ messages in thread
From: Lu Baolu @ 2021-01-16  3:54 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Tian, Kevin
  Cc: baolu.lu, joro, will, lorenzo.pieralisi, robh+dt, guohanjun,
	sudeep.holla, rjw, lenb, robin.murphy, Jonathan.Cameron,
	eric.auger, iommu, devicetree, linux-acpi, linux-arm-kernel,
	linux-accelerators, vdumpa, zhangfei.gao,
	shameerali.kolothum.thodi, vivek.gautam, Arnd Bergmann,
	David Woodhouse, Greg Kroah-Hartman, Zhou Wang

Hi Jean,

On 2021/1/15 0:41, Jean-Philippe Brucker wrote:
> I guess detailing what's needed for nested IOPF can help the discussion,
> although I haven't seen any concrete plan about implementing it, and it
> still seems a couple of years away. There are two important steps with
> nested IOPF:
> 
> (1) Figuring out whether a fault comes from L1 or L2. A SMMU stall event
>      comes with this information, but a PRI page request doesn't. The IOMMU
>      driver has to first translate the IOVA to a GPA, injecting the fault
>      into the guest if this translation fails by using the usual
>      iommu_report_device_fault().
> 
> (2) Translating the faulting GPA to a HVA that can be fed to
>      handle_mm_fault(). That requires help from KVM, so another interface -
>      either KVM registering GPA->HVA translation tables or IOMMU driver
>      querying each translation. Either way it should be reusable by device
>      drivers that implement IOPF themselves.
> 
> (1) could be enabled with iommu_dev_enable_feature(). (2) requires a more
> complex interface. (2) alone might also be desirable - demand-paging for
> level 2 only, no SVA for level 1.
> 
> Anyway, back to this patch. What I'm trying to convey is "can the IOMMU
> receive incoming I/O page faults for this device and, when SVA is enabled,
> feed them to the mm subsystem?  Enable that or return an error." I'm stuck
> on the name. IOPF alone is too vague. Not IOPF_L1 as Kevin noted, since L1
> is also used in virtualization. IOPF_BIND and IOPF_SVA could also mean (2)
> above. IOMMU_DEV_FEAT_IOPF_FLAT?
> 
> That leaves space for the nested extensions. (1) above could be
> IOMMU_FEAT_IOPF_NESTED, and (2) requires some new interfacing with KVM (or
> just an external fault handler) and could be used with either IOPF_FLAT or
> IOPF_NESTED. We can figure out the details later. What do you think?

I agree that we can define IOPF_ for current usage and leave space for
future extensions.

IOPF_FLAT represents IOPF on first-level translation, currently first
level translation could be used in below cases.

1) FL w/ internal Page Table: Kernel IOVA;
2) FL w/ external Page Table: VFIO passthrough;
3) FL w/ shared CPU page table: SVA

We don't need to support IOPF for case 1). Let's put it aside.

IOPF handling of 2) and 3) are different. Do we need to define different
names to distinguish these two cases?

Best regards,
baolu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
@ 2021-01-16  3:54               ` Lu Baolu
  0 siblings, 0 replies; 105+ messages in thread
From: Lu Baolu @ 2021-01-16  3:54 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Tian, Kevin
  Cc: Greg Kroah-Hartman, vivek.gautam, guohanjun, will, linux-acpi,
	zhangfei.gao, lenb, devicetree, Arnd Bergmann, robh+dt,
	linux-arm-kernel, David Woodhouse, rjw, iommu, sudeep.holla,
	robin.murphy, linux-accelerators

Hi Jean,

On 2021/1/15 0:41, Jean-Philippe Brucker wrote:
> I guess detailing what's needed for nested IOPF can help the discussion,
> although I haven't seen any concrete plan about implementing it, and it
> still seems a couple of years away. There are two important steps with
> nested IOPF:
> 
> (1) Figuring out whether a fault comes from L1 or L2. A SMMU stall event
>      comes with this information, but a PRI page request doesn't. The IOMMU
>      driver has to first translate the IOVA to a GPA, injecting the fault
>      into the guest if this translation fails by using the usual
>      iommu_report_device_fault().
> 
> (2) Translating the faulting GPA to a HVA that can be fed to
>      handle_mm_fault(). That requires help from KVM, so another interface -
>      either KVM registering GPA->HVA translation tables or IOMMU driver
>      querying each translation. Either way it should be reusable by device
>      drivers that implement IOPF themselves.
> 
> (1) could be enabled with iommu_dev_enable_feature(). (2) requires a more
> complex interface. (2) alone might also be desirable - demand-paging for
> level 2 only, no SVA for level 1.
> 
> Anyway, back to this patch. What I'm trying to convey is "can the IOMMU
> receive incoming I/O page faults for this device and, when SVA is enabled,
> feed them to the mm subsystem?  Enable that or return an error." I'm stuck
> on the name. IOPF alone is too vague. Not IOPF_L1 as Kevin noted, since L1
> is also used in virtualization. IOPF_BIND and IOPF_SVA could also mean (2)
> above. IOMMU_DEV_FEAT_IOPF_FLAT?
> 
> That leaves space for the nested extensions. (1) above could be
> IOMMU_FEAT_IOPF_NESTED, and (2) requires some new interfacing with KVM (or
> just an external fault handler) and could be used with either IOPF_FLAT or
> IOPF_NESTED. We can figure out the details later. What do you think?

I agree that we can define IOPF_ for current usage and leave space for
future extensions.

IOPF_FLAT represents IOPF on first-level translation, currently first
level translation could be used in below cases.

1) FL w/ internal Page Table: Kernel IOVA;
2) FL w/ external Page Table: VFIO passthrough;
3) FL w/ shared CPU page table: SVA

We don't need to support IOPF for case 1). Let's put it aside.

IOPF handling of 2) and 3) are different. Do we need to define different
names to distinguish these two cases?

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
@ 2021-01-16  3:54               ` Lu Baolu
  0 siblings, 0 replies; 105+ messages in thread
From: Lu Baolu @ 2021-01-16  3:54 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Tian, Kevin
  Cc: Greg Kroah-Hartman, vivek.gautam, guohanjun, will,
	lorenzo.pieralisi, joro, Zhou Wang, linux-acpi, zhangfei.gao,
	lenb, devicetree, Arnd Bergmann, eric.auger, robh+dt,
	Jonathan.Cameron, linux-arm-kernel, David Woodhouse, rjw,
	shameerali.kolothum.thodi, iommu, sudeep.holla, robin.murphy,
	linux-accelerators, baolu.lu

Hi Jean,

On 2021/1/15 0:41, Jean-Philippe Brucker wrote:
> I guess detailing what's needed for nested IOPF can help the discussion,
> although I haven't seen any concrete plan about implementing it, and it
> still seems a couple of years away. There are two important steps with
> nested IOPF:
> 
> (1) Figuring out whether a fault comes from L1 or L2. A SMMU stall event
>      comes with this information, but a PRI page request doesn't. The IOMMU
>      driver has to first translate the IOVA to a GPA, injecting the fault
>      into the guest if this translation fails by using the usual
>      iommu_report_device_fault().
> 
> (2) Translating the faulting GPA to a HVA that can be fed to
>      handle_mm_fault(). That requires help from KVM, so another interface -
>      either KVM registering GPA->HVA translation tables or IOMMU driver
>      querying each translation. Either way it should be reusable by device
>      drivers that implement IOPF themselves.
> 
> (1) could be enabled with iommu_dev_enable_feature(). (2) requires a more
> complex interface. (2) alone might also be desirable - demand-paging for
> level 2 only, no SVA for level 1.
> 
> Anyway, back to this patch. What I'm trying to convey is "can the IOMMU
> receive incoming I/O page faults for this device and, when SVA is enabled,
> feed them to the mm subsystem?  Enable that or return an error." I'm stuck
> on the name. IOPF alone is too vague. Not IOPF_L1 as Kevin noted, since L1
> is also used in virtualization. IOPF_BIND and IOPF_SVA could also mean (2)
> above. IOMMU_DEV_FEAT_IOPF_FLAT?
> 
> That leaves space for the nested extensions. (1) above could be
> IOMMU_FEAT_IOPF_NESTED, and (2) requires some new interfacing with KVM (or
> just an external fault handler) and could be used with either IOPF_FLAT or
> IOPF_NESTED. We can figure out the details later. What do you think?

I agree that we can define IOPF_ for current usage and leave space for
future extensions.

IOPF_FLAT represents IOPF on first-level translation, currently first
level translation could be used in below cases.

1) FL w/ internal Page Table: Kernel IOVA;
2) FL w/ external Page Table: VFIO passthrough;
3) FL w/ shared CPU page table: SVA

We don't need to support IOPF for case 1). Let's put it aside.

IOPF handling of 2) and 3) are different. Do we need to define different
names to distinguish these two cases?

Best regards,
baolu

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
  2021-01-16  3:54               ` Lu Baolu
  (?)
@ 2021-01-18  6:54                 ` Tian, Kevin
  -1 siblings, 0 replies; 105+ messages in thread
From: Tian, Kevin @ 2021-01-18  6:54 UTC (permalink / raw)
  To: Lu Baolu, Jean-Philippe Brucker
  Cc: joro, will, lorenzo.pieralisi, robh+dt, guohanjun, sudeep.holla,
	rjw, lenb, robin.murphy, Jonathan.Cameron, eric.auger, iommu,
	devicetree, linux-acpi, linux-arm-kernel, linux-accelerators,
	vdumpa, zhangfei.gao, shameerali.kolothum.thodi, vivek.gautam,
	Arnd Bergmann, David Woodhouse, Greg Kroah-Hartman, Zhou Wang

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Saturday, January 16, 2021 11:54 AM
> 
> Hi Jean,
> 
> On 2021/1/15 0:41, Jean-Philippe Brucker wrote:
> > I guess detailing what's needed for nested IOPF can help the discussion,
> > although I haven't seen any concrete plan about implementing it, and it
> > still seems a couple of years away. There are two important steps with
> > nested IOPF:
> >
> > (1) Figuring out whether a fault comes from L1 or L2. A SMMU stall event
> >      comes with this information, but a PRI page request doesn't. The
> IOMMU
> >      driver has to first translate the IOVA to a GPA, injecting the fault
> >      into the guest if this translation fails by using the usual
> >      iommu_report_device_fault().

The IOMMU driver can walk the page tables to find out the level information.
If the walk terminates at the 1st level, inject to the guest. Otherwise fix the 
mm fault at 2nd level. It's not efficient compared to hardware-provided info,
but it's doable and actual overhead needs to be measured (optimization exists
e.g. having fault client to hint no 2nd level fault expected when registering fault
handler in pinned case).

> >
> > (2) Translating the faulting GPA to a HVA that can be fed to
> >      handle_mm_fault(). That requires help from KVM, so another interface -
> >      either KVM registering GPA->HVA translation tables or IOMMU driver
> >      querying each translation. Either way it should be reusable by device
> >      drivers that implement IOPF themselves.

Or just leave to the fault client (say VFIO here) to figure it out. VFIO has the
information about GPA->HPA and can then call handle_mm_fault to fix the
received fault. The IOMMU driver just exports an interface for the device drivers 
which implement IOPF themselves to report a fault which is then handled by
the IOMMU core by reusing the same faulting path.

> >
> > (1) could be enabled with iommu_dev_enable_feature(). (2) requires a
> more
> > complex interface. (2) alone might also be desirable - demand-paging for
> > level 2 only, no SVA for level 1.

Yes, this is what we want to point out. A general FEAT_IOPF implies more than
what this patch intended to address.

> >
> > Anyway, back to this patch. What I'm trying to convey is "can the IOMMU
> > receive incoming I/O page faults for this device and, when SVA is enabled,
> > feed them to the mm subsystem?  Enable that or return an error." I'm stuck
> > on the name. IOPF alone is too vague. Not IOPF_L1 as Kevin noted, since L1
> > is also used in virtualization. IOPF_BIND and IOPF_SVA could also mean (2)
> > above. IOMMU_DEV_FEAT_IOPF_FLAT?
> >
> > That leaves space for the nested extensions. (1) above could be
> > IOMMU_FEAT_IOPF_NESTED, and (2) requires some new interfacing with
> KVM (or
> > just an external fault handler) and could be used with either IOPF_FLAT or
> > IOPF_NESTED. We can figure out the details later. What do you think?
> 
> I agree that we can define IOPF_ for current usage and leave space for
> future extensions.
> 
> IOPF_FLAT represents IOPF on first-level translation, currently first
> level translation could be used in below cases.
> 
> 1) FL w/ internal Page Table: Kernel IOVA;
> 2) FL w/ external Page Table: VFIO passthrough;
> 3) FL w/ shared CPU page table: SVA
> 
> We don't need to support IOPF for case 1). Let's put it aside.
> 
> IOPF handling of 2) and 3) are different. Do we need to define different
> names to distinguish these two cases?
> 

Defining feature names according to various use cases does not sound a
clean way. In an ideal way we should have just a general FEAT_IOPF since
the hardware (at least VT-d) does support fault in either 1st-level, 2nd-
level or nested configurations. We are entering this trouble just because
there is difficulty for the software evolving to enable full hardware cap
in one batch. My last proposal was sort of keeping FEAT_IOPF as a general
capability for whether delivering fault through the IOMMU or the ad-hoc
device, and then having a separate interface for whether IOPF reporting
is available under a specific configuration. The former is about the path
between the IOMMU and the device, while the latter is about the interface
between the IOMMU driver and its faulting client.

The reporting capability can be checked when the fault client is registering 
its fault handler, and at this time the IOMMU driver knows how the related 
mapping is configured (1st, 2nd, or nested) and whether fault reporting is 
supported in such configuration. We may introduce IOPF_REPORT_FLAT and 
IOPF_REPORT_NESTED respectively. while IOPF_REPORT_FLAT detection is 
straightforward (2 and 3 can be differentiated internally based on configured 
level), IOPF_REPORT_NESTED needs additional info to indicate which level is 
concerned since the vendor driver may not support fault reporting in both 
levels or the fault client may be interested in only one level (e.g. with 2nd
level pinned).

Thanks
Kevin

^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
@ 2021-01-18  6:54                 ` Tian, Kevin
  0 siblings, 0 replies; 105+ messages in thread
From: Tian, Kevin @ 2021-01-18  6:54 UTC (permalink / raw)
  To: Lu Baolu, Jean-Philippe Brucker
  Cc: Greg Kroah-Hartman, vivek.gautam, guohanjun, will, linux-acpi,
	zhangfei.gao, lenb, devicetree, Arnd Bergmann, robh+dt,
	linux-arm-kernel, David Woodhouse, rjw, iommu, sudeep.holla,
	robin.murphy, linux-accelerators

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Saturday, January 16, 2021 11:54 AM
> 
> Hi Jean,
> 
> On 2021/1/15 0:41, Jean-Philippe Brucker wrote:
> > I guess detailing what's needed for nested IOPF can help the discussion,
> > although I haven't seen any concrete plan about implementing it, and it
> > still seems a couple of years away. There are two important steps with
> > nested IOPF:
> >
> > (1) Figuring out whether a fault comes from L1 or L2. A SMMU stall event
> >      comes with this information, but a PRI page request doesn't. The
> IOMMU
> >      driver has to first translate the IOVA to a GPA, injecting the fault
> >      into the guest if this translation fails by using the usual
> >      iommu_report_device_fault().

The IOMMU driver can walk the page tables to find out the level information.
If the walk terminates at the 1st level, inject to the guest. Otherwise fix the 
mm fault at 2nd level. It's not efficient compared to hardware-provided info,
but it's doable and actual overhead needs to be measured (optimization exists
e.g. having fault client to hint no 2nd level fault expected when registering fault
handler in pinned case).

> >
> > (2) Translating the faulting GPA to a HVA that can be fed to
> >      handle_mm_fault(). That requires help from KVM, so another interface -
> >      either KVM registering GPA->HVA translation tables or IOMMU driver
> >      querying each translation. Either way it should be reusable by device
> >      drivers that implement IOPF themselves.

Or just leave to the fault client (say VFIO here) to figure it out. VFIO has the
information about GPA->HPA and can then call handle_mm_fault to fix the
received fault. The IOMMU driver just exports an interface for the device drivers 
which implement IOPF themselves to report a fault which is then handled by
the IOMMU core by reusing the same faulting path.

> >
> > (1) could be enabled with iommu_dev_enable_feature(). (2) requires a
> more
> > complex interface. (2) alone might also be desirable - demand-paging for
> > level 2 only, no SVA for level 1.

Yes, this is what we want to point out. A general FEAT_IOPF implies more than
what this patch intended to address.

> >
> > Anyway, back to this patch. What I'm trying to convey is "can the IOMMU
> > receive incoming I/O page faults for this device and, when SVA is enabled,
> > feed them to the mm subsystem?  Enable that or return an error." I'm stuck
> > on the name. IOPF alone is too vague. Not IOPF_L1 as Kevin noted, since L1
> > is also used in virtualization. IOPF_BIND and IOPF_SVA could also mean (2)
> > above. IOMMU_DEV_FEAT_IOPF_FLAT?
> >
> > That leaves space for the nested extensions. (1) above could be
> > IOMMU_FEAT_IOPF_NESTED, and (2) requires some new interfacing with
> KVM (or
> > just an external fault handler) and could be used with either IOPF_FLAT or
> > IOPF_NESTED. We can figure out the details later. What do you think?
> 
> I agree that we can define IOPF_ for current usage and leave space for
> future extensions.
> 
> IOPF_FLAT represents IOPF on first-level translation, currently first
> level translation could be used in below cases.
> 
> 1) FL w/ internal Page Table: Kernel IOVA;
> 2) FL w/ external Page Table: VFIO passthrough;
> 3) FL w/ shared CPU page table: SVA
> 
> We don't need to support IOPF for case 1). Let's put it aside.
> 
> IOPF handling of 2) and 3) are different. Do we need to define different
> names to distinguish these two cases?
> 

Defining feature names according to various use cases does not sound a
clean way. In an ideal way we should have just a general FEAT_IOPF since
the hardware (at least VT-d) does support fault in either 1st-level, 2nd-
level or nested configurations. We are entering this trouble just because
there is difficulty for the software evolving to enable full hardware cap
in one batch. My last proposal was sort of keeping FEAT_IOPF as a general
capability for whether delivering fault through the IOMMU or the ad-hoc
device, and then having a separate interface for whether IOPF reporting
is available under a specific configuration. The former is about the path
between the IOMMU and the device, while the latter is about the interface
between the IOMMU driver and its faulting client.

The reporting capability can be checked when the fault client is registering 
its fault handler, and at this time the IOMMU driver knows how the related 
mapping is configured (1st, 2nd, or nested) and whether fault reporting is 
supported in such configuration. We may introduce IOPF_REPORT_FLAT and 
IOPF_REPORT_NESTED respectively. while IOPF_REPORT_FLAT detection is 
straightforward (2 and 3 can be differentiated internally based on configured 
level), IOPF_REPORT_NESTED needs additional info to indicate which level is 
concerned since the vendor driver may not support fault reporting in both 
levels or the fault client may be interested in only one level (e.g. with 2nd
level pinned).

Thanks
Kevin
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* RE: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
@ 2021-01-18  6:54                 ` Tian, Kevin
  0 siblings, 0 replies; 105+ messages in thread
From: Tian, Kevin @ 2021-01-18  6:54 UTC (permalink / raw)
  To: Lu Baolu, Jean-Philippe Brucker
  Cc: Greg Kroah-Hartman, vivek.gautam, guohanjun, will,
	lorenzo.pieralisi, joro, Zhou Wang, linux-acpi, zhangfei.gao,
	lenb, devicetree, Arnd Bergmann, eric.auger, robh+dt,
	Jonathan.Cameron, linux-arm-kernel, David Woodhouse, rjw,
	shameerali.kolothum.thodi, iommu, sudeep.holla, robin.murphy,
	linux-accelerators

> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Saturday, January 16, 2021 11:54 AM
> 
> Hi Jean,
> 
> On 2021/1/15 0:41, Jean-Philippe Brucker wrote:
> > I guess detailing what's needed for nested IOPF can help the discussion,
> > although I haven't seen any concrete plan about implementing it, and it
> > still seems a couple of years away. There are two important steps with
> > nested IOPF:
> >
> > (1) Figuring out whether a fault comes from L1 or L2. A SMMU stall event
> >      comes with this information, but a PRI page request doesn't. The
> IOMMU
> >      driver has to first translate the IOVA to a GPA, injecting the fault
> >      into the guest if this translation fails by using the usual
> >      iommu_report_device_fault().

The IOMMU driver can walk the page tables to find out the level information.
If the walk terminates at the 1st level, inject to the guest. Otherwise fix the 
mm fault at 2nd level. It's not efficient compared to hardware-provided info,
but it's doable and actual overhead needs to be measured (optimization exists
e.g. having fault client to hint no 2nd level fault expected when registering fault
handler in pinned case).

> >
> > (2) Translating the faulting GPA to a HVA that can be fed to
> >      handle_mm_fault(). That requires help from KVM, so another interface -
> >      either KVM registering GPA->HVA translation tables or IOMMU driver
> >      querying each translation. Either way it should be reusable by device
> >      drivers that implement IOPF themselves.

Or just leave to the fault client (say VFIO here) to figure it out. VFIO has the
information about GPA->HPA and can then call handle_mm_fault to fix the
received fault. The IOMMU driver just exports an interface for the device drivers 
which implement IOPF themselves to report a fault which is then handled by
the IOMMU core by reusing the same faulting path.

> >
> > (1) could be enabled with iommu_dev_enable_feature(). (2) requires a
> more
> > complex interface. (2) alone might also be desirable - demand-paging for
> > level 2 only, no SVA for level 1.

Yes, this is what we want to point out. A general FEAT_IOPF implies more than
what this patch intended to address.

> >
> > Anyway, back to this patch. What I'm trying to convey is "can the IOMMU
> > receive incoming I/O page faults for this device and, when SVA is enabled,
> > feed them to the mm subsystem?  Enable that or return an error." I'm stuck
> > on the name. IOPF alone is too vague. Not IOPF_L1 as Kevin noted, since L1
> > is also used in virtualization. IOPF_BIND and IOPF_SVA could also mean (2)
> > above. IOMMU_DEV_FEAT_IOPF_FLAT?
> >
> > That leaves space for the nested extensions. (1) above could be
> > IOMMU_FEAT_IOPF_NESTED, and (2) requires some new interfacing with
> KVM (or
> > just an external fault handler) and could be used with either IOPF_FLAT or
> > IOPF_NESTED. We can figure out the details later. What do you think?
> 
> I agree that we can define IOPF_ for current usage and leave space for
> future extensions.
> 
> IOPF_FLAT represents IOPF on first-level translation, currently first
> level translation could be used in below cases.
> 
> 1) FL w/ internal Page Table: Kernel IOVA;
> 2) FL w/ external Page Table: VFIO passthrough;
> 3) FL w/ shared CPU page table: SVA
> 
> We don't need to support IOPF for case 1). Let's put it aside.
> 
> IOPF handling of 2) and 3) are different. Do we need to define different
> names to distinguish these two cases?
> 

Defining feature names according to various use cases does not sound a
clean way. In an ideal way we should have just a general FEAT_IOPF since
the hardware (at least VT-d) does support fault in either 1st-level, 2nd-
level or nested configurations. We are entering this trouble just because
there is difficulty for the software evolving to enable full hardware cap
in one batch. My last proposal was sort of keeping FEAT_IOPF as a general
capability for whether delivering fault through the IOMMU or the ad-hoc
device, and then having a separate interface for whether IOPF reporting
is available under a specific configuration. The former is about the path
between the IOMMU and the device, while the latter is about the interface
between the IOMMU driver and its faulting client.

The reporting capability can be checked when the fault client is registering 
its fault handler, and at this time the IOMMU driver knows how the related 
mapping is configured (1st, 2nd, or nested) and whether fault reporting is 
supported in such configuration. We may introduce IOPF_REPORT_FLAT and 
IOPF_REPORT_NESTED respectively. while IOPF_REPORT_FLAT detection is 
straightforward (2 and 3 can be differentiated internally based on configured 
level), IOPF_REPORT_NESTED needs additional info to indicate which level is 
concerned since the vendor driver may not support fault reporting in both 
levels or the fault client may be interested in only one level (e.g. with 2nd
level pinned).

Thanks
Kevin
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
  2021-01-18  6:54                 ` Tian, Kevin
  (?)
@ 2021-01-19 10:16                   ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-19 10:16 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Lu Baolu, joro, will, lorenzo.pieralisi, robh+dt, guohanjun,
	sudeep.holla, rjw, lenb, robin.murphy, Jonathan.Cameron,
	eric.auger, iommu, devicetree, linux-acpi, linux-arm-kernel,
	linux-accelerators, vdumpa, zhangfei.gao,
	shameerali.kolothum.thodi, vivek.gautam, Arnd Bergmann,
	David Woodhouse, Greg Kroah-Hartman, Zhou Wang

On Mon, Jan 18, 2021 at 06:54:28AM +0000, Tian, Kevin wrote:
> > From: Lu Baolu <baolu.lu@linux.intel.com>
> > Sent: Saturday, January 16, 2021 11:54 AM
> > 
> > Hi Jean,
> > 
> > On 2021/1/15 0:41, Jean-Philippe Brucker wrote:
> > > I guess detailing what's needed for nested IOPF can help the discussion,
> > > although I haven't seen any concrete plan about implementing it, and it
> > > still seems a couple of years away. There are two important steps with
> > > nested IOPF:
> > >
> > > (1) Figuring out whether a fault comes from L1 or L2. A SMMU stall event
> > >      comes with this information, but a PRI page request doesn't. The
> > IOMMU
> > >      driver has to first translate the IOVA to a GPA, injecting the fault
> > >      into the guest if this translation fails by using the usual
> > >      iommu_report_device_fault().
> 
> The IOMMU driver can walk the page tables to find out the level information.
> If the walk terminates at the 1st level, inject to the guest. Otherwise fix the 
> mm fault at 2nd level. It's not efficient compared to hardware-provided info,
> but it's doable and actual overhead needs to be measured (optimization exists
> e.g. having fault client to hint no 2nd level fault expected when registering fault
> handler in pinned case).
> 
> > >
> > > (2) Translating the faulting GPA to a HVA that can be fed to
> > >      handle_mm_fault(). That requires help from KVM, so another interface -
> > >      either KVM registering GPA->HVA translation tables or IOMMU driver
> > >      querying each translation. Either way it should be reusable by device
> > >      drivers that implement IOPF themselves.
> 
> Or just leave to the fault client (say VFIO here) to figure it out. VFIO has the
> information about GPA->HPA and can then call handle_mm_fault to fix the
> received fault. The IOMMU driver just exports an interface for the device drivers 
> which implement IOPF themselves to report a fault which is then handled by
> the IOMMU core by reusing the same faulting path.
> 
> > >
> > > (1) could be enabled with iommu_dev_enable_feature(). (2) requires a
> > more
> > > complex interface. (2) alone might also be desirable - demand-paging for
> > > level 2 only, no SVA for level 1.
> 
> Yes, this is what we want to point out. A general FEAT_IOPF implies more than
> what this patch intended to address.
> 
> > >
> > > Anyway, back to this patch. What I'm trying to convey is "can the IOMMU
> > > receive incoming I/O page faults for this device and, when SVA is enabled,
> > > feed them to the mm subsystem?  Enable that or return an error." I'm stuck
> > > on the name. IOPF alone is too vague. Not IOPF_L1 as Kevin noted, since L1
> > > is also used in virtualization. IOPF_BIND and IOPF_SVA could also mean (2)
> > > above. IOMMU_DEV_FEAT_IOPF_FLAT?
> > >
> > > That leaves space for the nested extensions. (1) above could be
> > > IOMMU_FEAT_IOPF_NESTED, and (2) requires some new interfacing with
> > KVM (or
> > > just an external fault handler) and could be used with either IOPF_FLAT or
> > > IOPF_NESTED. We can figure out the details later. What do you think?
> > 
> > I agree that we can define IOPF_ for current usage and leave space for
> > future extensions.
> > 
> > IOPF_FLAT represents IOPF on first-level translation, currently first
> > level translation could be used in below cases.
> > 
> > 1) FL w/ internal Page Table: Kernel IOVA;
> > 2) FL w/ external Page Table: VFIO passthrough;
> > 3) FL w/ shared CPU page table: SVA
> > 
> > We don't need to support IOPF for case 1). Let's put it aside.
> > 
> > IOPF handling of 2) and 3) are different. Do we need to define different
> > names to distinguish these two cases?
> > 
> 
> Defining feature names according to various use cases does not sound a
> clean way. In an ideal way we should have just a general FEAT_IOPF since
> the hardware (at least VT-d) does support fault in either 1st-level, 2nd-
> level or nested configurations. We are entering this trouble just because
> there is difficulty for the software evolving to enable full hardware cap
> in one batch. My last proposal was sort of keeping FEAT_IOPF as a general
> capability for whether delivering fault through the IOMMU or the ad-hoc
> device, and then having a separate interface for whether IOPF reporting
> is available under a specific configuration. The former is about the path
> between the IOMMU and the device, while the latter is about the interface
> between the IOMMU driver and its faulting client.
> 
> The reporting capability can be checked when the fault client is registering 
> its fault handler, and at this time the IOMMU driver knows how the related 
> mapping is configured (1st, 2nd, or nested) and whether fault reporting is 
> supported in such configuration. We may introduce IOPF_REPORT_FLAT and 
> IOPF_REPORT_NESTED respectively. while IOPF_REPORT_FLAT detection is 
> straightforward (2 and 3 can be differentiated internally based on configured 
> level), IOPF_REPORT_NESTED needs additional info to indicate which level is 
> concerned since the vendor driver may not support fault reporting in both 
> levels or the fault client may be interested in only one level (e.g. with 2nd
> level pinned).

I agree with this plan (provided I understood it correctly this time):
have IOMMU_DEV_FEAT_IOPF describing the IOPF interface between device and
IOMMU. Enabling it on its own doesn't do anything visible to the driver,
it just probes for capabilities and enables PRI if necessary. For host
SVA, since there is no additional communication between IOMMU and device
driver, enabling IOMMU_DEV_FEAT_SVA in addition to IOPF is sufficient.
Then when implementing nested we'll extend iommu_register_fault_handler()
with flags and parameters. That will also enable advanced dispatching (1).

Will it be necessary to enable FEAT_IOPF when doing VFIO passthrough
(injecting to the guest or handling it with external page tables)?
I think that would be better. Currently a device driver registering a
fault handler doesn't know if it will get recoverable page faults or only
unrecoverable ones.

So I don't think this patch needs any change. Baolu, are you ok with
keeping this and patch 4?

Thanks,
Jean

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
@ 2021-01-19 10:16                   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-19 10:16 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Greg Kroah-Hartman, vivek.gautam, guohanjun, will, linux-acpi,
	zhangfei.gao, lenb, devicetree, Arnd Bergmann, robh+dt,
	linux-arm-kernel, David Woodhouse, rjw, iommu, sudeep.holla,
	robin.murphy, linux-accelerators

On Mon, Jan 18, 2021 at 06:54:28AM +0000, Tian, Kevin wrote:
> > From: Lu Baolu <baolu.lu@linux.intel.com>
> > Sent: Saturday, January 16, 2021 11:54 AM
> > 
> > Hi Jean,
> > 
> > On 2021/1/15 0:41, Jean-Philippe Brucker wrote:
> > > I guess detailing what's needed for nested IOPF can help the discussion,
> > > although I haven't seen any concrete plan about implementing it, and it
> > > still seems a couple of years away. There are two important steps with
> > > nested IOPF:
> > >
> > > (1) Figuring out whether a fault comes from L1 or L2. A SMMU stall event
> > >      comes with this information, but a PRI page request doesn't. The
> > IOMMU
> > >      driver has to first translate the IOVA to a GPA, injecting the fault
> > >      into the guest if this translation fails by using the usual
> > >      iommu_report_device_fault().
> 
> The IOMMU driver can walk the page tables to find out the level information.
> If the walk terminates at the 1st level, inject to the guest. Otherwise fix the 
> mm fault at 2nd level. It's not efficient compared to hardware-provided info,
> but it's doable and actual overhead needs to be measured (optimization exists
> e.g. having fault client to hint no 2nd level fault expected when registering fault
> handler in pinned case).
> 
> > >
> > > (2) Translating the faulting GPA to a HVA that can be fed to
> > >      handle_mm_fault(). That requires help from KVM, so another interface -
> > >      either KVM registering GPA->HVA translation tables or IOMMU driver
> > >      querying each translation. Either way it should be reusable by device
> > >      drivers that implement IOPF themselves.
> 
> Or just leave to the fault client (say VFIO here) to figure it out. VFIO has the
> information about GPA->HPA and can then call handle_mm_fault to fix the
> received fault. The IOMMU driver just exports an interface for the device drivers 
> which implement IOPF themselves to report a fault which is then handled by
> the IOMMU core by reusing the same faulting path.
> 
> > >
> > > (1) could be enabled with iommu_dev_enable_feature(). (2) requires a
> > more
> > > complex interface. (2) alone might also be desirable - demand-paging for
> > > level 2 only, no SVA for level 1.
> 
> Yes, this is what we want to point out. A general FEAT_IOPF implies more than
> what this patch intended to address.
> 
> > >
> > > Anyway, back to this patch. What I'm trying to convey is "can the IOMMU
> > > receive incoming I/O page faults for this device and, when SVA is enabled,
> > > feed them to the mm subsystem?  Enable that or return an error." I'm stuck
> > > on the name. IOPF alone is too vague. Not IOPF_L1 as Kevin noted, since L1
> > > is also used in virtualization. IOPF_BIND and IOPF_SVA could also mean (2)
> > > above. IOMMU_DEV_FEAT_IOPF_FLAT?
> > >
> > > That leaves space for the nested extensions. (1) above could be
> > > IOMMU_FEAT_IOPF_NESTED, and (2) requires some new interfacing with
> > KVM (or
> > > just an external fault handler) and could be used with either IOPF_FLAT or
> > > IOPF_NESTED. We can figure out the details later. What do you think?
> > 
> > I agree that we can define IOPF_ for current usage and leave space for
> > future extensions.
> > 
> > IOPF_FLAT represents IOPF on first-level translation, currently first
> > level translation could be used in below cases.
> > 
> > 1) FL w/ internal Page Table: Kernel IOVA;
> > 2) FL w/ external Page Table: VFIO passthrough;
> > 3) FL w/ shared CPU page table: SVA
> > 
> > We don't need to support IOPF for case 1). Let's put it aside.
> > 
> > IOPF handling of 2) and 3) are different. Do we need to define different
> > names to distinguish these two cases?
> > 
> 
> Defining feature names according to various use cases does not sound a
> clean way. In an ideal way we should have just a general FEAT_IOPF since
> the hardware (at least VT-d) does support fault in either 1st-level, 2nd-
> level or nested configurations. We are entering this trouble just because
> there is difficulty for the software evolving to enable full hardware cap
> in one batch. My last proposal was sort of keeping FEAT_IOPF as a general
> capability for whether delivering fault through the IOMMU or the ad-hoc
> device, and then having a separate interface for whether IOPF reporting
> is available under a specific configuration. The former is about the path
> between the IOMMU and the device, while the latter is about the interface
> between the IOMMU driver and its faulting client.
> 
> The reporting capability can be checked when the fault client is registering 
> its fault handler, and at this time the IOMMU driver knows how the related 
> mapping is configured (1st, 2nd, or nested) and whether fault reporting is 
> supported in such configuration. We may introduce IOPF_REPORT_FLAT and 
> IOPF_REPORT_NESTED respectively. while IOPF_REPORT_FLAT detection is 
> straightforward (2 and 3 can be differentiated internally based on configured 
> level), IOPF_REPORT_NESTED needs additional info to indicate which level is 
> concerned since the vendor driver may not support fault reporting in both 
> levels or the fault client may be interested in only one level (e.g. with 2nd
> level pinned).

I agree with this plan (provided I understood it correctly this time):
have IOMMU_DEV_FEAT_IOPF describing the IOPF interface between device and
IOMMU. Enabling it on its own doesn't do anything visible to the driver,
it just probes for capabilities and enables PRI if necessary. For host
SVA, since there is no additional communication between IOMMU and device
driver, enabling IOMMU_DEV_FEAT_SVA in addition to IOPF is sufficient.
Then when implementing nested we'll extend iommu_register_fault_handler()
with flags and parameters. That will also enable advanced dispatching (1).

Will it be necessary to enable FEAT_IOPF when doing VFIO passthrough
(injecting to the guest or handling it with external page tables)?
I think that would be better. Currently a device driver registering a
fault handler doesn't know if it will get recoverable page faults or only
unrecoverable ones.

So I don't think this patch needs any change. Baolu, are you ok with
keeping this and patch 4?

Thanks,
Jean
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
@ 2021-01-19 10:16                   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-19 10:16 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Greg Kroah-Hartman, vivek.gautam, guohanjun, will,
	lorenzo.pieralisi, joro, Zhou Wang, linux-acpi, zhangfei.gao,
	lenb, devicetree, Arnd Bergmann, eric.auger, robh+dt,
	Jonathan.Cameron, linux-arm-kernel, David Woodhouse, rjw,
	shameerali.kolothum.thodi, iommu, sudeep.holla, robin.murphy,
	linux-accelerators, Lu Baolu

On Mon, Jan 18, 2021 at 06:54:28AM +0000, Tian, Kevin wrote:
> > From: Lu Baolu <baolu.lu@linux.intel.com>
> > Sent: Saturday, January 16, 2021 11:54 AM
> > 
> > Hi Jean,
> > 
> > On 2021/1/15 0:41, Jean-Philippe Brucker wrote:
> > > I guess detailing what's needed for nested IOPF can help the discussion,
> > > although I haven't seen any concrete plan about implementing it, and it
> > > still seems a couple of years away. There are two important steps with
> > > nested IOPF:
> > >
> > > (1) Figuring out whether a fault comes from L1 or L2. A SMMU stall event
> > >      comes with this information, but a PRI page request doesn't. The
> > IOMMU
> > >      driver has to first translate the IOVA to a GPA, injecting the fault
> > >      into the guest if this translation fails by using the usual
> > >      iommu_report_device_fault().
> 
> The IOMMU driver can walk the page tables to find out the level information.
> If the walk terminates at the 1st level, inject to the guest. Otherwise fix the 
> mm fault at 2nd level. It's not efficient compared to hardware-provided info,
> but it's doable and actual overhead needs to be measured (optimization exists
> e.g. having fault client to hint no 2nd level fault expected when registering fault
> handler in pinned case).
> 
> > >
> > > (2) Translating the faulting GPA to a HVA that can be fed to
> > >      handle_mm_fault(). That requires help from KVM, so another interface -
> > >      either KVM registering GPA->HVA translation tables or IOMMU driver
> > >      querying each translation. Either way it should be reusable by device
> > >      drivers that implement IOPF themselves.
> 
> Or just leave to the fault client (say VFIO here) to figure it out. VFIO has the
> information about GPA->HPA and can then call handle_mm_fault to fix the
> received fault. The IOMMU driver just exports an interface for the device drivers 
> which implement IOPF themselves to report a fault which is then handled by
> the IOMMU core by reusing the same faulting path.
> 
> > >
> > > (1) could be enabled with iommu_dev_enable_feature(). (2) requires a
> > more
> > > complex interface. (2) alone might also be desirable - demand-paging for
> > > level 2 only, no SVA for level 1.
> 
> Yes, this is what we want to point out. A general FEAT_IOPF implies more than
> what this patch intended to address.
> 
> > >
> > > Anyway, back to this patch. What I'm trying to convey is "can the IOMMU
> > > receive incoming I/O page faults for this device and, when SVA is enabled,
> > > feed them to the mm subsystem?  Enable that or return an error." I'm stuck
> > > on the name. IOPF alone is too vague. Not IOPF_L1 as Kevin noted, since L1
> > > is also used in virtualization. IOPF_BIND and IOPF_SVA could also mean (2)
> > > above. IOMMU_DEV_FEAT_IOPF_FLAT?
> > >
> > > That leaves space for the nested extensions. (1) above could be
> > > IOMMU_FEAT_IOPF_NESTED, and (2) requires some new interfacing with
> > KVM (or
> > > just an external fault handler) and could be used with either IOPF_FLAT or
> > > IOPF_NESTED. We can figure out the details later. What do you think?
> > 
> > I agree that we can define IOPF_ for current usage and leave space for
> > future extensions.
> > 
> > IOPF_FLAT represents IOPF on first-level translation, currently first
> > level translation could be used in below cases.
> > 
> > 1) FL w/ internal Page Table: Kernel IOVA;
> > 2) FL w/ external Page Table: VFIO passthrough;
> > 3) FL w/ shared CPU page table: SVA
> > 
> > We don't need to support IOPF for case 1). Let's put it aside.
> > 
> > IOPF handling of 2) and 3) are different. Do we need to define different
> > names to distinguish these two cases?
> > 
> 
> Defining feature names according to various use cases does not sound a
> clean way. In an ideal way we should have just a general FEAT_IOPF since
> the hardware (at least VT-d) does support fault in either 1st-level, 2nd-
> level or nested configurations. We are entering this trouble just because
> there is difficulty for the software evolving to enable full hardware cap
> in one batch. My last proposal was sort of keeping FEAT_IOPF as a general
> capability for whether delivering fault through the IOMMU or the ad-hoc
> device, and then having a separate interface for whether IOPF reporting
> is available under a specific configuration. The former is about the path
> between the IOMMU and the device, while the latter is about the interface
> between the IOMMU driver and its faulting client.
> 
> The reporting capability can be checked when the fault client is registering 
> its fault handler, and at this time the IOMMU driver knows how the related 
> mapping is configured (1st, 2nd, or nested) and whether fault reporting is 
> supported in such configuration. We may introduce IOPF_REPORT_FLAT and 
> IOPF_REPORT_NESTED respectively. while IOPF_REPORT_FLAT detection is 
> straightforward (2 and 3 can be differentiated internally based on configured 
> level), IOPF_REPORT_NESTED needs additional info to indicate which level is 
> concerned since the vendor driver may not support fault reporting in both 
> levels or the fault client may be interested in only one level (e.g. with 2nd
> level pinned).

I agree with this plan (provided I understood it correctly this time):
have IOMMU_DEV_FEAT_IOPF describing the IOPF interface between device and
IOMMU. Enabling it on its own doesn't do anything visible to the driver,
it just probes for capabilities and enables PRI if necessary. For host
SVA, since there is no additional communication between IOMMU and device
driver, enabling IOMMU_DEV_FEAT_SVA in addition to IOPF is sufficient.
Then when implementing nested we'll extend iommu_register_fault_handler()
with flags and parameters. That will also enable advanced dispatching (1).

Will it be necessary to enable FEAT_IOPF when doing VFIO passthrough
(injecting to the guest or handling it with external page tables)?
I think that would be better. Currently a device driver registering a
fault handler doesn't know if it will get recoverable page faults or only
unrecoverable ones.

So I don't think this patch needs any change. Baolu, are you ok with
keeping this and patch 4?

Thanks,
Jean

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 01/10] iommu: Remove obsolete comment
  2021-01-08 14:52   ` Jean-Philippe Brucker
  (?)
@ 2021-01-19 11:11     ` Jonathan Cameron
  -1 siblings, 0 replies; 105+ messages in thread
From: Jonathan Cameron @ 2021-01-19 11:11 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: joro, will, lorenzo.pieralisi, robh+dt, guohanjun, sudeep.holla,
	rjw, lenb, robin.murphy, eric.auger, iommu, devicetree,
	linux-acpi, linux-arm-kernel, linux-accelerators, baolu.lu,
	vdumpa, zhangfei.gao, shameerali.kolothum.thodi, vivek.gautam

On Fri, 8 Jan 2021 15:52:09 +0100
Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:

> Commit 986d5ecc5699 ("iommu: Move fwspec->iommu_priv to struct
> dev_iommu") removed iommu_priv from fwspec. Update the struct doc.
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org

Hi Jean-Philippe,

Flags parameter doesn't have any docs in this structure and should
do given kernel-doc should be complete.  It probably spits out a warning
for this if you build with W=1

Not sure if it makes sense to fix that in this same patch, or as a different
one as the responsible patch is a different one.
Looks like that came in:
Commit 5702ee24182f ("ACPI/IORT: Check ATS capability in root complex nodes")

Also, good to get this patch merged asap so we cut down on the noise in the
interesting part of this series!

FWIW
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Jonathan


> ---
>  include/linux/iommu.h | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index b3f0e2018c62..26bcde5e7746 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -570,7 +570,6 @@ struct iommu_group *fsl_mc_device_group(struct device *dev);
>   * struct iommu_fwspec - per-device IOMMU instance data
>   * @ops: ops for this device's IOMMU
>   * @iommu_fwnode: firmware handle for this device's IOMMU
> - * @iommu_priv: IOMMU driver private data for this device
>   * @num_pasid_bits: number of PASID bits supported by this device
>   * @num_ids: number of associated device IDs
>   * @ids: IDs which this device may present to the IOMMU


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 01/10] iommu: Remove obsolete comment
@ 2021-01-19 11:11     ` Jonathan Cameron
  0 siblings, 0 replies; 105+ messages in thread
From: Jonathan Cameron @ 2021-01-19 11:11 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: devicetree, linux-acpi, robin.murphy, guohanjun, rjw, iommu,
	robh+dt, linux-accelerators, sudeep.holla, vivek.gautam,
	zhangfei.gao, will, linux-arm-kernel, lenb

On Fri, 8 Jan 2021 15:52:09 +0100
Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:

> Commit 986d5ecc5699 ("iommu: Move fwspec->iommu_priv to struct
> dev_iommu") removed iommu_priv from fwspec. Update the struct doc.
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org

Hi Jean-Philippe,

Flags parameter doesn't have any docs in this structure and should
do given kernel-doc should be complete.  It probably spits out a warning
for this if you build with W=1

Not sure if it makes sense to fix that in this same patch, or as a different
one as the responsible patch is a different one.
Looks like that came in:
Commit 5702ee24182f ("ACPI/IORT: Check ATS capability in root complex nodes")

Also, good to get this patch merged asap so we cut down on the noise in the
interesting part of this series!

FWIW
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Jonathan


> ---
>  include/linux/iommu.h | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index b3f0e2018c62..26bcde5e7746 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -570,7 +570,6 @@ struct iommu_group *fsl_mc_device_group(struct device *dev);
>   * struct iommu_fwspec - per-device IOMMU instance data
>   * @ops: ops for this device's IOMMU
>   * @iommu_fwnode: firmware handle for this device's IOMMU
> - * @iommu_priv: IOMMU driver private data for this device
>   * @num_pasid_bits: number of PASID bits supported by this device
>   * @num_ids: number of associated device IDs
>   * @ids: IDs which this device may present to the IOMMU

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 01/10] iommu: Remove obsolete comment
@ 2021-01-19 11:11     ` Jonathan Cameron
  0 siblings, 0 replies; 105+ messages in thread
From: Jonathan Cameron @ 2021-01-19 11:11 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: devicetree, lorenzo.pieralisi, linux-acpi, robin.murphy, joro,
	guohanjun, rjw, shameerali.kolothum.thodi, eric.auger, iommu,
	robh+dt, linux-accelerators, sudeep.holla, vivek.gautam,
	zhangfei.gao, baolu.lu, will, linux-arm-kernel, lenb

On Fri, 8 Jan 2021 15:52:09 +0100
Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:

> Commit 986d5ecc5699 ("iommu: Move fwspec->iommu_priv to struct
> dev_iommu") removed iommu_priv from fwspec. Update the struct doc.
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org

Hi Jean-Philippe,

Flags parameter doesn't have any docs in this structure and should
do given kernel-doc should be complete.  It probably spits out a warning
for this if you build with W=1

Not sure if it makes sense to fix that in this same patch, or as a different
one as the responsible patch is a different one.
Looks like that came in:
Commit 5702ee24182f ("ACPI/IORT: Check ATS capability in root complex nodes")

Also, good to get this patch merged asap so we cut down on the noise in the
interesting part of this series!

FWIW
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Jonathan


> ---
>  include/linux/iommu.h | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index b3f0e2018c62..26bcde5e7746 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -570,7 +570,6 @@ struct iommu_group *fsl_mc_device_group(struct device *dev);
>   * struct iommu_fwspec - per-device IOMMU instance data
>   * @ops: ops for this device's IOMMU
>   * @iommu_fwnode: firmware handle for this device's IOMMU
> - * @iommu_priv: IOMMU driver private data for this device
>   * @num_pasid_bits: number of PASID bits supported by this device
>   * @num_ids: number of associated device IDs
>   * @ids: IDs which this device may present to the IOMMU


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 02/10] iommu/arm-smmu-v3: Use device properties for pasid-num-bits
  2021-01-08 14:52   ` Jean-Philippe Brucker
  (?)
@ 2021-01-19 11:22     ` Jonathan Cameron
  -1 siblings, 0 replies; 105+ messages in thread
From: Jonathan Cameron @ 2021-01-19 11:22 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: joro, will, lorenzo.pieralisi, robh+dt, guohanjun, sudeep.holla,
	rjw, lenb, robin.murphy, eric.auger, iommu, devicetree,
	linux-acpi, linux-arm-kernel, linux-accelerators, baolu.lu,
	vdumpa, zhangfei.gao, shameerali.kolothum.thodi, vivek.gautam

On Fri, 8 Jan 2021 15:52:10 +0100
Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:

> The pasid-num-bits property shouldn't need a dedicated fwspec field,
> it's a job for device properties. Add properties for IORT, and access
> the number of PASID bits using device_property_read_u32().
> 
> Suggested-by: Robin Murphy <robin.murphy@arm.com>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

Nice

Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Looks like we are fine not checking for missing properties because
ssid_bits == 0 corresponds to pasid off anyway.


> ---
>  include/linux/iommu.h                       |  2 --
>  drivers/acpi/arm64/iort.c                   | 13 +++++++------
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  3 ++-
>  drivers/iommu/of_iommu.c                    |  5 -----
>  4 files changed, 9 insertions(+), 14 deletions(-)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 26bcde5e7746..583c734b2e87 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -570,7 +570,6 @@ struct iommu_group *fsl_mc_device_group(struct device *dev);
>   * struct iommu_fwspec - per-device IOMMU instance data
>   * @ops: ops for this device's IOMMU
>   * @iommu_fwnode: firmware handle for this device's IOMMU
> - * @num_pasid_bits: number of PASID bits supported by this device
>   * @num_ids: number of associated device IDs
>   * @ids: IDs which this device may present to the IOMMU
>   */
> @@ -578,7 +577,6 @@ struct iommu_fwspec {
>  	const struct iommu_ops	*ops;
>  	struct fwnode_handle	*iommu_fwnode;
>  	u32			flags;
> -	u32			num_pasid_bits;
>  	unsigned int		num_ids;
>  	u32			ids[];
>  };
> diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> index d4eac6d7e9fb..c9a8bbb74b09 100644
> --- a/drivers/acpi/arm64/iort.c
> +++ b/drivers/acpi/arm64/iort.c
> @@ -968,15 +968,16 @@ static int iort_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
>  static void iort_named_component_init(struct device *dev,
>  				      struct acpi_iort_node *node)
>  {
> +	struct property_entry props[2] = {};
>  	struct acpi_iort_named_component *nc;
> -	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> -
> -	if (!fwspec)
> -		return;
>  
>  	nc = (struct acpi_iort_named_component *)node->node_data;
> -	fwspec->num_pasid_bits = FIELD_GET(ACPI_IORT_NC_PASID_BITS,
> -					   nc->node_flags);
> +	props[0] = PROPERTY_ENTRY_U32("pasid-num-bits",
> +				      FIELD_GET(ACPI_IORT_NC_PASID_BITS,
> +						nc->node_flags));
> +
> +	if (device_add_properties(dev, props))
> +		dev_warn(dev, "Could not add device properties\n");
>  }
>  
>  static int iort_nc_iommu_map(struct device *dev, struct acpi_iort_node *node)
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 8ca7415d785d..6a53b4edf054 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2366,7 +2366,8 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
>  		}
>  	}
>  
> -	master->ssid_bits = min(smmu->ssid_bits, fwspec->num_pasid_bits);
> +	device_property_read_u32(dev, "pasid-num-bits", &master->ssid_bits);
> +	master->ssid_bits = min(smmu->ssid_bits, master->ssid_bits);
>  
>  	/*
>  	 * Note that PASID must be enabled before, and disabled after ATS:
> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> index e505b9130a1c..a9d2df001149 100644
> --- a/drivers/iommu/of_iommu.c
> +++ b/drivers/iommu/of_iommu.c
> @@ -210,11 +210,6 @@ const struct iommu_ops *of_iommu_configure(struct device *dev,
>  					     of_pci_iommu_init, &info);
>  	} else {
>  		err = of_iommu_configure_device(master_np, dev, id);
> -
> -		fwspec = dev_iommu_fwspec_get(dev);
> -		if (!err && fwspec)
> -			of_property_read_u32(master_np, "pasid-num-bits",
> -					     &fwspec->num_pasid_bits);
>  	}
>  
>  	/*


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 02/10] iommu/arm-smmu-v3: Use device properties for pasid-num-bits
@ 2021-01-19 11:22     ` Jonathan Cameron
  0 siblings, 0 replies; 105+ messages in thread
From: Jonathan Cameron @ 2021-01-19 11:22 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: devicetree, linux-acpi, robin.murphy, guohanjun, rjw, iommu,
	robh+dt, linux-accelerators, sudeep.holla, vivek.gautam,
	zhangfei.gao, will, linux-arm-kernel, lenb

On Fri, 8 Jan 2021 15:52:10 +0100
Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:

> The pasid-num-bits property shouldn't need a dedicated fwspec field,
> it's a job for device properties. Add properties for IORT, and access
> the number of PASID bits using device_property_read_u32().
> 
> Suggested-by: Robin Murphy <robin.murphy@arm.com>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

Nice

Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Looks like we are fine not checking for missing properties because
ssid_bits == 0 corresponds to pasid off anyway.


> ---
>  include/linux/iommu.h                       |  2 --
>  drivers/acpi/arm64/iort.c                   | 13 +++++++------
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  3 ++-
>  drivers/iommu/of_iommu.c                    |  5 -----
>  4 files changed, 9 insertions(+), 14 deletions(-)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 26bcde5e7746..583c734b2e87 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -570,7 +570,6 @@ struct iommu_group *fsl_mc_device_group(struct device *dev);
>   * struct iommu_fwspec - per-device IOMMU instance data
>   * @ops: ops for this device's IOMMU
>   * @iommu_fwnode: firmware handle for this device's IOMMU
> - * @num_pasid_bits: number of PASID bits supported by this device
>   * @num_ids: number of associated device IDs
>   * @ids: IDs which this device may present to the IOMMU
>   */
> @@ -578,7 +577,6 @@ struct iommu_fwspec {
>  	const struct iommu_ops	*ops;
>  	struct fwnode_handle	*iommu_fwnode;
>  	u32			flags;
> -	u32			num_pasid_bits;
>  	unsigned int		num_ids;
>  	u32			ids[];
>  };
> diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> index d4eac6d7e9fb..c9a8bbb74b09 100644
> --- a/drivers/acpi/arm64/iort.c
> +++ b/drivers/acpi/arm64/iort.c
> @@ -968,15 +968,16 @@ static int iort_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
>  static void iort_named_component_init(struct device *dev,
>  				      struct acpi_iort_node *node)
>  {
> +	struct property_entry props[2] = {};
>  	struct acpi_iort_named_component *nc;
> -	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> -
> -	if (!fwspec)
> -		return;
>  
>  	nc = (struct acpi_iort_named_component *)node->node_data;
> -	fwspec->num_pasid_bits = FIELD_GET(ACPI_IORT_NC_PASID_BITS,
> -					   nc->node_flags);
> +	props[0] = PROPERTY_ENTRY_U32("pasid-num-bits",
> +				      FIELD_GET(ACPI_IORT_NC_PASID_BITS,
> +						nc->node_flags));
> +
> +	if (device_add_properties(dev, props))
> +		dev_warn(dev, "Could not add device properties\n");
>  }
>  
>  static int iort_nc_iommu_map(struct device *dev, struct acpi_iort_node *node)
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 8ca7415d785d..6a53b4edf054 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2366,7 +2366,8 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
>  		}
>  	}
>  
> -	master->ssid_bits = min(smmu->ssid_bits, fwspec->num_pasid_bits);
> +	device_property_read_u32(dev, "pasid-num-bits", &master->ssid_bits);
> +	master->ssid_bits = min(smmu->ssid_bits, master->ssid_bits);
>  
>  	/*
>  	 * Note that PASID must be enabled before, and disabled after ATS:
> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> index e505b9130a1c..a9d2df001149 100644
> --- a/drivers/iommu/of_iommu.c
> +++ b/drivers/iommu/of_iommu.c
> @@ -210,11 +210,6 @@ const struct iommu_ops *of_iommu_configure(struct device *dev,
>  					     of_pci_iommu_init, &info);
>  	} else {
>  		err = of_iommu_configure_device(master_np, dev, id);
> -
> -		fwspec = dev_iommu_fwspec_get(dev);
> -		if (!err && fwspec)
> -			of_property_read_u32(master_np, "pasid-num-bits",
> -					     &fwspec->num_pasid_bits);
>  	}
>  
>  	/*

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 02/10] iommu/arm-smmu-v3: Use device properties for pasid-num-bits
@ 2021-01-19 11:22     ` Jonathan Cameron
  0 siblings, 0 replies; 105+ messages in thread
From: Jonathan Cameron @ 2021-01-19 11:22 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: devicetree, lorenzo.pieralisi, linux-acpi, robin.murphy, joro,
	guohanjun, rjw, shameerali.kolothum.thodi, eric.auger, iommu,
	robh+dt, linux-accelerators, sudeep.holla, vivek.gautam,
	zhangfei.gao, baolu.lu, will, linux-arm-kernel, lenb

On Fri, 8 Jan 2021 15:52:10 +0100
Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:

> The pasid-num-bits property shouldn't need a dedicated fwspec field,
> it's a job for device properties. Add properties for IORT, and access
> the number of PASID bits using device_property_read_u32().
> 
> Suggested-by: Robin Murphy <robin.murphy@arm.com>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

Nice

Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Looks like we are fine not checking for missing properties because
ssid_bits == 0 corresponds to pasid off anyway.


> ---
>  include/linux/iommu.h                       |  2 --
>  drivers/acpi/arm64/iort.c                   | 13 +++++++------
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  3 ++-
>  drivers/iommu/of_iommu.c                    |  5 -----
>  4 files changed, 9 insertions(+), 14 deletions(-)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 26bcde5e7746..583c734b2e87 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -570,7 +570,6 @@ struct iommu_group *fsl_mc_device_group(struct device *dev);
>   * struct iommu_fwspec - per-device IOMMU instance data
>   * @ops: ops for this device's IOMMU
>   * @iommu_fwnode: firmware handle for this device's IOMMU
> - * @num_pasid_bits: number of PASID bits supported by this device
>   * @num_ids: number of associated device IDs
>   * @ids: IDs which this device may present to the IOMMU
>   */
> @@ -578,7 +577,6 @@ struct iommu_fwspec {
>  	const struct iommu_ops	*ops;
>  	struct fwnode_handle	*iommu_fwnode;
>  	u32			flags;
> -	u32			num_pasid_bits;
>  	unsigned int		num_ids;
>  	u32			ids[];
>  };
> diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> index d4eac6d7e9fb..c9a8bbb74b09 100644
> --- a/drivers/acpi/arm64/iort.c
> +++ b/drivers/acpi/arm64/iort.c
> @@ -968,15 +968,16 @@ static int iort_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
>  static void iort_named_component_init(struct device *dev,
>  				      struct acpi_iort_node *node)
>  {
> +	struct property_entry props[2] = {};
>  	struct acpi_iort_named_component *nc;
> -	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> -
> -	if (!fwspec)
> -		return;
>  
>  	nc = (struct acpi_iort_named_component *)node->node_data;
> -	fwspec->num_pasid_bits = FIELD_GET(ACPI_IORT_NC_PASID_BITS,
> -					   nc->node_flags);
> +	props[0] = PROPERTY_ENTRY_U32("pasid-num-bits",
> +				      FIELD_GET(ACPI_IORT_NC_PASID_BITS,
> +						nc->node_flags));
> +
> +	if (device_add_properties(dev, props))
> +		dev_warn(dev, "Could not add device properties\n");
>  }
>  
>  static int iort_nc_iommu_map(struct device *dev, struct acpi_iort_node *node)
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 8ca7415d785d..6a53b4edf054 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2366,7 +2366,8 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
>  		}
>  	}
>  
> -	master->ssid_bits = min(smmu->ssid_bits, fwspec->num_pasid_bits);
> +	device_property_read_u32(dev, "pasid-num-bits", &master->ssid_bits);
> +	master->ssid_bits = min(smmu->ssid_bits, master->ssid_bits);
>  
>  	/*
>  	 * Note that PASID must be enabled before, and disabled after ATS:
> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> index e505b9130a1c..a9d2df001149 100644
> --- a/drivers/iommu/of_iommu.c
> +++ b/drivers/iommu/of_iommu.c
> @@ -210,11 +210,6 @@ const struct iommu_ops *of_iommu_configure(struct device *dev,
>  					     of_pci_iommu_init, &info);
>  	} else {
>  		err = of_iommu_configure_device(master_np, dev, id);
> -
> -		fwspec = dev_iommu_fwspec_get(dev);
> -		if (!err && fwspec)
> -			of_property_read_u32(master_np, "pasid-num-bits",
> -					     &fwspec->num_pasid_bits);
>  	}
>  
>  	/*


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 05/10] uacce: Enable IOMMU_DEV_FEAT_IOPF
  2021-01-08 14:52   ` Jean-Philippe Brucker
  (?)
@ 2021-01-19 12:27     ` Jonathan Cameron
  -1 siblings, 0 replies; 105+ messages in thread
From: Jonathan Cameron @ 2021-01-19 12:27 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: joro, will, lorenzo.pieralisi, robh+dt, guohanjun, sudeep.holla,
	rjw, lenb, robin.murphy, eric.auger, iommu, devicetree,
	linux-acpi, linux-arm-kernel, linux-accelerators, baolu.lu,
	vdumpa, zhangfei.gao, shameerali.kolothum.thodi, vivek.gautam,
	Arnd Bergmann, Greg Kroah-Hartman, Zhou Wang

On Fri, 8 Jan 2021 15:52:13 +0100
Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:

> The IOPF (I/O Page Fault) feature is now enabled independently from the
> SVA feature, because some IOPF implementations are device-specific and
> do not require IOMMU support for PCIe PRI or Arm SMMU stall.
> 
> Enable IOPF unconditionally when enabling SVA for now. In the future, if
> a device driver implementing a uacce interface doesn't need IOPF
> support, it will need to tell the uacce module, for example with a new
> flag.
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Hi Jean-Philippe,

A minor suggestion inline but I'm not that bothered so either way
looks good to me.

> ---
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
> Cc: Zhou Wang <wangzhou1@hisilicon.com>
> ---
>  drivers/misc/uacce/uacce.c | 32 +++++++++++++++++++++++++-------
>  1 file changed, 25 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
> index d07af4edfcac..41ef1eb62a14 100644
> --- a/drivers/misc/uacce/uacce.c
> +++ b/drivers/misc/uacce/uacce.c
> @@ -385,6 +385,24 @@ static void uacce_release(struct device *dev)
>  	kfree(uacce);
>  }
>  
> +static unsigned int uacce_enable_sva(struct device *parent, unsigned int flags)
> +{
> +	if (!(flags & UACCE_DEV_SVA))
> +		return flags;
> +
> +	flags &= ~UACCE_DEV_SVA;
> +
> +	if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_IOPF))
> +		return flags;
> +
> +	if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA)) {
> +		iommu_dev_disable_feature(parent, IOMMU_DEV_FEAT_IOPF);
> +		return flags;
> +	}
> +
> +	return flags | UACCE_DEV_SVA;
> +}

I'm a great fan of paired enable / disable functions.
Whilst it would be trivial, maybe it is worth introducing

uacce_disable_sva()?
Also make that do the flags check internally to make it match
up with the enable path.


> +
>  /**
>   * uacce_alloc() - alloc an accelerator
>   * @parent: pointer of uacce parent device
> @@ -404,11 +422,7 @@ struct uacce_device *uacce_alloc(struct device *parent,
>  	if (!uacce)
>  		return ERR_PTR(-ENOMEM);
>  
> -	if (flags & UACCE_DEV_SVA) {
> -		ret = iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA);
> -		if (ret)
> -			flags &= ~UACCE_DEV_SVA;
> -	}
> +	flags = uacce_enable_sva(parent, flags);
>  
>  	uacce->parent = parent;
>  	uacce->flags = flags;
> @@ -432,8 +446,10 @@ struct uacce_device *uacce_alloc(struct device *parent,
>  	return uacce;
>  
>  err_with_uacce:
> -	if (flags & UACCE_DEV_SVA)
> +	if (flags & UACCE_DEV_SVA) {
>  		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
> +		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
> +	}
>  	kfree(uacce);
>  	return ERR_PTR(ret);
>  }
> @@ -487,8 +503,10 @@ void uacce_remove(struct uacce_device *uacce)
>  	mutex_unlock(&uacce->queues_lock);
>  
>  	/* disable sva now since no opened queues */
> -	if (uacce->flags & UACCE_DEV_SVA)
> +	if (uacce->flags & UACCE_DEV_SVA) {
>  		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
> +		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
> +	}
>  
>  	if (uacce->cdev)
>  		cdev_device_del(uacce->cdev, &uacce->dev);


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 05/10] uacce: Enable IOMMU_DEV_FEAT_IOPF
@ 2021-01-19 12:27     ` Jonathan Cameron
  0 siblings, 0 replies; 105+ messages in thread
From: Jonathan Cameron @ 2021-01-19 12:27 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: vivek.gautam, guohanjun, will, linux-acpi, zhangfei.gao, lenb,
	devicetree, Arnd Bergmann, robh+dt, linux-arm-kernel,
	Greg Kroah-Hartman, rjw, iommu, sudeep.holla, robin.murphy,
	linux-accelerators

On Fri, 8 Jan 2021 15:52:13 +0100
Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:

> The IOPF (I/O Page Fault) feature is now enabled independently from the
> SVA feature, because some IOPF implementations are device-specific and
> do not require IOMMU support for PCIe PRI or Arm SMMU stall.
> 
> Enable IOPF unconditionally when enabling SVA for now. In the future, if
> a device driver implementing a uacce interface doesn't need IOPF
> support, it will need to tell the uacce module, for example with a new
> flag.
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Hi Jean-Philippe,

A minor suggestion inline but I'm not that bothered so either way
looks good to me.

> ---
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
> Cc: Zhou Wang <wangzhou1@hisilicon.com>
> ---
>  drivers/misc/uacce/uacce.c | 32 +++++++++++++++++++++++++-------
>  1 file changed, 25 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
> index d07af4edfcac..41ef1eb62a14 100644
> --- a/drivers/misc/uacce/uacce.c
> +++ b/drivers/misc/uacce/uacce.c
> @@ -385,6 +385,24 @@ static void uacce_release(struct device *dev)
>  	kfree(uacce);
>  }
>  
> +static unsigned int uacce_enable_sva(struct device *parent, unsigned int flags)
> +{
> +	if (!(flags & UACCE_DEV_SVA))
> +		return flags;
> +
> +	flags &= ~UACCE_DEV_SVA;
> +
> +	if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_IOPF))
> +		return flags;
> +
> +	if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA)) {
> +		iommu_dev_disable_feature(parent, IOMMU_DEV_FEAT_IOPF);
> +		return flags;
> +	}
> +
> +	return flags | UACCE_DEV_SVA;
> +}

I'm a great fan of paired enable / disable functions.
Whilst it would be trivial, maybe it is worth introducing

uacce_disable_sva()?
Also make that do the flags check internally to make it match
up with the enable path.


> +
>  /**
>   * uacce_alloc() - alloc an accelerator
>   * @parent: pointer of uacce parent device
> @@ -404,11 +422,7 @@ struct uacce_device *uacce_alloc(struct device *parent,
>  	if (!uacce)
>  		return ERR_PTR(-ENOMEM);
>  
> -	if (flags & UACCE_DEV_SVA) {
> -		ret = iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA);
> -		if (ret)
> -			flags &= ~UACCE_DEV_SVA;
> -	}
> +	flags = uacce_enable_sva(parent, flags);
>  
>  	uacce->parent = parent;
>  	uacce->flags = flags;
> @@ -432,8 +446,10 @@ struct uacce_device *uacce_alloc(struct device *parent,
>  	return uacce;
>  
>  err_with_uacce:
> -	if (flags & UACCE_DEV_SVA)
> +	if (flags & UACCE_DEV_SVA) {
>  		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
> +		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
> +	}
>  	kfree(uacce);
>  	return ERR_PTR(ret);
>  }
> @@ -487,8 +503,10 @@ void uacce_remove(struct uacce_device *uacce)
>  	mutex_unlock(&uacce->queues_lock);
>  
>  	/* disable sva now since no opened queues */
> -	if (uacce->flags & UACCE_DEV_SVA)
> +	if (uacce->flags & UACCE_DEV_SVA) {
>  		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
> +		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
> +	}
>  
>  	if (uacce->cdev)
>  		cdev_device_del(uacce->cdev, &uacce->dev);

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 05/10] uacce: Enable IOMMU_DEV_FEAT_IOPF
@ 2021-01-19 12:27     ` Jonathan Cameron
  0 siblings, 0 replies; 105+ messages in thread
From: Jonathan Cameron @ 2021-01-19 12:27 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: vivek.gautam, guohanjun, will, lorenzo.pieralisi, joro,
	Zhou Wang, linux-acpi, zhangfei.gao, lenb, devicetree,
	Arnd Bergmann, eric.auger, robh+dt, linux-arm-kernel,
	Greg Kroah-Hartman, rjw, shameerali.kolothum.thodi, iommu,
	sudeep.holla, robin.murphy, linux-accelerators, baolu.lu

On Fri, 8 Jan 2021 15:52:13 +0100
Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:

> The IOPF (I/O Page Fault) feature is now enabled independently from the
> SVA feature, because some IOPF implementations are device-specific and
> do not require IOMMU support for PCIe PRI or Arm SMMU stall.
> 
> Enable IOPF unconditionally when enabling SVA for now. In the future, if
> a device driver implementing a uacce interface doesn't need IOPF
> support, it will need to tell the uacce module, for example with a new
> flag.
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Hi Jean-Philippe,

A minor suggestion inline but I'm not that bothered so either way
looks good to me.

> ---
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
> Cc: Zhou Wang <wangzhou1@hisilicon.com>
> ---
>  drivers/misc/uacce/uacce.c | 32 +++++++++++++++++++++++++-------
>  1 file changed, 25 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
> index d07af4edfcac..41ef1eb62a14 100644
> --- a/drivers/misc/uacce/uacce.c
> +++ b/drivers/misc/uacce/uacce.c
> @@ -385,6 +385,24 @@ static void uacce_release(struct device *dev)
>  	kfree(uacce);
>  }
>  
> +static unsigned int uacce_enable_sva(struct device *parent, unsigned int flags)
> +{
> +	if (!(flags & UACCE_DEV_SVA))
> +		return flags;
> +
> +	flags &= ~UACCE_DEV_SVA;
> +
> +	if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_IOPF))
> +		return flags;
> +
> +	if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA)) {
> +		iommu_dev_disable_feature(parent, IOMMU_DEV_FEAT_IOPF);
> +		return flags;
> +	}
> +
> +	return flags | UACCE_DEV_SVA;
> +}

I'm a great fan of paired enable / disable functions.
Whilst it would be trivial, maybe it is worth introducing

uacce_disable_sva()?
Also make that do the flags check internally to make it match
up with the enable path.


> +
>  /**
>   * uacce_alloc() - alloc an accelerator
>   * @parent: pointer of uacce parent device
> @@ -404,11 +422,7 @@ struct uacce_device *uacce_alloc(struct device *parent,
>  	if (!uacce)
>  		return ERR_PTR(-ENOMEM);
>  
> -	if (flags & UACCE_DEV_SVA) {
> -		ret = iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA);
> -		if (ret)
> -			flags &= ~UACCE_DEV_SVA;
> -	}
> +	flags = uacce_enable_sva(parent, flags);
>  
>  	uacce->parent = parent;
>  	uacce->flags = flags;
> @@ -432,8 +446,10 @@ struct uacce_device *uacce_alloc(struct device *parent,
>  	return uacce;
>  
>  err_with_uacce:
> -	if (flags & UACCE_DEV_SVA)
> +	if (flags & UACCE_DEV_SVA) {
>  		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
> +		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
> +	}
>  	kfree(uacce);
>  	return ERR_PTR(ret);
>  }
> @@ -487,8 +503,10 @@ void uacce_remove(struct uacce_device *uacce)
>  	mutex_unlock(&uacce->queues_lock);
>  
>  	/* disable sva now since no opened queues */
> -	if (uacce->flags & UACCE_DEV_SVA)
> +	if (uacce->flags & UACCE_DEV_SVA) {
>  		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
> +		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
> +	}
>  
>  	if (uacce->cdev)
>  		cdev_device_del(uacce->cdev, &uacce->dev);


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 06/10] iommu: Add a page fault handler
  2021-01-08 14:52   ` Jean-Philippe Brucker
  (?)
@ 2021-01-19 13:38     ` Jonathan Cameron
  -1 siblings, 0 replies; 105+ messages in thread
From: Jonathan Cameron @ 2021-01-19 13:38 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: joro, will, lorenzo.pieralisi, robh+dt, guohanjun, sudeep.holla,
	rjw, lenb, robin.murphy, eric.auger, iommu, devicetree,
	linux-acpi, linux-arm-kernel, linux-accelerators, baolu.lu,
	vdumpa, zhangfei.gao, shameerali.kolothum.thodi, vivek.gautam

On Fri, 8 Jan 2021 15:52:14 +0100
Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:

> Some systems allow devices to handle I/O Page Faults in the core mm. For
> example systems implementing the PCIe PRI extension or Arm SMMU stall
> model. Infrastructure for reporting these recoverable page faults was
> added to the IOMMU core by commit 0c830e6b3282 ("iommu: Introduce device
> fault report API"). Add a page fault handler for host SVA.
> 
> IOMMU driver can now instantiate several fault workqueues and link them
> to IOPF-capable devices. Drivers can choose between a single global
> workqueue, one per IOMMU device, one per low-level fault queue, one per
> domain, etc.
> 
> When it receives a fault event, supposedly in an IRQ handler, the IOMMU

Why "supposedly"? Do you mean "most commonly" 

> driver reports the fault using iommu_report_device_fault(), which calls
> the registered handler. The page fault handler then calls the mm fault
> handler, and reports either success or failure with iommu_page_response().
> When the handler succeeds, the IOMMU retries the access.

For PRI that description is perhaps a bit missleading.  IIRC the IOMMU
will only retry when it gets a new ATS query.

> 
> The iopf_param pointer could be embedded into iommu_fault_param. But
> putting iopf_param into the iommu_param structure allows us not to care
> about ordering between calls to iopf_queue_add_device() and
> iommu_register_device_fault_handler().
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

One really minor inconsistency inline that made me look twice..
With or without that tided up FWIW.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

...

> +/**
> + * iopf_queue_add_device - Add producer to the fault queue
> + * @queue: IOPF queue
> + * @dev: device to add
> + *
> + * Return: 0 on success and <0 on error.
> + */
> +int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev)
> +{
> +	int ret = -EBUSY;
> +	struct iopf_device_param *iopf_param;
> +	struct dev_iommu *param = dev->iommu;
> +
> +	if (!param)
> +		return -ENODEV;
> +
> +	iopf_param = kzalloc(sizeof(*iopf_param), GFP_KERNEL);
> +	if (!iopf_param)
> +		return -ENOMEM;
> +
> +	INIT_LIST_HEAD(&iopf_param->partial);
> +	iopf_param->queue = queue;
> +	iopf_param->dev = dev;
> +
> +	mutex_lock(&queue->lock);
> +	mutex_lock(&param->lock);
> +	if (!param->iopf_param) {
> +		list_add(&iopf_param->queue_list, &queue->devices);
> +		param->iopf_param = iopf_param;
> +		ret = 0;
> +	}
> +	mutex_unlock(&param->lock);
> +	mutex_unlock(&queue->lock);
> +
> +	if (ret)
> +		kfree(iopf_param);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iopf_queue_add_device);
> +
> +/**
> + * iopf_queue_remove_device - Remove producer from fault queue
> + * @queue: IOPF queue
> + * @dev: device to remove
> + *
> + * Caller makes sure that no more faults are reported for this device.
> + *
> + * Return: 0 on success and <0 on error.
> + */
> +int iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev)
> +{
> +	int ret = 0;
I'm not that keen that the logic of ret is basically the opposite
of that in the previous function.
There we had it init to error then set to good, here we do the opposite.

Not that important which but right now it just made me do a double take
whilst reading.

> +	struct iopf_fault *iopf, *next;
> +	struct iopf_device_param *iopf_param;
> +	struct dev_iommu *param = dev->iommu;
> +
> +	if (!param || !queue)
> +		return -EINVAL;
> +
> +	mutex_lock(&queue->lock);
> +	mutex_lock(&param->lock);
> +	iopf_param = param->iopf_param;
> +	if (iopf_param && iopf_param->queue == queue) {
> +		list_del(&iopf_param->queue_list);
> +		param->iopf_param = NULL;
> +	} else {
> +		ret = -EINVAL;
> +	}
> +	mutex_unlock(&param->lock);
> +	mutex_unlock(&queue->lock);
> +	if (ret)
> +		return ret;
> +
> +	/* Just in case some faults are still stuck */
> +	list_for_each_entry_safe(iopf, next, &iopf_param->partial, list)
> +		kfree(iopf);
> +
> +	kfree(iopf_param);
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(iopf_queue_remove_device);
> +

...


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 06/10] iommu: Add a page fault handler
@ 2021-01-19 13:38     ` Jonathan Cameron
  0 siblings, 0 replies; 105+ messages in thread
From: Jonathan Cameron @ 2021-01-19 13:38 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: devicetree, linux-acpi, robin.murphy, guohanjun, rjw, iommu,
	robh+dt, linux-accelerators, sudeep.holla, vivek.gautam,
	zhangfei.gao, will, linux-arm-kernel, lenb

On Fri, 8 Jan 2021 15:52:14 +0100
Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:

> Some systems allow devices to handle I/O Page Faults in the core mm. For
> example systems implementing the PCIe PRI extension or Arm SMMU stall
> model. Infrastructure for reporting these recoverable page faults was
> added to the IOMMU core by commit 0c830e6b3282 ("iommu: Introduce device
> fault report API"). Add a page fault handler for host SVA.
> 
> IOMMU driver can now instantiate several fault workqueues and link them
> to IOPF-capable devices. Drivers can choose between a single global
> workqueue, one per IOMMU device, one per low-level fault queue, one per
> domain, etc.
> 
> When it receives a fault event, supposedly in an IRQ handler, the IOMMU

Why "supposedly"? Do you mean "most commonly" 

> driver reports the fault using iommu_report_device_fault(), which calls
> the registered handler. The page fault handler then calls the mm fault
> handler, and reports either success or failure with iommu_page_response().
> When the handler succeeds, the IOMMU retries the access.

For PRI that description is perhaps a bit missleading.  IIRC the IOMMU
will only retry when it gets a new ATS query.

> 
> The iopf_param pointer could be embedded into iommu_fault_param. But
> putting iopf_param into the iommu_param structure allows us not to care
> about ordering between calls to iopf_queue_add_device() and
> iommu_register_device_fault_handler().
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

One really minor inconsistency inline that made me look twice..
With or without that tided up FWIW.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

...

> +/**
> + * iopf_queue_add_device - Add producer to the fault queue
> + * @queue: IOPF queue
> + * @dev: device to add
> + *
> + * Return: 0 on success and <0 on error.
> + */
> +int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev)
> +{
> +	int ret = -EBUSY;
> +	struct iopf_device_param *iopf_param;
> +	struct dev_iommu *param = dev->iommu;
> +
> +	if (!param)
> +		return -ENODEV;
> +
> +	iopf_param = kzalloc(sizeof(*iopf_param), GFP_KERNEL);
> +	if (!iopf_param)
> +		return -ENOMEM;
> +
> +	INIT_LIST_HEAD(&iopf_param->partial);
> +	iopf_param->queue = queue;
> +	iopf_param->dev = dev;
> +
> +	mutex_lock(&queue->lock);
> +	mutex_lock(&param->lock);
> +	if (!param->iopf_param) {
> +		list_add(&iopf_param->queue_list, &queue->devices);
> +		param->iopf_param = iopf_param;
> +		ret = 0;
> +	}
> +	mutex_unlock(&param->lock);
> +	mutex_unlock(&queue->lock);
> +
> +	if (ret)
> +		kfree(iopf_param);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iopf_queue_add_device);
> +
> +/**
> + * iopf_queue_remove_device - Remove producer from fault queue
> + * @queue: IOPF queue
> + * @dev: device to remove
> + *
> + * Caller makes sure that no more faults are reported for this device.
> + *
> + * Return: 0 on success and <0 on error.
> + */
> +int iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev)
> +{
> +	int ret = 0;
I'm not that keen that the logic of ret is basically the opposite
of that in the previous function.
There we had it init to error then set to good, here we do the opposite.

Not that important which but right now it just made me do a double take
whilst reading.

> +	struct iopf_fault *iopf, *next;
> +	struct iopf_device_param *iopf_param;
> +	struct dev_iommu *param = dev->iommu;
> +
> +	if (!param || !queue)
> +		return -EINVAL;
> +
> +	mutex_lock(&queue->lock);
> +	mutex_lock(&param->lock);
> +	iopf_param = param->iopf_param;
> +	if (iopf_param && iopf_param->queue == queue) {
> +		list_del(&iopf_param->queue_list);
> +		param->iopf_param = NULL;
> +	} else {
> +		ret = -EINVAL;
> +	}
> +	mutex_unlock(&param->lock);
> +	mutex_unlock(&queue->lock);
> +	if (ret)
> +		return ret;
> +
> +	/* Just in case some faults are still stuck */
> +	list_for_each_entry_safe(iopf, next, &iopf_param->partial, list)
> +		kfree(iopf);
> +
> +	kfree(iopf_param);
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(iopf_queue_remove_device);
> +

...

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 06/10] iommu: Add a page fault handler
@ 2021-01-19 13:38     ` Jonathan Cameron
  0 siblings, 0 replies; 105+ messages in thread
From: Jonathan Cameron @ 2021-01-19 13:38 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: devicetree, lorenzo.pieralisi, linux-acpi, robin.murphy, joro,
	guohanjun, rjw, shameerali.kolothum.thodi, eric.auger, iommu,
	robh+dt, linux-accelerators, sudeep.holla, vivek.gautam,
	zhangfei.gao, baolu.lu, will, linux-arm-kernel, lenb

On Fri, 8 Jan 2021 15:52:14 +0100
Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:

> Some systems allow devices to handle I/O Page Faults in the core mm. For
> example systems implementing the PCIe PRI extension or Arm SMMU stall
> model. Infrastructure for reporting these recoverable page faults was
> added to the IOMMU core by commit 0c830e6b3282 ("iommu: Introduce device
> fault report API"). Add a page fault handler for host SVA.
> 
> IOMMU driver can now instantiate several fault workqueues and link them
> to IOPF-capable devices. Drivers can choose between a single global
> workqueue, one per IOMMU device, one per low-level fault queue, one per
> domain, etc.
> 
> When it receives a fault event, supposedly in an IRQ handler, the IOMMU

Why "supposedly"? Do you mean "most commonly" 

> driver reports the fault using iommu_report_device_fault(), which calls
> the registered handler. The page fault handler then calls the mm fault
> handler, and reports either success or failure with iommu_page_response().
> When the handler succeeds, the IOMMU retries the access.

For PRI that description is perhaps a bit missleading.  IIRC the IOMMU
will only retry when it gets a new ATS query.

> 
> The iopf_param pointer could be embedded into iommu_fault_param. But
> putting iopf_param into the iommu_param structure allows us not to care
> about ordering between calls to iopf_queue_add_device() and
> iommu_register_device_fault_handler().
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

One really minor inconsistency inline that made me look twice..
With or without that tided up FWIW.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

...

> +/**
> + * iopf_queue_add_device - Add producer to the fault queue
> + * @queue: IOPF queue
> + * @dev: device to add
> + *
> + * Return: 0 on success and <0 on error.
> + */
> +int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev)
> +{
> +	int ret = -EBUSY;
> +	struct iopf_device_param *iopf_param;
> +	struct dev_iommu *param = dev->iommu;
> +
> +	if (!param)
> +		return -ENODEV;
> +
> +	iopf_param = kzalloc(sizeof(*iopf_param), GFP_KERNEL);
> +	if (!iopf_param)
> +		return -ENOMEM;
> +
> +	INIT_LIST_HEAD(&iopf_param->partial);
> +	iopf_param->queue = queue;
> +	iopf_param->dev = dev;
> +
> +	mutex_lock(&queue->lock);
> +	mutex_lock(&param->lock);
> +	if (!param->iopf_param) {
> +		list_add(&iopf_param->queue_list, &queue->devices);
> +		param->iopf_param = iopf_param;
> +		ret = 0;
> +	}
> +	mutex_unlock(&param->lock);
> +	mutex_unlock(&queue->lock);
> +
> +	if (ret)
> +		kfree(iopf_param);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iopf_queue_add_device);
> +
> +/**
> + * iopf_queue_remove_device - Remove producer from fault queue
> + * @queue: IOPF queue
> + * @dev: device to remove
> + *
> + * Caller makes sure that no more faults are reported for this device.
> + *
> + * Return: 0 on success and <0 on error.
> + */
> +int iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev)
> +{
> +	int ret = 0;
I'm not that keen that the logic of ret is basically the opposite
of that in the previous function.
There we had it init to error then set to good, here we do the opposite.

Not that important which but right now it just made me do a double take
whilst reading.

> +	struct iopf_fault *iopf, *next;
> +	struct iopf_device_param *iopf_param;
> +	struct dev_iommu *param = dev->iommu;
> +
> +	if (!param || !queue)
> +		return -EINVAL;
> +
> +	mutex_lock(&queue->lock);
> +	mutex_lock(&param->lock);
> +	iopf_param = param->iopf_param;
> +	if (iopf_param && iopf_param->queue == queue) {
> +		list_del(&iopf_param->queue_list);
> +		param->iopf_param = NULL;
> +	} else {
> +		ret = -EINVAL;
> +	}
> +	mutex_unlock(&param->lock);
> +	mutex_unlock(&queue->lock);
> +	if (ret)
> +		return ret;
> +
> +	/* Just in case some faults are still stuck */
> +	list_for_each_entry_safe(iopf, next, &iopf_param->partial, list)
> +		kfree(iopf);
> +
> +	kfree(iopf_param);
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(iopf_queue_remove_device);
> +

...


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 07/10] iommu/arm-smmu-v3: Maintain a SID->device structure
  2021-01-08 14:52   ` Jean-Philippe Brucker
  (?)
@ 2021-01-19 13:51     ` Jonathan Cameron
  -1 siblings, 0 replies; 105+ messages in thread
From: Jonathan Cameron @ 2021-01-19 13:51 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: joro, will, lorenzo.pieralisi, robh+dt, guohanjun, sudeep.holla,
	rjw, lenb, robin.murphy, eric.auger, iommu, devicetree,
	linux-acpi, linux-arm-kernel, linux-accelerators, baolu.lu,
	vdumpa, zhangfei.gao, shameerali.kolothum.thodi, vivek.gautam

On Fri, 8 Jan 2021 15:52:15 +0100
Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:

> When handling faults from the event or PRI queue, we need to find the
> struct device associated with a SID. Add a rb_tree to keep track of
> SIDs.
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
One totally trivial point if you happen to be spinning again.

Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
with or without that.

> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  13 +-
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 161 ++++++++++++++++----
>  2 files changed, 144 insertions(+), 30 deletions(-)
> 

...

>  
> +static int arm_smmu_insert_master(struct arm_smmu_device *smmu,
> +				  struct arm_smmu_master *master)
> +{
> +	int i;
> +	int ret = 0;
> +	struct arm_smmu_stream *new_stream, *cur_stream;
> +	struct rb_node **new_node, *parent_node = NULL;
> +	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(master->dev);
> +
> +	master->streams = kcalloc(fwspec->num_ids,
> +				  sizeof(struct arm_smmu_stream), GFP_KERNEL);
				  sizeof(*master->streams)

nitpick :) Saves reviewer going to check that master->streams is of the type they expect.


> +	if (!master->streams)
> +		return -ENOMEM;
> +	master->num_streams = fwspec->num_ids;
> +
> +	mutex_lock(&smmu->streams_mutex);
> +	for (i = 0; i < fwspec->num_ids && !ret; i++) {
> +		u32 sid = fwspec->ids[i];
> +
> +		new_stream = &master->streams[i];
> +		new_stream->id = sid;
> +		new_stream->master = master;
> +
> +		/*
> +		 * Check the SIDs are in range of the SMMU and our stream table
> +		 */
> +		if (!arm_smmu_sid_in_range(smmu, sid)) {
> +			ret = -ERANGE;
> +			break;
> +		}
> +
> +		/* Ensure l2 strtab is initialised */
> +		if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
> +			ret = arm_smmu_init_l2_strtab(smmu, sid);
> +			if (ret)
> +				break;
> +		}
> +
> +		/* Insert into SID tree */
> +		new_node = &(smmu->streams.rb_node);
> +		while (*new_node) {
> +			cur_stream = rb_entry(*new_node, struct arm_smmu_stream,
> +					      node);
> +			parent_node = *new_node;
> +			if (cur_stream->id > new_stream->id) {
> +				new_node = &((*new_node)->rb_left);
> +			} else if (cur_stream->id < new_stream->id) {
> +				new_node = &((*new_node)->rb_right);
> +			} else {
> +				dev_warn(master->dev,
> +					 "stream %u already in tree\n",
> +					 cur_stream->id);
> +				ret = -EINVAL;
> +				break;
> +			}
> +		}
> +
> +		if (!ret) {
> +			rb_link_node(&new_stream->node, parent_node, new_node);
> +			rb_insert_color(&new_stream->node, &smmu->streams);
> +		}
> +	}
> +
> +	if (ret) {
> +		for (; i > 0; i--)
> +			rb_erase(&master->streams[i].node, &smmu->streams);
> +		kfree(master->streams);
> +	}
> +	mutex_unlock(&smmu->streams_mutex);
> +
> +	return ret;
> +}
...

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 07/10] iommu/arm-smmu-v3: Maintain a SID->device structure
@ 2021-01-19 13:51     ` Jonathan Cameron
  0 siblings, 0 replies; 105+ messages in thread
From: Jonathan Cameron @ 2021-01-19 13:51 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: devicetree, linux-acpi, robin.murphy, guohanjun, rjw, iommu,
	robh+dt, linux-accelerators, sudeep.holla, vivek.gautam,
	zhangfei.gao, will, linux-arm-kernel, lenb

On Fri, 8 Jan 2021 15:52:15 +0100
Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:

> When handling faults from the event or PRI queue, we need to find the
> struct device associated with a SID. Add a rb_tree to keep track of
> SIDs.
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
One totally trivial point if you happen to be spinning again.

Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
with or without that.

> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  13 +-
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 161 ++++++++++++++++----
>  2 files changed, 144 insertions(+), 30 deletions(-)
> 

...

>  
> +static int arm_smmu_insert_master(struct arm_smmu_device *smmu,
> +				  struct arm_smmu_master *master)
> +{
> +	int i;
> +	int ret = 0;
> +	struct arm_smmu_stream *new_stream, *cur_stream;
> +	struct rb_node **new_node, *parent_node = NULL;
> +	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(master->dev);
> +
> +	master->streams = kcalloc(fwspec->num_ids,
> +				  sizeof(struct arm_smmu_stream), GFP_KERNEL);
				  sizeof(*master->streams)

nitpick :) Saves reviewer going to check that master->streams is of the type they expect.


> +	if (!master->streams)
> +		return -ENOMEM;
> +	master->num_streams = fwspec->num_ids;
> +
> +	mutex_lock(&smmu->streams_mutex);
> +	for (i = 0; i < fwspec->num_ids && !ret; i++) {
> +		u32 sid = fwspec->ids[i];
> +
> +		new_stream = &master->streams[i];
> +		new_stream->id = sid;
> +		new_stream->master = master;
> +
> +		/*
> +		 * Check the SIDs are in range of the SMMU and our stream table
> +		 */
> +		if (!arm_smmu_sid_in_range(smmu, sid)) {
> +			ret = -ERANGE;
> +			break;
> +		}
> +
> +		/* Ensure l2 strtab is initialised */
> +		if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
> +			ret = arm_smmu_init_l2_strtab(smmu, sid);
> +			if (ret)
> +				break;
> +		}
> +
> +		/* Insert into SID tree */
> +		new_node = &(smmu->streams.rb_node);
> +		while (*new_node) {
> +			cur_stream = rb_entry(*new_node, struct arm_smmu_stream,
> +					      node);
> +			parent_node = *new_node;
> +			if (cur_stream->id > new_stream->id) {
> +				new_node = &((*new_node)->rb_left);
> +			} else if (cur_stream->id < new_stream->id) {
> +				new_node = &((*new_node)->rb_right);
> +			} else {
> +				dev_warn(master->dev,
> +					 "stream %u already in tree\n",
> +					 cur_stream->id);
> +				ret = -EINVAL;
> +				break;
> +			}
> +		}
> +
> +		if (!ret) {
> +			rb_link_node(&new_stream->node, parent_node, new_node);
> +			rb_insert_color(&new_stream->node, &smmu->streams);
> +		}
> +	}
> +
> +	if (ret) {
> +		for (; i > 0; i--)
> +			rb_erase(&master->streams[i].node, &smmu->streams);
> +		kfree(master->streams);
> +	}
> +	mutex_unlock(&smmu->streams_mutex);
> +
> +	return ret;
> +}
...
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 07/10] iommu/arm-smmu-v3: Maintain a SID->device structure
@ 2021-01-19 13:51     ` Jonathan Cameron
  0 siblings, 0 replies; 105+ messages in thread
From: Jonathan Cameron @ 2021-01-19 13:51 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: devicetree, lorenzo.pieralisi, linux-acpi, robin.murphy, joro,
	guohanjun, rjw, shameerali.kolothum.thodi, eric.auger, iommu,
	robh+dt, linux-accelerators, sudeep.holla, vivek.gautam,
	zhangfei.gao, baolu.lu, will, linux-arm-kernel, lenb

On Fri, 8 Jan 2021 15:52:15 +0100
Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:

> When handling faults from the event or PRI queue, we need to find the
> struct device associated with a SID. Add a rb_tree to keep track of
> SIDs.
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
One totally trivial point if you happen to be spinning again.

Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
with or without that.

> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  13 +-
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 161 ++++++++++++++++----
>  2 files changed, 144 insertions(+), 30 deletions(-)
> 

...

>  
> +static int arm_smmu_insert_master(struct arm_smmu_device *smmu,
> +				  struct arm_smmu_master *master)
> +{
> +	int i;
> +	int ret = 0;
> +	struct arm_smmu_stream *new_stream, *cur_stream;
> +	struct rb_node **new_node, *parent_node = NULL;
> +	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(master->dev);
> +
> +	master->streams = kcalloc(fwspec->num_ids,
> +				  sizeof(struct arm_smmu_stream), GFP_KERNEL);
				  sizeof(*master->streams)

nitpick :) Saves reviewer going to check that master->streams is of the type they expect.


> +	if (!master->streams)
> +		return -ENOMEM;
> +	master->num_streams = fwspec->num_ids;
> +
> +	mutex_lock(&smmu->streams_mutex);
> +	for (i = 0; i < fwspec->num_ids && !ret; i++) {
> +		u32 sid = fwspec->ids[i];
> +
> +		new_stream = &master->streams[i];
> +		new_stream->id = sid;
> +		new_stream->master = master;
> +
> +		/*
> +		 * Check the SIDs are in range of the SMMU and our stream table
> +		 */
> +		if (!arm_smmu_sid_in_range(smmu, sid)) {
> +			ret = -ERANGE;
> +			break;
> +		}
> +
> +		/* Ensure l2 strtab is initialised */
> +		if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
> +			ret = arm_smmu_init_l2_strtab(smmu, sid);
> +			if (ret)
> +				break;
> +		}
> +
> +		/* Insert into SID tree */
> +		new_node = &(smmu->streams.rb_node);
> +		while (*new_node) {
> +			cur_stream = rb_entry(*new_node, struct arm_smmu_stream,
> +					      node);
> +			parent_node = *new_node;
> +			if (cur_stream->id > new_stream->id) {
> +				new_node = &((*new_node)->rb_left);
> +			} else if (cur_stream->id < new_stream->id) {
> +				new_node = &((*new_node)->rb_right);
> +			} else {
> +				dev_warn(master->dev,
> +					 "stream %u already in tree\n",
> +					 cur_stream->id);
> +				ret = -EINVAL;
> +				break;
> +			}
> +		}
> +
> +		if (!ret) {
> +			rb_link_node(&new_stream->node, parent_node, new_node);
> +			rb_insert_color(&new_stream->node, &smmu->streams);
> +		}
> +	}
> +
> +	if (ret) {
> +		for (; i > 0; i--)
> +			rb_erase(&master->streams[i].node, &smmu->streams);
> +		kfree(master->streams);
> +	}
> +	mutex_unlock(&smmu->streams_mutex);
> +
> +	return ret;
> +}
...

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 10/10] iommu/arm-smmu-v3: Add stall support for platform devices
  2021-01-08 14:52   ` Jean-Philippe Brucker
  (?)
@ 2021-01-19 17:28     ` Robin Murphy
  -1 siblings, 0 replies; 105+ messages in thread
From: Robin Murphy @ 2021-01-19 17:28 UTC (permalink / raw)
  To: Jean-Philippe Brucker, joro, will
  Cc: devicetree, linux-acpi, guohanjun, rjw, iommu, robh+dt,
	linux-accelerators, sudeep.holla, vivek.gautam, zhangfei.gao,
	linux-arm-kernel, lenb

On 2021-01-08 14:52, Jean-Philippe Brucker wrote:
> The SMMU provides a Stall model for handling page faults in platform
> devices. It is similar to PCIe PRI, but doesn't require devices to have
> their own translation cache. Instead, faulting transactions are parked
> and the OS is given a chance to fix the page tables and retry the
> transaction.
> 
> Enable stall for devices that support it (opt-in by firmware). When an
> event corresponds to a translation error, call the IOMMU fault handler.
> If the fault is recoverable, it will call us back to terminate or
> continue the stall.
> 
> To use stall device drivers need to enable IOMMU_DEV_FEAT_IOPF, which
> initializes the fault queue for the device.
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> ---
> v9: Add IOMMU_DEV_FEAT_IOPF
> ---
>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  61 ++++++
>   .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  70 ++++++-
>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 192 ++++++++++++++++--
>   3 files changed, 306 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 8ef6a1c48635..cb129870ef55 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -354,6 +354,13 @@
>   #define CMDQ_PRI_1_GRPID		GENMASK_ULL(8, 0)
>   #define CMDQ_PRI_1_RESP			GENMASK_ULL(13, 12)
>   
> +#define CMDQ_RESUME_0_SID		GENMASK_ULL(63, 32)
> +#define CMDQ_RESUME_0_RESP_TERM		0UL
> +#define CMDQ_RESUME_0_RESP_RETRY	1UL
> +#define CMDQ_RESUME_0_RESP_ABORT	2UL
> +#define CMDQ_RESUME_0_RESP		GENMASK_ULL(13, 12)

Nit: I think the SID field belongs here.

> +#define CMDQ_RESUME_1_STAG		GENMASK_ULL(15, 0)
> +
>   #define CMDQ_SYNC_0_CS			GENMASK_ULL(13, 12)
>   #define CMDQ_SYNC_0_CS_NONE		0
>   #define CMDQ_SYNC_0_CS_IRQ		1
> @@ -370,6 +377,25 @@
>   
>   #define EVTQ_0_ID			GENMASK_ULL(7, 0)
>   
> +#define EVT_ID_TRANSLATION_FAULT	0x10
> +#define EVT_ID_ADDR_SIZE_FAULT		0x11
> +#define EVT_ID_ACCESS_FAULT		0x12
> +#define EVT_ID_PERMISSION_FAULT		0x13
> +
> +#define EVTQ_0_SSV			(1UL << 11)
> +#define EVTQ_0_SSID			GENMASK_ULL(31, 12)
> +#define EVTQ_0_SID			GENMASK_ULL(63, 32)
> +#define EVTQ_1_STAG			GENMASK_ULL(15, 0)
> +#define EVTQ_1_STALL			(1UL << 31)
> +#define EVTQ_1_PRIV			(1UL << 33)
> +#define EVTQ_1_EXEC			(1UL << 34)
> +#define EVTQ_1_READ			(1UL << 35)

Nit: personally I'd find it a little clearer if these were named PnU, 
InD, and RnW to match the architecture, but quite possibly that's just 
me and those are gibberish to everyone else...

> +#define EVTQ_1_S2			(1UL << 39)
> +#define EVTQ_1_CLASS			GENMASK_ULL(41, 40)
> +#define EVTQ_1_TT_READ			(1UL << 44)
> +#define EVTQ_2_ADDR			GENMASK_ULL(63, 0)
> +#define EVTQ_3_IPA			GENMASK_ULL(51, 12)
> +
>   /* PRI queue */
>   #define PRIQ_ENT_SZ_SHIFT		4
>   #define PRIQ_ENT_DWORDS			((1 << PRIQ_ENT_SZ_SHIFT) >> 3)
> @@ -462,6 +488,13 @@ struct arm_smmu_cmdq_ent {
>   			enum pri_resp		resp;
>   		} pri;
>   
> +		#define CMDQ_OP_RESUME		0x44
> +		struct {
> +			u32			sid;
> +			u16			stag;
> +			u8			resp;
> +		} resume;
> +
>   		#define CMDQ_OP_CMD_SYNC	0x46
>   		struct {
>   			u64			msiaddr;
> @@ -520,6 +553,7 @@ struct arm_smmu_cmdq_batch {
>   
>   struct arm_smmu_evtq {
>   	struct arm_smmu_queue		q;
> +	struct iopf_queue		*iopf;
>   	u32				max_stalls;
>   };
>   
> @@ -656,7 +690,9 @@ struct arm_smmu_master {
>   	struct arm_smmu_stream		*streams;
>   	unsigned int			num_streams;
>   	bool				ats_enabled;
> +	bool				stall_enabled;
>   	bool				sva_enabled;
> +	bool				iopf_enabled;
>   	struct list_head		bonds;
>   	unsigned int			ssid_bits;
>   };
> @@ -675,6 +711,7 @@ struct arm_smmu_domain {
>   
>   	struct io_pgtable_ops		*pgtbl_ops;
>   	bool				non_strict;
> +	bool				stall_enabled;
>   	atomic_t			nr_ats_masters;
>   
>   	enum arm_smmu_domain_stage	stage;
> @@ -713,6 +750,10 @@ bool arm_smmu_master_sva_supported(struct arm_smmu_master *master);
>   bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master);
>   int arm_smmu_master_enable_sva(struct arm_smmu_master *master);
>   int arm_smmu_master_disable_sva(struct arm_smmu_master *master);
> +bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master);
> +bool arm_smmu_master_iopf_enabled(struct arm_smmu_master *master);
> +int arm_smmu_master_enable_iopf(struct arm_smmu_master *master);
> +int arm_smmu_master_disable_iopf(struct arm_smmu_master *master);
>   struct iommu_sva *arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm,
>   				    void *drvdata);
>   void arm_smmu_sva_unbind(struct iommu_sva *handle);
> @@ -744,6 +785,26 @@ static inline int arm_smmu_master_disable_sva(struct arm_smmu_master *master)
>   	return -ENODEV;
>   }
>   
> +static inline bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master)
> +{
> +	return false;
> +}
> +
> +static inline bool arm_smmu_master_iopf_enabled(struct arm_smmu_master *master)
> +{
> +	return false;
> +}
> +
> +static inline int arm_smmu_master_enable_iopf(struct arm_smmu_master *master)
> +{
> +	return -ENODEV;
> +}
> +
> +static inline int arm_smmu_master_disable_iopf(struct arm_smmu_master *master)
> +{
> +	return -ENODEV;
> +}
> +
>   static inline struct iommu_sva *
>   arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
>   {
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> index e13b092e6004..17acfee4f484 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> @@ -431,9 +431,9 @@ bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
>   	return true;
>   }
>   
> -static bool arm_smmu_iopf_supported(struct arm_smmu_master *master)
> +bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master)
>   {
> -	return false;
> +	return master->stall_enabled;
>   }
>   
>   bool arm_smmu_master_sva_supported(struct arm_smmu_master *master)
> @@ -441,8 +441,18 @@ bool arm_smmu_master_sva_supported(struct arm_smmu_master *master)
>   	if (!(master->smmu->features & ARM_SMMU_FEAT_SVA))
>   		return false;
>   
> -	/* SSID and IOPF support are mandatory for the moment */
> -	return master->ssid_bits && arm_smmu_iopf_supported(master);
> +	/* SSID support is mandatory for the moment */
> +	return master->ssid_bits;
> +}
> +
> +bool arm_smmu_master_iopf_enabled(struct arm_smmu_master *master)
> +{
> +	bool enabled;
> +
> +	mutex_lock(&sva_lock);
> +	enabled = master->iopf_enabled;
> +	mutex_unlock(&sva_lock);

Forgive me for being dim, but what's the locking synchronising against 
here? If we're expecting that master->iopf_enabled can change at any 
time, isn't whatever we've read potentially already invalid as soon as 
we've dropped the lock?

> +	return enabled;
>   }
>   
>   bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master)
> @@ -455,15 +465,67 @@ bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master)
>   	return enabled;
>   }
>   
> +int arm_smmu_master_enable_iopf(struct arm_smmu_master *master)
> +{
> +	int ret;
> +	struct device *dev = master->dev;
> +
> +	mutex_lock(&sva_lock);
> +	if (master->stall_enabled) {
> +		ret = iopf_queue_add_device(master->smmu->evtq.iopf, dev);
> +		if (ret)
> +			goto err_unlock;
> +	}
> +
> +	ret = iommu_register_device_fault_handler(dev, iommu_queue_iopf, dev);
> +	if (ret)
> +		goto err_remove_device;
> +	master->iopf_enabled = true;
> +	mutex_unlock(&sva_lock);
> +	return 0;
> +
> +err_remove_device:
> +	iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
> +err_unlock:
> +	mutex_unlock(&sva_lock);
> +	return ret;
> +}
> +
>   int arm_smmu_master_enable_sva(struct arm_smmu_master *master)
>   {
>   	mutex_lock(&sva_lock);
> +	/*
> +	 * Drivers for devices supporting PRI or stall should enable IOPF first.
> +	 * Others have device-specific fault handlers and don't need IOPF, so
> +	 * this sanity check is a bit basic.
> +	 */
> +	if (arm_smmu_master_iopf_supported(master) && !master->iopf_enabled) {
> +		mutex_unlock(&sva_lock);
> +		return -EINVAL;
> +	}
>   	master->sva_enabled = true;
>   	mutex_unlock(&sva_lock);
>   
>   	return 0;
>   }
>   
> +int arm_smmu_master_disable_iopf(struct arm_smmu_master *master)
> +{
> +	struct device *dev = master->dev;
> +
> +	mutex_lock(&sva_lock);
> +	if (master->sva_enabled) {
> +		mutex_unlock(&sva_lock);
> +		return -EBUSY;
> +	}
> +
> +	iommu_unregister_device_fault_handler(dev);
> +	iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
> +	master->iopf_enabled = false;
> +	mutex_unlock(&sva_lock);
> +	return 0;
> +}
> +
>   int arm_smmu_master_disable_sva(struct arm_smmu_master *master)
>   {
>   	mutex_lock(&sva_lock);
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 2dbae2e6965d..1fea11d65cd3 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -32,6 +32,7 @@
>   #include <linux/amba/bus.h>
>   
>   #include "arm-smmu-v3.h"
> +#include "../../iommu-sva-lib.h"
>   
>   static bool disable_bypass = true;
>   module_param(disable_bypass, bool, 0444);
> @@ -319,6 +320,11 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
>   		}
>   		cmd[1] |= FIELD_PREP(CMDQ_PRI_1_RESP, ent->pri.resp);
>   		break;
> +	case CMDQ_OP_RESUME:
> +		cmd[0] |= FIELD_PREP(CMDQ_RESUME_0_SID, ent->resume.sid);
> +		cmd[0] |= FIELD_PREP(CMDQ_RESUME_0_RESP, ent->resume.resp);
> +		cmd[1] |= FIELD_PREP(CMDQ_RESUME_1_STAG, ent->resume.stag);
> +		break;
>   	case CMDQ_OP_CMD_SYNC:
>   		if (ent->sync.msiaddr) {
>   			cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, CMDQ_SYNC_0_CS_IRQ);
> @@ -882,6 +888,44 @@ static int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu,
>   	return arm_smmu_cmdq_issue_cmdlist(smmu, cmds->cmds, cmds->num, true);
>   }
>   
> +static int arm_smmu_page_response(struct device *dev,
> +				  struct iommu_fault_event *unused,
> +				  struct iommu_page_response *resp)
> +{
> +	struct arm_smmu_cmdq_ent cmd = {0};
> +	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> +	int sid = master->streams[0].id;

If that's going to be the case, should we explicitly prevent 
multi-stream devices from opting in to faults at all?

> +	if (master->stall_enabled) {
> +		cmd.opcode		= CMDQ_OP_RESUME;
> +		cmd.resume.sid		= sid;
> +		cmd.resume.stag		= resp->grpid;
> +		switch (resp->code) {
> +		case IOMMU_PAGE_RESP_INVALID:
> +		case IOMMU_PAGE_RESP_FAILURE:
> +			cmd.resume.resp = CMDQ_RESUME_0_RESP_ABORT;
> +			break;
> +		case IOMMU_PAGE_RESP_SUCCESS:
> +			cmd.resume.resp = CMDQ_RESUME_0_RESP_RETRY;
> +			break;
> +		default:
> +			return -EINVAL;
> +		}
> +	} else {
> +		return -ENODEV;
> +	}
> +
> +	arm_smmu_cmdq_issue_cmd(master->smmu, &cmd);
> +	/*
> +	 * Don't send a SYNC, it doesn't do anything for RESUME or PRI_RESP.
> +	 * RESUME consumption guarantees that the stalled transaction will be
> +	 * terminated... at some point in the future. PRI_RESP is fire and
> +	 * forget.
> +	 */
> +
> +	return 0;
> +}
> +
>   /* Context descriptor manipulation functions */
>   void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid)
>   {
> @@ -991,7 +1035,6 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
>   	u64 val;
>   	bool cd_live;
>   	__le64 *cdptr;
> -	struct arm_smmu_device *smmu = smmu_domain->smmu;
>   
>   	if (WARN_ON(ssid >= (1 << smmu_domain->s1_cfg.s1cdmax)))
>   		return -E2BIG;
> @@ -1036,8 +1079,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
>   			FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid) |
>   			CTXDESC_CD_0_V;
>   
> -		/* STALL_MODEL==0b10 && CD.S==0 is ILLEGAL */
> -		if (smmu->features & ARM_SMMU_FEAT_STALL_FORCE)
> +		if (smmu_domain->stall_enabled)
>   			val |= CTXDESC_CD_0_S;
>   	}
>   
> @@ -1278,7 +1320,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>   			 FIELD_PREP(STRTAB_STE_1_STRW, STRTAB_STE_1_STRW_NSEL1));
>   
>   		if (smmu->features & ARM_SMMU_FEAT_STALLS &&
> -		   !(smmu->features & ARM_SMMU_FEAT_STALL_FORCE))
> +		    !master->stall_enabled)
>   			dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
>   
>   		val |= (s1_cfg->cdcfg.cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
> @@ -1355,7 +1397,6 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
>   	return 0;
>   }
>   
> -__maybe_unused
>   static struct arm_smmu_master *
>   arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid)
>   {
> @@ -1382,9 +1423,96 @@ arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid)
>   }
>   
>   /* IRQ and event handlers */
> +static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
> +{
> +	int ret;
> +	u32 perm = 0;
> +	struct arm_smmu_master *master;
> +	bool ssid_valid = evt[0] & EVTQ_0_SSV;
> +	u8 type = FIELD_GET(EVTQ_0_ID, evt[0]);
> +	u32 sid = FIELD_GET(EVTQ_0_SID, evt[0]);
> +	struct iommu_fault_event fault_evt = { };
> +	struct iommu_fault *flt = &fault_evt.fault;
> +
> +	/* Stage-2 is always pinned at the moment */
> +	if (evt[1] & EVTQ_1_S2)
> +		return -EFAULT;
> +
> +	master = arm_smmu_find_master(smmu, sid);
> +	if (!master)
> +		return -EINVAL;
> +
> +	if (evt[1] & EVTQ_1_READ)
> +		perm |= IOMMU_FAULT_PERM_READ;
> +	else
> +		perm |= IOMMU_FAULT_PERM_WRITE;
> +
> +	if (evt[1] & EVTQ_1_EXEC)
> +		perm |= IOMMU_FAULT_PERM_EXEC;
> +
> +	if (evt[1] & EVTQ_1_PRIV)
> +		perm |= IOMMU_FAULT_PERM_PRIV;
> +
> +	if (evt[1] & EVTQ_1_STALL) {
> +		flt->type = IOMMU_FAULT_PAGE_REQ;
> +		flt->prm = (struct iommu_fault_page_request) {
> +			.flags = IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE,
> +			.grpid = FIELD_GET(EVTQ_1_STAG, evt[1]),
> +			.perm = perm,
> +			.addr = FIELD_GET(EVTQ_2_ADDR, evt[2]),
> +		};
> +
> +		if (ssid_valid) {
> +			flt->prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
> +			flt->prm.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]);
> +		}

So if we get a bad ATS request with R=1, or a TLB/CFG conflict or any 
other imp-def event which happens to have bit 95 set, we might try to 
report it as something pageable? I would have thought we should look at 
the event code before *anything* else.

> +	} else {
> +		flt->type = IOMMU_FAULT_DMA_UNRECOV;
> +		flt->event = (struct iommu_fault_unrecoverable) {
> +			.flags = IOMMU_FAULT_UNRECOV_ADDR_VALID |
> +				 IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID,
> +			.perm = perm,
> +			.addr = FIELD_GET(EVTQ_2_ADDR, evt[2]),
> +			.fetch_addr = FIELD_GET(EVTQ_3_IPA, evt[3]),
> +		};
> +
> +		if (ssid_valid) {
> +			flt->event.flags |= IOMMU_FAULT_UNRECOV_PASID_VALID;
> +			flt->event.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]);
> +		}
> +
> +		switch (type) {
> +		case EVT_ID_TRANSLATION_FAULT:
> +		case EVT_ID_ADDR_SIZE_FAULT:
> +		case EVT_ID_ACCESS_FAULT:
> +			flt->event.reason = IOMMU_FAULT_REASON_PTE_FETCH;
> +			break;
> +		case EVT_ID_PERMISSION_FAULT:
> +			flt->event.reason = IOMMU_FAULT_REASON_PERMISSION;
> +			break;
> +		default:
> +			/* TODO: report other unrecoverable faults. */
> +			return -EFAULT;
> +		}
> +	}
> +
> +	ret = iommu_report_device_fault(master->dev, &fault_evt);
> +	if (ret && flt->type == IOMMU_FAULT_PAGE_REQ) {
> +		/* Nobody cared, abort the access */
> +		struct iommu_page_response resp = {
> +			.pasid		= flt->prm.pasid,
> +			.grpid		= flt->prm.grpid,
> +			.code		= IOMMU_PAGE_RESP_FAILURE,
> +		};
> +		arm_smmu_page_response(master->dev, NULL, &resp);
> +	}
> +
> +	return ret;
> +}
> +
>   static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
>   {
> -	int i;
> +	int i, ret;
>   	struct arm_smmu_device *smmu = dev;
>   	struct arm_smmu_queue *q = &smmu->evtq.q;
>   	struct arm_smmu_ll_queue *llq = &q->llq;
> @@ -1394,11 +1522,14 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
>   		while (!queue_remove_raw(q, evt)) {
>   			u8 id = FIELD_GET(EVTQ_0_ID, evt[0]);
>   
> -			dev_info(smmu->dev, "event 0x%02x received:\n", id);
> -			for (i = 0; i < ARRAY_SIZE(evt); ++i)
> -				dev_info(smmu->dev, "\t0x%016llx\n",
> -					 (unsigned long long)evt[i]);
> -
> +			ret = arm_smmu_handle_evt(smmu, evt);
> +			if (ret) {

Maybe make this an "if (!ret) continue;" to save the indentation from 
getting even more out of hand?

> +				dev_info(smmu->dev, "event 0x%02x received:\n",
> +					 id);
> +				for (i = 0; i < ARRAY_SIZE(evt); ++i)
> +					dev_info(smmu->dev, "\t0x%016llx\n",
> +						 (unsigned long long)evt[i]);
> +			}
>   		}
>   
>   		/*
> @@ -1903,6 +2034,8 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
>   
>   	cfg->s1cdmax = master->ssid_bits;
>   
> +	smmu_domain->stall_enabled = master->stall_enabled;
> +
>   	ret = arm_smmu_alloc_cd_tables(smmu_domain);
>   	if (ret)
>   		goto out_free_asid;
> @@ -2250,6 +2383,12 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>   			smmu_domain->s1_cfg.s1cdmax, master->ssid_bits);
>   		ret = -EINVAL;
>   		goto out_unlock;
> +	} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
> +		   smmu_domain->stall_enabled != master->stall_enabled) {

I appreciate that it's probably a fair bit more complex, but it would be 
nice to at least plan for resolving this decision later (i.e. at a point 
where a caller shows an interest in actually using stalls) in future. 
Obviously the first devices advertising stall capabilities will be the 
ones that do want to use it for their primary functionality, that are 
driving the work here. However once this all matures, firmwares may 
start annotating any stallable devices as such for completeness, rather 
than assuming any specific usage. At that point it would be a pain if, 
say, assigning two devices to the same VFIO domain for old-fashioned 
pinned DMA, was suddenly prevented for irrelevant reasons just because 
of a DT/IORT update.

> +		dev_err(dev, "cannot attach to stall-%s domain\n",
> +			smmu_domain->stall_enabled ? "enabled" : "disabled");
> +		ret = -EINVAL;
> +		goto out_unlock;
>   	}
>   
>   	master->domain = smmu_domain;
> @@ -2484,6 +2623,11 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
>   		master->ssid_bits = min_t(u8, master->ssid_bits,
>   					  CTXDESC_LINEAR_CDMAX);
>   
> +	if ((smmu->features & ARM_SMMU_FEAT_STALLS &&
> +	     device_property_read_bool(dev, "dma-can-stall")) ||
> +	    smmu->features & ARM_SMMU_FEAT_STALL_FORCE)
> +		master->stall_enabled = true;
> +
>   	return &smmu->iommu;
>   
>   err_free_master:
> @@ -2502,6 +2646,7 @@ static void arm_smmu_release_device(struct device *dev)
>   
>   	master = dev_iommu_priv_get(dev);
>   	WARN_ON(arm_smmu_master_sva_enabled(master));
> +	iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
>   	arm_smmu_detach_dev(master);
>   	arm_smmu_disable_pasid(master);
>   	arm_smmu_remove_master(master);
> @@ -2629,6 +2774,8 @@ static bool arm_smmu_dev_has_feature(struct device *dev,
>   		return false;
>   
>   	switch (feat) {
> +	case IOMMU_DEV_FEAT_IOPF:
> +		return arm_smmu_master_iopf_supported(master);
>   	case IOMMU_DEV_FEAT_SVA:
>   		return arm_smmu_master_sva_supported(master);
>   	default:
> @@ -2645,6 +2792,8 @@ static bool arm_smmu_dev_feature_enabled(struct device *dev,
>   		return false;
>   
>   	switch (feat) {
> +	case IOMMU_DEV_FEAT_IOPF:
> +		return arm_smmu_master_iopf_enabled(master);
>   	case IOMMU_DEV_FEAT_SVA:
>   		return arm_smmu_master_sva_enabled(master);
>   	default:
> @@ -2655,6 +2804,8 @@ static bool arm_smmu_dev_feature_enabled(struct device *dev,
>   static int arm_smmu_dev_enable_feature(struct device *dev,
>   				       enum iommu_dev_features feat)
>   {
> +	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> +
>   	if (!arm_smmu_dev_has_feature(dev, feat))
>   		return -ENODEV;
>   
> @@ -2662,8 +2813,10 @@ static int arm_smmu_dev_enable_feature(struct device *dev,
>   		return -EBUSY;
>   
>   	switch (feat) {
> +	case IOMMU_DEV_FEAT_IOPF:
> +		return arm_smmu_master_enable_iopf(master);
>   	case IOMMU_DEV_FEAT_SVA:
> -		return arm_smmu_master_enable_sva(dev_iommu_priv_get(dev));
> +		return arm_smmu_master_enable_sva(master);
>   	default:
>   		return -EINVAL;
>   	}
> @@ -2672,12 +2825,16 @@ static int arm_smmu_dev_enable_feature(struct device *dev,
>   static int arm_smmu_dev_disable_feature(struct device *dev,
>   					enum iommu_dev_features feat)
>   {
> +	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> +
>   	if (!arm_smmu_dev_feature_enabled(dev, feat))
>   		return -EINVAL;
>   
>   	switch (feat) {
> +	case IOMMU_DEV_FEAT_IOPF:
> +		return arm_smmu_master_disable_iopf(master);
>   	case IOMMU_DEV_FEAT_SVA:
> -		return arm_smmu_master_disable_sva(dev_iommu_priv_get(dev));
> +		return arm_smmu_master_disable_sva(master);
>   	default:
>   		return -EINVAL;
>   	}
> @@ -2708,6 +2865,7 @@ static struct iommu_ops arm_smmu_ops = {
>   	.sva_bind		= arm_smmu_sva_bind,
>   	.sva_unbind		= arm_smmu_sva_unbind,
>   	.sva_get_pasid		= arm_smmu_sva_get_pasid,
> +	.page_response		= arm_smmu_page_response,
>   	.pgsize_bitmap		= -1UL, /* Restricted during device attach */
>   };
>   
> @@ -2785,6 +2943,7 @@ static int arm_smmu_cmdq_init(struct arm_smmu_device *smmu)
>   static int arm_smmu_init_queues(struct arm_smmu_device *smmu)
>   {
>   	int ret;
> +	bool sva = arm_smmu_sva_supported(smmu);
>   
>   	/* cmdq */
>   	ret = arm_smmu_init_one_queue(smmu, &smmu->cmdq.q, ARM_SMMU_CMDQ_PROD,
> @@ -2804,6 +2963,12 @@ static int arm_smmu_init_queues(struct arm_smmu_device *smmu)
>   	if (ret)
>   		return ret;
>   
> +	if (sva && smmu->features & ARM_SMMU_FEAT_STALLS) {

Surely you could just test for ARM_SMMU_FEAT_SVA by now rather than go 
through the whole of arm_smmu_sva_supported() again?

Robin.

> +		smmu->evtq.iopf = iopf_queue_alloc(dev_name(smmu->dev));
> +		if (!smmu->evtq.iopf)
> +			return -ENOMEM;
> +	}
> +
>   	/* priq */
>   	if (!(smmu->features & ARM_SMMU_FEAT_PRI))
>   		return 0;
> @@ -3718,6 +3883,7 @@ static int arm_smmu_device_remove(struct platform_device *pdev)
>   	iommu_device_unregister(&smmu->iommu);
>   	iommu_device_sysfs_remove(&smmu->iommu);
>   	arm_smmu_device_disable(smmu);
> +	iopf_queue_free(smmu->evtq.iopf);
>   
>   	return 0;
>   }
> 

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 10/10] iommu/arm-smmu-v3: Add stall support for platform devices
@ 2021-01-19 17:28     ` Robin Murphy
  0 siblings, 0 replies; 105+ messages in thread
From: Robin Murphy @ 2021-01-19 17:28 UTC (permalink / raw)
  To: Jean-Philippe Brucker, joro, will
  Cc: devicetree, vivek.gautam, sudeep.holla, rjw, linux-acpi, iommu,
	robh+dt, linux-arm-kernel, guohanjun, zhangfei.gao,
	linux-accelerators, lenb

On 2021-01-08 14:52, Jean-Philippe Brucker wrote:
> The SMMU provides a Stall model for handling page faults in platform
> devices. It is similar to PCIe PRI, but doesn't require devices to have
> their own translation cache. Instead, faulting transactions are parked
> and the OS is given a chance to fix the page tables and retry the
> transaction.
> 
> Enable stall for devices that support it (opt-in by firmware). When an
> event corresponds to a translation error, call the IOMMU fault handler.
> If the fault is recoverable, it will call us back to terminate or
> continue the stall.
> 
> To use stall device drivers need to enable IOMMU_DEV_FEAT_IOPF, which
> initializes the fault queue for the device.
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> ---
> v9: Add IOMMU_DEV_FEAT_IOPF
> ---
>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  61 ++++++
>   .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  70 ++++++-
>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 192 ++++++++++++++++--
>   3 files changed, 306 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 8ef6a1c48635..cb129870ef55 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -354,6 +354,13 @@
>   #define CMDQ_PRI_1_GRPID		GENMASK_ULL(8, 0)
>   #define CMDQ_PRI_1_RESP			GENMASK_ULL(13, 12)
>   
> +#define CMDQ_RESUME_0_SID		GENMASK_ULL(63, 32)
> +#define CMDQ_RESUME_0_RESP_TERM		0UL
> +#define CMDQ_RESUME_0_RESP_RETRY	1UL
> +#define CMDQ_RESUME_0_RESP_ABORT	2UL
> +#define CMDQ_RESUME_0_RESP		GENMASK_ULL(13, 12)

Nit: I think the SID field belongs here.

> +#define CMDQ_RESUME_1_STAG		GENMASK_ULL(15, 0)
> +
>   #define CMDQ_SYNC_0_CS			GENMASK_ULL(13, 12)
>   #define CMDQ_SYNC_0_CS_NONE		0
>   #define CMDQ_SYNC_0_CS_IRQ		1
> @@ -370,6 +377,25 @@
>   
>   #define EVTQ_0_ID			GENMASK_ULL(7, 0)
>   
> +#define EVT_ID_TRANSLATION_FAULT	0x10
> +#define EVT_ID_ADDR_SIZE_FAULT		0x11
> +#define EVT_ID_ACCESS_FAULT		0x12
> +#define EVT_ID_PERMISSION_FAULT		0x13
> +
> +#define EVTQ_0_SSV			(1UL << 11)
> +#define EVTQ_0_SSID			GENMASK_ULL(31, 12)
> +#define EVTQ_0_SID			GENMASK_ULL(63, 32)
> +#define EVTQ_1_STAG			GENMASK_ULL(15, 0)
> +#define EVTQ_1_STALL			(1UL << 31)
> +#define EVTQ_1_PRIV			(1UL << 33)
> +#define EVTQ_1_EXEC			(1UL << 34)
> +#define EVTQ_1_READ			(1UL << 35)

Nit: personally I'd find it a little clearer if these were named PnU, 
InD, and RnW to match the architecture, but quite possibly that's just 
me and those are gibberish to everyone else...

> +#define EVTQ_1_S2			(1UL << 39)
> +#define EVTQ_1_CLASS			GENMASK_ULL(41, 40)
> +#define EVTQ_1_TT_READ			(1UL << 44)
> +#define EVTQ_2_ADDR			GENMASK_ULL(63, 0)
> +#define EVTQ_3_IPA			GENMASK_ULL(51, 12)
> +
>   /* PRI queue */
>   #define PRIQ_ENT_SZ_SHIFT		4
>   #define PRIQ_ENT_DWORDS			((1 << PRIQ_ENT_SZ_SHIFT) >> 3)
> @@ -462,6 +488,13 @@ struct arm_smmu_cmdq_ent {
>   			enum pri_resp		resp;
>   		} pri;
>   
> +		#define CMDQ_OP_RESUME		0x44
> +		struct {
> +			u32			sid;
> +			u16			stag;
> +			u8			resp;
> +		} resume;
> +
>   		#define CMDQ_OP_CMD_SYNC	0x46
>   		struct {
>   			u64			msiaddr;
> @@ -520,6 +553,7 @@ struct arm_smmu_cmdq_batch {
>   
>   struct arm_smmu_evtq {
>   	struct arm_smmu_queue		q;
> +	struct iopf_queue		*iopf;
>   	u32				max_stalls;
>   };
>   
> @@ -656,7 +690,9 @@ struct arm_smmu_master {
>   	struct arm_smmu_stream		*streams;
>   	unsigned int			num_streams;
>   	bool				ats_enabled;
> +	bool				stall_enabled;
>   	bool				sva_enabled;
> +	bool				iopf_enabled;
>   	struct list_head		bonds;
>   	unsigned int			ssid_bits;
>   };
> @@ -675,6 +711,7 @@ struct arm_smmu_domain {
>   
>   	struct io_pgtable_ops		*pgtbl_ops;
>   	bool				non_strict;
> +	bool				stall_enabled;
>   	atomic_t			nr_ats_masters;
>   
>   	enum arm_smmu_domain_stage	stage;
> @@ -713,6 +750,10 @@ bool arm_smmu_master_sva_supported(struct arm_smmu_master *master);
>   bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master);
>   int arm_smmu_master_enable_sva(struct arm_smmu_master *master);
>   int arm_smmu_master_disable_sva(struct arm_smmu_master *master);
> +bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master);
> +bool arm_smmu_master_iopf_enabled(struct arm_smmu_master *master);
> +int arm_smmu_master_enable_iopf(struct arm_smmu_master *master);
> +int arm_smmu_master_disable_iopf(struct arm_smmu_master *master);
>   struct iommu_sva *arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm,
>   				    void *drvdata);
>   void arm_smmu_sva_unbind(struct iommu_sva *handle);
> @@ -744,6 +785,26 @@ static inline int arm_smmu_master_disable_sva(struct arm_smmu_master *master)
>   	return -ENODEV;
>   }
>   
> +static inline bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master)
> +{
> +	return false;
> +}
> +
> +static inline bool arm_smmu_master_iopf_enabled(struct arm_smmu_master *master)
> +{
> +	return false;
> +}
> +
> +static inline int arm_smmu_master_enable_iopf(struct arm_smmu_master *master)
> +{
> +	return -ENODEV;
> +}
> +
> +static inline int arm_smmu_master_disable_iopf(struct arm_smmu_master *master)
> +{
> +	return -ENODEV;
> +}
> +
>   static inline struct iommu_sva *
>   arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
>   {
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> index e13b092e6004..17acfee4f484 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> @@ -431,9 +431,9 @@ bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
>   	return true;
>   }
>   
> -static bool arm_smmu_iopf_supported(struct arm_smmu_master *master)
> +bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master)
>   {
> -	return false;
> +	return master->stall_enabled;
>   }
>   
>   bool arm_smmu_master_sva_supported(struct arm_smmu_master *master)
> @@ -441,8 +441,18 @@ bool arm_smmu_master_sva_supported(struct arm_smmu_master *master)
>   	if (!(master->smmu->features & ARM_SMMU_FEAT_SVA))
>   		return false;
>   
> -	/* SSID and IOPF support are mandatory for the moment */
> -	return master->ssid_bits && arm_smmu_iopf_supported(master);
> +	/* SSID support is mandatory for the moment */
> +	return master->ssid_bits;
> +}
> +
> +bool arm_smmu_master_iopf_enabled(struct arm_smmu_master *master)
> +{
> +	bool enabled;
> +
> +	mutex_lock(&sva_lock);
> +	enabled = master->iopf_enabled;
> +	mutex_unlock(&sva_lock);

Forgive me for being dim, but what's the locking synchronising against 
here? If we're expecting that master->iopf_enabled can change at any 
time, isn't whatever we've read potentially already invalid as soon as 
we've dropped the lock?

> +	return enabled;
>   }
>   
>   bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master)
> @@ -455,15 +465,67 @@ bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master)
>   	return enabled;
>   }
>   
> +int arm_smmu_master_enable_iopf(struct arm_smmu_master *master)
> +{
> +	int ret;
> +	struct device *dev = master->dev;
> +
> +	mutex_lock(&sva_lock);
> +	if (master->stall_enabled) {
> +		ret = iopf_queue_add_device(master->smmu->evtq.iopf, dev);
> +		if (ret)
> +			goto err_unlock;
> +	}
> +
> +	ret = iommu_register_device_fault_handler(dev, iommu_queue_iopf, dev);
> +	if (ret)
> +		goto err_remove_device;
> +	master->iopf_enabled = true;
> +	mutex_unlock(&sva_lock);
> +	return 0;
> +
> +err_remove_device:
> +	iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
> +err_unlock:
> +	mutex_unlock(&sva_lock);
> +	return ret;
> +}
> +
>   int arm_smmu_master_enable_sva(struct arm_smmu_master *master)
>   {
>   	mutex_lock(&sva_lock);
> +	/*
> +	 * Drivers for devices supporting PRI or stall should enable IOPF first.
> +	 * Others have device-specific fault handlers and don't need IOPF, so
> +	 * this sanity check is a bit basic.
> +	 */
> +	if (arm_smmu_master_iopf_supported(master) && !master->iopf_enabled) {
> +		mutex_unlock(&sva_lock);
> +		return -EINVAL;
> +	}
>   	master->sva_enabled = true;
>   	mutex_unlock(&sva_lock);
>   
>   	return 0;
>   }
>   
> +int arm_smmu_master_disable_iopf(struct arm_smmu_master *master)
> +{
> +	struct device *dev = master->dev;
> +
> +	mutex_lock(&sva_lock);
> +	if (master->sva_enabled) {
> +		mutex_unlock(&sva_lock);
> +		return -EBUSY;
> +	}
> +
> +	iommu_unregister_device_fault_handler(dev);
> +	iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
> +	master->iopf_enabled = false;
> +	mutex_unlock(&sva_lock);
> +	return 0;
> +}
> +
>   int arm_smmu_master_disable_sva(struct arm_smmu_master *master)
>   {
>   	mutex_lock(&sva_lock);
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 2dbae2e6965d..1fea11d65cd3 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -32,6 +32,7 @@
>   #include <linux/amba/bus.h>
>   
>   #include "arm-smmu-v3.h"
> +#include "../../iommu-sva-lib.h"
>   
>   static bool disable_bypass = true;
>   module_param(disable_bypass, bool, 0444);
> @@ -319,6 +320,11 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
>   		}
>   		cmd[1] |= FIELD_PREP(CMDQ_PRI_1_RESP, ent->pri.resp);
>   		break;
> +	case CMDQ_OP_RESUME:
> +		cmd[0] |= FIELD_PREP(CMDQ_RESUME_0_SID, ent->resume.sid);
> +		cmd[0] |= FIELD_PREP(CMDQ_RESUME_0_RESP, ent->resume.resp);
> +		cmd[1] |= FIELD_PREP(CMDQ_RESUME_1_STAG, ent->resume.stag);
> +		break;
>   	case CMDQ_OP_CMD_SYNC:
>   		if (ent->sync.msiaddr) {
>   			cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, CMDQ_SYNC_0_CS_IRQ);
> @@ -882,6 +888,44 @@ static int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu,
>   	return arm_smmu_cmdq_issue_cmdlist(smmu, cmds->cmds, cmds->num, true);
>   }
>   
> +static int arm_smmu_page_response(struct device *dev,
> +				  struct iommu_fault_event *unused,
> +				  struct iommu_page_response *resp)
> +{
> +	struct arm_smmu_cmdq_ent cmd = {0};
> +	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> +	int sid = master->streams[0].id;

If that's going to be the case, should we explicitly prevent 
multi-stream devices from opting in to faults at all?

> +	if (master->stall_enabled) {
> +		cmd.opcode		= CMDQ_OP_RESUME;
> +		cmd.resume.sid		= sid;
> +		cmd.resume.stag		= resp->grpid;
> +		switch (resp->code) {
> +		case IOMMU_PAGE_RESP_INVALID:
> +		case IOMMU_PAGE_RESP_FAILURE:
> +			cmd.resume.resp = CMDQ_RESUME_0_RESP_ABORT;
> +			break;
> +		case IOMMU_PAGE_RESP_SUCCESS:
> +			cmd.resume.resp = CMDQ_RESUME_0_RESP_RETRY;
> +			break;
> +		default:
> +			return -EINVAL;
> +		}
> +	} else {
> +		return -ENODEV;
> +	}
> +
> +	arm_smmu_cmdq_issue_cmd(master->smmu, &cmd);
> +	/*
> +	 * Don't send a SYNC, it doesn't do anything for RESUME or PRI_RESP.
> +	 * RESUME consumption guarantees that the stalled transaction will be
> +	 * terminated... at some point in the future. PRI_RESP is fire and
> +	 * forget.
> +	 */
> +
> +	return 0;
> +}
> +
>   /* Context descriptor manipulation functions */
>   void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid)
>   {
> @@ -991,7 +1035,6 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
>   	u64 val;
>   	bool cd_live;
>   	__le64 *cdptr;
> -	struct arm_smmu_device *smmu = smmu_domain->smmu;
>   
>   	if (WARN_ON(ssid >= (1 << smmu_domain->s1_cfg.s1cdmax)))
>   		return -E2BIG;
> @@ -1036,8 +1079,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
>   			FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid) |
>   			CTXDESC_CD_0_V;
>   
> -		/* STALL_MODEL==0b10 && CD.S==0 is ILLEGAL */
> -		if (smmu->features & ARM_SMMU_FEAT_STALL_FORCE)
> +		if (smmu_domain->stall_enabled)
>   			val |= CTXDESC_CD_0_S;
>   	}
>   
> @@ -1278,7 +1320,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>   			 FIELD_PREP(STRTAB_STE_1_STRW, STRTAB_STE_1_STRW_NSEL1));
>   
>   		if (smmu->features & ARM_SMMU_FEAT_STALLS &&
> -		   !(smmu->features & ARM_SMMU_FEAT_STALL_FORCE))
> +		    !master->stall_enabled)
>   			dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
>   
>   		val |= (s1_cfg->cdcfg.cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
> @@ -1355,7 +1397,6 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
>   	return 0;
>   }
>   
> -__maybe_unused
>   static struct arm_smmu_master *
>   arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid)
>   {
> @@ -1382,9 +1423,96 @@ arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid)
>   }
>   
>   /* IRQ and event handlers */
> +static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
> +{
> +	int ret;
> +	u32 perm = 0;
> +	struct arm_smmu_master *master;
> +	bool ssid_valid = evt[0] & EVTQ_0_SSV;
> +	u8 type = FIELD_GET(EVTQ_0_ID, evt[0]);
> +	u32 sid = FIELD_GET(EVTQ_0_SID, evt[0]);
> +	struct iommu_fault_event fault_evt = { };
> +	struct iommu_fault *flt = &fault_evt.fault;
> +
> +	/* Stage-2 is always pinned at the moment */
> +	if (evt[1] & EVTQ_1_S2)
> +		return -EFAULT;
> +
> +	master = arm_smmu_find_master(smmu, sid);
> +	if (!master)
> +		return -EINVAL;
> +
> +	if (evt[1] & EVTQ_1_READ)
> +		perm |= IOMMU_FAULT_PERM_READ;
> +	else
> +		perm |= IOMMU_FAULT_PERM_WRITE;
> +
> +	if (evt[1] & EVTQ_1_EXEC)
> +		perm |= IOMMU_FAULT_PERM_EXEC;
> +
> +	if (evt[1] & EVTQ_1_PRIV)
> +		perm |= IOMMU_FAULT_PERM_PRIV;
> +
> +	if (evt[1] & EVTQ_1_STALL) {
> +		flt->type = IOMMU_FAULT_PAGE_REQ;
> +		flt->prm = (struct iommu_fault_page_request) {
> +			.flags = IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE,
> +			.grpid = FIELD_GET(EVTQ_1_STAG, evt[1]),
> +			.perm = perm,
> +			.addr = FIELD_GET(EVTQ_2_ADDR, evt[2]),
> +		};
> +
> +		if (ssid_valid) {
> +			flt->prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
> +			flt->prm.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]);
> +		}

So if we get a bad ATS request with R=1, or a TLB/CFG conflict or any 
other imp-def event which happens to have bit 95 set, we might try to 
report it as something pageable? I would have thought we should look at 
the event code before *anything* else.

> +	} else {
> +		flt->type = IOMMU_FAULT_DMA_UNRECOV;
> +		flt->event = (struct iommu_fault_unrecoverable) {
> +			.flags = IOMMU_FAULT_UNRECOV_ADDR_VALID |
> +				 IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID,
> +			.perm = perm,
> +			.addr = FIELD_GET(EVTQ_2_ADDR, evt[2]),
> +			.fetch_addr = FIELD_GET(EVTQ_3_IPA, evt[3]),
> +		};
> +
> +		if (ssid_valid) {
> +			flt->event.flags |= IOMMU_FAULT_UNRECOV_PASID_VALID;
> +			flt->event.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]);
> +		}
> +
> +		switch (type) {
> +		case EVT_ID_TRANSLATION_FAULT:
> +		case EVT_ID_ADDR_SIZE_FAULT:
> +		case EVT_ID_ACCESS_FAULT:
> +			flt->event.reason = IOMMU_FAULT_REASON_PTE_FETCH;
> +			break;
> +		case EVT_ID_PERMISSION_FAULT:
> +			flt->event.reason = IOMMU_FAULT_REASON_PERMISSION;
> +			break;
> +		default:
> +			/* TODO: report other unrecoverable faults. */
> +			return -EFAULT;
> +		}
> +	}
> +
> +	ret = iommu_report_device_fault(master->dev, &fault_evt);
> +	if (ret && flt->type == IOMMU_FAULT_PAGE_REQ) {
> +		/* Nobody cared, abort the access */
> +		struct iommu_page_response resp = {
> +			.pasid		= flt->prm.pasid,
> +			.grpid		= flt->prm.grpid,
> +			.code		= IOMMU_PAGE_RESP_FAILURE,
> +		};
> +		arm_smmu_page_response(master->dev, NULL, &resp);
> +	}
> +
> +	return ret;
> +}
> +
>   static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
>   {
> -	int i;
> +	int i, ret;
>   	struct arm_smmu_device *smmu = dev;
>   	struct arm_smmu_queue *q = &smmu->evtq.q;
>   	struct arm_smmu_ll_queue *llq = &q->llq;
> @@ -1394,11 +1522,14 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
>   		while (!queue_remove_raw(q, evt)) {
>   			u8 id = FIELD_GET(EVTQ_0_ID, evt[0]);
>   
> -			dev_info(smmu->dev, "event 0x%02x received:\n", id);
> -			for (i = 0; i < ARRAY_SIZE(evt); ++i)
> -				dev_info(smmu->dev, "\t0x%016llx\n",
> -					 (unsigned long long)evt[i]);
> -
> +			ret = arm_smmu_handle_evt(smmu, evt);
> +			if (ret) {

Maybe make this an "if (!ret) continue;" to save the indentation from 
getting even more out of hand?

> +				dev_info(smmu->dev, "event 0x%02x received:\n",
> +					 id);
> +				for (i = 0; i < ARRAY_SIZE(evt); ++i)
> +					dev_info(smmu->dev, "\t0x%016llx\n",
> +						 (unsigned long long)evt[i]);
> +			}
>   		}
>   
>   		/*
> @@ -1903,6 +2034,8 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
>   
>   	cfg->s1cdmax = master->ssid_bits;
>   
> +	smmu_domain->stall_enabled = master->stall_enabled;
> +
>   	ret = arm_smmu_alloc_cd_tables(smmu_domain);
>   	if (ret)
>   		goto out_free_asid;
> @@ -2250,6 +2383,12 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>   			smmu_domain->s1_cfg.s1cdmax, master->ssid_bits);
>   		ret = -EINVAL;
>   		goto out_unlock;
> +	} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
> +		   smmu_domain->stall_enabled != master->stall_enabled) {

I appreciate that it's probably a fair bit more complex, but it would be 
nice to at least plan for resolving this decision later (i.e. at a point 
where a caller shows an interest in actually using stalls) in future. 
Obviously the first devices advertising stall capabilities will be the 
ones that do want to use it for their primary functionality, that are 
driving the work here. However once this all matures, firmwares may 
start annotating any stallable devices as such for completeness, rather 
than assuming any specific usage. At that point it would be a pain if, 
say, assigning two devices to the same VFIO domain for old-fashioned 
pinned DMA, was suddenly prevented for irrelevant reasons just because 
of a DT/IORT update.

> +		dev_err(dev, "cannot attach to stall-%s domain\n",
> +			smmu_domain->stall_enabled ? "enabled" : "disabled");
> +		ret = -EINVAL;
> +		goto out_unlock;
>   	}
>   
>   	master->domain = smmu_domain;
> @@ -2484,6 +2623,11 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
>   		master->ssid_bits = min_t(u8, master->ssid_bits,
>   					  CTXDESC_LINEAR_CDMAX);
>   
> +	if ((smmu->features & ARM_SMMU_FEAT_STALLS &&
> +	     device_property_read_bool(dev, "dma-can-stall")) ||
> +	    smmu->features & ARM_SMMU_FEAT_STALL_FORCE)
> +		master->stall_enabled = true;
> +
>   	return &smmu->iommu;
>   
>   err_free_master:
> @@ -2502,6 +2646,7 @@ static void arm_smmu_release_device(struct device *dev)
>   
>   	master = dev_iommu_priv_get(dev);
>   	WARN_ON(arm_smmu_master_sva_enabled(master));
> +	iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
>   	arm_smmu_detach_dev(master);
>   	arm_smmu_disable_pasid(master);
>   	arm_smmu_remove_master(master);
> @@ -2629,6 +2774,8 @@ static bool arm_smmu_dev_has_feature(struct device *dev,
>   		return false;
>   
>   	switch (feat) {
> +	case IOMMU_DEV_FEAT_IOPF:
> +		return arm_smmu_master_iopf_supported(master);
>   	case IOMMU_DEV_FEAT_SVA:
>   		return arm_smmu_master_sva_supported(master);
>   	default:
> @@ -2645,6 +2792,8 @@ static bool arm_smmu_dev_feature_enabled(struct device *dev,
>   		return false;
>   
>   	switch (feat) {
> +	case IOMMU_DEV_FEAT_IOPF:
> +		return arm_smmu_master_iopf_enabled(master);
>   	case IOMMU_DEV_FEAT_SVA:
>   		return arm_smmu_master_sva_enabled(master);
>   	default:
> @@ -2655,6 +2804,8 @@ static bool arm_smmu_dev_feature_enabled(struct device *dev,
>   static int arm_smmu_dev_enable_feature(struct device *dev,
>   				       enum iommu_dev_features feat)
>   {
> +	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> +
>   	if (!arm_smmu_dev_has_feature(dev, feat))
>   		return -ENODEV;
>   
> @@ -2662,8 +2813,10 @@ static int arm_smmu_dev_enable_feature(struct device *dev,
>   		return -EBUSY;
>   
>   	switch (feat) {
> +	case IOMMU_DEV_FEAT_IOPF:
> +		return arm_smmu_master_enable_iopf(master);
>   	case IOMMU_DEV_FEAT_SVA:
> -		return arm_smmu_master_enable_sva(dev_iommu_priv_get(dev));
> +		return arm_smmu_master_enable_sva(master);
>   	default:
>   		return -EINVAL;
>   	}
> @@ -2672,12 +2825,16 @@ static int arm_smmu_dev_enable_feature(struct device *dev,
>   static int arm_smmu_dev_disable_feature(struct device *dev,
>   					enum iommu_dev_features feat)
>   {
> +	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> +
>   	if (!arm_smmu_dev_feature_enabled(dev, feat))
>   		return -EINVAL;
>   
>   	switch (feat) {
> +	case IOMMU_DEV_FEAT_IOPF:
> +		return arm_smmu_master_disable_iopf(master);
>   	case IOMMU_DEV_FEAT_SVA:
> -		return arm_smmu_master_disable_sva(dev_iommu_priv_get(dev));
> +		return arm_smmu_master_disable_sva(master);
>   	default:
>   		return -EINVAL;
>   	}
> @@ -2708,6 +2865,7 @@ static struct iommu_ops arm_smmu_ops = {
>   	.sva_bind		= arm_smmu_sva_bind,
>   	.sva_unbind		= arm_smmu_sva_unbind,
>   	.sva_get_pasid		= arm_smmu_sva_get_pasid,
> +	.page_response		= arm_smmu_page_response,
>   	.pgsize_bitmap		= -1UL, /* Restricted during device attach */
>   };
>   
> @@ -2785,6 +2943,7 @@ static int arm_smmu_cmdq_init(struct arm_smmu_device *smmu)
>   static int arm_smmu_init_queues(struct arm_smmu_device *smmu)
>   {
>   	int ret;
> +	bool sva = arm_smmu_sva_supported(smmu);
>   
>   	/* cmdq */
>   	ret = arm_smmu_init_one_queue(smmu, &smmu->cmdq.q, ARM_SMMU_CMDQ_PROD,
> @@ -2804,6 +2963,12 @@ static int arm_smmu_init_queues(struct arm_smmu_device *smmu)
>   	if (ret)
>   		return ret;
>   
> +	if (sva && smmu->features & ARM_SMMU_FEAT_STALLS) {

Surely you could just test for ARM_SMMU_FEAT_SVA by now rather than go 
through the whole of arm_smmu_sva_supported() again?

Robin.

> +		smmu->evtq.iopf = iopf_queue_alloc(dev_name(smmu->dev));
> +		if (!smmu->evtq.iopf)
> +			return -ENOMEM;
> +	}
> +
>   	/* priq */
>   	if (!(smmu->features & ARM_SMMU_FEAT_PRI))
>   		return 0;
> @@ -3718,6 +3883,7 @@ static int arm_smmu_device_remove(struct platform_device *pdev)
>   	iommu_device_unregister(&smmu->iommu);
>   	iommu_device_sysfs_remove(&smmu->iommu);
>   	arm_smmu_device_disable(smmu);
> +	iopf_queue_free(smmu->evtq.iopf);
>   
>   	return 0;
>   }
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 10/10] iommu/arm-smmu-v3: Add stall support for platform devices
@ 2021-01-19 17:28     ` Robin Murphy
  0 siblings, 0 replies; 105+ messages in thread
From: Robin Murphy @ 2021-01-19 17:28 UTC (permalink / raw)
  To: Jean-Philippe Brucker, joro, will
  Cc: devicetree, vivek.gautam, sudeep.holla, rjw, linux-acpi, iommu,
	robh+dt, linux-arm-kernel, guohanjun, zhangfei.gao,
	linux-accelerators, lenb

On 2021-01-08 14:52, Jean-Philippe Brucker wrote:
> The SMMU provides a Stall model for handling page faults in platform
> devices. It is similar to PCIe PRI, but doesn't require devices to have
> their own translation cache. Instead, faulting transactions are parked
> and the OS is given a chance to fix the page tables and retry the
> transaction.
> 
> Enable stall for devices that support it (opt-in by firmware). When an
> event corresponds to a translation error, call the IOMMU fault handler.
> If the fault is recoverable, it will call us back to terminate or
> continue the stall.
> 
> To use stall device drivers need to enable IOMMU_DEV_FEAT_IOPF, which
> initializes the fault queue for the device.
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> ---
> v9: Add IOMMU_DEV_FEAT_IOPF
> ---
>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  61 ++++++
>   .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  70 ++++++-
>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 192 ++++++++++++++++--
>   3 files changed, 306 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 8ef6a1c48635..cb129870ef55 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -354,6 +354,13 @@
>   #define CMDQ_PRI_1_GRPID		GENMASK_ULL(8, 0)
>   #define CMDQ_PRI_1_RESP			GENMASK_ULL(13, 12)
>   
> +#define CMDQ_RESUME_0_SID		GENMASK_ULL(63, 32)
> +#define CMDQ_RESUME_0_RESP_TERM		0UL
> +#define CMDQ_RESUME_0_RESP_RETRY	1UL
> +#define CMDQ_RESUME_0_RESP_ABORT	2UL
> +#define CMDQ_RESUME_0_RESP		GENMASK_ULL(13, 12)

Nit: I think the SID field belongs here.

> +#define CMDQ_RESUME_1_STAG		GENMASK_ULL(15, 0)
> +
>   #define CMDQ_SYNC_0_CS			GENMASK_ULL(13, 12)
>   #define CMDQ_SYNC_0_CS_NONE		0
>   #define CMDQ_SYNC_0_CS_IRQ		1
> @@ -370,6 +377,25 @@
>   
>   #define EVTQ_0_ID			GENMASK_ULL(7, 0)
>   
> +#define EVT_ID_TRANSLATION_FAULT	0x10
> +#define EVT_ID_ADDR_SIZE_FAULT		0x11
> +#define EVT_ID_ACCESS_FAULT		0x12
> +#define EVT_ID_PERMISSION_FAULT		0x13
> +
> +#define EVTQ_0_SSV			(1UL << 11)
> +#define EVTQ_0_SSID			GENMASK_ULL(31, 12)
> +#define EVTQ_0_SID			GENMASK_ULL(63, 32)
> +#define EVTQ_1_STAG			GENMASK_ULL(15, 0)
> +#define EVTQ_1_STALL			(1UL << 31)
> +#define EVTQ_1_PRIV			(1UL << 33)
> +#define EVTQ_1_EXEC			(1UL << 34)
> +#define EVTQ_1_READ			(1UL << 35)

Nit: personally I'd find it a little clearer if these were named PnU, 
InD, and RnW to match the architecture, but quite possibly that's just 
me and those are gibberish to everyone else...

> +#define EVTQ_1_S2			(1UL << 39)
> +#define EVTQ_1_CLASS			GENMASK_ULL(41, 40)
> +#define EVTQ_1_TT_READ			(1UL << 44)
> +#define EVTQ_2_ADDR			GENMASK_ULL(63, 0)
> +#define EVTQ_3_IPA			GENMASK_ULL(51, 12)
> +
>   /* PRI queue */
>   #define PRIQ_ENT_SZ_SHIFT		4
>   #define PRIQ_ENT_DWORDS			((1 << PRIQ_ENT_SZ_SHIFT) >> 3)
> @@ -462,6 +488,13 @@ struct arm_smmu_cmdq_ent {
>   			enum pri_resp		resp;
>   		} pri;
>   
> +		#define CMDQ_OP_RESUME		0x44
> +		struct {
> +			u32			sid;
> +			u16			stag;
> +			u8			resp;
> +		} resume;
> +
>   		#define CMDQ_OP_CMD_SYNC	0x46
>   		struct {
>   			u64			msiaddr;
> @@ -520,6 +553,7 @@ struct arm_smmu_cmdq_batch {
>   
>   struct arm_smmu_evtq {
>   	struct arm_smmu_queue		q;
> +	struct iopf_queue		*iopf;
>   	u32				max_stalls;
>   };
>   
> @@ -656,7 +690,9 @@ struct arm_smmu_master {
>   	struct arm_smmu_stream		*streams;
>   	unsigned int			num_streams;
>   	bool				ats_enabled;
> +	bool				stall_enabled;
>   	bool				sva_enabled;
> +	bool				iopf_enabled;
>   	struct list_head		bonds;
>   	unsigned int			ssid_bits;
>   };
> @@ -675,6 +711,7 @@ struct arm_smmu_domain {
>   
>   	struct io_pgtable_ops		*pgtbl_ops;
>   	bool				non_strict;
> +	bool				stall_enabled;
>   	atomic_t			nr_ats_masters;
>   
>   	enum arm_smmu_domain_stage	stage;
> @@ -713,6 +750,10 @@ bool arm_smmu_master_sva_supported(struct arm_smmu_master *master);
>   bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master);
>   int arm_smmu_master_enable_sva(struct arm_smmu_master *master);
>   int arm_smmu_master_disable_sva(struct arm_smmu_master *master);
> +bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master);
> +bool arm_smmu_master_iopf_enabled(struct arm_smmu_master *master);
> +int arm_smmu_master_enable_iopf(struct arm_smmu_master *master);
> +int arm_smmu_master_disable_iopf(struct arm_smmu_master *master);
>   struct iommu_sva *arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm,
>   				    void *drvdata);
>   void arm_smmu_sva_unbind(struct iommu_sva *handle);
> @@ -744,6 +785,26 @@ static inline int arm_smmu_master_disable_sva(struct arm_smmu_master *master)
>   	return -ENODEV;
>   }
>   
> +static inline bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master)
> +{
> +	return false;
> +}
> +
> +static inline bool arm_smmu_master_iopf_enabled(struct arm_smmu_master *master)
> +{
> +	return false;
> +}
> +
> +static inline int arm_smmu_master_enable_iopf(struct arm_smmu_master *master)
> +{
> +	return -ENODEV;
> +}
> +
> +static inline int arm_smmu_master_disable_iopf(struct arm_smmu_master *master)
> +{
> +	return -ENODEV;
> +}
> +
>   static inline struct iommu_sva *
>   arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm, void *drvdata)
>   {
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> index e13b092e6004..17acfee4f484 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> @@ -431,9 +431,9 @@ bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
>   	return true;
>   }
>   
> -static bool arm_smmu_iopf_supported(struct arm_smmu_master *master)
> +bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master)
>   {
> -	return false;
> +	return master->stall_enabled;
>   }
>   
>   bool arm_smmu_master_sva_supported(struct arm_smmu_master *master)
> @@ -441,8 +441,18 @@ bool arm_smmu_master_sva_supported(struct arm_smmu_master *master)
>   	if (!(master->smmu->features & ARM_SMMU_FEAT_SVA))
>   		return false;
>   
> -	/* SSID and IOPF support are mandatory for the moment */
> -	return master->ssid_bits && arm_smmu_iopf_supported(master);
> +	/* SSID support is mandatory for the moment */
> +	return master->ssid_bits;
> +}
> +
> +bool arm_smmu_master_iopf_enabled(struct arm_smmu_master *master)
> +{
> +	bool enabled;
> +
> +	mutex_lock(&sva_lock);
> +	enabled = master->iopf_enabled;
> +	mutex_unlock(&sva_lock);

Forgive me for being dim, but what's the locking synchronising against 
here? If we're expecting that master->iopf_enabled can change at any 
time, isn't whatever we've read potentially already invalid as soon as 
we've dropped the lock?

> +	return enabled;
>   }
>   
>   bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master)
> @@ -455,15 +465,67 @@ bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master)
>   	return enabled;
>   }
>   
> +int arm_smmu_master_enable_iopf(struct arm_smmu_master *master)
> +{
> +	int ret;
> +	struct device *dev = master->dev;
> +
> +	mutex_lock(&sva_lock);
> +	if (master->stall_enabled) {
> +		ret = iopf_queue_add_device(master->smmu->evtq.iopf, dev);
> +		if (ret)
> +			goto err_unlock;
> +	}
> +
> +	ret = iommu_register_device_fault_handler(dev, iommu_queue_iopf, dev);
> +	if (ret)
> +		goto err_remove_device;
> +	master->iopf_enabled = true;
> +	mutex_unlock(&sva_lock);
> +	return 0;
> +
> +err_remove_device:
> +	iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
> +err_unlock:
> +	mutex_unlock(&sva_lock);
> +	return ret;
> +}
> +
>   int arm_smmu_master_enable_sva(struct arm_smmu_master *master)
>   {
>   	mutex_lock(&sva_lock);
> +	/*
> +	 * Drivers for devices supporting PRI or stall should enable IOPF first.
> +	 * Others have device-specific fault handlers and don't need IOPF, so
> +	 * this sanity check is a bit basic.
> +	 */
> +	if (arm_smmu_master_iopf_supported(master) && !master->iopf_enabled) {
> +		mutex_unlock(&sva_lock);
> +		return -EINVAL;
> +	}
>   	master->sva_enabled = true;
>   	mutex_unlock(&sva_lock);
>   
>   	return 0;
>   }
>   
> +int arm_smmu_master_disable_iopf(struct arm_smmu_master *master)
> +{
> +	struct device *dev = master->dev;
> +
> +	mutex_lock(&sva_lock);
> +	if (master->sva_enabled) {
> +		mutex_unlock(&sva_lock);
> +		return -EBUSY;
> +	}
> +
> +	iommu_unregister_device_fault_handler(dev);
> +	iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
> +	master->iopf_enabled = false;
> +	mutex_unlock(&sva_lock);
> +	return 0;
> +}
> +
>   int arm_smmu_master_disable_sva(struct arm_smmu_master *master)
>   {
>   	mutex_lock(&sva_lock);
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 2dbae2e6965d..1fea11d65cd3 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -32,6 +32,7 @@
>   #include <linux/amba/bus.h>
>   
>   #include "arm-smmu-v3.h"
> +#include "../../iommu-sva-lib.h"
>   
>   static bool disable_bypass = true;
>   module_param(disable_bypass, bool, 0444);
> @@ -319,6 +320,11 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
>   		}
>   		cmd[1] |= FIELD_PREP(CMDQ_PRI_1_RESP, ent->pri.resp);
>   		break;
> +	case CMDQ_OP_RESUME:
> +		cmd[0] |= FIELD_PREP(CMDQ_RESUME_0_SID, ent->resume.sid);
> +		cmd[0] |= FIELD_PREP(CMDQ_RESUME_0_RESP, ent->resume.resp);
> +		cmd[1] |= FIELD_PREP(CMDQ_RESUME_1_STAG, ent->resume.stag);
> +		break;
>   	case CMDQ_OP_CMD_SYNC:
>   		if (ent->sync.msiaddr) {
>   			cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, CMDQ_SYNC_0_CS_IRQ);
> @@ -882,6 +888,44 @@ static int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu,
>   	return arm_smmu_cmdq_issue_cmdlist(smmu, cmds->cmds, cmds->num, true);
>   }
>   
> +static int arm_smmu_page_response(struct device *dev,
> +				  struct iommu_fault_event *unused,
> +				  struct iommu_page_response *resp)
> +{
> +	struct arm_smmu_cmdq_ent cmd = {0};
> +	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> +	int sid = master->streams[0].id;

If that's going to be the case, should we explicitly prevent 
multi-stream devices from opting in to faults at all?

> +	if (master->stall_enabled) {
> +		cmd.opcode		= CMDQ_OP_RESUME;
> +		cmd.resume.sid		= sid;
> +		cmd.resume.stag		= resp->grpid;
> +		switch (resp->code) {
> +		case IOMMU_PAGE_RESP_INVALID:
> +		case IOMMU_PAGE_RESP_FAILURE:
> +			cmd.resume.resp = CMDQ_RESUME_0_RESP_ABORT;
> +			break;
> +		case IOMMU_PAGE_RESP_SUCCESS:
> +			cmd.resume.resp = CMDQ_RESUME_0_RESP_RETRY;
> +			break;
> +		default:
> +			return -EINVAL;
> +		}
> +	} else {
> +		return -ENODEV;
> +	}
> +
> +	arm_smmu_cmdq_issue_cmd(master->smmu, &cmd);
> +	/*
> +	 * Don't send a SYNC, it doesn't do anything for RESUME or PRI_RESP.
> +	 * RESUME consumption guarantees that the stalled transaction will be
> +	 * terminated... at some point in the future. PRI_RESP is fire and
> +	 * forget.
> +	 */
> +
> +	return 0;
> +}
> +
>   /* Context descriptor manipulation functions */
>   void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid)
>   {
> @@ -991,7 +1035,6 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
>   	u64 val;
>   	bool cd_live;
>   	__le64 *cdptr;
> -	struct arm_smmu_device *smmu = smmu_domain->smmu;
>   
>   	if (WARN_ON(ssid >= (1 << smmu_domain->s1_cfg.s1cdmax)))
>   		return -E2BIG;
> @@ -1036,8 +1079,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
>   			FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid) |
>   			CTXDESC_CD_0_V;
>   
> -		/* STALL_MODEL==0b10 && CD.S==0 is ILLEGAL */
> -		if (smmu->features & ARM_SMMU_FEAT_STALL_FORCE)
> +		if (smmu_domain->stall_enabled)
>   			val |= CTXDESC_CD_0_S;
>   	}
>   
> @@ -1278,7 +1320,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>   			 FIELD_PREP(STRTAB_STE_1_STRW, STRTAB_STE_1_STRW_NSEL1));
>   
>   		if (smmu->features & ARM_SMMU_FEAT_STALLS &&
> -		   !(smmu->features & ARM_SMMU_FEAT_STALL_FORCE))
> +		    !master->stall_enabled)
>   			dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
>   
>   		val |= (s1_cfg->cdcfg.cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
> @@ -1355,7 +1397,6 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
>   	return 0;
>   }
>   
> -__maybe_unused
>   static struct arm_smmu_master *
>   arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid)
>   {
> @@ -1382,9 +1423,96 @@ arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid)
>   }
>   
>   /* IRQ and event handlers */
> +static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt)
> +{
> +	int ret;
> +	u32 perm = 0;
> +	struct arm_smmu_master *master;
> +	bool ssid_valid = evt[0] & EVTQ_0_SSV;
> +	u8 type = FIELD_GET(EVTQ_0_ID, evt[0]);
> +	u32 sid = FIELD_GET(EVTQ_0_SID, evt[0]);
> +	struct iommu_fault_event fault_evt = { };
> +	struct iommu_fault *flt = &fault_evt.fault;
> +
> +	/* Stage-2 is always pinned at the moment */
> +	if (evt[1] & EVTQ_1_S2)
> +		return -EFAULT;
> +
> +	master = arm_smmu_find_master(smmu, sid);
> +	if (!master)
> +		return -EINVAL;
> +
> +	if (evt[1] & EVTQ_1_READ)
> +		perm |= IOMMU_FAULT_PERM_READ;
> +	else
> +		perm |= IOMMU_FAULT_PERM_WRITE;
> +
> +	if (evt[1] & EVTQ_1_EXEC)
> +		perm |= IOMMU_FAULT_PERM_EXEC;
> +
> +	if (evt[1] & EVTQ_1_PRIV)
> +		perm |= IOMMU_FAULT_PERM_PRIV;
> +
> +	if (evt[1] & EVTQ_1_STALL) {
> +		flt->type = IOMMU_FAULT_PAGE_REQ;
> +		flt->prm = (struct iommu_fault_page_request) {
> +			.flags = IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE,
> +			.grpid = FIELD_GET(EVTQ_1_STAG, evt[1]),
> +			.perm = perm,
> +			.addr = FIELD_GET(EVTQ_2_ADDR, evt[2]),
> +		};
> +
> +		if (ssid_valid) {
> +			flt->prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
> +			flt->prm.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]);
> +		}

So if we get a bad ATS request with R=1, or a TLB/CFG conflict or any 
other imp-def event which happens to have bit 95 set, we might try to 
report it as something pageable? I would have thought we should look at 
the event code before *anything* else.

> +	} else {
> +		flt->type = IOMMU_FAULT_DMA_UNRECOV;
> +		flt->event = (struct iommu_fault_unrecoverable) {
> +			.flags = IOMMU_FAULT_UNRECOV_ADDR_VALID |
> +				 IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID,
> +			.perm = perm,
> +			.addr = FIELD_GET(EVTQ_2_ADDR, evt[2]),
> +			.fetch_addr = FIELD_GET(EVTQ_3_IPA, evt[3]),
> +		};
> +
> +		if (ssid_valid) {
> +			flt->event.flags |= IOMMU_FAULT_UNRECOV_PASID_VALID;
> +			flt->event.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]);
> +		}
> +
> +		switch (type) {
> +		case EVT_ID_TRANSLATION_FAULT:
> +		case EVT_ID_ADDR_SIZE_FAULT:
> +		case EVT_ID_ACCESS_FAULT:
> +			flt->event.reason = IOMMU_FAULT_REASON_PTE_FETCH;
> +			break;
> +		case EVT_ID_PERMISSION_FAULT:
> +			flt->event.reason = IOMMU_FAULT_REASON_PERMISSION;
> +			break;
> +		default:
> +			/* TODO: report other unrecoverable faults. */
> +			return -EFAULT;
> +		}
> +	}
> +
> +	ret = iommu_report_device_fault(master->dev, &fault_evt);
> +	if (ret && flt->type == IOMMU_FAULT_PAGE_REQ) {
> +		/* Nobody cared, abort the access */
> +		struct iommu_page_response resp = {
> +			.pasid		= flt->prm.pasid,
> +			.grpid		= flt->prm.grpid,
> +			.code		= IOMMU_PAGE_RESP_FAILURE,
> +		};
> +		arm_smmu_page_response(master->dev, NULL, &resp);
> +	}
> +
> +	return ret;
> +}
> +
>   static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
>   {
> -	int i;
> +	int i, ret;
>   	struct arm_smmu_device *smmu = dev;
>   	struct arm_smmu_queue *q = &smmu->evtq.q;
>   	struct arm_smmu_ll_queue *llq = &q->llq;
> @@ -1394,11 +1522,14 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
>   		while (!queue_remove_raw(q, evt)) {
>   			u8 id = FIELD_GET(EVTQ_0_ID, evt[0]);
>   
> -			dev_info(smmu->dev, "event 0x%02x received:\n", id);
> -			for (i = 0; i < ARRAY_SIZE(evt); ++i)
> -				dev_info(smmu->dev, "\t0x%016llx\n",
> -					 (unsigned long long)evt[i]);
> -
> +			ret = arm_smmu_handle_evt(smmu, evt);
> +			if (ret) {

Maybe make this an "if (!ret) continue;" to save the indentation from 
getting even more out of hand?

> +				dev_info(smmu->dev, "event 0x%02x received:\n",
> +					 id);
> +				for (i = 0; i < ARRAY_SIZE(evt); ++i)
> +					dev_info(smmu->dev, "\t0x%016llx\n",
> +						 (unsigned long long)evt[i]);
> +			}
>   		}
>   
>   		/*
> @@ -1903,6 +2034,8 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
>   
>   	cfg->s1cdmax = master->ssid_bits;
>   
> +	smmu_domain->stall_enabled = master->stall_enabled;
> +
>   	ret = arm_smmu_alloc_cd_tables(smmu_domain);
>   	if (ret)
>   		goto out_free_asid;
> @@ -2250,6 +2383,12 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>   			smmu_domain->s1_cfg.s1cdmax, master->ssid_bits);
>   		ret = -EINVAL;
>   		goto out_unlock;
> +	} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
> +		   smmu_domain->stall_enabled != master->stall_enabled) {

I appreciate that it's probably a fair bit more complex, but it would be 
nice to at least plan for resolving this decision later (i.e. at a point 
where a caller shows an interest in actually using stalls) in future. 
Obviously the first devices advertising stall capabilities will be the 
ones that do want to use it for their primary functionality, that are 
driving the work here. However once this all matures, firmwares may 
start annotating any stallable devices as such for completeness, rather 
than assuming any specific usage. At that point it would be a pain if, 
say, assigning two devices to the same VFIO domain for old-fashioned 
pinned DMA, was suddenly prevented for irrelevant reasons just because 
of a DT/IORT update.

> +		dev_err(dev, "cannot attach to stall-%s domain\n",
> +			smmu_domain->stall_enabled ? "enabled" : "disabled");
> +		ret = -EINVAL;
> +		goto out_unlock;
>   	}
>   
>   	master->domain = smmu_domain;
> @@ -2484,6 +2623,11 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
>   		master->ssid_bits = min_t(u8, master->ssid_bits,
>   					  CTXDESC_LINEAR_CDMAX);
>   
> +	if ((smmu->features & ARM_SMMU_FEAT_STALLS &&
> +	     device_property_read_bool(dev, "dma-can-stall")) ||
> +	    smmu->features & ARM_SMMU_FEAT_STALL_FORCE)
> +		master->stall_enabled = true;
> +
>   	return &smmu->iommu;
>   
>   err_free_master:
> @@ -2502,6 +2646,7 @@ static void arm_smmu_release_device(struct device *dev)
>   
>   	master = dev_iommu_priv_get(dev);
>   	WARN_ON(arm_smmu_master_sva_enabled(master));
> +	iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
>   	arm_smmu_detach_dev(master);
>   	arm_smmu_disable_pasid(master);
>   	arm_smmu_remove_master(master);
> @@ -2629,6 +2774,8 @@ static bool arm_smmu_dev_has_feature(struct device *dev,
>   		return false;
>   
>   	switch (feat) {
> +	case IOMMU_DEV_FEAT_IOPF:
> +		return arm_smmu_master_iopf_supported(master);
>   	case IOMMU_DEV_FEAT_SVA:
>   		return arm_smmu_master_sva_supported(master);
>   	default:
> @@ -2645,6 +2792,8 @@ static bool arm_smmu_dev_feature_enabled(struct device *dev,
>   		return false;
>   
>   	switch (feat) {
> +	case IOMMU_DEV_FEAT_IOPF:
> +		return arm_smmu_master_iopf_enabled(master);
>   	case IOMMU_DEV_FEAT_SVA:
>   		return arm_smmu_master_sva_enabled(master);
>   	default:
> @@ -2655,6 +2804,8 @@ static bool arm_smmu_dev_feature_enabled(struct device *dev,
>   static int arm_smmu_dev_enable_feature(struct device *dev,
>   				       enum iommu_dev_features feat)
>   {
> +	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> +
>   	if (!arm_smmu_dev_has_feature(dev, feat))
>   		return -ENODEV;
>   
> @@ -2662,8 +2813,10 @@ static int arm_smmu_dev_enable_feature(struct device *dev,
>   		return -EBUSY;
>   
>   	switch (feat) {
> +	case IOMMU_DEV_FEAT_IOPF:
> +		return arm_smmu_master_enable_iopf(master);
>   	case IOMMU_DEV_FEAT_SVA:
> -		return arm_smmu_master_enable_sva(dev_iommu_priv_get(dev));
> +		return arm_smmu_master_enable_sva(master);
>   	default:
>   		return -EINVAL;
>   	}
> @@ -2672,12 +2825,16 @@ static int arm_smmu_dev_enable_feature(struct device *dev,
>   static int arm_smmu_dev_disable_feature(struct device *dev,
>   					enum iommu_dev_features feat)
>   {
> +	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> +
>   	if (!arm_smmu_dev_feature_enabled(dev, feat))
>   		return -EINVAL;
>   
>   	switch (feat) {
> +	case IOMMU_DEV_FEAT_IOPF:
> +		return arm_smmu_master_disable_iopf(master);
>   	case IOMMU_DEV_FEAT_SVA:
> -		return arm_smmu_master_disable_sva(dev_iommu_priv_get(dev));
> +		return arm_smmu_master_disable_sva(master);
>   	default:
>   		return -EINVAL;
>   	}
> @@ -2708,6 +2865,7 @@ static struct iommu_ops arm_smmu_ops = {
>   	.sva_bind		= arm_smmu_sva_bind,
>   	.sva_unbind		= arm_smmu_sva_unbind,
>   	.sva_get_pasid		= arm_smmu_sva_get_pasid,
> +	.page_response		= arm_smmu_page_response,
>   	.pgsize_bitmap		= -1UL, /* Restricted during device attach */
>   };
>   
> @@ -2785,6 +2943,7 @@ static int arm_smmu_cmdq_init(struct arm_smmu_device *smmu)
>   static int arm_smmu_init_queues(struct arm_smmu_device *smmu)
>   {
>   	int ret;
> +	bool sva = arm_smmu_sva_supported(smmu);
>   
>   	/* cmdq */
>   	ret = arm_smmu_init_one_queue(smmu, &smmu->cmdq.q, ARM_SMMU_CMDQ_PROD,
> @@ -2804,6 +2963,12 @@ static int arm_smmu_init_queues(struct arm_smmu_device *smmu)
>   	if (ret)
>   		return ret;
>   
> +	if (sva && smmu->features & ARM_SMMU_FEAT_STALLS) {

Surely you could just test for ARM_SMMU_FEAT_SVA by now rather than go 
through the whole of arm_smmu_sva_supported() again?

Robin.

> +		smmu->evtq.iopf = iopf_queue_alloc(dev_name(smmu->dev));
> +		if (!smmu->evtq.iopf)
> +			return -ENOMEM;
> +	}
> +
>   	/* priq */
>   	if (!(smmu->features & ARM_SMMU_FEAT_PRI))
>   		return 0;
> @@ -3718,6 +3883,7 @@ static int arm_smmu_device_remove(struct platform_device *pdev)
>   	iommu_device_unregister(&smmu->iommu);
>   	iommu_device_sysfs_remove(&smmu->iommu);
>   	arm_smmu_device_disable(smmu);
> +	iopf_queue_free(smmu->evtq.iopf);
>   
>   	return 0;
>   }
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
  2021-01-19 10:16                   ` Jean-Philippe Brucker
  (?)
@ 2021-01-20  1:57                     ` Lu Baolu
  -1 siblings, 0 replies; 105+ messages in thread
From: Lu Baolu @ 2021-01-20  1:57 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Tian, Kevin
  Cc: baolu.lu, joro, will, lorenzo.pieralisi, robh+dt, guohanjun,
	sudeep.holla, rjw, lenb, robin.murphy, Jonathan.Cameron,
	eric.auger, iommu, devicetree, linux-acpi, linux-arm-kernel,
	linux-accelerators, vdumpa, zhangfei.gao,
	shameerali.kolothum.thodi, vivek.gautam, Arnd Bergmann,
	David Woodhouse, Greg Kroah-Hartman, Zhou Wang

Hi Jean,

On 1/19/21 6:16 PM, Jean-Philippe Brucker wrote:
> On Mon, Jan 18, 2021 at 06:54:28AM +0000, Tian, Kevin wrote:
>>> From: Lu Baolu <baolu.lu@linux.intel.com>
>>> Sent: Saturday, January 16, 2021 11:54 AM
>>>
>>> Hi Jean,
>>>
>>> On 2021/1/15 0:41, Jean-Philippe Brucker wrote:
>>>> I guess detailing what's needed for nested IOPF can help the discussion,
>>>> although I haven't seen any concrete plan about implementing it, and it
>>>> still seems a couple of years away. There are two important steps with
>>>> nested IOPF:
>>>>
>>>> (1) Figuring out whether a fault comes from L1 or L2. A SMMU stall event
>>>>       comes with this information, but a PRI page request doesn't. The
>>> IOMMU
>>>>       driver has to first translate the IOVA to a GPA, injecting the fault
>>>>       into the guest if this translation fails by using the usual
>>>>       iommu_report_device_fault().
>>
>> The IOMMU driver can walk the page tables to find out the level information.
>> If the walk terminates at the 1st level, inject to the guest. Otherwise fix the
>> mm fault at 2nd level. It's not efficient compared to hardware-provided info,
>> but it's doable and actual overhead needs to be measured (optimization exists
>> e.g. having fault client to hint no 2nd level fault expected when registering fault
>> handler in pinned case).
>>
>>>>
>>>> (2) Translating the faulting GPA to a HVA that can be fed to
>>>>       handle_mm_fault(). That requires help from KVM, so another interface -
>>>>       either KVM registering GPA->HVA translation tables or IOMMU driver
>>>>       querying each translation. Either way it should be reusable by device
>>>>       drivers that implement IOPF themselves.
>>
>> Or just leave to the fault client (say VFIO here) to figure it out. VFIO has the
>> information about GPA->HPA and can then call handle_mm_fault to fix the
>> received fault. The IOMMU driver just exports an interface for the device drivers
>> which implement IOPF themselves to report a fault which is then handled by
>> the IOMMU core by reusing the same faulting path.
>>
>>>>
>>>> (1) could be enabled with iommu_dev_enable_feature(). (2) requires a
>>> more
>>>> complex interface. (2) alone might also be desirable - demand-paging for
>>>> level 2 only, no SVA for level 1.
>>
>> Yes, this is what we want to point out. A general FEAT_IOPF implies more than
>> what this patch intended to address.
>>
>>>>
>>>> Anyway, back to this patch. What I'm trying to convey is "can the IOMMU
>>>> receive incoming I/O page faults for this device and, when SVA is enabled,
>>>> feed them to the mm subsystem?  Enable that or return an error." I'm stuck
>>>> on the name. IOPF alone is too vague. Not IOPF_L1 as Kevin noted, since L1
>>>> is also used in virtualization. IOPF_BIND and IOPF_SVA could also mean (2)
>>>> above. IOMMU_DEV_FEAT_IOPF_FLAT?
>>>>
>>>> That leaves space for the nested extensions. (1) above could be
>>>> IOMMU_FEAT_IOPF_NESTED, and (2) requires some new interfacing with
>>> KVM (or
>>>> just an external fault handler) and could be used with either IOPF_FLAT or
>>>> IOPF_NESTED. We can figure out the details later. What do you think?
>>>
>>> I agree that we can define IOPF_ for current usage and leave space for
>>> future extensions.
>>>
>>> IOPF_FLAT represents IOPF on first-level translation, currently first
>>> level translation could be used in below cases.
>>>
>>> 1) FL w/ internal Page Table: Kernel IOVA;
>>> 2) FL w/ external Page Table: VFIO passthrough;
>>> 3) FL w/ shared CPU page table: SVA
>>>
>>> We don't need to support IOPF for case 1). Let's put it aside.
>>>
>>> IOPF handling of 2) and 3) are different. Do we need to define different
>>> names to distinguish these two cases?
>>>
>>
>> Defining feature names according to various use cases does not sound a
>> clean way. In an ideal way we should have just a general FEAT_IOPF since
>> the hardware (at least VT-d) does support fault in either 1st-level, 2nd-
>> level or nested configurations. We are entering this trouble just because
>> there is difficulty for the software evolving to enable full hardware cap
>> in one batch. My last proposal was sort of keeping FEAT_IOPF as a general
>> capability for whether delivering fault through the IOMMU or the ad-hoc
>> device, and then having a separate interface for whether IOPF reporting
>> is available under a specific configuration. The former is about the path
>> between the IOMMU and the device, while the latter is about the interface
>> between the IOMMU driver and its faulting client.
>>
>> The reporting capability can be checked when the fault client is registering
>> its fault handler, and at this time the IOMMU driver knows how the related
>> mapping is configured (1st, 2nd, or nested) and whether fault reporting is
>> supported in such configuration. We may introduce IOPF_REPORT_FLAT and
>> IOPF_REPORT_NESTED respectively. while IOPF_REPORT_FLAT detection is
>> straightforward (2 and 3 can be differentiated internally based on configured
>> level), IOPF_REPORT_NESTED needs additional info to indicate which level is
>> concerned since the vendor driver may not support fault reporting in both
>> levels or the fault client may be interested in only one level (e.g. with 2nd
>> level pinned).
> 
> I agree with this plan (provided I understood it correctly this time):
> have IOMMU_DEV_FEAT_IOPF describing the IOPF interface between device and
> IOMMU. Enabling it on its own doesn't do anything visible to the driver,
> it just probes for capabilities and enables PRI if necessary. For host
> SVA, since there is no additional communication between IOMMU and device
> driver, enabling IOMMU_DEV_FEAT_SVA in addition to IOPF is sufficient.
> Then when implementing nested we'll extend iommu_register_fault_handler()
> with flags and parameters. That will also enable advanced dispatching (1).
> 
> Will it be necessary to enable FEAT_IOPF when doing VFIO passthrough
> (injecting to the guest or handling it with external page tables)?
> I think that would be better. Currently a device driver registering a
> fault handler doesn't know if it will get recoverable page faults or only
> unrecoverable ones.
> 
> So I don't think this patch needs any change. Baolu, are you ok with
> keeping this and patch 4?

It sounds good to me. Keep FEAT_IOPF as the IOMMU capability of
generating I/O page fault and differentiate different I/O page faults by
extending the fault handler register interface.

> 
> Thanks,
> Jean
> 

Best regards,
baolu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
@ 2021-01-20  1:57                     ` Lu Baolu
  0 siblings, 0 replies; 105+ messages in thread
From: Lu Baolu @ 2021-01-20  1:57 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Tian, Kevin
  Cc: Greg Kroah-Hartman, vivek.gautam, guohanjun, will, linux-acpi,
	zhangfei.gao, lenb, devicetree, Arnd Bergmann, robh+dt,
	linux-arm-kernel, David Woodhouse, rjw, iommu, sudeep.holla,
	robin.murphy, linux-accelerators

Hi Jean,

On 1/19/21 6:16 PM, Jean-Philippe Brucker wrote:
> On Mon, Jan 18, 2021 at 06:54:28AM +0000, Tian, Kevin wrote:
>>> From: Lu Baolu <baolu.lu@linux.intel.com>
>>> Sent: Saturday, January 16, 2021 11:54 AM
>>>
>>> Hi Jean,
>>>
>>> On 2021/1/15 0:41, Jean-Philippe Brucker wrote:
>>>> I guess detailing what's needed for nested IOPF can help the discussion,
>>>> although I haven't seen any concrete plan about implementing it, and it
>>>> still seems a couple of years away. There are two important steps with
>>>> nested IOPF:
>>>>
>>>> (1) Figuring out whether a fault comes from L1 or L2. A SMMU stall event
>>>>       comes with this information, but a PRI page request doesn't. The
>>> IOMMU
>>>>       driver has to first translate the IOVA to a GPA, injecting the fault
>>>>       into the guest if this translation fails by using the usual
>>>>       iommu_report_device_fault().
>>
>> The IOMMU driver can walk the page tables to find out the level information.
>> If the walk terminates at the 1st level, inject to the guest. Otherwise fix the
>> mm fault at 2nd level. It's not efficient compared to hardware-provided info,
>> but it's doable and actual overhead needs to be measured (optimization exists
>> e.g. having fault client to hint no 2nd level fault expected when registering fault
>> handler in pinned case).
>>
>>>>
>>>> (2) Translating the faulting GPA to a HVA that can be fed to
>>>>       handle_mm_fault(). That requires help from KVM, so another interface -
>>>>       either KVM registering GPA->HVA translation tables or IOMMU driver
>>>>       querying each translation. Either way it should be reusable by device
>>>>       drivers that implement IOPF themselves.
>>
>> Or just leave to the fault client (say VFIO here) to figure it out. VFIO has the
>> information about GPA->HPA and can then call handle_mm_fault to fix the
>> received fault. The IOMMU driver just exports an interface for the device drivers
>> which implement IOPF themselves to report a fault which is then handled by
>> the IOMMU core by reusing the same faulting path.
>>
>>>>
>>>> (1) could be enabled with iommu_dev_enable_feature(). (2) requires a
>>> more
>>>> complex interface. (2) alone might also be desirable - demand-paging for
>>>> level 2 only, no SVA for level 1.
>>
>> Yes, this is what we want to point out. A general FEAT_IOPF implies more than
>> what this patch intended to address.
>>
>>>>
>>>> Anyway, back to this patch. What I'm trying to convey is "can the IOMMU
>>>> receive incoming I/O page faults for this device and, when SVA is enabled,
>>>> feed them to the mm subsystem?  Enable that or return an error." I'm stuck
>>>> on the name. IOPF alone is too vague. Not IOPF_L1 as Kevin noted, since L1
>>>> is also used in virtualization. IOPF_BIND and IOPF_SVA could also mean (2)
>>>> above. IOMMU_DEV_FEAT_IOPF_FLAT?
>>>>
>>>> That leaves space for the nested extensions. (1) above could be
>>>> IOMMU_FEAT_IOPF_NESTED, and (2) requires some new interfacing with
>>> KVM (or
>>>> just an external fault handler) and could be used with either IOPF_FLAT or
>>>> IOPF_NESTED. We can figure out the details later. What do you think?
>>>
>>> I agree that we can define IOPF_ for current usage and leave space for
>>> future extensions.
>>>
>>> IOPF_FLAT represents IOPF on first-level translation, currently first
>>> level translation could be used in below cases.
>>>
>>> 1) FL w/ internal Page Table: Kernel IOVA;
>>> 2) FL w/ external Page Table: VFIO passthrough;
>>> 3) FL w/ shared CPU page table: SVA
>>>
>>> We don't need to support IOPF for case 1). Let's put it aside.
>>>
>>> IOPF handling of 2) and 3) are different. Do we need to define different
>>> names to distinguish these two cases?
>>>
>>
>> Defining feature names according to various use cases does not sound a
>> clean way. In an ideal way we should have just a general FEAT_IOPF since
>> the hardware (at least VT-d) does support fault in either 1st-level, 2nd-
>> level or nested configurations. We are entering this trouble just because
>> there is difficulty for the software evolving to enable full hardware cap
>> in one batch. My last proposal was sort of keeping FEAT_IOPF as a general
>> capability for whether delivering fault through the IOMMU or the ad-hoc
>> device, and then having a separate interface for whether IOPF reporting
>> is available under a specific configuration. The former is about the path
>> between the IOMMU and the device, while the latter is about the interface
>> between the IOMMU driver and its faulting client.
>>
>> The reporting capability can be checked when the fault client is registering
>> its fault handler, and at this time the IOMMU driver knows how the related
>> mapping is configured (1st, 2nd, or nested) and whether fault reporting is
>> supported in such configuration. We may introduce IOPF_REPORT_FLAT and
>> IOPF_REPORT_NESTED respectively. while IOPF_REPORT_FLAT detection is
>> straightforward (2 and 3 can be differentiated internally based on configured
>> level), IOPF_REPORT_NESTED needs additional info to indicate which level is
>> concerned since the vendor driver may not support fault reporting in both
>> levels or the fault client may be interested in only one level (e.g. with 2nd
>> level pinned).
> 
> I agree with this plan (provided I understood it correctly this time):
> have IOMMU_DEV_FEAT_IOPF describing the IOPF interface between device and
> IOMMU. Enabling it on its own doesn't do anything visible to the driver,
> it just probes for capabilities and enables PRI if necessary. For host
> SVA, since there is no additional communication between IOMMU and device
> driver, enabling IOMMU_DEV_FEAT_SVA in addition to IOPF is sufficient.
> Then when implementing nested we'll extend iommu_register_fault_handler()
> with flags and parameters. That will also enable advanced dispatching (1).
> 
> Will it be necessary to enable FEAT_IOPF when doing VFIO passthrough
> (injecting to the guest or handling it with external page tables)?
> I think that would be better. Currently a device driver registering a
> fault handler doesn't know if it will get recoverable page faults or only
> unrecoverable ones.
> 
> So I don't think this patch needs any change. Baolu, are you ok with
> keeping this and patch 4?

It sounds good to me. Keep FEAT_IOPF as the IOMMU capability of
generating I/O page fault and differentiate different I/O page faults by
extending the fault handler register interface.

> 
> Thanks,
> Jean
> 

Best regards,
baolu
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA
@ 2021-01-20  1:57                     ` Lu Baolu
  0 siblings, 0 replies; 105+ messages in thread
From: Lu Baolu @ 2021-01-20  1:57 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Tian, Kevin
  Cc: Greg Kroah-Hartman, vivek.gautam, guohanjun, will,
	lorenzo.pieralisi, joro, Zhou Wang, linux-acpi, zhangfei.gao,
	lenb, devicetree, Arnd Bergmann, eric.auger, robh+dt,
	Jonathan.Cameron, linux-arm-kernel, David Woodhouse, rjw,
	shameerali.kolothum.thodi, iommu, sudeep.holla, robin.murphy,
	linux-accelerators, baolu.lu

Hi Jean,

On 1/19/21 6:16 PM, Jean-Philippe Brucker wrote:
> On Mon, Jan 18, 2021 at 06:54:28AM +0000, Tian, Kevin wrote:
>>> From: Lu Baolu <baolu.lu@linux.intel.com>
>>> Sent: Saturday, January 16, 2021 11:54 AM
>>>
>>> Hi Jean,
>>>
>>> On 2021/1/15 0:41, Jean-Philippe Brucker wrote:
>>>> I guess detailing what's needed for nested IOPF can help the discussion,
>>>> although I haven't seen any concrete plan about implementing it, and it
>>>> still seems a couple of years away. There are two important steps with
>>>> nested IOPF:
>>>>
>>>> (1) Figuring out whether a fault comes from L1 or L2. A SMMU stall event
>>>>       comes with this information, but a PRI page request doesn't. The
>>> IOMMU
>>>>       driver has to first translate the IOVA to a GPA, injecting the fault
>>>>       into the guest if this translation fails by using the usual
>>>>       iommu_report_device_fault().
>>
>> The IOMMU driver can walk the page tables to find out the level information.
>> If the walk terminates at the 1st level, inject to the guest. Otherwise fix the
>> mm fault at 2nd level. It's not efficient compared to hardware-provided info,
>> but it's doable and actual overhead needs to be measured (optimization exists
>> e.g. having fault client to hint no 2nd level fault expected when registering fault
>> handler in pinned case).
>>
>>>>
>>>> (2) Translating the faulting GPA to a HVA that can be fed to
>>>>       handle_mm_fault(). That requires help from KVM, so another interface -
>>>>       either KVM registering GPA->HVA translation tables or IOMMU driver
>>>>       querying each translation. Either way it should be reusable by device
>>>>       drivers that implement IOPF themselves.
>>
>> Or just leave to the fault client (say VFIO here) to figure it out. VFIO has the
>> information about GPA->HPA and can then call handle_mm_fault to fix the
>> received fault. The IOMMU driver just exports an interface for the device drivers
>> which implement IOPF themselves to report a fault which is then handled by
>> the IOMMU core by reusing the same faulting path.
>>
>>>>
>>>> (1) could be enabled with iommu_dev_enable_feature(). (2) requires a
>>> more
>>>> complex interface. (2) alone might also be desirable - demand-paging for
>>>> level 2 only, no SVA for level 1.
>>
>> Yes, this is what we want to point out. A general FEAT_IOPF implies more than
>> what this patch intended to address.
>>
>>>>
>>>> Anyway, back to this patch. What I'm trying to convey is "can the IOMMU
>>>> receive incoming I/O page faults for this device and, when SVA is enabled,
>>>> feed them to the mm subsystem?  Enable that or return an error." I'm stuck
>>>> on the name. IOPF alone is too vague. Not IOPF_L1 as Kevin noted, since L1
>>>> is also used in virtualization. IOPF_BIND and IOPF_SVA could also mean (2)
>>>> above. IOMMU_DEV_FEAT_IOPF_FLAT?
>>>>
>>>> That leaves space for the nested extensions. (1) above could be
>>>> IOMMU_FEAT_IOPF_NESTED, and (2) requires some new interfacing with
>>> KVM (or
>>>> just an external fault handler) and could be used with either IOPF_FLAT or
>>>> IOPF_NESTED. We can figure out the details later. What do you think?
>>>
>>> I agree that we can define IOPF_ for current usage and leave space for
>>> future extensions.
>>>
>>> IOPF_FLAT represents IOPF on first-level translation, currently first
>>> level translation could be used in below cases.
>>>
>>> 1) FL w/ internal Page Table: Kernel IOVA;
>>> 2) FL w/ external Page Table: VFIO passthrough;
>>> 3) FL w/ shared CPU page table: SVA
>>>
>>> We don't need to support IOPF for case 1). Let's put it aside.
>>>
>>> IOPF handling of 2) and 3) are different. Do we need to define different
>>> names to distinguish these two cases?
>>>
>>
>> Defining feature names according to various use cases does not sound a
>> clean way. In an ideal way we should have just a general FEAT_IOPF since
>> the hardware (at least VT-d) does support fault in either 1st-level, 2nd-
>> level or nested configurations. We are entering this trouble just because
>> there is difficulty for the software evolving to enable full hardware cap
>> in one batch. My last proposal was sort of keeping FEAT_IOPF as a general
>> capability for whether delivering fault through the IOMMU or the ad-hoc
>> device, and then having a separate interface for whether IOPF reporting
>> is available under a specific configuration. The former is about the path
>> between the IOMMU and the device, while the latter is about the interface
>> between the IOMMU driver and its faulting client.
>>
>> The reporting capability can be checked when the fault client is registering
>> its fault handler, and at this time the IOMMU driver knows how the related
>> mapping is configured (1st, 2nd, or nested) and whether fault reporting is
>> supported in such configuration. We may introduce IOPF_REPORT_FLAT and
>> IOPF_REPORT_NESTED respectively. while IOPF_REPORT_FLAT detection is
>> straightforward (2 and 3 can be differentiated internally based on configured
>> level), IOPF_REPORT_NESTED needs additional info to indicate which level is
>> concerned since the vendor driver may not support fault reporting in both
>> levels or the fault client may be interested in only one level (e.g. with 2nd
>> level pinned).
> 
> I agree with this plan (provided I understood it correctly this time):
> have IOMMU_DEV_FEAT_IOPF describing the IOPF interface between device and
> IOMMU. Enabling it on its own doesn't do anything visible to the driver,
> it just probes for capabilities and enables PRI if necessary. For host
> SVA, since there is no additional communication between IOMMU and device
> driver, enabling IOMMU_DEV_FEAT_SVA in addition to IOPF is sufficient.
> Then when implementing nested we'll extend iommu_register_fault_handler()
> with flags and parameters. That will also enable advanced dispatching (1).
> 
> Will it be necessary to enable FEAT_IOPF when doing VFIO passthrough
> (injecting to the guest or handling it with external page tables)?
> I think that would be better. Currently a device driver registering a
> fault handler doesn't know if it will get recoverable page faults or only
> unrecoverable ones.
> 
> So I don't think this patch needs any change. Baolu, are you ok with
> keeping this and patch 4?

It sounds good to me. Keep FEAT_IOPF as the IOMMU capability of
generating I/O page fault and differentiate different I/O page faults by
extending the fault handler register interface.

> 
> Thanks,
> Jean
> 

Best regards,
baolu

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 01/10] iommu: Remove obsolete comment
  2021-01-19 11:11     ` Jonathan Cameron
  (?)
@ 2021-01-20 17:41       ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-20 17:41 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: joro, will, lorenzo.pieralisi, robh+dt, guohanjun, sudeep.holla,
	rjw, lenb, robin.murphy, eric.auger, iommu, devicetree,
	linux-acpi, linux-arm-kernel, linux-accelerators, baolu.lu,
	vdumpa, zhangfei.gao, shameerali.kolothum.thodi, vivek.gautam

On Tue, Jan 19, 2021 at 11:11:44AM +0000, Jonathan Cameron wrote:
> On Fri, 8 Jan 2021 15:52:09 +0100
> Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:
> 
> > Commit 986d5ecc5699 ("iommu: Move fwspec->iommu_priv to struct
> > dev_iommu") removed iommu_priv from fwspec. Update the struct doc.
> > 
> > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org
> 
> Hi Jean-Philippe,
> 
> Flags parameter doesn't have any docs in this structure and should
> do given kernel-doc should be complete.  It probably spits out a warning
> for this if you build with W=1

Ah right, I had a patch removing the flags field locally, but I'm not
planning to upstream that one anymore. I don't mind fixing up the comment
in next version.

Thanks,
Jean

> 
> Not sure if it makes sense to fix that in this same patch, or as a different
> one as the responsible patch is a different one.
> Looks like that came in:
> Commit 5702ee24182f ("ACPI/IORT: Check ATS capability in root complex nodes")
> 
> Also, good to get this patch merged asap so we cut down on the noise in the
> interesting part of this series!
> 
> FWIW
> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> Jonathan

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 01/10] iommu: Remove obsolete comment
@ 2021-01-20 17:41       ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-20 17:41 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: devicetree, linux-acpi, robin.murphy, guohanjun, rjw, iommu,
	robh+dt, linux-accelerators, sudeep.holla, vivek.gautam,
	zhangfei.gao, will, linux-arm-kernel, lenb

On Tue, Jan 19, 2021 at 11:11:44AM +0000, Jonathan Cameron wrote:
> On Fri, 8 Jan 2021 15:52:09 +0100
> Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:
> 
> > Commit 986d5ecc5699 ("iommu: Move fwspec->iommu_priv to struct
> > dev_iommu") removed iommu_priv from fwspec. Update the struct doc.
> > 
> > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org
> 
> Hi Jean-Philippe,
> 
> Flags parameter doesn't have any docs in this structure and should
> do given kernel-doc should be complete.  It probably spits out a warning
> for this if you build with W=1

Ah right, I had a patch removing the flags field locally, but I'm not
planning to upstream that one anymore. I don't mind fixing up the comment
in next version.

Thanks,
Jean

> 
> Not sure if it makes sense to fix that in this same patch, or as a different
> one as the responsible patch is a different one.
> Looks like that came in:
> Commit 5702ee24182f ("ACPI/IORT: Check ATS capability in root complex nodes")
> 
> Also, good to get this patch merged asap so we cut down on the noise in the
> interesting part of this series!
> 
> FWIW
> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> Jonathan
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 01/10] iommu: Remove obsolete comment
@ 2021-01-20 17:41       ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-20 17:41 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: devicetree, lorenzo.pieralisi, linux-acpi, robin.murphy, joro,
	guohanjun, rjw, shameerali.kolothum.thodi, eric.auger, iommu,
	robh+dt, linux-accelerators, sudeep.holla, vivek.gautam,
	zhangfei.gao, baolu.lu, will, linux-arm-kernel, lenb

On Tue, Jan 19, 2021 at 11:11:44AM +0000, Jonathan Cameron wrote:
> On Fri, 8 Jan 2021 15:52:09 +0100
> Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:
> 
> > Commit 986d5ecc5699 ("iommu: Move fwspec->iommu_priv to struct
> > dev_iommu") removed iommu_priv from fwspec. Update the struct doc.
> > 
> > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org
> 
> Hi Jean-Philippe,
> 
> Flags parameter doesn't have any docs in this structure and should
> do given kernel-doc should be complete.  It probably spits out a warning
> for this if you build with W=1

Ah right, I had a patch removing the flags field locally, but I'm not
planning to upstream that one anymore. I don't mind fixing up the comment
in next version.

Thanks,
Jean

> 
> Not sure if it makes sense to fix that in this same patch, or as a different
> one as the responsible patch is a different one.
> Looks like that came in:
> Commit 5702ee24182f ("ACPI/IORT: Check ATS capability in root complex nodes")
> 
> Also, good to get this patch merged asap so we cut down on the noise in the
> interesting part of this series!
> 
> FWIW
> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> Jonathan

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 05/10] uacce: Enable IOMMU_DEV_FEAT_IOPF
  2021-01-19 12:27     ` Jonathan Cameron
  (?)
@ 2021-01-20 17:42       ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-20 17:42 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: joro, will, lorenzo.pieralisi, robh+dt, guohanjun, sudeep.holla,
	rjw, lenb, robin.murphy, eric.auger, iommu, devicetree,
	linux-acpi, linux-arm-kernel, linux-accelerators, baolu.lu,
	vdumpa, zhangfei.gao, shameerali.kolothum.thodi, vivek.gautam,
	Arnd Bergmann, Greg Kroah-Hartman, Zhou Wang

On Tue, Jan 19, 2021 at 12:27:59PM +0000, Jonathan Cameron wrote:
> On Fri, 8 Jan 2021 15:52:13 +0100
> Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:
> 
> > The IOPF (I/O Page Fault) feature is now enabled independently from the
> > SVA feature, because some IOPF implementations are device-specific and
> > do not require IOMMU support for PCIe PRI or Arm SMMU stall.
> > 
> > Enable IOPF unconditionally when enabling SVA for now. In the future, if
> > a device driver implementing a uacce interface doesn't need IOPF
> > support, it will need to tell the uacce module, for example with a new
> > flag.
> > 
> > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Hi Jean-Philippe,
> 
> A minor suggestion inline but I'm not that bothered so either way
> looks good to me.

No problem, I'll add the disable function

Thanks,
Jean

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 05/10] uacce: Enable IOMMU_DEV_FEAT_IOPF
@ 2021-01-20 17:42       ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-20 17:42 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: vivek.gautam, guohanjun, will, linux-acpi, zhangfei.gao, lenb,
	devicetree, Arnd Bergmann, robh+dt, linux-arm-kernel,
	Greg Kroah-Hartman, rjw, iommu, sudeep.holla, robin.murphy,
	linux-accelerators

On Tue, Jan 19, 2021 at 12:27:59PM +0000, Jonathan Cameron wrote:
> On Fri, 8 Jan 2021 15:52:13 +0100
> Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:
> 
> > The IOPF (I/O Page Fault) feature is now enabled independently from the
> > SVA feature, because some IOPF implementations are device-specific and
> > do not require IOMMU support for PCIe PRI or Arm SMMU stall.
> > 
> > Enable IOPF unconditionally when enabling SVA for now. In the future, if
> > a device driver implementing a uacce interface doesn't need IOPF
> > support, it will need to tell the uacce module, for example with a new
> > flag.
> > 
> > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Hi Jean-Philippe,
> 
> A minor suggestion inline but I'm not that bothered so either way
> looks good to me.

No problem, I'll add the disable function

Thanks,
Jean
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 05/10] uacce: Enable IOMMU_DEV_FEAT_IOPF
@ 2021-01-20 17:42       ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-20 17:42 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: vivek.gautam, guohanjun, will, lorenzo.pieralisi, joro,
	Zhou Wang, linux-acpi, zhangfei.gao, lenb, devicetree,
	Arnd Bergmann, eric.auger, robh+dt, linux-arm-kernel,
	Greg Kroah-Hartman, rjw, shameerali.kolothum.thodi, iommu,
	sudeep.holla, robin.murphy, linux-accelerators, baolu.lu

On Tue, Jan 19, 2021 at 12:27:59PM +0000, Jonathan Cameron wrote:
> On Fri, 8 Jan 2021 15:52:13 +0100
> Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:
> 
> > The IOPF (I/O Page Fault) feature is now enabled independently from the
> > SVA feature, because some IOPF implementations are device-specific and
> > do not require IOMMU support for PCIe PRI or Arm SMMU stall.
> > 
> > Enable IOPF unconditionally when enabling SVA for now. In the future, if
> > a device driver implementing a uacce interface doesn't need IOPF
> > support, it will need to tell the uacce module, for example with a new
> > flag.
> > 
> > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Hi Jean-Philippe,
> 
> A minor suggestion inline but I'm not that bothered so either way
> looks good to me.

No problem, I'll add the disable function

Thanks,
Jean

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 06/10] iommu: Add a page fault handler
  2021-01-19 13:38     ` Jonathan Cameron
  (?)
@ 2021-01-20 17:43       ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-20 17:43 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: joro, will, lorenzo.pieralisi, robh+dt, guohanjun, sudeep.holla,
	rjw, lenb, robin.murphy, eric.auger, iommu, devicetree,
	linux-acpi, linux-arm-kernel, linux-accelerators, baolu.lu,
	vdumpa, zhangfei.gao, shameerali.kolothum.thodi, vivek.gautam

On Tue, Jan 19, 2021 at 01:38:19PM +0000, Jonathan Cameron wrote:
> On Fri, 8 Jan 2021 15:52:14 +0100
> Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:
> 
> > Some systems allow devices to handle I/O Page Faults in the core mm. For
> > example systems implementing the PCIe PRI extension or Arm SMMU stall
> > model. Infrastructure for reporting these recoverable page faults was
> > added to the IOMMU core by commit 0c830e6b3282 ("iommu: Introduce device
> > fault report API"). Add a page fault handler for host SVA.
> > 
> > IOMMU driver can now instantiate several fault workqueues and link them
> > to IOPF-capable devices. Drivers can choose between a single global
> > workqueue, one per IOMMU device, one per low-level fault queue, one per
> > domain, etc.
> > 
> > When it receives a fault event, supposedly in an IRQ handler, the IOMMU
> 
> Why "supposedly"? Do you mean "most commonly" 
> 
> > driver reports the fault using iommu_report_device_fault(), which calls
> > the registered handler. The page fault handler then calls the mm fault
> > handler, and reports either success or failure with iommu_page_response().
> > When the handler succeeds, the IOMMU retries the access.
> 
> For PRI that description is perhaps a bit missleading.  IIRC the IOMMU
> will only retry when it gets a new ATS query.
> 
> > 
> > The iopf_param pointer could be embedded into iommu_fault_param. But
> > putting iopf_param into the iommu_param structure allows us not to care
> > about ordering between calls to iopf_queue_add_device() and
> > iommu_register_device_fault_handler().
> > 
> > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> 
> One really minor inconsistency inline that made me look twice..
> With or without that tided up FWIW.
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Thanks! I'll fix these up and resend
Jean


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 06/10] iommu: Add a page fault handler
@ 2021-01-20 17:43       ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-20 17:43 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: devicetree, linux-acpi, robin.murphy, guohanjun, rjw, iommu,
	robh+dt, linux-accelerators, sudeep.holla, vivek.gautam,
	zhangfei.gao, will, linux-arm-kernel, lenb

On Tue, Jan 19, 2021 at 01:38:19PM +0000, Jonathan Cameron wrote:
> On Fri, 8 Jan 2021 15:52:14 +0100
> Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:
> 
> > Some systems allow devices to handle I/O Page Faults in the core mm. For
> > example systems implementing the PCIe PRI extension or Arm SMMU stall
> > model. Infrastructure for reporting these recoverable page faults was
> > added to the IOMMU core by commit 0c830e6b3282 ("iommu: Introduce device
> > fault report API"). Add a page fault handler for host SVA.
> > 
> > IOMMU driver can now instantiate several fault workqueues and link them
> > to IOPF-capable devices. Drivers can choose between a single global
> > workqueue, one per IOMMU device, one per low-level fault queue, one per
> > domain, etc.
> > 
> > When it receives a fault event, supposedly in an IRQ handler, the IOMMU
> 
> Why "supposedly"? Do you mean "most commonly" 
> 
> > driver reports the fault using iommu_report_device_fault(), which calls
> > the registered handler. The page fault handler then calls the mm fault
> > handler, and reports either success or failure with iommu_page_response().
> > When the handler succeeds, the IOMMU retries the access.
> 
> For PRI that description is perhaps a bit missleading.  IIRC the IOMMU
> will only retry when it gets a new ATS query.
> 
> > 
> > The iopf_param pointer could be embedded into iommu_fault_param. But
> > putting iopf_param into the iommu_param structure allows us not to care
> > about ordering between calls to iopf_queue_add_device() and
> > iommu_register_device_fault_handler().
> > 
> > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> 
> One really minor inconsistency inline that made me look twice..
> With or without that tided up FWIW.
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Thanks! I'll fix these up and resend
Jean

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 06/10] iommu: Add a page fault handler
@ 2021-01-20 17:43       ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-20 17:43 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: devicetree, lorenzo.pieralisi, linux-acpi, robin.murphy, joro,
	guohanjun, rjw, shameerali.kolothum.thodi, eric.auger, iommu,
	robh+dt, linux-accelerators, sudeep.holla, vivek.gautam,
	zhangfei.gao, baolu.lu, will, linux-arm-kernel, lenb

On Tue, Jan 19, 2021 at 01:38:19PM +0000, Jonathan Cameron wrote:
> On Fri, 8 Jan 2021 15:52:14 +0100
> Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:
> 
> > Some systems allow devices to handle I/O Page Faults in the core mm. For
> > example systems implementing the PCIe PRI extension or Arm SMMU stall
> > model. Infrastructure for reporting these recoverable page faults was
> > added to the IOMMU core by commit 0c830e6b3282 ("iommu: Introduce device
> > fault report API"). Add a page fault handler for host SVA.
> > 
> > IOMMU driver can now instantiate several fault workqueues and link them
> > to IOPF-capable devices. Drivers can choose between a single global
> > workqueue, one per IOMMU device, one per low-level fault queue, one per
> > domain, etc.
> > 
> > When it receives a fault event, supposedly in an IRQ handler, the IOMMU
> 
> Why "supposedly"? Do you mean "most commonly" 
> 
> > driver reports the fault using iommu_report_device_fault(), which calls
> > the registered handler. The page fault handler then calls the mm fault
> > handler, and reports either success or failure with iommu_page_response().
> > When the handler succeeds, the IOMMU retries the access.
> 
> For PRI that description is perhaps a bit missleading.  IIRC the IOMMU
> will only retry when it gets a new ATS query.
> 
> > 
> > The iopf_param pointer could be embedded into iommu_fault_param. But
> > putting iopf_param into the iommu_param structure allows us not to care
> > about ordering between calls to iopf_queue_add_device() and
> > iommu_register_device_fault_handler().
> > 
> > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> 
> One really minor inconsistency inline that made me look twice..
> With or without that tided up FWIW.
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Thanks! I'll fix these up and resend
Jean


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 10/10] iommu/arm-smmu-v3: Add stall support for platform devices
  2021-01-19 17:28     ` Robin Murphy
  (?)
@ 2021-01-20 17:55       ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-20 17:55 UTC (permalink / raw)
  To: Robin Murphy
  Cc: joro, will, devicetree, linux-acpi, guohanjun, rjw, iommu,
	robh+dt, linux-accelerators, sudeep.holla, vivek.gautam,
	zhangfei.gao, linux-arm-kernel, lenb

On Tue, Jan 19, 2021 at 05:28:21PM +0000, Robin Murphy wrote:
> On 2021-01-08 14:52, Jean-Philippe Brucker wrote:
> > +#define EVTQ_1_PRIV			(1UL << 33)
> > +#define EVTQ_1_EXEC			(1UL << 34)
> > +#define EVTQ_1_READ			(1UL << 35)
> 
> Nit: personally I'd find it a little clearer if these were named PnU, InD,
> and RnW to match the architecture, but quite possibly that's just me and
> those are gibberish to everyone else...

No problem, I think it's still decipherable without a spec

> > +bool arm_smmu_master_iopf_enabled(struct arm_smmu_master *master)
> > +{
> > +	bool enabled;
> > +
> > +	mutex_lock(&sva_lock);
> > +	enabled = master->iopf_enabled;
> > +	mutex_unlock(&sva_lock);
> 
> Forgive me for being dim, but what's the locking synchronising against here?
> If we're expecting that master->iopf_enabled can change at any time, isn't
> whatever we've read potentially already invalid as soon as we've dropped the
> lock?

Right, no reason to lock this. I doubt the lock in sva_enabled() is
necessary either, I could remove it in a separate patch.

> > +static int arm_smmu_page_response(struct device *dev,
> > +				  struct iommu_fault_event *unused,
> > +				  struct iommu_page_response *resp)
> > +{
> > +	struct arm_smmu_cmdq_ent cmd = {0};
> > +	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> > +	int sid = master->streams[0].id;
> 
> If that's going to be the case, should we explicitly prevent multi-stream
> devices from opting in to faults at all?

Sure I'll add a check in iopf_supported(). Dealing with multi-stream
devices should be easy enough (record the incoming SID into
iommu_fault_event and fetch it back here), it just didn't seem necessary
for the moment.

> > +	if (evt[1] & EVTQ_1_STALL) {
> > +		flt->type = IOMMU_FAULT_PAGE_REQ;
> > +		flt->prm = (struct iommu_fault_page_request) {
> > +			.flags = IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE,
> > +			.grpid = FIELD_GET(EVTQ_1_STAG, evt[1]),
> > +			.perm = perm,
> > +			.addr = FIELD_GET(EVTQ_2_ADDR, evt[2]),
> > +		};
> > +
> > +		if (ssid_valid) {
> > +			flt->prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
> > +			flt->prm.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]);
> > +		}
> 
> So if we get a bad ATS request with R=1, or a TLB/CFG conflict or any other
> imp-def event which happens to have bit 95 set, we might try to report it as
> something pageable? I would have thought we should look at the event code
> before *anything* else.

Yes I definitely need to fix this

> > @@ -2250,6 +2383,12 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> >   			smmu_domain->s1_cfg.s1cdmax, master->ssid_bits);
> >   		ret = -EINVAL;
> >   		goto out_unlock;
> > +	} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
> > +		   smmu_domain->stall_enabled != master->stall_enabled) {
> 
> I appreciate that it's probably a fair bit more complex, but it would be
> nice to at least plan for resolving this decision later (i.e. at a point
> where a caller shows an interest in actually using stalls) in future.
> Obviously the first devices advertising stall capabilities will be the ones
> that do want to use it for their primary functionality, that are driving the
> work here. However once this all matures, firmwares may start annotating any
> stallable devices as such for completeness, rather than assuming any
> specific usage. At that point it would be a pain if, say, assigning two
> devices to the same VFIO domain for old-fashioned pinned DMA, was suddenly
> prevented for irrelevant reasons just because of a DT/IORT update.

It is more complex but possible. Device drivers signal their intent to use
stall by enabling IOMMU_DEV_FEAT_IOPF, so we can postpone setting CD.S
until then. We'll still need to make sure all devices attached to a domain
support it, and prevent attaching a device that can't handle stall to a
stall-enabled domain since it would inherit all CDs. Then there will be
drivers wanting to receive stall events for context #0 and handle them by
issuing iommu_map() calls (unpinned VFIO, mentioned by Baolu on patch
3). That requires setting and clearing CD.S live. So it is doable but I'd
rather leave it for later.

> > @@ -2785,6 +2943,7 @@ static int arm_smmu_cmdq_init(struct arm_smmu_device *smmu)
> >   static int arm_smmu_init_queues(struct arm_smmu_device *smmu)
> >   {
> >   	int ret;
> > +	bool sva = arm_smmu_sva_supported(smmu);
> >   	/* cmdq */
> >   	ret = arm_smmu_init_one_queue(smmu, &smmu->cmdq.q, ARM_SMMU_CMDQ_PROD,
> > @@ -2804,6 +2963,12 @@ static int arm_smmu_init_queues(struct arm_smmu_device *smmu)
> >   	if (ret)
> >   		return ret;
> > +	if (sva && smmu->features & ARM_SMMU_FEAT_STALLS) {
> 
> Surely you could just test for ARM_SMMU_FEAT_SVA by now rather than go
> through the whole of arm_smmu_sva_supported() again?

Oh right, that was dumb

Thanks for the review
Jean

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 10/10] iommu/arm-smmu-v3: Add stall support for platform devices
@ 2021-01-20 17:55       ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-20 17:55 UTC (permalink / raw)
  To: Robin Murphy
  Cc: devicetree, vivek.gautam, sudeep.holla, rjw, linux-acpi, iommu,
	robh+dt, linux-arm-kernel, guohanjun, zhangfei.gao, will,
	linux-accelerators, lenb

On Tue, Jan 19, 2021 at 05:28:21PM +0000, Robin Murphy wrote:
> On 2021-01-08 14:52, Jean-Philippe Brucker wrote:
> > +#define EVTQ_1_PRIV			(1UL << 33)
> > +#define EVTQ_1_EXEC			(1UL << 34)
> > +#define EVTQ_1_READ			(1UL << 35)
> 
> Nit: personally I'd find it a little clearer if these were named PnU, InD,
> and RnW to match the architecture, but quite possibly that's just me and
> those are gibberish to everyone else...

No problem, I think it's still decipherable without a spec

> > +bool arm_smmu_master_iopf_enabled(struct arm_smmu_master *master)
> > +{
> > +	bool enabled;
> > +
> > +	mutex_lock(&sva_lock);
> > +	enabled = master->iopf_enabled;
> > +	mutex_unlock(&sva_lock);
> 
> Forgive me for being dim, but what's the locking synchronising against here?
> If we're expecting that master->iopf_enabled can change at any time, isn't
> whatever we've read potentially already invalid as soon as we've dropped the
> lock?

Right, no reason to lock this. I doubt the lock in sva_enabled() is
necessary either, I could remove it in a separate patch.

> > +static int arm_smmu_page_response(struct device *dev,
> > +				  struct iommu_fault_event *unused,
> > +				  struct iommu_page_response *resp)
> > +{
> > +	struct arm_smmu_cmdq_ent cmd = {0};
> > +	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> > +	int sid = master->streams[0].id;
> 
> If that's going to be the case, should we explicitly prevent multi-stream
> devices from opting in to faults at all?

Sure I'll add a check in iopf_supported(). Dealing with multi-stream
devices should be easy enough (record the incoming SID into
iommu_fault_event and fetch it back here), it just didn't seem necessary
for the moment.

> > +	if (evt[1] & EVTQ_1_STALL) {
> > +		flt->type = IOMMU_FAULT_PAGE_REQ;
> > +		flt->prm = (struct iommu_fault_page_request) {
> > +			.flags = IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE,
> > +			.grpid = FIELD_GET(EVTQ_1_STAG, evt[1]),
> > +			.perm = perm,
> > +			.addr = FIELD_GET(EVTQ_2_ADDR, evt[2]),
> > +		};
> > +
> > +		if (ssid_valid) {
> > +			flt->prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
> > +			flt->prm.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]);
> > +		}
> 
> So if we get a bad ATS request with R=1, or a TLB/CFG conflict or any other
> imp-def event which happens to have bit 95 set, we might try to report it as
> something pageable? I would have thought we should look at the event code
> before *anything* else.

Yes I definitely need to fix this

> > @@ -2250,6 +2383,12 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> >   			smmu_domain->s1_cfg.s1cdmax, master->ssid_bits);
> >   		ret = -EINVAL;
> >   		goto out_unlock;
> > +	} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
> > +		   smmu_domain->stall_enabled != master->stall_enabled) {
> 
> I appreciate that it's probably a fair bit more complex, but it would be
> nice to at least plan for resolving this decision later (i.e. at a point
> where a caller shows an interest in actually using stalls) in future.
> Obviously the first devices advertising stall capabilities will be the ones
> that do want to use it for their primary functionality, that are driving the
> work here. However once this all matures, firmwares may start annotating any
> stallable devices as such for completeness, rather than assuming any
> specific usage. At that point it would be a pain if, say, assigning two
> devices to the same VFIO domain for old-fashioned pinned DMA, was suddenly
> prevented for irrelevant reasons just because of a DT/IORT update.

It is more complex but possible. Device drivers signal their intent to use
stall by enabling IOMMU_DEV_FEAT_IOPF, so we can postpone setting CD.S
until then. We'll still need to make sure all devices attached to a domain
support it, and prevent attaching a device that can't handle stall to a
stall-enabled domain since it would inherit all CDs. Then there will be
drivers wanting to receive stall events for context #0 and handle them by
issuing iommu_map() calls (unpinned VFIO, mentioned by Baolu on patch
3). That requires setting and clearing CD.S live. So it is doable but I'd
rather leave it for later.

> > @@ -2785,6 +2943,7 @@ static int arm_smmu_cmdq_init(struct arm_smmu_device *smmu)
> >   static int arm_smmu_init_queues(struct arm_smmu_device *smmu)
> >   {
> >   	int ret;
> > +	bool sva = arm_smmu_sva_supported(smmu);
> >   	/* cmdq */
> >   	ret = arm_smmu_init_one_queue(smmu, &smmu->cmdq.q, ARM_SMMU_CMDQ_PROD,
> > @@ -2804,6 +2963,12 @@ static int arm_smmu_init_queues(struct arm_smmu_device *smmu)
> >   	if (ret)
> >   		return ret;
> > +	if (sva && smmu->features & ARM_SMMU_FEAT_STALLS) {
> 
> Surely you could just test for ARM_SMMU_FEAT_SVA by now rather than go
> through the whole of arm_smmu_sva_supported() again?

Oh right, that was dumb

Thanks for the review
Jean
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 10/10] iommu/arm-smmu-v3: Add stall support for platform devices
@ 2021-01-20 17:55       ` Jean-Philippe Brucker
  0 siblings, 0 replies; 105+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-20 17:55 UTC (permalink / raw)
  To: Robin Murphy
  Cc: devicetree, vivek.gautam, joro, sudeep.holla, rjw, linux-acpi,
	iommu, robh+dt, linux-arm-kernel, guohanjun, zhangfei.gao, will,
	linux-accelerators, lenb

On Tue, Jan 19, 2021 at 05:28:21PM +0000, Robin Murphy wrote:
> On 2021-01-08 14:52, Jean-Philippe Brucker wrote:
> > +#define EVTQ_1_PRIV			(1UL << 33)
> > +#define EVTQ_1_EXEC			(1UL << 34)
> > +#define EVTQ_1_READ			(1UL << 35)
> 
> Nit: personally I'd find it a little clearer if these were named PnU, InD,
> and RnW to match the architecture, but quite possibly that's just me and
> those are gibberish to everyone else...

No problem, I think it's still decipherable without a spec

> > +bool arm_smmu_master_iopf_enabled(struct arm_smmu_master *master)
> > +{
> > +	bool enabled;
> > +
> > +	mutex_lock(&sva_lock);
> > +	enabled = master->iopf_enabled;
> > +	mutex_unlock(&sva_lock);
> 
> Forgive me for being dim, but what's the locking synchronising against here?
> If we're expecting that master->iopf_enabled can change at any time, isn't
> whatever we've read potentially already invalid as soon as we've dropped the
> lock?

Right, no reason to lock this. I doubt the lock in sva_enabled() is
necessary either, I could remove it in a separate patch.

> > +static int arm_smmu_page_response(struct device *dev,
> > +				  struct iommu_fault_event *unused,
> > +				  struct iommu_page_response *resp)
> > +{
> > +	struct arm_smmu_cmdq_ent cmd = {0};
> > +	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> > +	int sid = master->streams[0].id;
> 
> If that's going to be the case, should we explicitly prevent multi-stream
> devices from opting in to faults at all?

Sure I'll add a check in iopf_supported(). Dealing with multi-stream
devices should be easy enough (record the incoming SID into
iommu_fault_event and fetch it back here), it just didn't seem necessary
for the moment.

> > +	if (evt[1] & EVTQ_1_STALL) {
> > +		flt->type = IOMMU_FAULT_PAGE_REQ;
> > +		flt->prm = (struct iommu_fault_page_request) {
> > +			.flags = IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE,
> > +			.grpid = FIELD_GET(EVTQ_1_STAG, evt[1]),
> > +			.perm = perm,
> > +			.addr = FIELD_GET(EVTQ_2_ADDR, evt[2]),
> > +		};
> > +
> > +		if (ssid_valid) {
> > +			flt->prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
> > +			flt->prm.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]);
> > +		}
> 
> So if we get a bad ATS request with R=1, or a TLB/CFG conflict or any other
> imp-def event which happens to have bit 95 set, we might try to report it as
> something pageable? I would have thought we should look at the event code
> before *anything* else.

Yes I definitely need to fix this

> > @@ -2250,6 +2383,12 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> >   			smmu_domain->s1_cfg.s1cdmax, master->ssid_bits);
> >   		ret = -EINVAL;
> >   		goto out_unlock;
> > +	} else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 &&
> > +		   smmu_domain->stall_enabled != master->stall_enabled) {
> 
> I appreciate that it's probably a fair bit more complex, but it would be
> nice to at least plan for resolving this decision later (i.e. at a point
> where a caller shows an interest in actually using stalls) in future.
> Obviously the first devices advertising stall capabilities will be the ones
> that do want to use it for their primary functionality, that are driving the
> work here. However once this all matures, firmwares may start annotating any
> stallable devices as such for completeness, rather than assuming any
> specific usage. At that point it would be a pain if, say, assigning two
> devices to the same VFIO domain for old-fashioned pinned DMA, was suddenly
> prevented for irrelevant reasons just because of a DT/IORT update.

It is more complex but possible. Device drivers signal their intent to use
stall by enabling IOMMU_DEV_FEAT_IOPF, so we can postpone setting CD.S
until then. We'll still need to make sure all devices attached to a domain
support it, and prevent attaching a device that can't handle stall to a
stall-enabled domain since it would inherit all CDs. Then there will be
drivers wanting to receive stall events for context #0 and handle them by
issuing iommu_map() calls (unpinned VFIO, mentioned by Baolu on patch
3). That requires setting and clearing CD.S live. So it is doable but I'd
rather leave it for later.

> > @@ -2785,6 +2943,7 @@ static int arm_smmu_cmdq_init(struct arm_smmu_device *smmu)
> >   static int arm_smmu_init_queues(struct arm_smmu_device *smmu)
> >   {
> >   	int ret;
> > +	bool sva = arm_smmu_sva_supported(smmu);
> >   	/* cmdq */
> >   	ret = arm_smmu_init_one_queue(smmu, &smmu->cmdq.q, ARM_SMMU_CMDQ_PROD,
> > @@ -2804,6 +2963,12 @@ static int arm_smmu_init_queues(struct arm_smmu_device *smmu)
> >   	if (ret)
> >   		return ret;
> > +	if (sva && smmu->features & ARM_SMMU_FEAT_STALLS) {
> 
> Surely you could just test for ARM_SMMU_FEAT_SVA by now rather than go
> through the whole of arm_smmu_sva_supported() again?

Oh right, that was dumb

Thanks for the review
Jean

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 05/10] uacce: Enable IOMMU_DEV_FEAT_IOPF
  2021-01-08 14:52   ` Jean-Philippe Brucker
  (?)
@ 2021-01-20 20:47     ` Dave Jiang
  -1 siblings, 0 replies; 105+ messages in thread
From: Dave Jiang @ 2021-01-20 20:47 UTC (permalink / raw)
  To: Jean-Philippe Brucker, joro, will
  Cc: vivek.gautam, guohanjun, Zhou Wang, linux-acpi, zhangfei.gao,
	lenb, devicetree, Arnd Bergmann, eric.auger, vdumpa, robh+dt,
	linux-arm-kernel, Greg Kroah-Hartman, rjw,
	shameerali.kolothum.thodi, iommu, sudeep.holla, robin.murphy,
	linux-accelerators, baolu.lu, Dan Williams, Pan, Jacob jun


On 1/8/2021 7:52 AM, Jean-Philippe Brucker wrote:
> The IOPF (I/O Page Fault) feature is now enabled independently from the
> SVA feature, because some IOPF implementations are device-specific and
> do not require IOMMU support for PCIe PRI or Arm SMMU stall.
>
> Enable IOPF unconditionally when enabling SVA for now. In the future, if
> a device driver implementing a uacce interface doesn't need IOPF
> support, it will need to tell the uacce module, for example with a new
> flag.
>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> ---
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
> Cc: Zhou Wang <wangzhou1@hisilicon.com>
> ---
>   drivers/misc/uacce/uacce.c | 32 +++++++++++++++++++++++++-------
>   1 file changed, 25 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
> index d07af4edfcac..41ef1eb62a14 100644
> --- a/drivers/misc/uacce/uacce.c
> +++ b/drivers/misc/uacce/uacce.c
> @@ -385,6 +385,24 @@ static void uacce_release(struct device *dev)
>   	kfree(uacce);
>   }
>   
> +static unsigned int uacce_enable_sva(struct device *parent, unsigned int flags)
> +{
> +	if (!(flags & UACCE_DEV_SVA))
> +		return flags;
> +
> +	flags &= ~UACCE_DEV_SVA;
> +
> +	if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_IOPF))
> +		return flags;
> +
> +	if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA)) {
> +		iommu_dev_disable_feature(parent, IOMMU_DEV_FEAT_IOPF);
> +		return flags;
> +	}

Sorry to jump in a bit late on this and not specifically towards the 
intent of this patch. But I'd like to start a discussion on if we want 
to push the iommu dev feature enabling to the device driver itself 
rather than having UACCE control this? Maybe allow the device driver to 
manage the feature bits and UACCE only verify that they are enabled?

 1. The device driver knows what platform it's on and what specific
    feature bits its devices supports. Maybe in the future if there are
    feature bits that's needed on one platform and not on another?
 2. This allows the possibility of multiple uacce device registered to 1
    pci dev, which for a device with asymmetric queues (Intel DSA/idxd
    driver) that is desirable feature. The current setup forces a single
    uacce device per pdev. If additional uacce devs are registered, the
    first removal of uacce device will disable the feature bit for the
    rest of the registered devices. With uacce managing the feature bit,
    it would need to add device context to the parent pdev and ref
    counting. It may be cleaner to just allow device driver to manage
    the feature bits and the driver should have all the information on
    when the feature needs to be turned on and off.

- DaveJ


> +
> +	return flags | UACCE_DEV_SVA;
> +}
> +
>   /**
>    * uacce_alloc() - alloc an accelerator
>    * @parent: pointer of uacce parent device
> @@ -404,11 +422,7 @@ struct uacce_device *uacce_alloc(struct device *parent,
>   	if (!uacce)
>   		return ERR_PTR(-ENOMEM);
>   
> -	if (flags & UACCE_DEV_SVA) {
> -		ret = iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA);
> -		if (ret)
> -			flags &= ~UACCE_DEV_SVA;
> -	}
> +	flags = uacce_enable_sva(parent, flags);
>   
>   	uacce->parent = parent;
>   	uacce->flags = flags;
> @@ -432,8 +446,10 @@ struct uacce_device *uacce_alloc(struct device *parent,
>   	return uacce;
>   
>   err_with_uacce:
> -	if (flags & UACCE_DEV_SVA)
> +	if (flags & UACCE_DEV_SVA) {
>   		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
> +		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
> +	}
>   	kfree(uacce);
>   	return ERR_PTR(ret);
>   }
> @@ -487,8 +503,10 @@ void uacce_remove(struct uacce_device *uacce)
>   	mutex_unlock(&uacce->queues_lock);
>   
>   	/* disable sva now since no opened queues */
> -	if (uacce->flags & UACCE_DEV_SVA)
> +	if (uacce->flags & UACCE_DEV_SVA) {
>   		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
> +		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
> +	}
>   
>   	if (uacce->cdev)
>   		cdev_device_del(uacce->cdev, &uacce->dev);

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 05/10] uacce: Enable IOMMU_DEV_FEAT_IOPF
@ 2021-01-20 20:47     ` Dave Jiang
  0 siblings, 0 replies; 105+ messages in thread
From: Dave Jiang @ 2021-01-20 20:47 UTC (permalink / raw)
  To: Jean-Philippe Brucker, joro, will
  Cc: vivek.gautam, guohanjun, linux-acpi, zhangfei.gao, lenb,
	devicetree, Arnd Bergmann, robh+dt, Dan Williams,
	linux-arm-kernel, Greg Kroah-Hartman, rjw, iommu, Pan, Jacob jun,
	sudeep.holla, robin.murphy, linux-accelerators


On 1/8/2021 7:52 AM, Jean-Philippe Brucker wrote:
> The IOPF (I/O Page Fault) feature is now enabled independently from the
> SVA feature, because some IOPF implementations are device-specific and
> do not require IOMMU support for PCIe PRI or Arm SMMU stall.
>
> Enable IOPF unconditionally when enabling SVA for now. In the future, if
> a device driver implementing a uacce interface doesn't need IOPF
> support, it will need to tell the uacce module, for example with a new
> flag.
>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> ---
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
> Cc: Zhou Wang <wangzhou1@hisilicon.com>
> ---
>   drivers/misc/uacce/uacce.c | 32 +++++++++++++++++++++++++-------
>   1 file changed, 25 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
> index d07af4edfcac..41ef1eb62a14 100644
> --- a/drivers/misc/uacce/uacce.c
> +++ b/drivers/misc/uacce/uacce.c
> @@ -385,6 +385,24 @@ static void uacce_release(struct device *dev)
>   	kfree(uacce);
>   }
>   
> +static unsigned int uacce_enable_sva(struct device *parent, unsigned int flags)
> +{
> +	if (!(flags & UACCE_DEV_SVA))
> +		return flags;
> +
> +	flags &= ~UACCE_DEV_SVA;
> +
> +	if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_IOPF))
> +		return flags;
> +
> +	if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA)) {
> +		iommu_dev_disable_feature(parent, IOMMU_DEV_FEAT_IOPF);
> +		return flags;
> +	}

Sorry to jump in a bit late on this and not specifically towards the 
intent of this patch. But I'd like to start a discussion on if we want 
to push the iommu dev feature enabling to the device driver itself 
rather than having UACCE control this? Maybe allow the device driver to 
manage the feature bits and UACCE only verify that they are enabled?

 1. The device driver knows what platform it's on and what specific
    feature bits its devices supports. Maybe in the future if there are
    feature bits that's needed on one platform and not on another?
 2. This allows the possibility of multiple uacce device registered to 1
    pci dev, which for a device with asymmetric queues (Intel DSA/idxd
    driver) that is desirable feature. The current setup forces a single
    uacce device per pdev. If additional uacce devs are registered, the
    first removal of uacce device will disable the feature bit for the
    rest of the registered devices. With uacce managing the feature bit,
    it would need to add device context to the parent pdev and ref
    counting. It may be cleaner to just allow device driver to manage
    the feature bits and the driver should have all the information on
    when the feature needs to be turned on and off.

- DaveJ


> +
> +	return flags | UACCE_DEV_SVA;
> +}
> +
>   /**
>    * uacce_alloc() - alloc an accelerator
>    * @parent: pointer of uacce parent device
> @@ -404,11 +422,7 @@ struct uacce_device *uacce_alloc(struct device *parent,
>   	if (!uacce)
>   		return ERR_PTR(-ENOMEM);
>   
> -	if (flags & UACCE_DEV_SVA) {
> -		ret = iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA);
> -		if (ret)
> -			flags &= ~UACCE_DEV_SVA;
> -	}
> +	flags = uacce_enable_sva(parent, flags);
>   
>   	uacce->parent = parent;
>   	uacce->flags = flags;
> @@ -432,8 +446,10 @@ struct uacce_device *uacce_alloc(struct device *parent,
>   	return uacce;
>   
>   err_with_uacce:
> -	if (flags & UACCE_DEV_SVA)
> +	if (flags & UACCE_DEV_SVA) {
>   		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
> +		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
> +	}
>   	kfree(uacce);
>   	return ERR_PTR(ret);
>   }
> @@ -487,8 +503,10 @@ void uacce_remove(struct uacce_device *uacce)
>   	mutex_unlock(&uacce->queues_lock);
>   
>   	/* disable sva now since no opened queues */
> -	if (uacce->flags & UACCE_DEV_SVA)
> +	if (uacce->flags & UACCE_DEV_SVA) {
>   		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
> +		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
> +	}
>   
>   	if (uacce->cdev)
>   		cdev_device_del(uacce->cdev, &uacce->dev);
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 05/10] uacce: Enable IOMMU_DEV_FEAT_IOPF
@ 2021-01-20 20:47     ` Dave Jiang
  0 siblings, 0 replies; 105+ messages in thread
From: Dave Jiang @ 2021-01-20 20:47 UTC (permalink / raw)
  To: Jean-Philippe Brucker, joro, will
  Cc: vivek.gautam, guohanjun, Zhou Wang, linux-acpi, zhangfei.gao,
	lenb, devicetree, Arnd Bergmann, eric.auger, robh+dt,
	Dan Williams, linux-arm-kernel, Greg Kroah-Hartman, rjw,
	shameerali.kolothum.thodi, iommu, Pan, Jacob jun, sudeep.holla,
	robin.murphy, linux-accelerators, baolu.lu


On 1/8/2021 7:52 AM, Jean-Philippe Brucker wrote:
> The IOPF (I/O Page Fault) feature is now enabled independently from the
> SVA feature, because some IOPF implementations are device-specific and
> do not require IOMMU support for PCIe PRI or Arm SMMU stall.
>
> Enable IOPF unconditionally when enabling SVA for now. In the future, if
> a device driver implementing a uacce interface doesn't need IOPF
> support, it will need to tell the uacce module, for example with a new
> flag.
>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> ---
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
> Cc: Zhou Wang <wangzhou1@hisilicon.com>
> ---
>   drivers/misc/uacce/uacce.c | 32 +++++++++++++++++++++++++-------
>   1 file changed, 25 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
> index d07af4edfcac..41ef1eb62a14 100644
> --- a/drivers/misc/uacce/uacce.c
> +++ b/drivers/misc/uacce/uacce.c
> @@ -385,6 +385,24 @@ static void uacce_release(struct device *dev)
>   	kfree(uacce);
>   }
>   
> +static unsigned int uacce_enable_sva(struct device *parent, unsigned int flags)
> +{
> +	if (!(flags & UACCE_DEV_SVA))
> +		return flags;
> +
> +	flags &= ~UACCE_DEV_SVA;
> +
> +	if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_IOPF))
> +		return flags;
> +
> +	if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA)) {
> +		iommu_dev_disable_feature(parent, IOMMU_DEV_FEAT_IOPF);
> +		return flags;
> +	}

Sorry to jump in a bit late on this and not specifically towards the 
intent of this patch. But I'd like to start a discussion on if we want 
to push the iommu dev feature enabling to the device driver itself 
rather than having UACCE control this? Maybe allow the device driver to 
manage the feature bits and UACCE only verify that they are enabled?

 1. The device driver knows what platform it's on and what specific
    feature bits its devices supports. Maybe in the future if there are
    feature bits that's needed on one platform and not on another?
 2. This allows the possibility of multiple uacce device registered to 1
    pci dev, which for a device with asymmetric queues (Intel DSA/idxd
    driver) that is desirable feature. The current setup forces a single
    uacce device per pdev. If additional uacce devs are registered, the
    first removal of uacce device will disable the feature bit for the
    rest of the registered devices. With uacce managing the feature bit,
    it would need to add device context to the parent pdev and ref
    counting. It may be cleaner to just allow device driver to manage
    the feature bits and the driver should have all the information on
    when the feature needs to be turned on and off.

- DaveJ


> +
> +	return flags | UACCE_DEV_SVA;
> +}
> +
>   /**
>    * uacce_alloc() - alloc an accelerator
>    * @parent: pointer of uacce parent device
> @@ -404,11 +422,7 @@ struct uacce_device *uacce_alloc(struct device *parent,
>   	if (!uacce)
>   		return ERR_PTR(-ENOMEM);
>   
> -	if (flags & UACCE_DEV_SVA) {
> -		ret = iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA);
> -		if (ret)
> -			flags &= ~UACCE_DEV_SVA;
> -	}
> +	flags = uacce_enable_sva(parent, flags);
>   
>   	uacce->parent = parent;
>   	uacce->flags = flags;
> @@ -432,8 +446,10 @@ struct uacce_device *uacce_alloc(struct device *parent,
>   	return uacce;
>   
>   err_with_uacce:
> -	if (flags & UACCE_DEV_SVA)
> +	if (flags & UACCE_DEV_SVA) {
>   		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
> +		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
> +	}
>   	kfree(uacce);
>   	return ERR_PTR(ret);
>   }
> @@ -487,8 +503,10 @@ void uacce_remove(struct uacce_device *uacce)
>   	mutex_unlock(&uacce->queues_lock);
>   
>   	/* disable sva now since no opened queues */
> -	if (uacce->flags & UACCE_DEV_SVA)
> +	if (uacce->flags & UACCE_DEV_SVA) {
>   		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
> +		iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
> +	}
>   
>   	if (uacce->cdev)
>   		cdev_device_del(uacce->cdev, &uacce->dev);

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 05/10] uacce: Enable IOMMU_DEV_FEAT_IOPF
  2021-01-20 20:47     ` Dave Jiang
  (?)
@ 2021-01-22 11:53       ` Zhou Wang
  -1 siblings, 0 replies; 105+ messages in thread
From: Zhou Wang @ 2021-01-22 11:53 UTC (permalink / raw)
  To: Dave Jiang, Jean-Philippe Brucker, joro, will
  Cc: vivek.gautam, guohanjun, linux-acpi, zhangfei.gao, lenb,
	devicetree, Arnd Bergmann, eric.auger, vdumpa, robh+dt,
	linux-arm-kernel, Greg Kroah-Hartman, rjw,
	shameerali.kolothum.thodi, iommu, sudeep.holla, robin.murphy,
	linux-accelerators, baolu.lu, Dan Williams, Pan, Jacob jun

On 2021/1/21 4:47, Dave Jiang wrote:
> 
> On 1/8/2021 7:52 AM, Jean-Philippe Brucker wrote:
>> The IOPF (I/O Page Fault) feature is now enabled independently from the
>> SVA feature, because some IOPF implementations are device-specific and
>> do not require IOMMU support for PCIe PRI or Arm SMMU stall.
>>
>> Enable IOPF unconditionally when enabling SVA for now. In the future, if
>> a device driver implementing a uacce interface doesn't need IOPF
>> support, it will need to tell the uacce module, for example with a new
>> flag.
>>
>> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
>> ---
>> Cc: Arnd Bergmann <arnd@arndb.de>
>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
>> Cc: Zhou Wang <wangzhou1@hisilicon.com>
>> ---
>>   drivers/misc/uacce/uacce.c | 32 +++++++++++++++++++++++++-------
>>   1 file changed, 25 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
>> index d07af4edfcac..41ef1eb62a14 100644
>> --- a/drivers/misc/uacce/uacce.c
>> +++ b/drivers/misc/uacce/uacce.c
>> @@ -385,6 +385,24 @@ static void uacce_release(struct device *dev)
>>       kfree(uacce);
>>   }
>>   +static unsigned int uacce_enable_sva(struct device *parent, unsigned int flags)
>> +{
>> +    if (!(flags & UACCE_DEV_SVA))
>> +        return flags;
>> +
>> +    flags &= ~UACCE_DEV_SVA;
>> +
>> +    if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_IOPF))
>> +        return flags;
>> +
>> +    if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA)) {
>> +        iommu_dev_disable_feature(parent, IOMMU_DEV_FEAT_IOPF);
>> +        return flags;
>> +    }
> 
> Sorry to jump in a bit late on this and not specifically towards the
> intent of this patch. But I'd like to start a discussion on if we want
> to push the iommu dev feature enabling to the device driver itself rather
> than having UACCE control this? Maybe allow the device driver to manage
> the feature bits and UACCE only verify that they are enabled?
> 
> 1. The device driver knows what platform it's on and what specific
>    feature bits its devices supports. Maybe in the future if there are
>    feature bits that's needed on one platform and not on another?

Hi Dave,

From the discussion in this series, the meaning of IOMMU_DEV_FEAT_IOPF here
is the IOPF capability of iommu device itself. So I think check it in UACCE
will be fine.

> 2. This allows the possibility of multiple uacce device registered to 1
>    pci dev, which for a device with asymmetric queues (Intel DSA/idxd
>    driver) that is desirable feature. The current setup forces a single
>    uacce device per pdev. If additional uacce devs are registered, the
>    first removal of uacce device will disable the feature bit for the
>    rest of the registered devices. With uacce managing the feature bit,
>    it would need to add device context to the parent pdev and ref
>    counting. It may be cleaner to just allow device driver to manage
>    the feature bits and the driver should have all the information on
>    when the feature needs to be turned on and off.

Yes, we have this problem, however, this problem exists for IOMMU_DEV_FEAT_SVA
too. How about to fix it in another patch?

Best,
Zhou

> 
> - DaveJ
> 
> 
>> +
>> +    return flags | UACCE_DEV_SVA;
>> +}
>> +
>>   /**
>>    * uacce_alloc() - alloc an accelerator
>>    * @parent: pointer of uacce parent device
>> @@ -404,11 +422,7 @@ struct uacce_device *uacce_alloc(struct device *parent,
>>       if (!uacce)
>>           return ERR_PTR(-ENOMEM);
>>   -    if (flags & UACCE_DEV_SVA) {
>> -        ret = iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA);
>> -        if (ret)
>> -            flags &= ~UACCE_DEV_SVA;
>> -    }
>> +    flags = uacce_enable_sva(parent, flags);
>>         uacce->parent = parent;
>>       uacce->flags = flags;
>> @@ -432,8 +446,10 @@ struct uacce_device *uacce_alloc(struct device *parent,
>>       return uacce;
>>     err_with_uacce:
>> -    if (flags & UACCE_DEV_SVA)
>> +    if (flags & UACCE_DEV_SVA) {
>>           iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
>> +        iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
>> +    }
>>       kfree(uacce);
>>       return ERR_PTR(ret);
>>   }
>> @@ -487,8 +503,10 @@ void uacce_remove(struct uacce_device *uacce)
>>       mutex_unlock(&uacce->queues_lock);
>>         /* disable sva now since no opened queues */
>> -    if (uacce->flags & UACCE_DEV_SVA)
>> +    if (uacce->flags & UACCE_DEV_SVA) {
>>           iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
>> +        iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
>> +    }
>>         if (uacce->cdev)
>>           cdev_device_del(uacce->cdev, &uacce->dev);
> 
> .
> 


^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 05/10] uacce: Enable IOMMU_DEV_FEAT_IOPF
@ 2021-01-22 11:53       ` Zhou Wang
  0 siblings, 0 replies; 105+ messages in thread
From: Zhou Wang @ 2021-01-22 11:53 UTC (permalink / raw)
  To: Dave Jiang, Jean-Philippe Brucker, joro, will
  Cc: devicetree, Arnd Bergmann, linux-acpi, Greg Kroah-Hartman,
	sudeep.holla, rjw, iommu, Pan, Jacob jun, vivek.gautam, robh+dt,
	linux-accelerators, guohanjun, zhangfei.gao, Dan Williams,
	robin.murphy, linux-arm-kernel, lenb

On 2021/1/21 4:47, Dave Jiang wrote:
> 
> On 1/8/2021 7:52 AM, Jean-Philippe Brucker wrote:
>> The IOPF (I/O Page Fault) feature is now enabled independently from the
>> SVA feature, because some IOPF implementations are device-specific and
>> do not require IOMMU support for PCIe PRI or Arm SMMU stall.
>>
>> Enable IOPF unconditionally when enabling SVA for now. In the future, if
>> a device driver implementing a uacce interface doesn't need IOPF
>> support, it will need to tell the uacce module, for example with a new
>> flag.
>>
>> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
>> ---
>> Cc: Arnd Bergmann <arnd@arndb.de>
>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
>> Cc: Zhou Wang <wangzhou1@hisilicon.com>
>> ---
>>   drivers/misc/uacce/uacce.c | 32 +++++++++++++++++++++++++-------
>>   1 file changed, 25 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
>> index d07af4edfcac..41ef1eb62a14 100644
>> --- a/drivers/misc/uacce/uacce.c
>> +++ b/drivers/misc/uacce/uacce.c
>> @@ -385,6 +385,24 @@ static void uacce_release(struct device *dev)
>>       kfree(uacce);
>>   }
>>   +static unsigned int uacce_enable_sva(struct device *parent, unsigned int flags)
>> +{
>> +    if (!(flags & UACCE_DEV_SVA))
>> +        return flags;
>> +
>> +    flags &= ~UACCE_DEV_SVA;
>> +
>> +    if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_IOPF))
>> +        return flags;
>> +
>> +    if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA)) {
>> +        iommu_dev_disable_feature(parent, IOMMU_DEV_FEAT_IOPF);
>> +        return flags;
>> +    }
> 
> Sorry to jump in a bit late on this and not specifically towards the
> intent of this patch. But I'd like to start a discussion on if we want
> to push the iommu dev feature enabling to the device driver itself rather
> than having UACCE control this? Maybe allow the device driver to manage
> the feature bits and UACCE only verify that they are enabled?
> 
> 1. The device driver knows what platform it's on and what specific
>    feature bits its devices supports. Maybe in the future if there are
>    feature bits that's needed on one platform and not on another?

Hi Dave,

From the discussion in this series, the meaning of IOMMU_DEV_FEAT_IOPF here
is the IOPF capability of iommu device itself. So I think check it in UACCE
will be fine.

> 2. This allows the possibility of multiple uacce device registered to 1
>    pci dev, which for a device with asymmetric queues (Intel DSA/idxd
>    driver) that is desirable feature. The current setup forces a single
>    uacce device per pdev. If additional uacce devs are registered, the
>    first removal of uacce device will disable the feature bit for the
>    rest of the registered devices. With uacce managing the feature bit,
>    it would need to add device context to the parent pdev and ref
>    counting. It may be cleaner to just allow device driver to manage
>    the feature bits and the driver should have all the information on
>    when the feature needs to be turned on and off.

Yes, we have this problem, however, this problem exists for IOMMU_DEV_FEAT_SVA
too. How about to fix it in another patch?

Best,
Zhou

> 
> - DaveJ
> 
> 
>> +
>> +    return flags | UACCE_DEV_SVA;
>> +}
>> +
>>   /**
>>    * uacce_alloc() - alloc an accelerator
>>    * @parent: pointer of uacce parent device
>> @@ -404,11 +422,7 @@ struct uacce_device *uacce_alloc(struct device *parent,
>>       if (!uacce)
>>           return ERR_PTR(-ENOMEM);
>>   -    if (flags & UACCE_DEV_SVA) {
>> -        ret = iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA);
>> -        if (ret)
>> -            flags &= ~UACCE_DEV_SVA;
>> -    }
>> +    flags = uacce_enable_sva(parent, flags);
>>         uacce->parent = parent;
>>       uacce->flags = flags;
>> @@ -432,8 +446,10 @@ struct uacce_device *uacce_alloc(struct device *parent,
>>       return uacce;
>>     err_with_uacce:
>> -    if (flags & UACCE_DEV_SVA)
>> +    if (flags & UACCE_DEV_SVA) {
>>           iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
>> +        iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
>> +    }
>>       kfree(uacce);
>>       return ERR_PTR(ret);
>>   }
>> @@ -487,8 +503,10 @@ void uacce_remove(struct uacce_device *uacce)
>>       mutex_unlock(&uacce->queues_lock);
>>         /* disable sva now since no opened queues */
>> -    if (uacce->flags & UACCE_DEV_SVA)
>> +    if (uacce->flags & UACCE_DEV_SVA) {
>>           iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
>> +        iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
>> +    }
>>         if (uacce->cdev)
>>           cdev_device_del(uacce->cdev, &uacce->dev);
> 
> .
> 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 05/10] uacce: Enable IOMMU_DEV_FEAT_IOPF
@ 2021-01-22 11:53       ` Zhou Wang
  0 siblings, 0 replies; 105+ messages in thread
From: Zhou Wang @ 2021-01-22 11:53 UTC (permalink / raw)
  To: Dave Jiang, Jean-Philippe Brucker, joro, will
  Cc: devicetree, Arnd Bergmann, linux-acpi, Greg Kroah-Hartman,
	sudeep.holla, rjw, iommu, shameerali.kolothum.thodi, Pan,
	Jacob jun, vivek.gautam, robh+dt, linux-accelerators, eric.auger,
	guohanjun, zhangfei.gao, Dan Williams, baolu.lu, robin.murphy,
	linux-arm-kernel, lenb

On 2021/1/21 4:47, Dave Jiang wrote:
> 
> On 1/8/2021 7:52 AM, Jean-Philippe Brucker wrote:
>> The IOPF (I/O Page Fault) feature is now enabled independently from the
>> SVA feature, because some IOPF implementations are device-specific and
>> do not require IOMMU support for PCIe PRI or Arm SMMU stall.
>>
>> Enable IOPF unconditionally when enabling SVA for now. In the future, if
>> a device driver implementing a uacce interface doesn't need IOPF
>> support, it will need to tell the uacce module, for example with a new
>> flag.
>>
>> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
>> ---
>> Cc: Arnd Bergmann <arnd@arndb.de>
>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
>> Cc: Zhou Wang <wangzhou1@hisilicon.com>
>> ---
>>   drivers/misc/uacce/uacce.c | 32 +++++++++++++++++++++++++-------
>>   1 file changed, 25 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
>> index d07af4edfcac..41ef1eb62a14 100644
>> --- a/drivers/misc/uacce/uacce.c
>> +++ b/drivers/misc/uacce/uacce.c
>> @@ -385,6 +385,24 @@ static void uacce_release(struct device *dev)
>>       kfree(uacce);
>>   }
>>   +static unsigned int uacce_enable_sva(struct device *parent, unsigned int flags)
>> +{
>> +    if (!(flags & UACCE_DEV_SVA))
>> +        return flags;
>> +
>> +    flags &= ~UACCE_DEV_SVA;
>> +
>> +    if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_IOPF))
>> +        return flags;
>> +
>> +    if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA)) {
>> +        iommu_dev_disable_feature(parent, IOMMU_DEV_FEAT_IOPF);
>> +        return flags;
>> +    }
> 
> Sorry to jump in a bit late on this and not specifically towards the
> intent of this patch. But I'd like to start a discussion on if we want
> to push the iommu dev feature enabling to the device driver itself rather
> than having UACCE control this? Maybe allow the device driver to manage
> the feature bits and UACCE only verify that they are enabled?
> 
> 1. The device driver knows what platform it's on and what specific
>    feature bits its devices supports. Maybe in the future if there are
>    feature bits that's needed on one platform and not on another?

Hi Dave,

From the discussion in this series, the meaning of IOMMU_DEV_FEAT_IOPF here
is the IOPF capability of iommu device itself. So I think check it in UACCE
will be fine.

> 2. This allows the possibility of multiple uacce device registered to 1
>    pci dev, which for a device with asymmetric queues (Intel DSA/idxd
>    driver) that is desirable feature. The current setup forces a single
>    uacce device per pdev. If additional uacce devs are registered, the
>    first removal of uacce device will disable the feature bit for the
>    rest of the registered devices. With uacce managing the feature bit,
>    it would need to add device context to the parent pdev and ref
>    counting. It may be cleaner to just allow device driver to manage
>    the feature bits and the driver should have all the information on
>    when the feature needs to be turned on and off.

Yes, we have this problem, however, this problem exists for IOMMU_DEV_FEAT_SVA
too. How about to fix it in another patch?

Best,
Zhou

> 
> - DaveJ
> 
> 
>> +
>> +    return flags | UACCE_DEV_SVA;
>> +}
>> +
>>   /**
>>    * uacce_alloc() - alloc an accelerator
>>    * @parent: pointer of uacce parent device
>> @@ -404,11 +422,7 @@ struct uacce_device *uacce_alloc(struct device *parent,
>>       if (!uacce)
>>           return ERR_PTR(-ENOMEM);
>>   -    if (flags & UACCE_DEV_SVA) {
>> -        ret = iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA);
>> -        if (ret)
>> -            flags &= ~UACCE_DEV_SVA;
>> -    }
>> +    flags = uacce_enable_sva(parent, flags);
>>         uacce->parent = parent;
>>       uacce->flags = flags;
>> @@ -432,8 +446,10 @@ struct uacce_device *uacce_alloc(struct device *parent,
>>       return uacce;
>>     err_with_uacce:
>> -    if (flags & UACCE_DEV_SVA)
>> +    if (flags & UACCE_DEV_SVA) {
>>           iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
>> +        iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
>> +    }
>>       kfree(uacce);
>>       return ERR_PTR(ret);
>>   }
>> @@ -487,8 +503,10 @@ void uacce_remove(struct uacce_device *uacce)
>>       mutex_unlock(&uacce->queues_lock);
>>         /* disable sva now since no opened queues */
>> -    if (uacce->flags & UACCE_DEV_SVA)
>> +    if (uacce->flags & UACCE_DEV_SVA) {
>>           iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
>> +        iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
>> +    }
>>         if (uacce->cdev)
>>           cdev_device_del(uacce->cdev, &uacce->dev);
> 
> .
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 05/10] uacce: Enable IOMMU_DEV_FEAT_IOPF
  2021-01-22 11:53       ` Zhou Wang
  (?)
@ 2021-01-22 15:43         ` Dave Jiang
  -1 siblings, 0 replies; 105+ messages in thread
From: Dave Jiang @ 2021-01-22 15:43 UTC (permalink / raw)
  To: Zhou Wang, Jean-Philippe Brucker, joro, will
  Cc: vivek.gautam, guohanjun, linux-acpi, zhangfei.gao, lenb,
	devicetree, Arnd Bergmann, eric.auger, vdumpa, robh+dt,
	linux-arm-kernel, Greg Kroah-Hartman, rjw,
	shameerali.kolothum.thodi, iommu, sudeep.holla, robin.murphy,
	linux-accelerators, baolu.lu, Dan Williams, Pan, Jacob jun


On 1/22/2021 4:53 AM, Zhou Wang wrote:
> On 2021/1/21 4:47, Dave Jiang wrote:
>> On 1/8/2021 7:52 AM, Jean-Philippe Brucker wrote:
>>> The IOPF (I/O Page Fault) feature is now enabled independently from the
>>> SVA feature, because some IOPF implementations are device-specific and
>>> do not require IOMMU support for PCIe PRI or Arm SMMU stall.
>>>
>>> Enable IOPF unconditionally when enabling SVA for now. In the future, if
>>> a device driver implementing a uacce interface doesn't need IOPF
>>> support, it will need to tell the uacce module, for example with a new
>>> flag.
>>>
>>> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
>>> ---
>>> Cc: Arnd Bergmann <arnd@arndb.de>
>>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>> Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
>>> Cc: Zhou Wang <wangzhou1@hisilicon.com>
>>> ---
>>>    drivers/misc/uacce/uacce.c | 32 +++++++++++++++++++++++++-------
>>>    1 file changed, 25 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
>>> index d07af4edfcac..41ef1eb62a14 100644
>>> --- a/drivers/misc/uacce/uacce.c
>>> +++ b/drivers/misc/uacce/uacce.c
>>> @@ -385,6 +385,24 @@ static void uacce_release(struct device *dev)
>>>        kfree(uacce);
>>>    }
>>>    +static unsigned int uacce_enable_sva(struct device *parent, unsigned int flags)
>>> +{
>>> +    if (!(flags & UACCE_DEV_SVA))
>>> +        return flags;
>>> +
>>> +    flags &= ~UACCE_DEV_SVA;
>>> +
>>> +    if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_IOPF))
>>> +        return flags;
>>> +
>>> +    if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA)) {
>>> +        iommu_dev_disable_feature(parent, IOMMU_DEV_FEAT_IOPF);
>>> +        return flags;
>>> +    }
>> Sorry to jump in a bit late on this and not specifically towards the
>> intent of this patch. But I'd like to start a discussion on if we want
>> to push the iommu dev feature enabling to the device driver itself rather
>> than having UACCE control this? Maybe allow the device driver to manage
>> the feature bits and UACCE only verify that they are enabled?
>>
>> 1. The device driver knows what platform it's on and what specific
>>     feature bits its devices supports. Maybe in the future if there are
>>     feature bits that's needed on one platform and not on another?
> Hi Dave,
>
>  From the discussion in this series, the meaning of IOMMU_DEV_FEAT_IOPF here
> is the IOPF capability of iommu device itself. So I think check it in UACCE
> will be fine.
>
>> 2. This allows the possibility of multiple uacce device registered to 1
>>     pci dev, which for a device with asymmetric queues (Intel DSA/idxd
>>     driver) that is desirable feature. The current setup forces a single
>>     uacce device per pdev. If additional uacce devs are registered, the
>>     first removal of uacce device will disable the feature bit for the
>>     rest of the registered devices. With uacce managing the feature bit,
>>     it would need to add device context to the parent pdev and ref
>>     counting. It may be cleaner to just allow device driver to manage
>>     the feature bits and the driver should have all the information on
>>     when the feature needs to be turned on and off.
> Yes, we have this problem, however, this problem exists for IOMMU_DEV_FEAT_SVA
> too. How about to fix it in another patch?

Hi Zhou,

Right that's what I'm implying. I'm not pushing back on the IOPF feature 
set. Just trying to survey  the opinions from people on moving the 
feature settings to the actual drivers rather than having it in UACCE. I 
will create some patches to show what I mean for comments.


>
> Best,
> Zhou
>
>> - DaveJ
>>
>>
>>> +
>>> +    return flags | UACCE_DEV_SVA;
>>> +}
>>> +
>>>    /**
>>>     * uacce_alloc() - alloc an accelerator
>>>     * @parent: pointer of uacce parent device
>>> @@ -404,11 +422,7 @@ struct uacce_device *uacce_alloc(struct device *parent,
>>>        if (!uacce)
>>>            return ERR_PTR(-ENOMEM);
>>>    -    if (flags & UACCE_DEV_SVA) {
>>> -        ret = iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA);
>>> -        if (ret)
>>> -            flags &= ~UACCE_DEV_SVA;
>>> -    }
>>> +    flags = uacce_enable_sva(parent, flags);
>>>          uacce->parent = parent;
>>>        uacce->flags = flags;
>>> @@ -432,8 +446,10 @@ struct uacce_device *uacce_alloc(struct device *parent,
>>>        return uacce;
>>>      err_with_uacce:
>>> -    if (flags & UACCE_DEV_SVA)
>>> +    if (flags & UACCE_DEV_SVA) {
>>>            iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
>>> +        iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
>>> +    }
>>>        kfree(uacce);
>>>        return ERR_PTR(ret);
>>>    }
>>> @@ -487,8 +503,10 @@ void uacce_remove(struct uacce_device *uacce)
>>>        mutex_unlock(&uacce->queues_lock);
>>>          /* disable sva now since no opened queues */
>>> -    if (uacce->flags & UACCE_DEV_SVA)
>>> +    if (uacce->flags & UACCE_DEV_SVA) {
>>>            iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
>>> +        iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
>>> +    }
>>>          if (uacce->cdev)
>>>            cdev_device_del(uacce->cdev, &uacce->dev);
>> .
>>

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 05/10] uacce: Enable IOMMU_DEV_FEAT_IOPF
@ 2021-01-22 15:43         ` Dave Jiang
  0 siblings, 0 replies; 105+ messages in thread
From: Dave Jiang @ 2021-01-22 15:43 UTC (permalink / raw)
  To: Zhou Wang, Jean-Philippe Brucker, joro, will
  Cc: devicetree, Arnd Bergmann, linux-acpi, Greg Kroah-Hartman,
	sudeep.holla, rjw, iommu, Pan, Jacob jun, vivek.gautam, robh+dt,
	linux-accelerators, guohanjun, zhangfei.gao, Dan Williams,
	robin.murphy, linux-arm-kernel, lenb


On 1/22/2021 4:53 AM, Zhou Wang wrote:
> On 2021/1/21 4:47, Dave Jiang wrote:
>> On 1/8/2021 7:52 AM, Jean-Philippe Brucker wrote:
>>> The IOPF (I/O Page Fault) feature is now enabled independently from the
>>> SVA feature, because some IOPF implementations are device-specific and
>>> do not require IOMMU support for PCIe PRI or Arm SMMU stall.
>>>
>>> Enable IOPF unconditionally when enabling SVA for now. In the future, if
>>> a device driver implementing a uacce interface doesn't need IOPF
>>> support, it will need to tell the uacce module, for example with a new
>>> flag.
>>>
>>> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
>>> ---
>>> Cc: Arnd Bergmann <arnd@arndb.de>
>>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>> Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
>>> Cc: Zhou Wang <wangzhou1@hisilicon.com>
>>> ---
>>>    drivers/misc/uacce/uacce.c | 32 +++++++++++++++++++++++++-------
>>>    1 file changed, 25 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
>>> index d07af4edfcac..41ef1eb62a14 100644
>>> --- a/drivers/misc/uacce/uacce.c
>>> +++ b/drivers/misc/uacce/uacce.c
>>> @@ -385,6 +385,24 @@ static void uacce_release(struct device *dev)
>>>        kfree(uacce);
>>>    }
>>>    +static unsigned int uacce_enable_sva(struct device *parent, unsigned int flags)
>>> +{
>>> +    if (!(flags & UACCE_DEV_SVA))
>>> +        return flags;
>>> +
>>> +    flags &= ~UACCE_DEV_SVA;
>>> +
>>> +    if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_IOPF))
>>> +        return flags;
>>> +
>>> +    if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA)) {
>>> +        iommu_dev_disable_feature(parent, IOMMU_DEV_FEAT_IOPF);
>>> +        return flags;
>>> +    }
>> Sorry to jump in a bit late on this and not specifically towards the
>> intent of this patch. But I'd like to start a discussion on if we want
>> to push the iommu dev feature enabling to the device driver itself rather
>> than having UACCE control this? Maybe allow the device driver to manage
>> the feature bits and UACCE only verify that they are enabled?
>>
>> 1. The device driver knows what platform it's on and what specific
>>     feature bits its devices supports. Maybe in the future if there are
>>     feature bits that's needed on one platform and not on another?
> Hi Dave,
>
>  From the discussion in this series, the meaning of IOMMU_DEV_FEAT_IOPF here
> is the IOPF capability of iommu device itself. So I think check it in UACCE
> will be fine.
>
>> 2. This allows the possibility of multiple uacce device registered to 1
>>     pci dev, which for a device with asymmetric queues (Intel DSA/idxd
>>     driver) that is desirable feature. The current setup forces a single
>>     uacce device per pdev. If additional uacce devs are registered, the
>>     first removal of uacce device will disable the feature bit for the
>>     rest of the registered devices. With uacce managing the feature bit,
>>     it would need to add device context to the parent pdev and ref
>>     counting. It may be cleaner to just allow device driver to manage
>>     the feature bits and the driver should have all the information on
>>     when the feature needs to be turned on and off.
> Yes, we have this problem, however, this problem exists for IOMMU_DEV_FEAT_SVA
> too. How about to fix it in another patch?

Hi Zhou,

Right that's what I'm implying. I'm not pushing back on the IOPF feature 
set. Just trying to survey  the opinions from people on moving the 
feature settings to the actual drivers rather than having it in UACCE. I 
will create some patches to show what I mean for comments.


>
> Best,
> Zhou
>
>> - DaveJ
>>
>>
>>> +
>>> +    return flags | UACCE_DEV_SVA;
>>> +}
>>> +
>>>    /**
>>>     * uacce_alloc() - alloc an accelerator
>>>     * @parent: pointer of uacce parent device
>>> @@ -404,11 +422,7 @@ struct uacce_device *uacce_alloc(struct device *parent,
>>>        if (!uacce)
>>>            return ERR_PTR(-ENOMEM);
>>>    -    if (flags & UACCE_DEV_SVA) {
>>> -        ret = iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA);
>>> -        if (ret)
>>> -            flags &= ~UACCE_DEV_SVA;
>>> -    }
>>> +    flags = uacce_enable_sva(parent, flags);
>>>          uacce->parent = parent;
>>>        uacce->flags = flags;
>>> @@ -432,8 +446,10 @@ struct uacce_device *uacce_alloc(struct device *parent,
>>>        return uacce;
>>>      err_with_uacce:
>>> -    if (flags & UACCE_DEV_SVA)
>>> +    if (flags & UACCE_DEV_SVA) {
>>>            iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
>>> +        iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
>>> +    }
>>>        kfree(uacce);
>>>        return ERR_PTR(ret);
>>>    }
>>> @@ -487,8 +503,10 @@ void uacce_remove(struct uacce_device *uacce)
>>>        mutex_unlock(&uacce->queues_lock);
>>>          /* disable sva now since no opened queues */
>>> -    if (uacce->flags & UACCE_DEV_SVA)
>>> +    if (uacce->flags & UACCE_DEV_SVA) {
>>>            iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
>>> +        iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
>>> +    }
>>>          if (uacce->cdev)
>>>            cdev_device_del(uacce->cdev, &uacce->dev);
>> .
>>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 105+ messages in thread

* Re: [PATCH v9 05/10] uacce: Enable IOMMU_DEV_FEAT_IOPF
@ 2021-01-22 15:43         ` Dave Jiang
  0 siblings, 0 replies; 105+ messages in thread
From: Dave Jiang @ 2021-01-22 15:43 UTC (permalink / raw)
  To: Zhou Wang, Jean-Philippe Brucker, joro, will
  Cc: devicetree, Arnd Bergmann, linux-acpi, Greg Kroah-Hartman,
	sudeep.holla, rjw, iommu, shameerali.kolothum.thodi, Pan,
	Jacob jun, vivek.gautam, robh+dt, linux-accelerators, eric.auger,
	guohanjun, zhangfei.gao, Dan Williams, baolu.lu, robin.murphy,
	linux-arm-kernel, lenb


On 1/22/2021 4:53 AM, Zhou Wang wrote:
> On 2021/1/21 4:47, Dave Jiang wrote:
>> On 1/8/2021 7:52 AM, Jean-Philippe Brucker wrote:
>>> The IOPF (I/O Page Fault) feature is now enabled independently from the
>>> SVA feature, because some IOPF implementations are device-specific and
>>> do not require IOMMU support for PCIe PRI or Arm SMMU stall.
>>>
>>> Enable IOPF unconditionally when enabling SVA for now. In the future, if
>>> a device driver implementing a uacce interface doesn't need IOPF
>>> support, it will need to tell the uacce module, for example with a new
>>> flag.
>>>
>>> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
>>> ---
>>> Cc: Arnd Bergmann <arnd@arndb.de>
>>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>> Cc: Zhangfei Gao <zhangfei.gao@linaro.org>
>>> Cc: Zhou Wang <wangzhou1@hisilicon.com>
>>> ---
>>>    drivers/misc/uacce/uacce.c | 32 +++++++++++++++++++++++++-------
>>>    1 file changed, 25 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
>>> index d07af4edfcac..41ef1eb62a14 100644
>>> --- a/drivers/misc/uacce/uacce.c
>>> +++ b/drivers/misc/uacce/uacce.c
>>> @@ -385,6 +385,24 @@ static void uacce_release(struct device *dev)
>>>        kfree(uacce);
>>>    }
>>>    +static unsigned int uacce_enable_sva(struct device *parent, unsigned int flags)
>>> +{
>>> +    if (!(flags & UACCE_DEV_SVA))
>>> +        return flags;
>>> +
>>> +    flags &= ~UACCE_DEV_SVA;
>>> +
>>> +    if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_IOPF))
>>> +        return flags;
>>> +
>>> +    if (iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA)) {
>>> +        iommu_dev_disable_feature(parent, IOMMU_DEV_FEAT_IOPF);
>>> +        return flags;
>>> +    }
>> Sorry to jump in a bit late on this and not specifically towards the
>> intent of this patch. But I'd like to start a discussion on if we want
>> to push the iommu dev feature enabling to the device driver itself rather
>> than having UACCE control this? Maybe allow the device driver to manage
>> the feature bits and UACCE only verify that they are enabled?
>>
>> 1. The device driver knows what platform it's on and what specific
>>     feature bits its devices supports. Maybe in the future if there are
>>     feature bits that's needed on one platform and not on another?
> Hi Dave,
>
>  From the discussion in this series, the meaning of IOMMU_DEV_FEAT_IOPF here
> is the IOPF capability of iommu device itself. So I think check it in UACCE
> will be fine.
>
>> 2. This allows the possibility of multiple uacce device registered to 1
>>     pci dev, which for a device with asymmetric queues (Intel DSA/idxd
>>     driver) that is desirable feature. The current setup forces a single
>>     uacce device per pdev. If additional uacce devs are registered, the
>>     first removal of uacce device will disable the feature bit for the
>>     rest of the registered devices. With uacce managing the feature bit,
>>     it would need to add device context to the parent pdev and ref
>>     counting. It may be cleaner to just allow device driver to manage
>>     the feature bits and the driver should have all the information on
>>     when the feature needs to be turned on and off.
> Yes, we have this problem, however, this problem exists for IOMMU_DEV_FEAT_SVA
> too. How about to fix it in another patch?

Hi Zhou,

Right that's what I'm implying. I'm not pushing back on the IOPF feature 
set. Just trying to survey  the opinions from people on moving the 
feature settings to the actual drivers rather than having it in UACCE. I 
will create some patches to show what I mean for comments.


>
> Best,
> Zhou
>
>> - DaveJ
>>
>>
>>> +
>>> +    return flags | UACCE_DEV_SVA;
>>> +}
>>> +
>>>    /**
>>>     * uacce_alloc() - alloc an accelerator
>>>     * @parent: pointer of uacce parent device
>>> @@ -404,11 +422,7 @@ struct uacce_device *uacce_alloc(struct device *parent,
>>>        if (!uacce)
>>>            return ERR_PTR(-ENOMEM);
>>>    -    if (flags & UACCE_DEV_SVA) {
>>> -        ret = iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA);
>>> -        if (ret)
>>> -            flags &= ~UACCE_DEV_SVA;
>>> -    }
>>> +    flags = uacce_enable_sva(parent, flags);
>>>          uacce->parent = parent;
>>>        uacce->flags = flags;
>>> @@ -432,8 +446,10 @@ struct uacce_device *uacce_alloc(struct device *parent,
>>>        return uacce;
>>>      err_with_uacce:
>>> -    if (flags & UACCE_DEV_SVA)
>>> +    if (flags & UACCE_DEV_SVA) {
>>>            iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
>>> +        iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
>>> +    }
>>>        kfree(uacce);
>>>        return ERR_PTR(ret);
>>>    }
>>> @@ -487,8 +503,10 @@ void uacce_remove(struct uacce_device *uacce)
>>>        mutex_unlock(&uacce->queues_lock);
>>>          /* disable sva now since no opened queues */
>>> -    if (uacce->flags & UACCE_DEV_SVA)
>>> +    if (uacce->flags & UACCE_DEV_SVA) {
>>>            iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_SVA);
>>> +        iommu_dev_disable_feature(uacce->parent, IOMMU_DEV_FEAT_IOPF);
>>> +    }
>>>          if (uacce->cdev)
>>>            cdev_device_del(uacce->cdev, &uacce->dev);
>> .
>>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 105+ messages in thread

end of thread, other threads:[~2021-01-22 15:45 UTC | newest]

Thread overview: 105+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-08 14:52 [PATCH v9 00/10] iommu: I/O page faults for SMMUv3 Jean-Philippe Brucker
2021-01-08 14:52 ` Jean-Philippe Brucker
2021-01-08 14:52 ` Jean-Philippe Brucker
2021-01-08 14:52 ` [PATCH v9 01/10] iommu: Remove obsolete comment Jean-Philippe Brucker
2021-01-08 14:52   ` Jean-Philippe Brucker
2021-01-08 14:52   ` Jean-Philippe Brucker
2021-01-19 11:11   ` Jonathan Cameron
2021-01-19 11:11     ` Jonathan Cameron
2021-01-19 11:11     ` Jonathan Cameron
2021-01-20 17:41     ` Jean-Philippe Brucker
2021-01-20 17:41       ` Jean-Philippe Brucker
2021-01-20 17:41       ` Jean-Philippe Brucker
2021-01-08 14:52 ` [PATCH v9 02/10] iommu/arm-smmu-v3: Use device properties for pasid-num-bits Jean-Philippe Brucker
2021-01-08 14:52   ` Jean-Philippe Brucker
2021-01-08 14:52   ` Jean-Philippe Brucker
2021-01-19 11:22   ` Jonathan Cameron
2021-01-19 11:22     ` Jonathan Cameron
2021-01-19 11:22     ` Jonathan Cameron
2021-01-08 14:52 ` [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA Jean-Philippe Brucker
2021-01-08 14:52   ` Jean-Philippe Brucker
2021-01-08 14:52   ` Jean-Philippe Brucker
2021-01-12  4:31   ` Lu Baolu
2021-01-12  4:31     ` Lu Baolu
2021-01-12  4:31     ` Lu Baolu
2021-01-12  9:16     ` Jean-Philippe Brucker
2021-01-12  9:16       ` Jean-Philippe Brucker
2021-01-12  9:16       ` Jean-Philippe Brucker
2021-01-13  2:49       ` Lu Baolu
2021-01-13  2:49         ` Lu Baolu
2021-01-13  2:49         ` Lu Baolu
2021-01-13  8:10         ` Tian, Kevin
2021-01-13  8:10           ` Tian, Kevin
2021-01-13  8:10           ` Tian, Kevin
2021-01-14 16:41           ` Jean-Philippe Brucker
2021-01-14 16:41             ` Jean-Philippe Brucker
2021-01-14 16:41             ` Jean-Philippe Brucker
2021-01-16  3:54             ` Lu Baolu
2021-01-16  3:54               ` Lu Baolu
2021-01-16  3:54               ` Lu Baolu
2021-01-18  6:54               ` Tian, Kevin
2021-01-18  6:54                 ` Tian, Kevin
2021-01-18  6:54                 ` Tian, Kevin
2021-01-19 10:16                 ` Jean-Philippe Brucker
2021-01-19 10:16                   ` Jean-Philippe Brucker
2021-01-19 10:16                   ` Jean-Philippe Brucker
2021-01-20  1:57                   ` Lu Baolu
2021-01-20  1:57                     ` Lu Baolu
2021-01-20  1:57                     ` Lu Baolu
2021-01-08 14:52 ` [PATCH v9 04/10] iommu/vt-d: Support IOMMU_DEV_FEAT_IOPF Jean-Philippe Brucker
2021-01-08 14:52   ` Jean-Philippe Brucker
2021-01-08 14:52   ` Jean-Philippe Brucker
2021-01-08 14:52 ` [PATCH v9 05/10] uacce: Enable IOMMU_DEV_FEAT_IOPF Jean-Philippe Brucker
2021-01-08 14:52   ` Jean-Philippe Brucker
2021-01-08 14:52   ` Jean-Philippe Brucker
2021-01-11  3:29   ` Zhangfei Gao
2021-01-11  3:29     ` Zhangfei Gao
2021-01-11  3:29     ` Zhangfei Gao
2021-01-19 12:27   ` Jonathan Cameron
2021-01-19 12:27     ` Jonathan Cameron
2021-01-19 12:27     ` Jonathan Cameron
2021-01-20 17:42     ` Jean-Philippe Brucker
2021-01-20 17:42       ` Jean-Philippe Brucker
2021-01-20 17:42       ` Jean-Philippe Brucker
2021-01-20 20:47   ` Dave Jiang
2021-01-20 20:47     ` Dave Jiang
2021-01-20 20:47     ` Dave Jiang
2021-01-22 11:53     ` Zhou Wang
2021-01-22 11:53       ` Zhou Wang
2021-01-22 11:53       ` Zhou Wang
2021-01-22 15:43       ` Dave Jiang
2021-01-22 15:43         ` Dave Jiang
2021-01-22 15:43         ` Dave Jiang
2021-01-08 14:52 ` [PATCH v9 06/10] iommu: Add a page fault handler Jean-Philippe Brucker
2021-01-08 14:52   ` Jean-Philippe Brucker
2021-01-08 14:52   ` Jean-Philippe Brucker
2021-01-19 13:38   ` Jonathan Cameron
2021-01-19 13:38     ` Jonathan Cameron
2021-01-19 13:38     ` Jonathan Cameron
2021-01-20 17:43     ` Jean-Philippe Brucker
2021-01-20 17:43       ` Jean-Philippe Brucker
2021-01-20 17:43       ` Jean-Philippe Brucker
2021-01-08 14:52 ` [PATCH v9 07/10] iommu/arm-smmu-v3: Maintain a SID->device structure Jean-Philippe Brucker
2021-01-08 14:52   ` Jean-Philippe Brucker
2021-01-08 14:52   ` Jean-Philippe Brucker
2021-01-19 13:51   ` Jonathan Cameron
2021-01-19 13:51     ` Jonathan Cameron
2021-01-19 13:51     ` Jonathan Cameron
2021-01-08 14:52 ` [PATCH v9 08/10] dt-bindings: document stall property for IOMMU masters Jean-Philippe Brucker
2021-01-08 14:52   ` Jean-Philippe Brucker
2021-01-08 14:52   ` Jean-Philippe Brucker
2021-01-08 14:52 ` [PATCH v9 09/10] ACPI/IORT: Enable stall support for platform devices Jean-Philippe Brucker
2021-01-08 14:52   ` Jean-Philippe Brucker
2021-01-08 14:52   ` Jean-Philippe Brucker
2021-01-08 14:52 ` [PATCH v9 10/10] iommu/arm-smmu-v3: Add " Jean-Philippe Brucker
2021-01-08 14:52   ` Jean-Philippe Brucker
2021-01-08 14:52   ` Jean-Philippe Brucker
2021-01-19 17:28   ` Robin Murphy
2021-01-19 17:28     ` Robin Murphy
2021-01-19 17:28     ` Robin Murphy
2021-01-20 17:55     ` Jean-Philippe Brucker
2021-01-20 17:55       ` Jean-Philippe Brucker
2021-01-20 17:55       ` Jean-Philippe Brucker
2021-01-11  3:26 ` [PATCH v9 00/10] iommu: I/O page faults for SMMUv3 Zhangfei Gao
2021-01-11  3:26   ` Zhangfei Gao
2021-01-11  3:26   ` Zhangfei Gao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.