devicetree.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v10 0/5] NVIDIA ARM SMMU Implementation
@ 2020-07-08  5:00 Krishna Reddy
  2020-07-08  5:00 ` [PATCH v10 1/5] iommu/arm-smmu: move TLB timeout and spin count macros Krishna Reddy
                   ` (5 more replies)
  0 siblings, 6 replies; 25+ messages in thread
From: Krishna Reddy @ 2020-07-08  5:00 UTC (permalink / raw)
  To: joro, will, robin.murphy, robh+dt, treding, jonathanh
  Cc: devicetree, linux-arm-kernel, iommu, linux-kernel, linux-tegra,
	yhsu, snikam, praithatha, talho, bbiswas, mperttunen, nicolinc,
	bhuntsman, nicoleotsuka, Krishna Reddy

Changes in v10:
Perform SMMU base ioremap before calling implementation init.
Check for Global faults across both ARM MMU-500s during global interrupt.
Check for context faults across all contexts of both ARM MMU-500s during context fault interrupt.
Add new DT binding nvidia,smmu-500 for NVIDIA implementation.

v9 - https://lkml.org/lkml/2020/6/30/1282
v8 - https://lkml.org/lkml/2020/6/29/2385
v7 - https://lkml.org/lkml/2020/6/28/347
v6 - https://lkml.org/lkml/2020/6/4/1018
v5 - https://lkml.org/lkml/2020/5/21/1114
v4 - https://lkml.org/lkml/2019/10/30/1054
v3 - https://lkml.org/lkml/2019/10/18/1601
v2 - https://lkml.org/lkml/2019/9/2/980
v1 - https://lkml.org/lkml/2019/8/29/1588


Krishna Reddy (5):
  iommu/arm-smmu: move TLB timeout and spin count macros
  iommu/arm-smmu: ioremap smmu mmio region before implementation init
  iommu/arm-smmu: add NVIDIA implementation for ARM MMU-500 usage
  dt-bindings: arm-smmu: add binding for Tegra194 SMMU
  iommu/arm-smmu: Add global/context fault implementation hooks

 .../devicetree/bindings/iommu/arm,smmu.yaml   |  18 ++
 MAINTAINERS                                   |   2 +
 drivers/iommu/Makefile                        |   2 +-
 drivers/iommu/arm-smmu-impl.c                 |   3 +
 drivers/iommu/arm-smmu-nvidia.c               | 278 ++++++++++++++++++
 drivers/iommu/arm-smmu.c                      |  29 +-
 drivers/iommu/arm-smmu.h                      |   6 +
 7 files changed, 328 insertions(+), 10 deletions(-)
 create mode 100644 drivers/iommu/arm-smmu-nvidia.c


base-commit: e5640f21b63d2a5d3e4e0c4111b2b38e99fe5164
-- 
2.26.2


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v10 1/5] iommu/arm-smmu: move TLB timeout and spin count macros
  2020-07-08  5:00 [PATCH v10 0/5] NVIDIA ARM SMMU Implementation Krishna Reddy
@ 2020-07-08  5:00 ` Krishna Reddy
  2020-07-08 12:05   ` Jon Hunter
  2020-07-08 20:37   ` Nicolin Chen
  2020-07-08  5:00 ` [PATCH v10 2/5] iommu/arm-smmu: ioremap smmu mmio region before implementation init Krishna Reddy
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 25+ messages in thread
From: Krishna Reddy @ 2020-07-08  5:00 UTC (permalink / raw)
  To: joro, will, robin.murphy, robh+dt, treding, jonathanh
  Cc: devicetree, linux-arm-kernel, iommu, linux-kernel, linux-tegra,
	yhsu, snikam, praithatha, talho, bbiswas, mperttunen, nicolinc,
	bhuntsman, nicoleotsuka, Krishna Reddy

Move TLB timeout and spin count macros to header file to
allow using the same from vendor specific implementations.

Signed-off-by: Krishna Reddy <vdumpa@nvidia.com>
---
 drivers/iommu/arm-smmu.c | 3 ---
 drivers/iommu/arm-smmu.h | 2 ++
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 243bc4cb2705..d2054178df35 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -52,9 +52,6 @@
  */
 #define QCOM_DUMMY_VAL -1
 
-#define TLB_LOOP_TIMEOUT		1000000	/* 1s! */
-#define TLB_SPIN_COUNT			10
-
 #define MSI_IOVA_BASE			0x8000000
 #define MSI_IOVA_LENGTH			0x100000
 
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index d172c024be61..c7d0122a7c6c 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -236,6 +236,8 @@ enum arm_smmu_cbar_type {
 /* Maximum number of context banks per SMMU */
 #define ARM_SMMU_MAX_CBS		128
 
+#define TLB_LOOP_TIMEOUT		1000000	/* 1s! */
+#define TLB_SPIN_COUNT			10
 
 /* Shared driver definitions */
 enum arm_smmu_arch_version {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v10 2/5] iommu/arm-smmu: ioremap smmu mmio region before implementation init
  2020-07-08  5:00 [PATCH v10 0/5] NVIDIA ARM SMMU Implementation Krishna Reddy
  2020-07-08  5:00 ` [PATCH v10 1/5] iommu/arm-smmu: move TLB timeout and spin count macros Krishna Reddy
@ 2020-07-08  5:00 ` Krishna Reddy
  2020-07-08 12:05   ` Jon Hunter
                     ` (2 more replies)
  2020-07-08  5:00 ` [PATCH v10 3/5] iommu/arm-smmu: add NVIDIA implementation for ARM MMU-500 usage Krishna Reddy
                   ` (3 subsequent siblings)
  5 siblings, 3 replies; 25+ messages in thread
From: Krishna Reddy @ 2020-07-08  5:00 UTC (permalink / raw)
  To: joro, will, robin.murphy, robh+dt, treding, jonathanh
  Cc: devicetree, linux-arm-kernel, iommu, linux-kernel, linux-tegra,
	yhsu, snikam, praithatha, talho, bbiswas, mperttunen, nicolinc,
	bhuntsman, nicoleotsuka, Krishna Reddy

ioremap smmu mmio region before calling into implementation init.
This is necessary to allow mapped address available during vendor
specific implementation init.

Signed-off-by: Krishna Reddy <vdumpa@nvidia.com>
---
 drivers/iommu/arm-smmu.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index d2054178df35..e03e873d3bca 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -2120,10 +2120,6 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
 	if (err)
 		return err;
 
-	smmu = arm_smmu_impl_init(smmu);
-	if (IS_ERR(smmu))
-		return PTR_ERR(smmu);
-
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
 	ioaddr = res->start;
 	smmu->base = devm_ioremap_resource(dev, res);
@@ -2135,6 +2131,10 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
 	 */
 	smmu->numpage = resource_size(res);
 
+	smmu = arm_smmu_impl_init(smmu);
+	if (IS_ERR(smmu))
+		return PTR_ERR(smmu);
+
 	num_irqs = 0;
 	while ((res = platform_get_resource(pdev, IORESOURCE_IRQ, num_irqs))) {
 		num_irqs++;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v10 3/5] iommu/arm-smmu: add NVIDIA implementation for ARM MMU-500 usage
  2020-07-08  5:00 [PATCH v10 0/5] NVIDIA ARM SMMU Implementation Krishna Reddy
  2020-07-08  5:00 ` [PATCH v10 1/5] iommu/arm-smmu: move TLB timeout and spin count macros Krishna Reddy
  2020-07-08  5:00 ` [PATCH v10 2/5] iommu/arm-smmu: ioremap smmu mmio region before implementation init Krishna Reddy
@ 2020-07-08  5:00 ` Krishna Reddy
  2020-07-08 12:24   ` Jon Hunter
  2020-07-08 20:36   ` Nicolin Chen
  2020-07-08  5:00 ` [PATCH v10 4/5] dt-bindings: arm-smmu: add binding for Tegra194 SMMU Krishna Reddy
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 25+ messages in thread
From: Krishna Reddy @ 2020-07-08  5:00 UTC (permalink / raw)
  To: joro, will, robin.murphy, robh+dt, treding, jonathanh
  Cc: devicetree, linux-arm-kernel, iommu, linux-kernel, linux-tegra,
	yhsu, snikam, praithatha, talho, bbiswas, mperttunen, nicolinc,
	bhuntsman, nicoleotsuka, Krishna Reddy

NVIDIA's Tegra194 SoC has three ARM MMU-500 instances.
It uses two of the ARM MMU-500s together to interleave IOVA
accesses across them and must be programmed identically.
This implementation supports programming the two ARM MMU-500s
that must be programmed identically.

The third ARM MMU-500 instance is supported by standard
arm-smmu.c driver itself.

Signed-off-by: Krishna Reddy <vdumpa@nvidia.com>
---
 MAINTAINERS                     |   2 +
 drivers/iommu/Makefile          |   2 +-
 drivers/iommu/arm-smmu-impl.c   |   3 +
 drivers/iommu/arm-smmu-nvidia.c | 179 ++++++++++++++++++++++++++++++++
 drivers/iommu/arm-smmu.c        |   1 +
 drivers/iommu/arm-smmu.h        |   1 +
 6 files changed, 187 insertions(+), 1 deletion(-)
 create mode 100644 drivers/iommu/arm-smmu-nvidia.c

diff --git a/MAINTAINERS b/MAINTAINERS
index c23352059a6b..534cedaf8e55 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -16811,8 +16811,10 @@ F:	drivers/i2c/busses/i2c-tegra.c
 
 TEGRA IOMMU DRIVERS
 M:	Thierry Reding <thierry.reding@gmail.com>
+R:	Krishna Reddy <vdumpa@nvidia.com>
 L:	linux-tegra@vger.kernel.org
 S:	Supported
+F:	drivers/iommu/arm-smmu-nvidia.c
 F:	drivers/iommu/tegra*
 
 TEGRA KBC DRIVER
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 342190196dfb..2b8203db73ec 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -15,7 +15,7 @@ obj-$(CONFIG_AMD_IOMMU) += amd/iommu.o amd/init.o amd/quirks.o
 obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += amd/debugfs.o
 obj-$(CONFIG_AMD_IOMMU_V2) += amd/iommu_v2.o
 obj-$(CONFIG_ARM_SMMU) += arm_smmu.o
-arm_smmu-objs += arm-smmu.o arm-smmu-impl.o arm-smmu-qcom.o
+arm_smmu-objs += arm-smmu.o arm-smmu-impl.o arm-smmu-nvidia.o arm-smmu-qcom.o
 obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o
 obj-$(CONFIG_DMAR_TABLE) += intel/dmar.o
 obj-$(CONFIG_INTEL_IOMMU) += intel/iommu.o intel/pasid.o
diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c
index c75b9d957b70..f15571d05474 100644
--- a/drivers/iommu/arm-smmu-impl.c
+++ b/drivers/iommu/arm-smmu-impl.c
@@ -171,6 +171,9 @@ struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu)
 	if (of_property_read_bool(np, "calxeda,smmu-secure-config-access"))
 		smmu->impl = &calxeda_impl;
 
+	if (of_device_is_compatible(np, "nvidia,tegra194-smmu"))
+		return nvidia_smmu_impl_init(smmu);
+
 	if (of_device_is_compatible(np, "qcom,sdm845-smmu-500") ||
 	    of_device_is_compatible(np, "qcom,sc7180-smmu-500"))
 		return qcom_smmu_impl_init(smmu);
diff --git a/drivers/iommu/arm-smmu-nvidia.c b/drivers/iommu/arm-smmu-nvidia.c
new file mode 100644
index 000000000000..2f55e5793d34
--- /dev/null
+++ b/drivers/iommu/arm-smmu-nvidia.c
@@ -0,0 +1,179 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (C) 2019-2020 NVIDIA CORPORATION.  All rights reserved.
+
+#include <linux/bitfield.h>
+#include <linux/delay.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+
+#include "arm-smmu.h"
+
+/*
+ * Tegra194 has three ARM MMU-500 Instances.
+ * Two of them are used together and must be programmed identically for
+ * interleaved IOVA accesses across them and translates accesses from
+ * non-isochronous HW devices.
+ * Third one is used for translating accesses from isochronous HW devices.
+ * This implementation supports programming of the two instances that must
+ * be programmed identically.
+ * The third instance usage is through standard arm-smmu driver itself and
+ * is out of scope of this implementation.
+ */
+#define NUM_SMMU_INSTANCES 2
+
+struct nvidia_smmu {
+	struct arm_smmu_device	smmu;
+	void __iomem		*bases[NUM_SMMU_INSTANCES];
+};
+
+static inline void __iomem *nvidia_smmu_page(struct arm_smmu_device *smmu,
+					     unsigned int inst, int page)
+{
+	struct nvidia_smmu *nvidia_smmu;
+
+	nvidia_smmu = container_of(smmu, struct nvidia_smmu, smmu);
+	return nvidia_smmu->bases[inst] + (page << smmu->pgshift);
+}
+
+static u32 nvidia_smmu_read_reg(struct arm_smmu_device *smmu,
+				int page, int offset)
+{
+	void __iomem *reg = nvidia_smmu_page(smmu, 0, page) + offset;
+
+	return readl_relaxed(reg);
+}
+
+static void nvidia_smmu_write_reg(struct arm_smmu_device *smmu,
+				  int page, int offset, u32 val)
+{
+	unsigned int i;
+
+	for (i = 0; i < NUM_SMMU_INSTANCES; i++) {
+		void __iomem *reg = nvidia_smmu_page(smmu, i, page) + offset;
+
+		writel_relaxed(val, reg);
+	}
+}
+
+static u64 nvidia_smmu_read_reg64(struct arm_smmu_device *smmu,
+				  int page, int offset)
+{
+	void __iomem *reg = nvidia_smmu_page(smmu, 0, page) + offset;
+
+	return readq_relaxed(reg);
+}
+
+static void nvidia_smmu_write_reg64(struct arm_smmu_device *smmu,
+				    int page, int offset, u64 val)
+{
+	unsigned int i;
+
+	for (i = 0; i < NUM_SMMU_INSTANCES; i++) {
+		void __iomem *reg = nvidia_smmu_page(smmu, i, page) + offset;
+
+		writeq_relaxed(val, reg);
+	}
+}
+
+static void nvidia_smmu_tlb_sync(struct arm_smmu_device *smmu, int page,
+				 int sync, int status)
+{
+	unsigned int delay;
+
+	arm_smmu_writel(smmu, page, sync, 0);
+
+	for (delay = 1; delay < TLB_LOOP_TIMEOUT; delay *= 2) {
+		unsigned int spin_cnt;
+
+		for (spin_cnt = TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) {
+			u32 val = 0;
+			unsigned int i;
+
+			for (i = 0; i < NUM_SMMU_INSTANCES; i++) {
+				void __iomem *reg;
+
+				reg = nvidia_smmu_page(smmu, i, page) + status;
+				val |= readl_relaxed(reg);
+			}
+
+			if (!(val & ARM_SMMU_sTLBGSTATUS_GSACTIVE))
+				return;
+
+			cpu_relax();
+		}
+
+		udelay(delay);
+	}
+
+	dev_err_ratelimited(smmu->dev,
+			    "TLB sync timed out -- SMMU may be deadlocked\n");
+}
+
+static int nvidia_smmu_reset(struct arm_smmu_device *smmu)
+{
+	unsigned int i;
+
+	for (i = 0; i < NUM_SMMU_INSTANCES; i++) {
+		u32 val;
+		void __iomem *reg = nvidia_smmu_page(smmu, i, ARM_SMMU_GR0) +
+				    ARM_SMMU_GR0_sGFSR;
+
+		/* clear global FSR */
+		val = readl_relaxed(reg);
+		writel_relaxed(val, reg);
+	}
+
+	return 0;
+}
+
+static const struct arm_smmu_impl nvidia_smmu_impl = {
+	.read_reg = nvidia_smmu_read_reg,
+	.write_reg = nvidia_smmu_write_reg,
+	.read_reg64 = nvidia_smmu_read_reg64,
+	.write_reg64 = nvidia_smmu_write_reg64,
+	.reset = nvidia_smmu_reset,
+	.tlb_sync = nvidia_smmu_tlb_sync,
+};
+
+struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu)
+{
+	struct resource *res;
+	struct device *dev = smmu->dev;
+	struct nvidia_smmu *nvidia_smmu;
+	struct platform_device *pdev = to_platform_device(dev);
+
+	nvidia_smmu = devm_kzalloc(dev, sizeof(*nvidia_smmu), GFP_KERNEL);
+	if (!nvidia_smmu)
+		return ERR_PTR(-ENOMEM);
+
+	/*
+	 * Copy the data from struct arm_smmu_device *smmu allocated in
+	 * arm-smmu.c. The smmu from struct nvidia_smmu replaces the smmu
+	 * pointer used in arm-smmu.c once this function returns.
+	 * This is necessary to derive nvidia_smmu from smmu pointer passed
+	 * through arm_smmu_impl function calls subsequently.
+	 */
+	nvidia_smmu->smmu = *smmu;
+	/* Instance 0 is ioremapped by arm-smmu.c. */
+	nvidia_smmu->bases[0] = smmu->base;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
+	if (!res)
+		return ERR_PTR(-ENODEV);
+
+	nvidia_smmu->bases[1] = devm_ioremap_resource(dev, res);
+	if (IS_ERR(nvidia_smmu->bases[1]))
+		return ERR_CAST(nvidia_smmu->bases[1]);
+
+	nvidia_smmu->smmu.impl = &nvidia_smmu_impl;
+
+	/*
+	 * Free the struct arm_smmu_device *smmu allocated in arm-smmu.c.
+	 * Once this function returns, arm-smmu.c would use arm_smmu_device
+	 * allocated as part of struct nvidia_smmu.
+	 */
+	devm_kfree(dev, smmu);
+
+	return &nvidia_smmu->smmu;
+}
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index e03e873d3bca..c123a5814f70 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1943,6 +1943,7 @@ static const struct of_device_id arm_smmu_of_match[] = {
 	{ .compatible = "arm,mmu-401", .data = &arm_mmu401 },
 	{ .compatible = "arm,mmu-500", .data = &arm_mmu500 },
 	{ .compatible = "cavium,smmu-v2", .data = &cavium_smmuv2 },
+	{ .compatible = "nvidia,smmu-500", .data = &arm_mmu500 },
 	{ .compatible = "qcom,smmu-v2", .data = &qcom_smmuv2 },
 	{ },
 };
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index c7d0122a7c6c..fad63efa1a72 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -452,6 +452,7 @@ static inline void arm_smmu_writeq(struct arm_smmu_device *smmu, int page,
 	arm_smmu_writeq((s), ARM_SMMU_CB((s), (n)), (o), (v))
 
 struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu);
+struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu);
 struct arm_smmu_device *qcom_smmu_impl_init(struct arm_smmu_device *smmu);
 
 int arm_mmu500_reset(struct arm_smmu_device *smmu);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v10 4/5] dt-bindings: arm-smmu: add binding for Tegra194 SMMU
  2020-07-08  5:00 [PATCH v10 0/5] NVIDIA ARM SMMU Implementation Krishna Reddy
                   ` (2 preceding siblings ...)
  2020-07-08  5:00 ` [PATCH v10 3/5] iommu/arm-smmu: add NVIDIA implementation for ARM MMU-500 usage Krishna Reddy
@ 2020-07-08  5:00 ` Krishna Reddy
  2020-07-08 12:25   ` Jon Hunter
  2020-07-09 20:13   ` Rob Herring
  2020-07-08  5:00 ` [PATCH v10 5/5] iommu/arm-smmu: Add global/context fault implementation hooks Krishna Reddy
  2020-07-13 13:50 ` [PATCH v10 0/5] NVIDIA ARM SMMU Implementation Will Deacon
  5 siblings, 2 replies; 25+ messages in thread
From: Krishna Reddy @ 2020-07-08  5:00 UTC (permalink / raw)
  To: joro, will, robin.murphy, robh+dt, treding, jonathanh
  Cc: devicetree, linux-arm-kernel, iommu, linux-kernel, linux-tegra,
	yhsu, snikam, praithatha, talho, bbiswas, mperttunen, nicolinc,
	bhuntsman, nicoleotsuka, Krishna Reddy

Add binding for NVIDIA's Tegra194 SoC SMMU.

Signed-off-by: Krishna Reddy <vdumpa@nvidia.com>
---
 .../devicetree/bindings/iommu/arm,smmu.yaml    | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
index d7ceb4c34423..ac1f526c3424 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
@@ -38,6 +38,11 @@ properties:
               - qcom,sc7180-smmu-500
               - qcom,sdm845-smmu-500
           - const: arm,mmu-500
+      - description: NVIDIA SoCs that program two ARM MMU-500s identically
+        items:
+          - enum:
+              - nvidia,tegra194-smmu
+          - const: nvidia,smmu-500
       - items:
           - const: arm,mmu-500
           - const: arm,smmu-v2
@@ -138,6 +143,19 @@ required:
 
 additionalProperties: false
 
+allOf:
+  - if:
+      properties:
+        compatible:
+          contains:
+            enum:
+              - nvidia,tegra194-smmu
+    then:
+      properties:
+        reg:
+          minItems: 2
+          maxItems: 2
+
 examples:
   - |+
     /* SMMU with stream matching or stream indexing */
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v10 5/5] iommu/arm-smmu: Add global/context fault implementation hooks
  2020-07-08  5:00 [PATCH v10 0/5] NVIDIA ARM SMMU Implementation Krishna Reddy
                   ` (3 preceding siblings ...)
  2020-07-08  5:00 ` [PATCH v10 4/5] dt-bindings: arm-smmu: add binding for Tegra194 SMMU Krishna Reddy
@ 2020-07-08  5:00 ` Krishna Reddy
  2020-07-08 12:28   ` Jon Hunter
                     ` (2 more replies)
  2020-07-13 13:50 ` [PATCH v10 0/5] NVIDIA ARM SMMU Implementation Will Deacon
  5 siblings, 3 replies; 25+ messages in thread
From: Krishna Reddy @ 2020-07-08  5:00 UTC (permalink / raw)
  To: joro, will, robin.murphy, robh+dt, treding, jonathanh
  Cc: devicetree, linux-arm-kernel, iommu, linux-kernel, linux-tegra,
	yhsu, snikam, praithatha, talho, bbiswas, mperttunen, nicolinc,
	bhuntsman, nicoleotsuka, Krishna Reddy

Add global/context fault hooks to allow vendor specific implementations
override default fault interrupt handlers.

Update NVIDIA implementation to override the default global/context fault
interrupt handlers and handle interrupts across the two ARM MMU-500s that
are programmed identically.

Signed-off-by: Krishna Reddy <vdumpa@nvidia.com>
---
 drivers/iommu/arm-smmu-nvidia.c | 99 +++++++++++++++++++++++++++++++++
 drivers/iommu/arm-smmu.c        | 17 +++++-
 drivers/iommu/arm-smmu.h        |  3 +
 3 files changed, 117 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm-smmu-nvidia.c b/drivers/iommu/arm-smmu-nvidia.c
index 2f55e5793d34..31368057e9be 100644
--- a/drivers/iommu/arm-smmu-nvidia.c
+++ b/drivers/iommu/arm-smmu-nvidia.c
@@ -127,6 +127,103 @@ static int nvidia_smmu_reset(struct arm_smmu_device *smmu)
 	return 0;
 }
 
+static irqreturn_t nvidia_smmu_global_fault_inst(int irq,
+						 struct arm_smmu_device *smmu,
+						 int inst)
+{
+	u32 gfsr, gfsynr0, gfsynr1, gfsynr2;
+	void __iomem *gr0_base = nvidia_smmu_page(smmu, inst, 0);
+
+	gfsr = readl_relaxed(gr0_base + ARM_SMMU_GR0_sGFSR);
+	if (!gfsr)
+		return IRQ_NONE;
+
+	gfsynr0 = readl_relaxed(gr0_base + ARM_SMMU_GR0_sGFSYNR0);
+	gfsynr1 = readl_relaxed(gr0_base + ARM_SMMU_GR0_sGFSYNR1);
+	gfsynr2 = readl_relaxed(gr0_base + ARM_SMMU_GR0_sGFSYNR2);
+
+	dev_err_ratelimited(smmu->dev,
+			    "Unexpected global fault, this could be serious\n");
+	dev_err_ratelimited(smmu->dev,
+			    "\tGFSR 0x%08x, GFSYNR0 0x%08x, GFSYNR1 0x%08x, GFSYNR2 0x%08x\n",
+			    gfsr, gfsynr0, gfsynr1, gfsynr2);
+
+	writel_relaxed(gfsr, gr0_base + ARM_SMMU_GR0_sGFSR);
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t nvidia_smmu_global_fault(int irq, void *dev)
+{
+	unsigned int inst;
+	irqreturn_t ret = IRQ_NONE;
+	struct arm_smmu_device *smmu = dev;
+
+	for (inst = 0; inst < NUM_SMMU_INSTANCES; inst++) {
+		irqreturn_t irq_ret;
+
+		irq_ret = nvidia_smmu_global_fault_inst(irq, smmu, inst);
+		if (irq_ret == IRQ_HANDLED)
+			ret = IRQ_HANDLED;
+	}
+
+	return ret;
+}
+
+static irqreturn_t nvidia_smmu_context_fault_bank(int irq,
+						  struct arm_smmu_device *smmu,
+						  int idx, int inst)
+{
+	u32 fsr, fsynr, cbfrsynra;
+	unsigned long iova;
+	void __iomem *gr1_base = nvidia_smmu_page(smmu, inst, 1);
+	void __iomem *cb_base = nvidia_smmu_page(smmu, inst, smmu->numpage + idx);
+
+	fsr = readl_relaxed(cb_base + ARM_SMMU_CB_FSR);
+	if (!(fsr & ARM_SMMU_FSR_FAULT))
+		return IRQ_NONE;
+
+	fsynr = readl_relaxed(cb_base + ARM_SMMU_CB_FSYNR0);
+	iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR);
+	cbfrsynra = readl_relaxed(gr1_base + ARM_SMMU_GR1_CBFRSYNRA(idx));
+
+	dev_err_ratelimited(smmu->dev,
+			    "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, cbfrsynra=0x%x, cb=%d\n",
+			    fsr, iova, fsynr, cbfrsynra, idx);
+
+	writel_relaxed(fsr, cb_base + ARM_SMMU_CB_FSR);
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t nvidia_smmu_context_fault(int irq, void *dev)
+{
+	int idx;
+	unsigned int inst;
+	irqreturn_t ret = IRQ_NONE;
+	struct arm_smmu_device *smmu;
+	struct iommu_domain *domain = dev;
+	struct arm_smmu_domain *smmu_domain;
+
+	smmu_domain = container_of(domain, struct arm_smmu_domain, domain);
+	smmu = smmu_domain->smmu;
+
+	for (inst = 0; inst < NUM_SMMU_INSTANCES; inst++) {
+		irqreturn_t irq_ret;
+
+		/*
+		 * Interrupt line is shared between all contexts.
+		 * Check for faults across all contexts.
+		 */
+		for (idx = 0; idx < smmu->num_context_banks; idx++) {
+			irq_ret = nvidia_smmu_context_fault_bank(irq, smmu,
+								 idx, inst);
+			if (irq_ret == IRQ_HANDLED)
+				ret = IRQ_HANDLED;
+		}
+	}
+
+	return ret;
+}
+
 static const struct arm_smmu_impl nvidia_smmu_impl = {
 	.read_reg = nvidia_smmu_read_reg,
 	.write_reg = nvidia_smmu_write_reg,
@@ -134,6 +231,8 @@ static const struct arm_smmu_impl nvidia_smmu_impl = {
 	.write_reg64 = nvidia_smmu_write_reg64,
 	.reset = nvidia_smmu_reset,
 	.tlb_sync = nvidia_smmu_tlb_sync,
+	.global_fault = nvidia_smmu_global_fault,
+	.context_fault = nvidia_smmu_context_fault,
 };
 
 struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu)
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index c123a5814f70..020afddfaa0f 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -670,6 +670,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	enum io_pgtable_fmt fmt;
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	irqreturn_t (*context_fault)(int irq, void *dev);
 
 	mutex_lock(&smmu_domain->init_mutex);
 	if (smmu_domain->smmu)
@@ -832,7 +833,13 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	 * handler seeing a half-initialised domain state.
 	 */
 	irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
-	ret = devm_request_irq(smmu->dev, irq, arm_smmu_context_fault,
+
+	if (smmu->impl && smmu->impl->context_fault)
+		context_fault = smmu->impl->context_fault;
+	else
+		context_fault = arm_smmu_context_fault;
+
+	ret = devm_request_irq(smmu->dev, irq, context_fault,
 			       IRQF_SHARED, "arm-smmu-context-fault", domain);
 	if (ret < 0) {
 		dev_err(smmu->dev, "failed to request context IRQ %d (%u)\n",
@@ -2105,6 +2112,7 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
 	struct arm_smmu_device *smmu;
 	struct device *dev = &pdev->dev;
 	int num_irqs, i, err;
+	irqreturn_t (*global_fault)(int irq, void *dev);
 
 	smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL);
 	if (!smmu) {
@@ -2191,9 +2199,14 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
 		smmu->num_context_irqs = smmu->num_context_banks;
 	}
 
+	if (smmu->impl && smmu->impl->global_fault)
+		global_fault = smmu->impl->global_fault;
+	else
+		global_fault = arm_smmu_global_fault;
+
 	for (i = 0; i < smmu->num_global_irqs; ++i) {
 		err = devm_request_irq(smmu->dev, smmu->irqs[i],
-				       arm_smmu_global_fault,
+				       global_fault,
 				       IRQF_SHARED,
 				       "arm-smmu global fault",
 				       smmu);
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index fad63efa1a72..d890a4a968e8 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -18,6 +18,7 @@
 #include <linux/io-64-nonatomic-hi-lo.h>
 #include <linux/io-pgtable.h>
 #include <linux/iommu.h>
+#include <linux/irqreturn.h>
 #include <linux/mutex.h>
 #include <linux/spinlock.h>
 #include <linux/types.h>
@@ -389,6 +390,8 @@ struct arm_smmu_impl {
 	void (*tlb_sync)(struct arm_smmu_device *smmu, int page, int sync,
 			 int status);
 	int (*def_domain_type)(struct device *dev);
+	irqreturn_t (*global_fault)(int irq, void *dev);
+	irqreturn_t (*context_fault)(int irq, void *dev);
 };
 
 static inline void __iomem *arm_smmu_page(struct arm_smmu_device *smmu, int n)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 1/5] iommu/arm-smmu: move TLB timeout and spin count macros
  2020-07-08  5:00 ` [PATCH v10 1/5] iommu/arm-smmu: move TLB timeout and spin count macros Krishna Reddy
@ 2020-07-08 12:05   ` Jon Hunter
  2020-07-08 20:37   ` Nicolin Chen
  1 sibling, 0 replies; 25+ messages in thread
From: Jon Hunter @ 2020-07-08 12:05 UTC (permalink / raw)
  To: Krishna Reddy, joro, will, robin.murphy, robh+dt, treding
  Cc: devicetree, linux-arm-kernel, iommu, linux-kernel, linux-tegra,
	yhsu, snikam, praithatha, talho, bbiswas, mperttunen, nicolinc,
	bhuntsman, nicoleotsuka


On 08/07/2020 06:00, Krishna Reddy wrote:
> Move TLB timeout and spin count macros to header file to
> allow using the same from vendor specific implementations.
> 
> Signed-off-by: Krishna Reddy <vdumpa@nvidia.com>
> ---
>  drivers/iommu/arm-smmu.c | 3 ---
>  drivers/iommu/arm-smmu.h | 2 ++
>  2 files changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 243bc4cb2705..d2054178df35 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -52,9 +52,6 @@
>   */
>  #define QCOM_DUMMY_VAL -1
>  
> -#define TLB_LOOP_TIMEOUT		1000000	/* 1s! */
> -#define TLB_SPIN_COUNT			10
> -
>  #define MSI_IOVA_BASE			0x8000000
>  #define MSI_IOVA_LENGTH			0x100000
>  
> diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
> index d172c024be61..c7d0122a7c6c 100644
> --- a/drivers/iommu/arm-smmu.h
> +++ b/drivers/iommu/arm-smmu.h
> @@ -236,6 +236,8 @@ enum arm_smmu_cbar_type {
>  /* Maximum number of context banks per SMMU */
>  #define ARM_SMMU_MAX_CBS		128
>  
> +#define TLB_LOOP_TIMEOUT		1000000	/* 1s! */
> +#define TLB_SPIN_COUNT			10
>  
>  /* Shared driver definitions */
>  enum arm_smmu_arch_version {
> 


Reviewed-by: Jon Hunter <jonathanh@nvidia.com>

Thanks!
Jon

-- 
nvpublic

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 2/5] iommu/arm-smmu: ioremap smmu mmio region before implementation init
  2020-07-08  5:00 ` [PATCH v10 2/5] iommu/arm-smmu: ioremap smmu mmio region before implementation init Krishna Reddy
@ 2020-07-08 12:05   ` Jon Hunter
  2020-07-08 20:37   ` Nicolin Chen
  2020-07-13 14:02   ` Robin Murphy
  2 siblings, 0 replies; 25+ messages in thread
From: Jon Hunter @ 2020-07-08 12:05 UTC (permalink / raw)
  To: Krishna Reddy, joro, will, robin.murphy, robh+dt, treding
  Cc: devicetree, linux-arm-kernel, iommu, linux-kernel, linux-tegra,
	yhsu, snikam, praithatha, talho, bbiswas, mperttunen, nicolinc,
	bhuntsman, nicoleotsuka


On 08/07/2020 06:00, Krishna Reddy wrote:
> ioremap smmu mmio region before calling into implementation init.
> This is necessary to allow mapped address available during vendor
> specific implementation init.
> 
> Signed-off-by: Krishna Reddy <vdumpa@nvidia.com>
> ---
>  drivers/iommu/arm-smmu.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index d2054178df35..e03e873d3bca 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -2120,10 +2120,6 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
>  	if (err)
>  		return err;
>  
> -	smmu = arm_smmu_impl_init(smmu);
> -	if (IS_ERR(smmu))
> -		return PTR_ERR(smmu);
> -
>  	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>  	ioaddr = res->start;
>  	smmu->base = devm_ioremap_resource(dev, res);
> @@ -2135,6 +2131,10 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
>  	 */
>  	smmu->numpage = resource_size(res);
>  
> +	smmu = arm_smmu_impl_init(smmu);
> +	if (IS_ERR(smmu))
> +		return PTR_ERR(smmu);
> +
>  	num_irqs = 0;
>  	while ((res = platform_get_resource(pdev, IORESOURCE_IRQ, num_irqs))) {
>  		num_irqs++;
> 


Reviewed-by: Jon Hunter <jonathanh@nvidia.com>

Thanks!
Jon

-- 
nvpublic

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 3/5] iommu/arm-smmu: add NVIDIA implementation for ARM MMU-500 usage
  2020-07-08  5:00 ` [PATCH v10 3/5] iommu/arm-smmu: add NVIDIA implementation for ARM MMU-500 usage Krishna Reddy
@ 2020-07-08 12:24   ` Jon Hunter
  2020-07-08 20:36   ` Nicolin Chen
  1 sibling, 0 replies; 25+ messages in thread
From: Jon Hunter @ 2020-07-08 12:24 UTC (permalink / raw)
  To: Krishna Reddy, joro, will, robin.murphy, robh+dt, treding
  Cc: devicetree, linux-arm-kernel, iommu, linux-kernel, linux-tegra,
	yhsu, snikam, praithatha, talho, bbiswas, mperttunen, nicolinc,
	bhuntsman, nicoleotsuka


On 08/07/2020 06:00, Krishna Reddy wrote:
> NVIDIA's Tegra194 SoC has three ARM MMU-500 instances.
> It uses two of the ARM MMU-500s together to interleave IOVA
> accesses across them and must be programmed identically.
> This implementation supports programming the two ARM MMU-500s
> that must be programmed identically.
> 
> The third ARM MMU-500 instance is supported by standard
> arm-smmu.c driver itself.
> 
> Signed-off-by: Krishna Reddy <vdumpa@nvidia.com>
> ---
>  MAINTAINERS                     |   2 +
>  drivers/iommu/Makefile          |   2 +-
>  drivers/iommu/arm-smmu-impl.c   |   3 +
>  drivers/iommu/arm-smmu-nvidia.c | 179 ++++++++++++++++++++++++++++++++
>  drivers/iommu/arm-smmu.c        |   1 +
>  drivers/iommu/arm-smmu.h        |   1 +
>  6 files changed, 187 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/iommu/arm-smmu-nvidia.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index c23352059a6b..534cedaf8e55 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -16811,8 +16811,10 @@ F:	drivers/i2c/busses/i2c-tegra.c
>  
>  TEGRA IOMMU DRIVERS
>  M:	Thierry Reding <thierry.reding@gmail.com>
> +R:	Krishna Reddy <vdumpa@nvidia.com>
>  L:	linux-tegra@vger.kernel.org
>  S:	Supported
> +F:	drivers/iommu/arm-smmu-nvidia.c
>  F:	drivers/iommu/tegra*
>  
>  TEGRA KBC DRIVER
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index 342190196dfb..2b8203db73ec 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -15,7 +15,7 @@ obj-$(CONFIG_AMD_IOMMU) += amd/iommu.o amd/init.o amd/quirks.o
>  obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += amd/debugfs.o
>  obj-$(CONFIG_AMD_IOMMU_V2) += amd/iommu_v2.o
>  obj-$(CONFIG_ARM_SMMU) += arm_smmu.o
> -arm_smmu-objs += arm-smmu.o arm-smmu-impl.o arm-smmu-qcom.o
> +arm_smmu-objs += arm-smmu.o arm-smmu-impl.o arm-smmu-nvidia.o arm-smmu-qcom.o
>  obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o
>  obj-$(CONFIG_DMAR_TABLE) += intel/dmar.o
>  obj-$(CONFIG_INTEL_IOMMU) += intel/iommu.o intel/pasid.o
> diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c
> index c75b9d957b70..f15571d05474 100644
> --- a/drivers/iommu/arm-smmu-impl.c
> +++ b/drivers/iommu/arm-smmu-impl.c
> @@ -171,6 +171,9 @@ struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu)
>  	if (of_property_read_bool(np, "calxeda,smmu-secure-config-access"))
>  		smmu->impl = &calxeda_impl;
>  
> +	if (of_device_is_compatible(np, "nvidia,tegra194-smmu"))
> +		return nvidia_smmu_impl_init(smmu);
> +

I wonder if we should be matching nvidia,smmu-500 here as well because
any device that has that we want to call nvidia_smmu_impl_init(). I
understand that there is only tegra194 today, but seems funny to match
nvidia,smmu-500 below and then nvidia,tegra194-smmu here. That said ...

>  	if (of_device_is_compatible(np, "qcom,sdm845-smmu-500") ||
>  	    of_device_is_compatible(np, "qcom,sc7180-smmu-500"))
>  		return qcom_smmu_impl_init(smmu);
> diff --git a/drivers/iommu/arm-smmu-nvidia.c b/drivers/iommu/arm-smmu-nvidia.c
> new file mode 100644
> index 000000000000..2f55e5793d34
> --- /dev/null
> +++ b/drivers/iommu/arm-smmu-nvidia.c
> @@ -0,0 +1,179 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright (C) 2019-2020 NVIDIA CORPORATION.  All rights reserved.
> +
> +#include <linux/bitfield.h>
> +#include <linux/delay.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +#include <linux/slab.h>
> +
> +#include "arm-smmu.h"
> +
> +/*
> + * Tegra194 has three ARM MMU-500 Instances.
> + * Two of them are used together and must be programmed identically for
> + * interleaved IOVA accesses across them and translates accesses from
> + * non-isochronous HW devices.
> + * Third one is used for translating accesses from isochronous HW devices.
> + * This implementation supports programming of the two instances that must
> + * be programmed identically.
> + * The third instance usage is through standard arm-smmu driver itself and
> + * is out of scope of this implementation.
> + */
> +#define NUM_SMMU_INSTANCES 2
> +
> +struct nvidia_smmu {
> +	struct arm_smmu_device	smmu;
> +	void __iomem		*bases[NUM_SMMU_INSTANCES];
> +};
> +
> +static inline void __iomem *nvidia_smmu_page(struct arm_smmu_device *smmu,
> +					     unsigned int inst, int page)
> +{
> +	struct nvidia_smmu *nvidia_smmu;
> +
> +	nvidia_smmu = container_of(smmu, struct nvidia_smmu, smmu);
> +	return nvidia_smmu->bases[inst] + (page << smmu->pgshift);
> +}
> +
> +static u32 nvidia_smmu_read_reg(struct arm_smmu_device *smmu,
> +				int page, int offset)
> +{
> +	void __iomem *reg = nvidia_smmu_page(smmu, 0, page) + offset;
> +
> +	return readl_relaxed(reg);
> +}
> +
> +static void nvidia_smmu_write_reg(struct arm_smmu_device *smmu,
> +				  int page, int offset, u32 val)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < NUM_SMMU_INSTANCES; i++) {
> +		void __iomem *reg = nvidia_smmu_page(smmu, i, page) + offset;
> +
> +		writel_relaxed(val, reg);
> +	}
> +}
> +
> +static u64 nvidia_smmu_read_reg64(struct arm_smmu_device *smmu,
> +				  int page, int offset)
> +{
> +	void __iomem *reg = nvidia_smmu_page(smmu, 0, page) + offset;
> +
> +	return readq_relaxed(reg);
> +}
> +
> +static void nvidia_smmu_write_reg64(struct arm_smmu_device *smmu,
> +				    int page, int offset, u64 val)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < NUM_SMMU_INSTANCES; i++) {
> +		void __iomem *reg = nvidia_smmu_page(smmu, i, page) + offset;
> +
> +		writeq_relaxed(val, reg);
> +	}
> +}
> +
> +static void nvidia_smmu_tlb_sync(struct arm_smmu_device *smmu, int page,
> +				 int sync, int status)
> +{
> +	unsigned int delay;
> +
> +	arm_smmu_writel(smmu, page, sync, 0);
> +
> +	for (delay = 1; delay < TLB_LOOP_TIMEOUT; delay *= 2) {
> +		unsigned int spin_cnt;
> +
> +		for (spin_cnt = TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) {
> +			u32 val = 0;
> +			unsigned int i;
> +
> +			for (i = 0; i < NUM_SMMU_INSTANCES; i++) {
> +				void __iomem *reg;
> +
> +				reg = nvidia_smmu_page(smmu, i, page) + status;
> +				val |= readl_relaxed(reg);
> +			}
> +
> +			if (!(val & ARM_SMMU_sTLBGSTATUS_GSACTIVE))
> +				return;
> +
> +			cpu_relax();
> +		}
> +
> +		udelay(delay);
> +	}
> +
> +	dev_err_ratelimited(smmu->dev,
> +			    "TLB sync timed out -- SMMU may be deadlocked\n");
> +}
> +
> +static int nvidia_smmu_reset(struct arm_smmu_device *smmu)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < NUM_SMMU_INSTANCES; i++) {
> +		u32 val;
> +		void __iomem *reg = nvidia_smmu_page(smmu, i, ARM_SMMU_GR0) +
> +				    ARM_SMMU_GR0_sGFSR;
> +
> +		/* clear global FSR */
> +		val = readl_relaxed(reg);
> +		writel_relaxed(val, reg);
> +	}
> +
> +	return 0;
> +}
> +
> +static const struct arm_smmu_impl nvidia_smmu_impl = {
> +	.read_reg = nvidia_smmu_read_reg,
> +	.write_reg = nvidia_smmu_write_reg,
> +	.read_reg64 = nvidia_smmu_read_reg64,
> +	.write_reg64 = nvidia_smmu_write_reg64,
> +	.reset = nvidia_smmu_reset,
> +	.tlb_sync = nvidia_smmu_tlb_sync,
> +};
> +
> +struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu)
> +{
> +	struct resource *res;
> +	struct device *dev = smmu->dev;
> +	struct nvidia_smmu *nvidia_smmu;
> +	struct platform_device *pdev = to_platform_device(dev);
> +
> +	nvidia_smmu = devm_kzalloc(dev, sizeof(*nvidia_smmu), GFP_KERNEL);
> +	if (!nvidia_smmu)
> +		return ERR_PTR(-ENOMEM);
> +
> +	/*
> +	 * Copy the data from struct arm_smmu_device *smmu allocated in
> +	 * arm-smmu.c. The smmu from struct nvidia_smmu replaces the smmu
> +	 * pointer used in arm-smmu.c once this function returns.
> +	 * This is necessary to derive nvidia_smmu from smmu pointer passed
> +	 * through arm_smmu_impl function calls subsequently.
> +	 */
> +	nvidia_smmu->smmu = *smmu;
> +	/* Instance 0 is ioremapped by arm-smmu.c. */
> +	nvidia_smmu->bases[0] = smmu->base;
> +
> +	res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
> +	if (!res)
> +		return ERR_PTR(-ENODEV);
> +
> +	nvidia_smmu->bases[1] = devm_ioremap_resource(dev, res);
> +	if (IS_ERR(nvidia_smmu->bases[1]))
> +		return ERR_CAST(nvidia_smmu->bases[1]);
> +
> +	nvidia_smmu->smmu.impl = &nvidia_smmu_impl;
> +
> +	/*
> +	 * Free the struct arm_smmu_device *smmu allocated in arm-smmu.c.
> +	 * Once this function returns, arm-smmu.c would use arm_smmu_device
> +	 * allocated as part of struct nvidia_smmu.
> +	 */
> +	devm_kfree(dev, smmu);
> +
> +	return &nvidia_smmu->smmu;
> +}
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index e03e873d3bca..c123a5814f70 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -1943,6 +1943,7 @@ static const struct of_device_id arm_smmu_of_match[] = {
>  	{ .compatible = "arm,mmu-401", .data = &arm_mmu401 },
>  	{ .compatible = "arm,mmu-500", .data = &arm_mmu500 },
>  	{ .compatible = "cavium,smmu-v2", .data = &cavium_smmuv2 },
> +	{ .compatible = "nvidia,smmu-500", .data = &arm_mmu500 },
>  	{ .compatible = "qcom,smmu-v2", .data = &qcom_smmuv2 },
>  	{ },
>  };
> diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
> index c7d0122a7c6c..fad63efa1a72 100644
> --- a/drivers/iommu/arm-smmu.h
> +++ b/drivers/iommu/arm-smmu.h
> @@ -452,6 +452,7 @@ static inline void arm_smmu_writeq(struct arm_smmu_device *smmu, int page,
>  	arm_smmu_writeq((s), ARM_SMMU_CB((s), (n)), (o), (v))
>  
>  struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu);
> +struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu);
>  struct arm_smmu_device *qcom_smmu_impl_init(struct arm_smmu_device *smmu);
>  
>  int arm_mmu500_reset(struct arm_smmu_device *smmu);
> 

Reviewed-by: Jon Hunter <jonathanh@nvidia.com>

Thanks
Jon

-- 
nvpublic

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 4/5] dt-bindings: arm-smmu: add binding for Tegra194 SMMU
  2020-07-08  5:00 ` [PATCH v10 4/5] dt-bindings: arm-smmu: add binding for Tegra194 SMMU Krishna Reddy
@ 2020-07-08 12:25   ` Jon Hunter
  2020-07-09 20:13   ` Rob Herring
  1 sibling, 0 replies; 25+ messages in thread
From: Jon Hunter @ 2020-07-08 12:25 UTC (permalink / raw)
  To: Krishna Reddy, joro, will, robin.murphy, robh+dt, treding
  Cc: devicetree, linux-arm-kernel, iommu, linux-kernel, linux-tegra,
	yhsu, snikam, praithatha, talho, bbiswas, mperttunen, nicolinc,
	bhuntsman, nicoleotsuka


On 08/07/2020 06:00, Krishna Reddy wrote:
> Add binding for NVIDIA's Tegra194 SoC SMMU.
> 
> Signed-off-by: Krishna Reddy <vdumpa@nvidia.com>
> ---
>  .../devicetree/bindings/iommu/arm,smmu.yaml    | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> index d7ceb4c34423..ac1f526c3424 100644
> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> @@ -38,6 +38,11 @@ properties:
>                - qcom,sc7180-smmu-500
>                - qcom,sdm845-smmu-500
>            - const: arm,mmu-500
> +      - description: NVIDIA SoCs that program two ARM MMU-500s identically
> +        items:
> +          - enum:
> +              - nvidia,tegra194-smmu
> +          - const: nvidia,smmu-500
>        - items:
>            - const: arm,mmu-500
>            - const: arm,smmu-v2
> @@ -138,6 +143,19 @@ required:
>  
>  additionalProperties: false
>  
> +allOf:
> +  - if:
> +      properties:
> +        compatible:
> +          contains:
> +            enum:
> +              - nvidia,tegra194-smmu
> +    then:
> +      properties:
> +        reg:
> +          minItems: 2
> +          maxItems: 2
> +
>  examples:
>    - |+
>      /* SMMU with stream matching or stream indexing */
> 

Reviewed-by: Jon Hunter <jonathanh@nvidia.com>

Thanks
Jon

-- 
nvpublic

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 5/5] iommu/arm-smmu: Add global/context fault implementation hooks
  2020-07-08  5:00 ` [PATCH v10 5/5] iommu/arm-smmu: Add global/context fault implementation hooks Krishna Reddy
@ 2020-07-08 12:28   ` Jon Hunter
  2020-07-08 20:36   ` Nicolin Chen
  2020-07-13 13:44   ` Will Deacon
  2 siblings, 0 replies; 25+ messages in thread
From: Jon Hunter @ 2020-07-08 12:28 UTC (permalink / raw)
  To: Krishna Reddy, joro, will, robin.murphy, robh+dt, treding
  Cc: devicetree, linux-arm-kernel, iommu, linux-kernel, linux-tegra,
	yhsu, snikam, praithatha, talho, bbiswas, mperttunen, nicolinc,
	bhuntsman, nicoleotsuka


On 08/07/2020 06:00, Krishna Reddy wrote:
> Add global/context fault hooks to allow vendor specific implementations
> override default fault interrupt handlers.
> 
> Update NVIDIA implementation to override the default global/context fault
> interrupt handlers and handle interrupts across the two ARM MMU-500s that
> are programmed identically.
> 
> Signed-off-by: Krishna Reddy <vdumpa@nvidia.com>
> ---
>  drivers/iommu/arm-smmu-nvidia.c | 99 +++++++++++++++++++++++++++++++++
>  drivers/iommu/arm-smmu.c        | 17 +++++-
>  drivers/iommu/arm-smmu.h        |  3 +
>  3 files changed, 117 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu-nvidia.c b/drivers/iommu/arm-smmu-nvidia.c
> index 2f55e5793d34..31368057e9be 100644
> --- a/drivers/iommu/arm-smmu-nvidia.c
> +++ b/drivers/iommu/arm-smmu-nvidia.c
> @@ -127,6 +127,103 @@ static int nvidia_smmu_reset(struct arm_smmu_device *smmu)
>  	return 0;
>  }
>  
> +static irqreturn_t nvidia_smmu_global_fault_inst(int irq,
> +						 struct arm_smmu_device *smmu,
> +						 int inst)
> +{
> +	u32 gfsr, gfsynr0, gfsynr1, gfsynr2;
> +	void __iomem *gr0_base = nvidia_smmu_page(smmu, inst, 0);
> +
> +	gfsr = readl_relaxed(gr0_base + ARM_SMMU_GR0_sGFSR);
> +	if (!gfsr)
> +		return IRQ_NONE;
> +
> +	gfsynr0 = readl_relaxed(gr0_base + ARM_SMMU_GR0_sGFSYNR0);
> +	gfsynr1 = readl_relaxed(gr0_base + ARM_SMMU_GR0_sGFSYNR1);
> +	gfsynr2 = readl_relaxed(gr0_base + ARM_SMMU_GR0_sGFSYNR2);
> +
> +	dev_err_ratelimited(smmu->dev,
> +			    "Unexpected global fault, this could be serious\n");
> +	dev_err_ratelimited(smmu->dev,
> +			    "\tGFSR 0x%08x, GFSYNR0 0x%08x, GFSYNR1 0x%08x, GFSYNR2 0x%08x\n",
> +			    gfsr, gfsynr0, gfsynr1, gfsynr2);
> +
> +	writel_relaxed(gfsr, gr0_base + ARM_SMMU_GR0_sGFSR);
> +	return IRQ_HANDLED;
> +}
> +
> +static irqreturn_t nvidia_smmu_global_fault(int irq, void *dev)
> +{
> +	unsigned int inst;
> +	irqreturn_t ret = IRQ_NONE;
> +	struct arm_smmu_device *smmu = dev;
> +
> +	for (inst = 0; inst < NUM_SMMU_INSTANCES; inst++) {
> +		irqreturn_t irq_ret;
> +
> +		irq_ret = nvidia_smmu_global_fault_inst(irq, smmu, inst);
> +		if (irq_ret == IRQ_HANDLED)
> +			ret = IRQ_HANDLED;
> +	}
> +
> +	return ret;
> +}
> +
> +static irqreturn_t nvidia_smmu_context_fault_bank(int irq,
> +						  struct arm_smmu_device *smmu,
> +						  int idx, int inst)
> +{
> +	u32 fsr, fsynr, cbfrsynra;
> +	unsigned long iova;
> +	void __iomem *gr1_base = nvidia_smmu_page(smmu, inst, 1);
> +	void __iomem *cb_base = nvidia_smmu_page(smmu, inst, smmu->numpage + idx);
> +
> +	fsr = readl_relaxed(cb_base + ARM_SMMU_CB_FSR);
> +	if (!(fsr & ARM_SMMU_FSR_FAULT))
> +		return IRQ_NONE;
> +
> +	fsynr = readl_relaxed(cb_base + ARM_SMMU_CB_FSYNR0);
> +	iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR);
> +	cbfrsynra = readl_relaxed(gr1_base + ARM_SMMU_GR1_CBFRSYNRA(idx));
> +
> +	dev_err_ratelimited(smmu->dev,
> +			    "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, cbfrsynra=0x%x, cb=%d\n",
> +			    fsr, iova, fsynr, cbfrsynra, idx);
> +
> +	writel_relaxed(fsr, cb_base + ARM_SMMU_CB_FSR);
> +	return IRQ_HANDLED;
> +}
> +
> +static irqreturn_t nvidia_smmu_context_fault(int irq, void *dev)
> +{
> +	int idx;
> +	unsigned int inst;
> +	irqreturn_t ret = IRQ_NONE;
> +	struct arm_smmu_device *smmu;
> +	struct iommu_domain *domain = dev;
> +	struct arm_smmu_domain *smmu_domain;
> +
> +	smmu_domain = container_of(domain, struct arm_smmu_domain, domain);
> +	smmu = smmu_domain->smmu;
> +
> +	for (inst = 0; inst < NUM_SMMU_INSTANCES; inst++) {
> +		irqreturn_t irq_ret;
> +
> +		/*
> +		 * Interrupt line is shared between all contexts.
> +		 * Check for faults across all contexts.
> +		 */
> +		for (idx = 0; idx < smmu->num_context_banks; idx++) {
> +			irq_ret = nvidia_smmu_context_fault_bank(irq, smmu,
> +								 idx, inst);
> +			if (irq_ret == IRQ_HANDLED)
> +				ret = IRQ_HANDLED;
> +		}
> +	}
> +
> +	return ret;
> +}
> +
>  static const struct arm_smmu_impl nvidia_smmu_impl = {
>  	.read_reg = nvidia_smmu_read_reg,
>  	.write_reg = nvidia_smmu_write_reg,
> @@ -134,6 +231,8 @@ static const struct arm_smmu_impl nvidia_smmu_impl = {
>  	.write_reg64 = nvidia_smmu_write_reg64,
>  	.reset = nvidia_smmu_reset,
>  	.tlb_sync = nvidia_smmu_tlb_sync,
> +	.global_fault = nvidia_smmu_global_fault,
> +	.context_fault = nvidia_smmu_context_fault,
>  };
>  
>  struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu)
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index c123a5814f70..020afddfaa0f 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -670,6 +670,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>  	enum io_pgtable_fmt fmt;
>  	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>  	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> +	irqreturn_t (*context_fault)(int irq, void *dev);
>  
>  	mutex_lock(&smmu_domain->init_mutex);
>  	if (smmu_domain->smmu)
> @@ -832,7 +833,13 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>  	 * handler seeing a half-initialised domain state.
>  	 */
>  	irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
> -	ret = devm_request_irq(smmu->dev, irq, arm_smmu_context_fault,
> +
> +	if (smmu->impl && smmu->impl->context_fault)
> +		context_fault = smmu->impl->context_fault;
> +	else
> +		context_fault = arm_smmu_context_fault;
> +
> +	ret = devm_request_irq(smmu->dev, irq, context_fault,
>  			       IRQF_SHARED, "arm-smmu-context-fault", domain);
>  	if (ret < 0) {
>  		dev_err(smmu->dev, "failed to request context IRQ %d (%u)\n",
> @@ -2105,6 +2112,7 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
>  	struct arm_smmu_device *smmu;
>  	struct device *dev = &pdev->dev;
>  	int num_irqs, i, err;
> +	irqreturn_t (*global_fault)(int irq, void *dev);
>  
>  	smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL);
>  	if (!smmu) {
> @@ -2191,9 +2199,14 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
>  		smmu->num_context_irqs = smmu->num_context_banks;
>  	}
>  
> +	if (smmu->impl && smmu->impl->global_fault)
> +		global_fault = smmu->impl->global_fault;
> +	else
> +		global_fault = arm_smmu_global_fault;
> +
>  	for (i = 0; i < smmu->num_global_irqs; ++i) {
>  		err = devm_request_irq(smmu->dev, smmu->irqs[i],
> -				       arm_smmu_global_fault,
> +				       global_fault,
>  				       IRQF_SHARED,
>  				       "arm-smmu global fault",
>  				       smmu);
> diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
> index fad63efa1a72..d890a4a968e8 100644
> --- a/drivers/iommu/arm-smmu.h
> +++ b/drivers/iommu/arm-smmu.h
> @@ -18,6 +18,7 @@
>  #include <linux/io-64-nonatomic-hi-lo.h>
>  #include <linux/io-pgtable.h>
>  #include <linux/iommu.h>
> +#include <linux/irqreturn.h>
>  #include <linux/mutex.h>
>  #include <linux/spinlock.h>
>  #include <linux/types.h>
> @@ -389,6 +390,8 @@ struct arm_smmu_impl {
>  	void (*tlb_sync)(struct arm_smmu_device *smmu, int page, int sync,
>  			 int status);
>  	int (*def_domain_type)(struct device *dev);
> +	irqreturn_t (*global_fault)(int irq, void *dev);
> +	irqreturn_t (*context_fault)(int irq, void *dev);
>  };
>  
>  static inline void __iomem *arm_smmu_page(struct arm_smmu_device *smmu, int n)
> 


Reviewed-by: Jon Hunter <jonathanh@nvidia.com>

Thanks!
Jon

-- 
nvpublic

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 5/5] iommu/arm-smmu: Add global/context fault implementation hooks
  2020-07-08  5:00 ` [PATCH v10 5/5] iommu/arm-smmu: Add global/context fault implementation hooks Krishna Reddy
  2020-07-08 12:28   ` Jon Hunter
@ 2020-07-08 20:36   ` Nicolin Chen
  2020-07-13 13:44   ` Will Deacon
  2 siblings, 0 replies; 25+ messages in thread
From: Nicolin Chen @ 2020-07-08 20:36 UTC (permalink / raw)
  To: Krishna Reddy
  Cc: joro, will, robin.murphy, robh+dt, treding, jonathanh,
	devicetree, linux-arm-kernel, iommu, linux-kernel, linux-tegra,
	yhsu, snikam, praithatha, talho, bbiswas, mperttunen, nicolinc,
	bhuntsman

On Tue, Jul 07, 2020 at 10:00:17PM -0700, Krishna Reddy wrote:
> Add global/context fault hooks to allow vendor specific implementations
> override default fault interrupt handlers.
> 
> Update NVIDIA implementation to override the default global/context fault
> interrupt handlers and handle interrupts across the two ARM MMU-500s that
> are programmed identically.
> 
> Signed-off-by: Krishna Reddy <vdumpa@nvidia.com>

Reviewed-by: Nicolin Chen <nicoleotsuka@gmail.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 3/5] iommu/arm-smmu: add NVIDIA implementation for ARM MMU-500 usage
  2020-07-08  5:00 ` [PATCH v10 3/5] iommu/arm-smmu: add NVIDIA implementation for ARM MMU-500 usage Krishna Reddy
  2020-07-08 12:24   ` Jon Hunter
@ 2020-07-08 20:36   ` Nicolin Chen
  1 sibling, 0 replies; 25+ messages in thread
From: Nicolin Chen @ 2020-07-08 20:36 UTC (permalink / raw)
  To: Krishna Reddy
  Cc: joro, will, robin.murphy, robh+dt, treding, jonathanh,
	devicetree, linux-arm-kernel, iommu, linux-kernel, linux-tegra,
	yhsu, snikam, praithatha, talho, bbiswas, mperttunen, nicolinc,
	bhuntsman

On Tue, Jul 07, 2020 at 10:00:15PM -0700, Krishna Reddy wrote:
> NVIDIA's Tegra194 SoC has three ARM MMU-500 instances.
> It uses two of the ARM MMU-500s together to interleave IOVA
> accesses across them and must be programmed identically.
> This implementation supports programming the two ARM MMU-500s
> that must be programmed identically.
> 
> The third ARM MMU-500 instance is supported by standard
> arm-smmu.c driver itself.
> 
> Signed-off-by: Krishna Reddy <vdumpa@nvidia.com>

Reviewed-by: Nicolin Chen <nicoleotsuka@gmail.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 1/5] iommu/arm-smmu: move TLB timeout and spin count macros
  2020-07-08  5:00 ` [PATCH v10 1/5] iommu/arm-smmu: move TLB timeout and spin count macros Krishna Reddy
  2020-07-08 12:05   ` Jon Hunter
@ 2020-07-08 20:37   ` Nicolin Chen
  1 sibling, 0 replies; 25+ messages in thread
From: Nicolin Chen @ 2020-07-08 20:37 UTC (permalink / raw)
  To: Krishna Reddy
  Cc: joro, will, robin.murphy, robh+dt, treding, jonathanh,
	devicetree, linux-arm-kernel, iommu, linux-kernel, linux-tegra,
	yhsu, snikam, praithatha, talho, bbiswas, mperttunen, nicolinc,
	bhuntsman

On Tue, Jul 07, 2020 at 10:00:13PM -0700, Krishna Reddy wrote:
> Move TLB timeout and spin count macros to header file to
> allow using the same from vendor specific implementations.
> 
> Signed-off-by: Krishna Reddy <vdumpa@nvidia.com>

Reviewed-by: Nicolin Chen <nicoleotsuka@gmail.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 2/5] iommu/arm-smmu: ioremap smmu mmio region before implementation init
  2020-07-08  5:00 ` [PATCH v10 2/5] iommu/arm-smmu: ioremap smmu mmio region before implementation init Krishna Reddy
  2020-07-08 12:05   ` Jon Hunter
@ 2020-07-08 20:37   ` Nicolin Chen
  2020-07-13 14:02   ` Robin Murphy
  2 siblings, 0 replies; 25+ messages in thread
From: Nicolin Chen @ 2020-07-08 20:37 UTC (permalink / raw)
  To: Krishna Reddy
  Cc: joro, will, robin.murphy, robh+dt, treding, jonathanh,
	devicetree, linux-arm-kernel, iommu, linux-kernel, linux-tegra,
	yhsu, snikam, praithatha, talho, bbiswas, mperttunen, nicolinc,
	bhuntsman

On Tue, Jul 07, 2020 at 10:00:14PM -0700, Krishna Reddy wrote:
> ioremap smmu mmio region before calling into implementation init.
> This is necessary to allow mapped address available during vendor
> specific implementation init.
> 
> Signed-off-by: Krishna Reddy <vdumpa@nvidia.com>

Reviewed-by: Nicolin Chen <nicoleotsuka@gmail.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 4/5] dt-bindings: arm-smmu: add binding for Tegra194 SMMU
  2020-07-08  5:00 ` [PATCH v10 4/5] dt-bindings: arm-smmu: add binding for Tegra194 SMMU Krishna Reddy
  2020-07-08 12:25   ` Jon Hunter
@ 2020-07-09 20:13   ` Rob Herring
  2020-07-10 20:29     ` Krishna Reddy
  1 sibling, 1 reply; 25+ messages in thread
From: Rob Herring @ 2020-07-09 20:13 UTC (permalink / raw)
  To: Krishna Reddy
  Cc: joro, will, robin.murphy, treding, jonathanh, devicetree,
	linux-arm-kernel, iommu, linux-kernel, linux-tegra, yhsu, snikam,
	praithatha, talho, bbiswas, mperttunen, nicolinc, bhuntsman,
	nicoleotsuka

On Tue, Jul 07, 2020 at 10:00:16PM -0700, Krishna Reddy wrote:
> Add binding for NVIDIA's Tegra194 SoC SMMU.
> 
> Signed-off-by: Krishna Reddy <vdumpa@nvidia.com>
> ---
>  .../devicetree/bindings/iommu/arm,smmu.yaml    | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> index d7ceb4c34423..ac1f526c3424 100644
> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> @@ -38,6 +38,11 @@ properties:
>                - qcom,sc7180-smmu-500
>                - qcom,sdm845-smmu-500
>            - const: arm,mmu-500
> +      - description: NVIDIA SoCs that program two ARM MMU-500s identically
> +        items:
> +          - enum:
> +              - nvidia,tegra194-smmu
> +          - const: nvidia,smmu-500
>        - items:
>            - const: arm,mmu-500
>            - const: arm,smmu-v2
> @@ -138,6 +143,19 @@ required:
>  
>  additionalProperties: false
>  
> +allOf:
> +  - if:
> +      properties:
> +        compatible:
> +          contains:
> +            enum:
> +              - nvidia,tegra194-smmu
> +    then:
> +      properties:
> +        reg:
> +          minItems: 2
> +          maxItems: 2

This doesn't work. The main part of the schema already said there's only 
1 reg region. This part is ANDed with that, not an override. You need to 
add an else clause with 'maxItems: 1' and change the base schema to 
{minItems: 1, maxItems: 2}.

Rob

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [PATCH v10 4/5] dt-bindings: arm-smmu: add binding for Tegra194 SMMU
  2020-07-09 20:13   ` Rob Herring
@ 2020-07-10 20:29     ` Krishna Reddy
  2020-07-13 14:10       ` Robin Murphy
  0 siblings, 1 reply; 25+ messages in thread
From: Krishna Reddy @ 2020-07-10 20:29 UTC (permalink / raw)
  To: Rob Herring, Robin Murphy
  Cc: joro, will, robin.murphy, Thierry Reding, Jonathan Hunter,
	devicetree, linux-arm-kernel, iommu, linux-kernel, linux-tegra,
	Yu-Huan Hsu, Sachin Nikam, Pritesh Raithatha, Timo Alho,
	Bitan Biswas, Mikko Perttunen, Nicolin Chen, Bryan Huntsman,
	nicoleotsuka

Thanks Rob. One question on setting "minItems: ". Please see below.

>> +allOf:
>> +  - if:
>> +      properties:
>> +        compatible:
>> +          contains:
>> +            enum:
>> +              - nvidia,tegra194-smmu
>> +    then:
>> +      properties:
>> +        reg:
>> +          minItems: 2
>> +          maxItems: 2

>This doesn't work. The main part of the schema already said there's only
>1 reg region. This part is ANDed with that, not an override. You need to add an else clause with 'maxItems: 1' and change the base schema to
>{minItems: 1, maxItems: 2}.

As the earlier version of base schema doesn't have "minItems: " set, should it be set to 0 for backward compatibility?  Or can it just be omitted setting in base schema as before?

"else" part to set "maxItems: 1" and setting "maxItems: 2" in base schema is clear to me.


-KR

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 5/5] iommu/arm-smmu: Add global/context fault implementation hooks
  2020-07-08  5:00 ` [PATCH v10 5/5] iommu/arm-smmu: Add global/context fault implementation hooks Krishna Reddy
  2020-07-08 12:28   ` Jon Hunter
  2020-07-08 20:36   ` Nicolin Chen
@ 2020-07-13 13:44   ` Will Deacon
  2020-07-17 11:58     ` Robin Murphy
  2 siblings, 1 reply; 25+ messages in thread
From: Will Deacon @ 2020-07-13 13:44 UTC (permalink / raw)
  To: Krishna Reddy
  Cc: joro, robin.murphy, robh+dt, treding, jonathanh, devicetree,
	linux-arm-kernel, iommu, linux-kernel, linux-tegra, yhsu, snikam,
	praithatha, talho, bbiswas, mperttunen, nicolinc, bhuntsman,
	nicoleotsuka

On Tue, Jul 07, 2020 at 10:00:17PM -0700, Krishna Reddy wrote:
> Add global/context fault hooks to allow vendor specific implementations
> override default fault interrupt handlers.
> 
> Update NVIDIA implementation to override the default global/context fault
> interrupt handlers and handle interrupts across the two ARM MMU-500s that
> are programmed identically.
> 
> Signed-off-by: Krishna Reddy <vdumpa@nvidia.com>
> ---
>  drivers/iommu/arm-smmu-nvidia.c | 99 +++++++++++++++++++++++++++++++++
>  drivers/iommu/arm-smmu.c        | 17 +++++-
>  drivers/iommu/arm-smmu.h        |  3 +
>  3 files changed, 117 insertions(+), 2 deletions(-)

Given that faults shouldn't occur during normal operation, is this patch
actually necessary?

Will

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 0/5] NVIDIA ARM SMMU Implementation
  2020-07-08  5:00 [PATCH v10 0/5] NVIDIA ARM SMMU Implementation Krishna Reddy
                   ` (4 preceding siblings ...)
  2020-07-08  5:00 ` [PATCH v10 5/5] iommu/arm-smmu: Add global/context fault implementation hooks Krishna Reddy
@ 2020-07-13 13:50 ` Will Deacon
  2020-07-17 10:03   ` Will Deacon
  5 siblings, 1 reply; 25+ messages in thread
From: Will Deacon @ 2020-07-13 13:50 UTC (permalink / raw)
  To: Krishna Reddy
  Cc: joro, robin.murphy, robh+dt, treding, jonathanh, devicetree,
	linux-arm-kernel, iommu, linux-kernel, linux-tegra, yhsu, snikam,
	praithatha, talho, bbiswas, mperttunen, nicolinc, bhuntsman,
	nicoleotsuka

On Tue, Jul 07, 2020 at 10:00:12PM -0700, Krishna Reddy wrote:
> Changes in v10:
> Perform SMMU base ioremap before calling implementation init.
> Check for Global faults across both ARM MMU-500s during global interrupt.
> Check for context faults across all contexts of both ARM MMU-500s during context fault interrupt.
> Add new DT binding nvidia,smmu-500 for NVIDIA implementation.

Please repost based on my SMMU queue, as this doesn't currently apply.

https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/log/?h=for-joerg/arm-smmu/updates

Will

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 2/5] iommu/arm-smmu: ioremap smmu mmio region before implementation init
  2020-07-08  5:00 ` [PATCH v10 2/5] iommu/arm-smmu: ioremap smmu mmio region before implementation init Krishna Reddy
  2020-07-08 12:05   ` Jon Hunter
  2020-07-08 20:37   ` Nicolin Chen
@ 2020-07-13 14:02   ` Robin Murphy
  2 siblings, 0 replies; 25+ messages in thread
From: Robin Murphy @ 2020-07-13 14:02 UTC (permalink / raw)
  To: Krishna Reddy, joro, will, robh+dt, treding, jonathanh
  Cc: devicetree, linux-arm-kernel, iommu, linux-kernel, linux-tegra,
	yhsu, snikam, praithatha, talho, bbiswas, mperttunen, nicolinc,
	bhuntsman, nicoleotsuka

On 2020-07-08 06:00, Krishna Reddy wrote:
> ioremap smmu mmio region before calling into implementation init.
> This is necessary to allow mapped address available during vendor
> specific implementation init.

Reviewed-by: Robin Murphy <robin.murphy@arm.com>

> Signed-off-by: Krishna Reddy <vdumpa@nvidia.com>
> ---
>   drivers/iommu/arm-smmu.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index d2054178df35..e03e873d3bca 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -2120,10 +2120,6 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
>   	if (err)
>   		return err;
>   
> -	smmu = arm_smmu_impl_init(smmu);
> -	if (IS_ERR(smmu))
> -		return PTR_ERR(smmu);
> -
>   	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>   	ioaddr = res->start;
>   	smmu->base = devm_ioremap_resource(dev, res);
> @@ -2135,6 +2131,10 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
>   	 */
>   	smmu->numpage = resource_size(res);
>   
> +	smmu = arm_smmu_impl_init(smmu);
> +	if (IS_ERR(smmu))
> +		return PTR_ERR(smmu);
> +
>   	num_irqs = 0;
>   	while ((res = platform_get_resource(pdev, IORESOURCE_IRQ, num_irqs))) {
>   		num_irqs++;
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 4/5] dt-bindings: arm-smmu: add binding for Tegra194 SMMU
  2020-07-10 20:29     ` Krishna Reddy
@ 2020-07-13 14:10       ` Robin Murphy
  2020-07-14 14:22         ` Rob Herring
  0 siblings, 1 reply; 25+ messages in thread
From: Robin Murphy @ 2020-07-13 14:10 UTC (permalink / raw)
  To: Krishna Reddy, Rob Herring
  Cc: joro, will, Thierry Reding, Jonathan Hunter, devicetree,
	linux-arm-kernel, iommu, linux-kernel, linux-tegra, Yu-Huan Hsu,
	Sachin Nikam, Pritesh Raithatha, Timo Alho, Bitan Biswas,
	Mikko Perttunen, Nicolin Chen, Bryan Huntsman, nicoleotsuka

On 2020-07-10 21:29, Krishna Reddy wrote:
> Thanks Rob. One question on setting "minItems: ". Please see below.
> 
>>> +allOf:
>>> +  - if:
>>> +      properties:
>>> +        compatible:
>>> +          contains:
>>> +            enum:
>>> +              - nvidia,tegra194-smmu
>>> +    then:
>>> +      properties:
>>> +        reg:
>>> +          minItems: 2
>>> +          maxItems: 2
> 
>> This doesn't work. The main part of the schema already said there's only
>> 1 reg region. This part is ANDed with that, not an override. You need to add an else clause with 'maxItems: 1' and change the base schema to
>> {minItems: 1, maxItems: 2}.
> 
> As the earlier version of base schema doesn't have "minItems: " set, should it be set to 0 for backward compatibility?  Or can it just be omitted setting in base schema as before?

We've always needed at least 1 "reg" specifier in practice, so I don't 
think being backwards-compatible with broken DTs is a concern :)

Robin.

> 
> "else" part to set "maxItems: 1" and setting "maxItems: 2" in base schema is clear to me.
> 
> 
> -KR
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 4/5] dt-bindings: arm-smmu: add binding for Tegra194 SMMU
  2020-07-13 14:10       ` Robin Murphy
@ 2020-07-14 14:22         ` Rob Herring
  0 siblings, 0 replies; 25+ messages in thread
From: Rob Herring @ 2020-07-14 14:22 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Krishna Reddy, joro, will, Thierry Reding, Jonathan Hunter,
	devicetree, linux-arm-kernel, iommu, linux-kernel, linux-tegra,
	Yu-Huan Hsu, Sachin Nikam, Pritesh Raithatha, Timo Alho,
	Bitan Biswas, Mikko Perttunen, Nicolin Chen, Bryan Huntsman,
	nicoleotsuka

On Mon, Jul 13, 2020 at 8:10 AM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2020-07-10 21:29, Krishna Reddy wrote:
> > Thanks Rob. One question on setting "minItems: ". Please see below.
> >
> >>> +allOf:
> >>> +  - if:
> >>> +      properties:
> >>> +        compatible:
> >>> +          contains:
> >>> +            enum:
> >>> +              - nvidia,tegra194-smmu
> >>> +    then:
> >>> +      properties:
> >>> +        reg:
> >>> +          minItems: 2
> >>> +          maxItems: 2
> >
> >> This doesn't work. The main part of the schema already said there's only
> >> 1 reg region. This part is ANDed with that, not an override. You need to add an else clause with 'maxItems: 1' and change the base schema to
> >> {minItems: 1, maxItems: 2}.
> >
> > As the earlier version of base schema doesn't have "minItems: " set, should it be set to 0 for backward compatibility?  Or can it just be omitted setting in base schema as before?
>
> We've always needed at least 1 "reg" specifier in practice, so I don't
> think being backwards-compatible with broken DTs is a concern :)

'minItems: 0' would be a boolean (e.g. "reg;") and I'm not sure that's
even really valid json-schema. What you'd want here is 'reg' not
present (i.e. not in 'required').

Rob

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 0/5] NVIDIA ARM SMMU Implementation
  2020-07-13 13:50 ` [PATCH v10 0/5] NVIDIA ARM SMMU Implementation Will Deacon
@ 2020-07-17 10:03   ` Will Deacon
  2020-07-17 22:36     ` Krishna Reddy
  0 siblings, 1 reply; 25+ messages in thread
From: Will Deacon @ 2020-07-17 10:03 UTC (permalink / raw)
  To: Krishna Reddy
  Cc: joro, robin.murphy, robh+dt, treding, jonathanh, devicetree,
	linux-arm-kernel, iommu, linux-kernel, linux-tegra, yhsu, snikam,
	praithatha, talho, bbiswas, mperttunen, nicolinc, bhuntsman,
	nicoleotsuka

On Mon, Jul 13, 2020 at 02:50:20PM +0100, Will Deacon wrote:
> On Tue, Jul 07, 2020 at 10:00:12PM -0700, Krishna Reddy wrote:
> > Changes in v10:
> > Perform SMMU base ioremap before calling implementation init.
> > Check for Global faults across both ARM MMU-500s during global interrupt.
> > Check for context faults across all contexts of both ARM MMU-500s during context fault interrupt.
> > Add new DT binding nvidia,smmu-500 for NVIDIA implementation.
> 
> Please repost based on my SMMU queue, as this doesn't currently apply.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/log/?h=for-joerg/arm-smmu/updates

Any update on this, please? It would be a shame to miss 5.9 now that the
patches look alright.

Will

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 5/5] iommu/arm-smmu: Add global/context fault implementation hooks
  2020-07-13 13:44   ` Will Deacon
@ 2020-07-17 11:58     ` Robin Murphy
  0 siblings, 0 replies; 25+ messages in thread
From: Robin Murphy @ 2020-07-17 11:58 UTC (permalink / raw)
  To: Will Deacon, Krishna Reddy
  Cc: snikam, devicetree, mperttunen, praithatha, bhuntsman,
	linux-kernel, jonathanh, talho, iommu, robh+dt, nicolinc,
	linux-tegra, yhsu, treding, linux-arm-kernel, bbiswas

On 2020-07-13 14:44, Will Deacon wrote:
> On Tue, Jul 07, 2020 at 10:00:17PM -0700, Krishna Reddy wrote:
>> Add global/context fault hooks to allow vendor specific implementations
>> override default fault interrupt handlers.
>>
>> Update NVIDIA implementation to override the default global/context fault
>> interrupt handlers and handle interrupts across the two ARM MMU-500s that
>> are programmed identically.
>>
>> Signed-off-by: Krishna Reddy <vdumpa@nvidia.com>
>> ---
>>   drivers/iommu/arm-smmu-nvidia.c | 99 +++++++++++++++++++++++++++++++++
>>   drivers/iommu/arm-smmu.c        | 17 +++++-
>>   drivers/iommu/arm-smmu.h        |  3 +
>>   3 files changed, 117 insertions(+), 2 deletions(-)
> 
> Given that faults shouldn't occur during normal operation, is this patch
> actually necessary?

Indeed they shouldn't, but if something *does* happen to go wrong then I 
think it's worth having proper handling in place, since the consequences 
otherwise include a screaming "spurious" fault or just silently losing 
some transactions and possibly locking up part of the system altogether 
(depending on HUPCF at least - I recall MMU-500 also behaving funnily 
WRT TLB maintenance while an IRQ is outstanding, but that was long 
enough ago that it might have been related to the old CFCFG behaviour).

Until we sort out the reserved memory regions thing (the new IORT spec 
is due Real Soon Now(TM)...) some systems are going to keep suffering 
transient context faults during boot - those may make the display 
unhappy until it gets reset, but we certainly don't want to invite the 
possibility of them wedging the SMMU itself.

Robin.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [PATCH v10 0/5] NVIDIA ARM SMMU Implementation
  2020-07-17 10:03   ` Will Deacon
@ 2020-07-17 22:36     ` Krishna Reddy
  0 siblings, 0 replies; 25+ messages in thread
From: Krishna Reddy @ 2020-07-17 22:36 UTC (permalink / raw)
  To: Will Deacon
  Cc: joro, robin.murphy, robh+dt, Thierry Reding, Jonathan Hunter,
	devicetree, linux-arm-kernel, iommu, linux-kernel, linux-tegra,
	Yu-Huan Hsu, Sachin Nikam, Pritesh Raithatha, Timo Alho,
	Bitan Biswas, Mikko Perttunen, Nicolin Chen, Bryan Huntsman,
	nicoleotsuka

>On Mon, Jul 13, 2020 at 02:50:20PM +0100, Will Deacon wrote:
>> On Tue, Jul 07, 2020 at 10:00:12PM -0700, Krishna Reddy wrote:
> >> Changes in v10:
>> > Perform SMMU base ioremap before calling implementation init.
>> > Check for Global faults across both ARM MMU-500s during global interrupt.
>> > Check for context faults across all contexts of both ARM MMU-500s during context fault interrupt.
> >> Add new DT binding nvidia,smmu-500 for NVIDIA implementation.
>>
>> Please repost based on my SMMU queue, as this doesn't currently apply.
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/log/?h=
>> for-joerg/arm-smmu/updates

>Any update on this, please? It would be a shame to miss 5.9 now that the patches look alright.

Thanks for pinging.
I have been out of the office this week. Just started working on reposting patches.

Will

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2020-07-17 22:36 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-08  5:00 [PATCH v10 0/5] NVIDIA ARM SMMU Implementation Krishna Reddy
2020-07-08  5:00 ` [PATCH v10 1/5] iommu/arm-smmu: move TLB timeout and spin count macros Krishna Reddy
2020-07-08 12:05   ` Jon Hunter
2020-07-08 20:37   ` Nicolin Chen
2020-07-08  5:00 ` [PATCH v10 2/5] iommu/arm-smmu: ioremap smmu mmio region before implementation init Krishna Reddy
2020-07-08 12:05   ` Jon Hunter
2020-07-08 20:37   ` Nicolin Chen
2020-07-13 14:02   ` Robin Murphy
2020-07-08  5:00 ` [PATCH v10 3/5] iommu/arm-smmu: add NVIDIA implementation for ARM MMU-500 usage Krishna Reddy
2020-07-08 12:24   ` Jon Hunter
2020-07-08 20:36   ` Nicolin Chen
2020-07-08  5:00 ` [PATCH v10 4/5] dt-bindings: arm-smmu: add binding for Tegra194 SMMU Krishna Reddy
2020-07-08 12:25   ` Jon Hunter
2020-07-09 20:13   ` Rob Herring
2020-07-10 20:29     ` Krishna Reddy
2020-07-13 14:10       ` Robin Murphy
2020-07-14 14:22         ` Rob Herring
2020-07-08  5:00 ` [PATCH v10 5/5] iommu/arm-smmu: Add global/context fault implementation hooks Krishna Reddy
2020-07-08 12:28   ` Jon Hunter
2020-07-08 20:36   ` Nicolin Chen
2020-07-13 13:44   ` Will Deacon
2020-07-17 11:58     ` Robin Murphy
2020-07-13 13:50 ` [PATCH v10 0/5] NVIDIA ARM SMMU Implementation Will Deacon
2020-07-17 10:03   ` Will Deacon
2020-07-17 22:36     ` Krishna Reddy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).