[PATCH v5 0/5] Nvidia Arm SMMUv2 Implementation

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v5 0/5] Nvidia Arm SMMUv2 Implementation
@ 2020-05-21 23:31 Krishna Reddy
  2020-05-21 23:31 ` [PATCH v5 1/5] iommu/arm-smmu: add NVIDIA implementation for dual ARM MMU-500 usage Krishna Reddy
                   ` (5 more replies)
  0 siblings, 6 replies; 8+ messages in thread
From: Krishna Reddy @ 2020-05-21 23:31 UTC (permalink / raw)
  Cc: joro, will, robin.murphy, linux-arm-kernel, iommu, linux-kernel,
	linux-tegra, treding, yhsu, snikam, praithatha, talho, bbiswas,
	mperttunen, nicolinc, bhuntsman, Krishna Reddy

Changes in v5:
Rebased on top of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git next

v4 - https://lkml.org/lkml/2019/10/30/1054
v3 - https://lkml.org/lkml/2019/10/18/1601
v2 - https://lkml.org/lkml/2019/9/2/980
v1 - https://lkml.org/lkml/2019/8/29/1588

Krishna Reddy (5):
  iommu/arm-smmu: add NVIDIA implementation for dual ARM MMU-500 usage
  dt-bindings: arm-smmu: Add binding for Tegra194 SMMU
  iommu/arm-smmu: Add global/context fault implementation hooks
  arm64: tegra: Add DT node for T194 SMMU
  arm64: tegra: enable SMMU for SDHCI and EQOS on T194

 .../devicetree/bindings/iommu/arm,smmu.yaml   |   5 +
 MAINTAINERS                                   |   2 +
 arch/arm64/boot/dts/nvidia/tegra194.dtsi      |  81 ++++++
 drivers/iommu/Makefile                        |   2 +-
 drivers/iommu/arm-smmu-impl.c                 |   3 +
 drivers/iommu/arm-smmu-nvidia.c               | 261 ++++++++++++++++++
 drivers/iommu/arm-smmu.c                      |  11 +-
 drivers/iommu/arm-smmu.h                      |   4 +
 8 files changed, 366 insertions(+), 3 deletions(-)
 create mode 100644 drivers/iommu/arm-smmu-nvidia.c


base-commit: 365f8d504da50feaebf826d180113529c9383670
-- 
2.26.2


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v5 1/5] iommu/arm-smmu: add NVIDIA implementation for dual ARM MMU-500 usage
  2020-05-21 23:31 [PATCH v5 0/5] Nvidia Arm SMMUv2 Implementation Krishna Reddy
@ 2020-05-21 23:31 ` Krishna Reddy
  2020-05-21 23:31 ` [PATCH v5 2/5] dt-bindings: arm-smmu: Add binding for Tegra194 SMMU Krishna Reddy
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Krishna Reddy @ 2020-05-21 23:31 UTC (permalink / raw)
  Cc: joro, will, robin.murphy, linux-arm-kernel, iommu, linux-kernel,
	linux-tegra, treding, yhsu, snikam, praithatha, talho, bbiswas,
	mperttunen, nicolinc, bhuntsman, Krishna Reddy

NVIDIA's Tegra194 soc uses two ARM MMU-500s together to interleave
IOVA accesses across them.
Add NVIDIA implementation for dual ARM MMU-500s and add new compatible
string for Tegra194 soc.

Signed-off-by: Krishna Reddy <vdumpa@nvidia.com>
---
 MAINTAINERS                     |   2 +
 drivers/iommu/Makefile          |   2 +-
 drivers/iommu/arm-smmu-impl.c   |   3 +
 drivers/iommu/arm-smmu-nvidia.c | 161 ++++++++++++++++++++++++++++++++
 drivers/iommu/arm-smmu.h        |   1 +
 5 files changed, 168 insertions(+), 1 deletion(-)
 create mode 100644 drivers/iommu/arm-smmu-nvidia.c

diff --git a/MAINTAINERS b/MAINTAINERS
index ecc0749810b0..0d8c966ecf17 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -16560,9 +16560,11 @@ F:	drivers/i2c/busses/i2c-tegra.c
 
 TEGRA IOMMU DRIVERS
 M:	Thierry Reding <thierry.reding@gmail.com>
+R:	Krishna Reddy <vdumpa@nvidia.com>
 L:	linux-tegra@vger.kernel.org
 S:	Supported
 F:	drivers/iommu/tegra*
+F:	drivers/iommu/arm-smmu-nvidia.c
 
 TEGRA KBC DRIVER
 M:	Laxman Dewangan <ldewangan@nvidia.com>
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 57cf4ba5e27c..35542df00da7 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -15,7 +15,7 @@ obj-$(CONFIG_AMD_IOMMU) += amd_iommu.o amd_iommu_init.o amd_iommu_quirks.o
 obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += amd_iommu_debugfs.o
 obj-$(CONFIG_AMD_IOMMU_V2) += amd_iommu_v2.o
 obj-$(CONFIG_ARM_SMMU) += arm_smmu.o
-arm_smmu-objs += arm-smmu.o arm-smmu-impl.o arm-smmu-qcom.o
+arm_smmu-objs += arm-smmu.o arm-smmu-impl.o arm-smmu-qcom.o arm-smmu-nvidia.o
 obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o
 obj-$(CONFIG_DMAR_TABLE) += dmar.o
 obj-$(CONFIG_INTEL_IOMMU) += intel-iommu.o intel-pasid.o
diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c
index 74d97a886e93..dcdd513323aa 100644
--- a/drivers/iommu/arm-smmu-impl.c
+++ b/drivers/iommu/arm-smmu-impl.c
@@ -158,6 +158,9 @@ struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu)
 	 */
 	switch (smmu->model) {
 	case ARM_MMU500:
+		if (of_device_is_compatible(smmu->dev->of_node,
+					    "nvidia,tegra194-smmu-500"))
+			return nvidia_smmu_impl_init(smmu);
 		smmu->impl = &arm_mmu500_impl;
 		break;
 	case CAVIUM_SMMUV2:
diff --git a/drivers/iommu/arm-smmu-nvidia.c b/drivers/iommu/arm-smmu-nvidia.c
new file mode 100644
index 000000000000..dafc293a4521
--- /dev/null
+++ b/drivers/iommu/arm-smmu-nvidia.c
@@ -0,0 +1,161 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Nvidia ARM SMMU v2 implementation quirks
+// Copyright (C) 2019 NVIDIA CORPORATION.  All rights reserved.
+
+#define pr_fmt(fmt) "nvidia-smmu: " fmt
+
+#include <linux/bitfield.h>
+#include <linux/delay.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+
+#include "arm-smmu.h"
+
+/* Tegra194 has three ARM MMU-500 Instances.
+ * Two of them are used together for Interleaved IOVA accesses and
+ * used by Non-Isochronous Hw devices for SMMU translations.
+ * Third one is used for SMMU translations from Isochronous HW devices.
+ * It is possible to use this Implementation to program either
+ * all three or two of the instances identically as desired through
+ * DT node.
+ *
+ * Programming all the three instances identically comes with redundant tlb
+ * invalidations as all three never need to be tlb invalidated for a HW device.
+ *
+ * When Linux Kernel supports multiple SMMU devices, The SMMU device used for
+ * Isochornous HW devices should be added as a separate ARM MMU-500 device
+ * in DT and be programmed independently for efficient tlb invalidates.
+ *
+ */
+#define MAX_SMMU_INSTANCES 3
+
+#define TLB_LOOP_TIMEOUT		1000000	/* 1s! */
+#define TLB_SPIN_COUNT			10
+
+struct nvidia_smmu {
+	struct arm_smmu_device	smmu;
+	unsigned int		num_inst;
+	void __iomem		*bases[MAX_SMMU_INSTANCES];
+};
+
+#define to_nvidia_smmu(s) container_of(s, struct nvidia_smmu, smmu)
+
+#define nsmmu_page(smmu, inst, page) \
+	(((inst) ? to_nvidia_smmu(smmu)->bases[(inst)] : smmu->base) + \
+	((page) << smmu->pgshift))
+
+static u32 nsmmu_read_reg(struct arm_smmu_device *smmu,
+			      int page, int offset)
+{
+	return readl_relaxed(nsmmu_page(smmu, 0, page) + offset);
+}
+
+static void nsmmu_write_reg(struct arm_smmu_device *smmu,
+			    int page, int offset, u32 val)
+{
+	unsigned int i;
+
+	for (i = 0; i < to_nvidia_smmu(smmu)->num_inst; i++)
+		writel_relaxed(val, nsmmu_page(smmu, i, page) + offset);
+}
+
+static u64 nsmmu_read_reg64(struct arm_smmu_device *smmu,
+				int page, int offset)
+{
+	return readq_relaxed(nsmmu_page(smmu, 0, page) + offset);
+}
+
+static void nsmmu_write_reg64(struct arm_smmu_device *smmu,
+				  int page, int offset, u64 val)
+{
+	unsigned int i;
+
+	for (i = 0; i < to_nvidia_smmu(smmu)->num_inst; i++)
+		writeq_relaxed(val, nsmmu_page(smmu, i, page) + offset);
+}
+
+static void nsmmu_tlb_sync(struct arm_smmu_device *smmu, int page,
+			   int sync, int status)
+{
+	u32 reg;
+	unsigned int i;
+	unsigned int spin_cnt, delay;
+
+	arm_smmu_writel(smmu, page, sync, 0);
+
+	for (delay = 1; delay < TLB_LOOP_TIMEOUT; delay *= 2) {
+		for (spin_cnt = TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) {
+			reg = 0;
+			for (i = 0; i < to_nvidia_smmu(smmu)->num_inst; i++) {
+				reg |= readl_relaxed(
+					nsmmu_page(smmu, i, page) + status);
+			}
+			if (!(reg & ARM_SMMU_sTLBGSTATUS_GSACTIVE))
+				return;
+			cpu_relax();
+		}
+		udelay(delay);
+	}
+	dev_err_ratelimited(smmu->dev,
+			    "TLB sync timed out -- SMMU may be deadlocked\n");
+}
+
+static int nsmmu_reset(struct arm_smmu_device *smmu)
+{
+	u32 reg;
+	unsigned int i;
+
+	for (i = 0; i < to_nvidia_smmu(smmu)->num_inst; i++) {
+		/* clear global FSR */
+		reg = readl_relaxed(nsmmu_page(smmu, i, ARM_SMMU_GR0) +
+				    ARM_SMMU_GR0_sGFSR);
+		writel_relaxed(reg, nsmmu_page(smmu, i, ARM_SMMU_GR0) +
+				    ARM_SMMU_GR0_sGFSR);
+	}
+
+	return 0;
+}
+
+static const struct arm_smmu_impl nvidia_smmu_impl = {
+	.read_reg = nsmmu_read_reg,
+	.write_reg = nsmmu_write_reg,
+	.read_reg64 = nsmmu_read_reg64,
+	.write_reg64 = nsmmu_write_reg64,
+	.reset = nsmmu_reset,
+	.tlb_sync = nsmmu_tlb_sync,
+};
+
+struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu)
+{
+	unsigned int i;
+	struct nvidia_smmu *nsmmu;
+	struct resource *res;
+	struct device *dev = smmu->dev;
+	struct platform_device *pdev = to_platform_device(smmu->dev);
+
+	nsmmu = devm_kzalloc(smmu->dev, sizeof(*nsmmu), GFP_KERNEL);
+	if (!nsmmu)
+		return ERR_PTR(-ENOMEM);
+
+	nsmmu->smmu = *smmu;
+	/* Instance 0 is ioremapped by arm-smmu.c */
+	nsmmu->num_inst = 1;
+
+	for (i = 1; i < MAX_SMMU_INSTANCES; i++) {
+		res = platform_get_resource(pdev, IORESOURCE_MEM, i);
+		if (!res)
+			break;
+		nsmmu->bases[i] = devm_ioremap_resource(dev, res);
+		if (IS_ERR(nsmmu->bases[i]))
+			return (struct arm_smmu_device *)nsmmu->bases[i];
+		nsmmu->num_inst++;
+	}
+
+	nsmmu->smmu.impl = &nvidia_smmu_impl;
+	devm_kfree(smmu->dev, smmu);
+	pr_info("NVIDIA ARM SMMU Implementation, Instances=%d\n",
+		nsmmu->num_inst);
+
+	return &nsmmu->smmu;
+}
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index 8d1cd54d82a6..67c3c6f5c49e 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -450,6 +450,7 @@ static inline void arm_smmu_writeq(struct arm_smmu_device *smmu, int page,
 
 struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu);
 struct arm_smmu_device *qcom_smmu_impl_init(struct arm_smmu_device *smmu);
+struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu);
 
 int arm_mmu500_reset(struct arm_smmu_device *smmu);
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v5 2/5] dt-bindings: arm-smmu: Add binding for Tegra194 SMMU
  2020-05-21 23:31 [PATCH v5 0/5] Nvidia Arm SMMUv2 Implementation Krishna Reddy
  2020-05-21 23:31 ` [PATCH v5 1/5] iommu/arm-smmu: add NVIDIA implementation for dual ARM MMU-500 usage Krishna Reddy
@ 2020-05-21 23:31 ` Krishna Reddy
  2020-05-21 23:31 ` [PATCH v5 3/5] iommu/arm-smmu: Add global/context fault implementation hooks Krishna Reddy
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Krishna Reddy @ 2020-05-21 23:31 UTC (permalink / raw)
  Cc: joro, will, robin.murphy, linux-arm-kernel, iommu, linux-kernel,
	linux-tegra, treding, yhsu, snikam, praithatha, talho, bbiswas,
	mperttunen, nicolinc, bhuntsman, Krishna Reddy

Add binding for NVIDIA's Tegra194 Soc SMMU that is based
on ARM MMU-500.

Signed-off-by: Krishna Reddy <vdumpa@nvidia.com>
---
 Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
index 6515dbe47508..78aba7dd5a61 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
@@ -37,6 +37,11 @@ properties:
               - qcom,sc7180-smmu-500
               - qcom,sdm845-smmu-500
           - const: arm,mmu-500
+      - description: NVIDIA SoCs that use more than one "arm,mmu-500"
+        items:
+          - enum:
+              - nvdia,tegra194-smmu-500
+          - const: arm,mmu-500
       - items:
           - const: arm,mmu-500
           - const: arm,smmu-v2
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v5 3/5] iommu/arm-smmu: Add global/context fault implementation hooks
  2020-05-21 23:31 [PATCH v5 0/5] Nvidia Arm SMMUv2 Implementation Krishna Reddy
  2020-05-21 23:31 ` [PATCH v5 1/5] iommu/arm-smmu: add NVIDIA implementation for dual ARM MMU-500 usage Krishna Reddy
  2020-05-21 23:31 ` [PATCH v5 2/5] dt-bindings: arm-smmu: Add binding for Tegra194 SMMU Krishna Reddy
@ 2020-05-21 23:31 ` Krishna Reddy
  2020-05-21 23:31 ` [PATCH v5 4/5] arm64: tegra: Add DT node for T194 SMMU Krishna Reddy
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Krishna Reddy @ 2020-05-21 23:31 UTC (permalink / raw)
  Cc: joro, will, robin.murphy, linux-arm-kernel, iommu, linux-kernel,
	linux-tegra, treding, yhsu, snikam, praithatha, talho, bbiswas,
	mperttunen, nicolinc, bhuntsman, Krishna Reddy

Add global/context fault hooks to allow NVIDIA SMMU implementation
handle faults across multiple SMMUs.

Signed-off-by: Krishna Reddy <vdumpa@nvidia.com>
---
 drivers/iommu/arm-smmu-nvidia.c | 100 ++++++++++++++++++++++++++++++++
 drivers/iommu/arm-smmu.c        |  11 +++-
 drivers/iommu/arm-smmu.h        |   3 +
 3 files changed, 112 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm-smmu-nvidia.c b/drivers/iommu/arm-smmu-nvidia.c
index dafc293a4521..5999b6a77099 100644
--- a/drivers/iommu/arm-smmu-nvidia.c
+++ b/drivers/iommu/arm-smmu-nvidia.c
@@ -117,6 +117,104 @@ static int nsmmu_reset(struct arm_smmu_device *smmu)
 	return 0;
 }
 
+static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
+{
+	return container_of(dom, struct arm_smmu_domain, domain);
+}
+
+static irqreturn_t nsmmu_global_fault_inst(int irq,
+					       struct arm_smmu_device *smmu,
+					       int inst)
+{
+	u32 gfsr, gfsynr0, gfsynr1, gfsynr2;
+
+	gfsr = readl_relaxed(nsmmu_page(smmu, inst, 0) + ARM_SMMU_GR0_sGFSR);
+	gfsynr0 = readl_relaxed(nsmmu_page(smmu, inst, 0) +
+				ARM_SMMU_GR0_sGFSYNR0);
+	gfsynr1 = readl_relaxed(nsmmu_page(smmu, inst, 0) +
+				ARM_SMMU_GR0_sGFSYNR1);
+	gfsynr2 = readl_relaxed(nsmmu_page(smmu, inst, 0) +
+				ARM_SMMU_GR0_sGFSYNR2);
+
+	if (!gfsr)
+		return IRQ_NONE;
+
+	dev_err_ratelimited(smmu->dev,
+		"Unexpected global fault, this could be serious\n");
+	dev_err_ratelimited(smmu->dev,
+		"\tGFSR 0x%08x, GFSYNR0 0x%08x, GFSYNR1 0x%08x, GFSYNR2 0x%08x\n",
+		gfsr, gfsynr0, gfsynr1, gfsynr2);
+
+	writel_relaxed(gfsr, nsmmu_page(smmu, inst, 0) + ARM_SMMU_GR0_sGFSR);
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t nsmmu_global_fault(int irq, void *dev)
+{
+	int inst;
+	irqreturn_t irq_ret = IRQ_NONE;
+	struct arm_smmu_device *smmu = dev;
+
+	for (inst = 0; inst < to_nvidia_smmu(smmu)->num_inst; inst++) {
+		irq_ret = nsmmu_global_fault_inst(irq, smmu, inst);
+		if (irq_ret == IRQ_HANDLED)
+			return irq_ret;
+	}
+
+	return irq_ret;
+}
+
+static irqreturn_t nsmmu_context_fault_bank(int irq,
+					    struct arm_smmu_device *smmu,
+					    int idx, int inst)
+{
+	u32 fsr, fsynr, cbfrsynra;
+	unsigned long iova;
+
+	fsr = arm_smmu_cb_read(smmu, idx, ARM_SMMU_CB_FSR);
+	if (!(fsr & ARM_SMMU_FSR_FAULT))
+		return IRQ_NONE;
+
+	fsynr = readl_relaxed(nsmmu_page(smmu, inst, smmu->numpage + idx) +
+			      ARM_SMMU_CB_FSYNR0);
+	iova = readq_relaxed(nsmmu_page(smmu, inst, smmu->numpage + idx) +
+			     ARM_SMMU_CB_FAR);
+	cbfrsynra = readl_relaxed(nsmmu_page(smmu, inst, 1) +
+				  ARM_SMMU_GR1_CBFRSYNRA(idx));
+
+	dev_err_ratelimited(smmu->dev,
+	"Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, cbfrsynra=0x%x, cb=%d\n",
+			    fsr, iova, fsynr, cbfrsynra, idx);
+
+	writel_relaxed(fsr, nsmmu_page(smmu, inst, smmu->numpage + idx) +
+			    ARM_SMMU_CB_FSR);
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t nsmmu_context_fault(int irq, void *dev)
+{
+	int inst, idx;
+	irqreturn_t irq_ret = IRQ_NONE;
+	struct iommu_domain *domain = dev;
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+
+	for (inst = 0; inst < to_nvidia_smmu(smmu)->num_inst; inst++) {
+		/* Interrupt line shared between all context faults.
+		 * Check for faults across all contexts.
+		 */
+		for (idx = 0; idx < smmu->num_context_banks; idx++) {
+			irq_ret = nsmmu_context_fault_bank(irq, smmu,
+							   idx, inst);
+
+			if (irq_ret == IRQ_HANDLED)
+				return irq_ret;
+		}
+	}
+
+	return irq_ret;
+}
+
 static const struct arm_smmu_impl nvidia_smmu_impl = {
 	.read_reg = nsmmu_read_reg,
 	.write_reg = nsmmu_write_reg,
@@ -124,6 +222,8 @@ static const struct arm_smmu_impl nvidia_smmu_impl = {
 	.write_reg64 = nsmmu_write_reg64,
 	.reset = nsmmu_reset,
 	.tlb_sync = nsmmu_tlb_sync,
+	.global_fault = nsmmu_global_fault,
+	.context_fault = nsmmu_context_fault,
 };
 
 struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu)
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index e622f4e33379..975faa57b659 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -673,6 +673,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	enum io_pgtable_fmt fmt;
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	irqreturn_t (*context_fault)(int irq, void *dev);
 
 	mutex_lock(&smmu_domain->init_mutex);
 	if (smmu_domain->smmu)
@@ -835,7 +836,9 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	 * handler seeing a half-initialised domain state.
 	 */
 	irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
-	ret = devm_request_irq(smmu->dev, irq, arm_smmu_context_fault,
+	context_fault = (smmu->impl && smmu->impl->context_fault) ?
+			 smmu->impl->context_fault : arm_smmu_context_fault;
+	ret = devm_request_irq(smmu->dev, irq, context_fault,
 			       IRQF_SHARED, "arm-smmu-context-fault", domain);
 	if (ret < 0) {
 		dev_err(smmu->dev, "failed to request context IRQ %d (%u)\n",
@@ -2095,6 +2098,7 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
 	struct arm_smmu_device *smmu;
 	struct device *dev = &pdev->dev;
 	int num_irqs, i, err;
+	irqreturn_t (*global_fault)(int irq, void *dev);
 
 	smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL);
 	if (!smmu) {
@@ -2181,9 +2185,12 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
 		smmu->num_context_irqs = smmu->num_context_banks;
 	}
 
+	global_fault = (smmu->impl && smmu->impl->global_fault) ?
+			smmu->impl->global_fault : arm_smmu_global_fault;
+
 	for (i = 0; i < smmu->num_global_irqs; ++i) {
 		err = devm_request_irq(smmu->dev, smmu->irqs[i],
-				       arm_smmu_global_fault,
+				       global_fault,
 				       IRQF_SHARED,
 				       "arm-smmu global fault",
 				       smmu);
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index 67c3c6f5c49e..27d786afc56a 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -18,6 +18,7 @@
 #include <linux/io-64-nonatomic-hi-lo.h>
 #include <linux/io-pgtable.h>
 #include <linux/iommu.h>
+#include <linux/irqreturn.h>
 #include <linux/mutex.h>
 #include <linux/spinlock.h>
 #include <linux/types.h>
@@ -386,6 +387,8 @@ struct arm_smmu_impl {
 	int (*init_context)(struct arm_smmu_domain *smmu_domain);
 	void (*tlb_sync)(struct arm_smmu_device *smmu, int page, int sync,
 			 int status);
+	irqreturn_t (*global_fault)(int irq, void *dev);
+	irqreturn_t (*context_fault)(int irq, void *dev);
 };
 
 static inline void __iomem *arm_smmu_page(struct arm_smmu_device *smmu, int n)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v5 4/5] arm64: tegra: Add DT node for T194 SMMU
  2020-05-21 23:31 [PATCH v5 0/5] Nvidia Arm SMMUv2 Implementation Krishna Reddy
                   ` (2 preceding siblings ...)
  2020-05-21 23:31 ` [PATCH v5 3/5] iommu/arm-smmu: Add global/context fault implementation hooks Krishna Reddy
@ 2020-05-21 23:31 ` Krishna Reddy
  2020-05-21 23:31 ` [PATCH v5 5/5] arm64: tegra: enable SMMU for SDHCI and EQOS on T194 Krishna Reddy
  2020-05-22 15:14 ` [PATCH v5 0/5] Nvidia Arm SMMUv2 Implementation Thierry Reding
  5 siblings, 0 replies; 8+ messages in thread
From: Krishna Reddy @ 2020-05-21 23:31 UTC (permalink / raw)
  Cc: joro, will, robin.murphy, linux-arm-kernel, iommu, linux-kernel,
	linux-tegra, treding, yhsu, snikam, praithatha, talho, bbiswas,
	mperttunen, nicolinc, bhuntsman, Krishna Reddy

Add DT node for T194 SMMU to enable SMMU support.

Signed-off-by: Krishna Reddy <vdumpa@nvidia.com>
---
 arch/arm64/boot/dts/nvidia/tegra194.dtsi | 77 ++++++++++++++++++++++++
 1 file changed, 77 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
index f4ede86e32b4..f7c4399afb55 100644
--- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
@@ -1620,6 +1620,83 @@ pcie@141a0000 {
 			  0x82000000 0x0  0x40000000 0x1f 0x40000000 0x0 0xc0000000>; /* non-prefetchable memory (3GB) */
 	};
 
+	smmu: iommu@12000000 {
+		compatible = "arm,mmu-500","nvidia,tegra194-smmu-500";
+		reg = <0 0x12000000 0 0x800000>,
+		      <0 0x11000000 0 0x800000>,
+		      <0 0x10000000 0 0x800000>;
+		interrupts = <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 232 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 240 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 232 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 240 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+			     <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>;
+		stream-match-mask = <0x7f80>;
+		#global-interrupts = <3>;
+		#iommu-cells = <1>;
+	};
+
 	pcie_ep@14160000 {
 		compatible = "nvidia,tegra194-pcie-ep", "snps,dw-pcie-ep";
 		power-domains = <&bpmp TEGRA194_POWER_DOMAIN_PCIEX4A>;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v5 5/5] arm64: tegra: enable SMMU for SDHCI and EQOS on T194
  2020-05-21 23:31 [PATCH v5 0/5] Nvidia Arm SMMUv2 Implementation Krishna Reddy
                   ` (3 preceding siblings ...)
  2020-05-21 23:31 ` [PATCH v5 4/5] arm64: tegra: Add DT node for T194 SMMU Krishna Reddy
@ 2020-05-21 23:31 ` Krishna Reddy
  2020-05-22 15:14 ` [PATCH v5 0/5] Nvidia Arm SMMUv2 Implementation Thierry Reding
  5 siblings, 0 replies; 8+ messages in thread
From: Krishna Reddy @ 2020-05-21 23:31 UTC (permalink / raw)
  Cc: joro, will, robin.murphy, linux-arm-kernel, iommu, linux-kernel,
	linux-tegra, treding, yhsu, snikam, praithatha, talho, bbiswas,
	mperttunen, nicolinc, bhuntsman, Krishna Reddy

Enable SMMU translations for SDHCI and EQOS transactions on T194.

Signed-off-by: Krishna Reddy <vdumpa@nvidia.com>
---
 arch/arm64/boot/dts/nvidia/tegra194.dtsi | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
index f7c4399afb55..706bbb439dcd 100644
--- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
@@ -59,6 +59,7 @@ ethernet@2490000 {
 			clock-names = "master_bus", "slave_bus", "rx", "tx", "ptp_ref";
 			resets = <&bpmp TEGRA194_RESET_EQOS>;
 			reset-names = "eqos";
+			iommus = <&smmu TEGRA194_SID_EQOS>;
 			status = "disabled";
 
 			snps,write-requests = <1>;
@@ -457,6 +458,7 @@ sdmmc1: sdhci@3400000 {
 			clock-names = "sdhci";
 			resets = <&bpmp TEGRA194_RESET_SDMMC1>;
 			reset-names = "sdhci";
+			iommus = <&smmu TEGRA194_SID_SDMMC1>;
 			nvidia,pad-autocal-pull-up-offset-3v3-timeout =
 									<0x07>;
 			nvidia,pad-autocal-pull-down-offset-3v3-timeout =
@@ -479,6 +481,7 @@ sdmmc3: sdhci@3440000 {
 			clock-names = "sdhci";
 			resets = <&bpmp TEGRA194_RESET_SDMMC3>;
 			reset-names = "sdhci";
+			iommus = <&smmu TEGRA194_SID_SDMMC3>;
 			nvidia,pad-autocal-pull-up-offset-1v8 = <0x00>;
 			nvidia,pad-autocal-pull-down-offset-1v8 = <0x7a>;
 			nvidia,pad-autocal-pull-up-offset-3v3-timeout = <0x07>;
@@ -506,6 +509,7 @@ sdmmc4: sdhci@3460000 {
 					  <&bpmp TEGRA194_CLK_PLLC4>;
 			resets = <&bpmp TEGRA194_RESET_SDMMC4>;
 			reset-names = "sdhci";
+			iommus = <&smmu TEGRA194_SID_SDMMC4>;
 			nvidia,pad-autocal-pull-up-offset-hs400 = <0x00>;
 			nvidia,pad-autocal-pull-down-offset-hs400 = <0x00>;
 			nvidia,pad-autocal-pull-up-offset-1v8-timeout = <0x0a>;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v5 0/5] Nvidia Arm SMMUv2 Implementation
  2020-05-21 23:31 [PATCH v5 0/5] Nvidia Arm SMMUv2 Implementation Krishna Reddy
                   ` (4 preceding siblings ...)
  2020-05-21 23:31 ` [PATCH v5 5/5] arm64: tegra: enable SMMU for SDHCI and EQOS on T194 Krishna Reddy
@ 2020-05-22 15:14 ` Thierry Reding
  2020-05-22 18:10   ` Krishna Reddy
  5 siblings, 1 reply; 8+ messages in thread
From: Thierry Reding @ 2020-05-22 15:14 UTC (permalink / raw)
  To: Krishna Reddy
  Cc: snikam, mperttunen, bhuntsman, will, joro, linux-kernel,
	praithatha, talho, iommu, nicolinc, linux-tegra, yhsu, treding,
	robin.murphy, linux-arm-kernel, bbiswas

[-- Attachment #1: Type: text/plain, Size: 5829 bytes --]

On Thu, May 21, 2020 at 04:31:02PM -0700, Krishna Reddy wrote:
> Changes in v5:
> Rebased on top of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git next
> 
> v4 - https://lkml.org/lkml/2019/10/30/1054
> v3 - https://lkml.org/lkml/2019/10/18/1601
> v2 - https://lkml.org/lkml/2019/9/2/980
> v1 - https://lkml.org/lkml/2019/8/29/1588
> 
> Krishna Reddy (5):
>   iommu/arm-smmu: add NVIDIA implementation for dual ARM MMU-500 usage
>   dt-bindings: arm-smmu: Add binding for Tegra194 SMMU
>   iommu/arm-smmu: Add global/context fault implementation hooks

For the record: I don't think we should apply these because we don't
have a good way of testing them. We currently have three problems that
prevent us from enabling SMMU on Tegra194:

  1) If we enable SMMU support, then the DMA API will automatically try
     to use SMMU domains for allocations. This means that translations
     will happen as soon as a device's IOMMU operations are initialized
     and that is typically a long time (in kernel time at least) before
     a driver is bound and has a chance of configuring the device.

     This causes problems for non-quiesced devices like display
     controllers that the bootloader might have set up to scan out a
     boot splash.

     What we're missing here is a way to:

     a) advertise reserved memory regions for boot splash framebuffers
     b) map reserved memory regions early during SMMU setup

     Patches have been floating on the public mailing lists for b) but
     a) requires changes to the bootloader (both proprietary ones and
     U-Boot for SoCs prior to Tegra194).

  2) Even if we don't enable SMMU for a given device (by not hooking up
     the iommus property), with a default kernel configuration we get a
     bunch of faults during boot because the ARM SMMU driver faults by
     default (rather than bypass) for masters which aren't hooked up to
     the SMMU.

     We could work around that by changing the default configuration or
     overriding it on the command-line, but that's not really an option
     because it decreases security and means that Tegra194 won't work
     out-of-the-box.

  3) We don't properly describe the DMA hierarchy, which causes the DMA
     masks to be improperly set. As a bit of background: Tegra194 has a
     special address bit (bit 39) that causes some swizzling to happen
     within the memory controller. As a result, any I/O virtual address
     that has bit 39 set will cause this swizzling to happen on access.
     The DMA/IOMMU allocator always starts allocating from the top of
     the IOVA space, which means that the first couple of gigabytes of
     allocations will cause most devices to fail because of the
     undesired swizzling that occurs.

     We had an initial patch for SDHCI merged that hard-codes the DMA
     mask to DMA_BIT_MASK(39) on Tegra194 to work around that. However,
     the devices all do support addressing 40 bits and the restriction
     on bit 39 is really a property of the bus rather than a capability
     of the device. This means that we would have to work around this
     for every device driver by adding similar hacks. A better option is
     to properly describe the DMA hierarchy (using dma-ranges) because
     that will then automatically be applied as a constraint on each
     device's DMA mask.

     I have been working on patches to address this, but they are fairly
     involved because they require device tree bindings changes and so
     on.

So before we solve all of the above issues we can't really enable SMMU
on Tegra194 and hence won't be able to test it. As such we don't know if
these patches even work, nor can we validate that they continue to work.

As such, I don't think there's any use in applying these patches
upstream since they will be effectively dead code until all of the above
issues are resolved.

>   arm64: tegra: Add DT node for T194 SMMU
>   arm64: tegra: enable SMMU for SDHCI and EQOS on T194

This one is going to cause EQOS to break because of 3) above. It might
work for SDHCI because of the workaround we currently have in that
driver. However, I do have a local patch that reverts the workaround
and replaces it with the proper fix, which uses dma-ranges as mentioned
above.

That said, I have tested earlier versions of this patchset on top of my
local branch with fixes for the above and they do seem to work as
expected.

So I'll leave it up to the IOMMU maintainers whether they're willing to
merge the driver patches as is. But I want to clarify that I won't be
applying the DTS patches until we've solved all of the above issues and
therefore it should be clear that these won't be runtime tested until
then.

I expect it will take at least until v5.9-rc1 before we have all the
changes merged that would allow us to enable SMMU support.

Thierry

>  .../devicetree/bindings/iommu/arm,smmu.yaml   |   5 +
>  MAINTAINERS                                   |   2 +
>  arch/arm64/boot/dts/nvidia/tegra194.dtsi      |  81 ++++++
>  drivers/iommu/Makefile                        |   2 +-
>  drivers/iommu/arm-smmu-impl.c                 |   3 +
>  drivers/iommu/arm-smmu-nvidia.c               | 261 ++++++++++++++++++
>  drivers/iommu/arm-smmu.c                      |  11 +-
>  drivers/iommu/arm-smmu.h                      |   4 +
>  8 files changed, 366 insertions(+), 3 deletions(-)
>  create mode 100644 drivers/iommu/arm-smmu-nvidia.c
> 
> 
> base-commit: 365f8d504da50feaebf826d180113529c9383670
> -- 
> 2.26.2
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH v5 0/5] Nvidia Arm SMMUv2 Implementation
  2020-05-22 15:14 ` [PATCH v5 0/5] Nvidia Arm SMMUv2 Implementation Thierry Reding
@ 2020-05-22 18:10   ` Krishna Reddy
  0 siblings, 0 replies; 8+ messages in thread
From: Krishna Reddy @ 2020-05-22 18:10 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Sachin Nikam, Mikko Perttunen, Bryan Huntsman, will, joro,
	linux-kernel, Pritesh Raithatha, Timo Alho, iommu, Nicolin Chen,
	linux-tegra, Yu-Huan Hsu, Thierry Reding, robin.murphy,
	linux-arm-kernel, Bitan Biswas

>For the record: I don't think we should apply these because we don't have a good way of testing them. We currently have three problems that prevent us from enabling SMMU on Tegra194:

Out of three issues pointed here, I see that only issue 2) is a real blocker for enabling SMMU HW by default in upstream.

>That said, I have tested earlier versions of this patchset on top of my local branch with fixes for the above and they do seem to work as expected.
>So I'll leave it up to the IOMMU maintainers whether they're willing to merge the driver patches as is.
> But I want to clarify that I won't be applying the DTS patches until we've solved all of the above issues and therefore it should be clear that these won't be runtime tested until then.

SMMU driver patches as such are complete and can be used by nvidia with a local config change(CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT=n) to disable_bypass and
Protects the driver patches against kernel changes. This config disable option is tested already by Nicolin Chen and me.

Robin/Will, Can you comment if smmu driver patches alone(1,2,3 out of 5 patches) can be merged without DT enable patches? Is it reasonable to merge the driver patches alone?

>1) If we enable SMMU support, then the DMA API will automatically try
>     to use SMMU domains for allocations. This means that translations
>     will happen as soon as a device's IOMMU operations are initialized
>     and that is typically a long time (in kernel time at least) before
>     a driver is bound and has a chance of configuring the device.

>     This causes problems for non-quiesced devices like display
>     controllers that the bootloader might have set up to scan out a
>     boot splash.

>     What we're missing here is a way to:

>     a) advertise reserved memory regions for boot splash framebuffers
>     b) map reserved memory regions early during SMMU setup

>     Patches have been floating on the public mailing lists for b) but
>     a) requires changes to the bootloader (both proprietary ones and
>     U-Boot for SoCs prior to Tegra194).

This happens if SMMU translations is enabled for display before reserved
 Memory regions issue is fixed. This issue is not a real blocker for SMMU enable.


>  2) Even if we don't enable SMMU for a given device (by not hooking up
>     the iommus property), with a default kernel configuration we get a
>     bunch of faults during boot because the ARM SMMU driver faults by
>     default (rather than bypass) for masters which aren't hooked up to
>     the SMMU.

>     We could work around that by changing the default configuration or
>     overriding it on the command-line, but that's not really an option
>     because it decreases security and means that Tegra194 won't work
>     out-of-the-box.

This is the real issue that blocks enabling SMMU.  The USF faults for devices
that don't have SMMU translations enabled should be fixed or WAR'ed before
SMMU can be enabled. We should look at keeping SID as 0x7F for the devices
that can't have SMMU enabled yet. SID 0x7f bypasses SMMU externally.

>  3) We don't properly describe the DMA hierarchy, which causes the DMA
>     masks to be improperly set. As a bit of background: Tegra194 has a
>     special address bit (bit 39) that causes some swizzling to happen
>     within the memory controller. As a result, any I/O virtual address
>     that has bit 39 set will cause this swizzling to happen on access.
>     The DMA/IOMMU allocator always starts allocating from the top of
>     the IOVA space, which means that the first couple of gigabytes of
>     allocations will cause most devices to fail because of the
>     undesired swizzling that occurs.

>     We had an initial patch for SDHCI merged that hard-codes the DMA
>     mask to DMA_BIT_MASK(39) on Tegra194 to work around that. However,
>     the devices all do support addressing 40 bits and the restriction
>     on bit 39 is really a property of the bus rather than a capability
>     of the device. This means that we would have to work around this
>     for every device driver by adding similar hacks. A better option is
>     to properly describe the DMA hierarchy (using dma-ranges) because
>     that will then automatically be applied as a constraint on each
>     device's DMA mask.

>     I have been working on patches to address this, but they are fairly
>     involved because they require device tree bindings changes and so
>     on.

Dma_mask issue is again outside SMMU driver and as long as the clients with
Dma_mask issue don't have SMMU enabled, it would be fine.
SDHCI can have SMMU enabled in upstream as soon as issue 2 is taken care.

>So before we solve all of the above issues we can't really enable SMMU on Tegra194 and hence won't be able to test it. As such we don't know if these patches even work, nor can we validate that they continue to work.
>As such, I don't think there's any use in applying these patches upstream since they will be effectively dead code until all of the above issues are resolved.
>   arm64: tegra: Add DT node for T194 SMMU
>   arm64: tegra: enable SMMU for SDHCI and EQOS on T194
>This one is going to cause EQOS to break because of 3) above. It might work for SDHCI because of the workaround we currently have in that driver. However, I do have a local patch that reverts the workaround and replaces it with the proper fix, which uses dma->ranges as mentioned above.

The DT patches can't be merged as of now. The enable patches can follow up later after issue 2 is fixed.

>I expect it will take at least until v5.9-rc1 before we have all the changes merged that would allow us to enable SMMU support.

Thierry

>  .../devicetree/bindings/iommu/arm,smmu.yaml   |   5 +
>  MAINTAINERS                                   |   2 +
>  arch/arm64/boot/dts/nvidia/tegra194.dtsi      |  81 ++++++
>  drivers/iommu/Makefile                        |   2 +-
>  drivers/iommu/arm-smmu-impl.c                 |   3 +
>  drivers/iommu/arm-smmu-nvidia.c               | 261 ++++++++++++++++++
>  drivers/iommu/arm-smmu.c                      |  11 +-
>  drivers/iommu/arm-smmu.h                      |   4 +
>  8 files changed, 366 insertions(+), 3 deletions(-)  create mode 
> 100644 drivers/iommu/arm-smmu-nvidia.c
> 
> 
> base-commit: 365f8d504da50feaebf826d180113529c9383670
> --
> 2.26.2
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-05-22 18:10 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-21 23:31 [PATCH v5 0/5] Nvidia Arm SMMUv2 Implementation Krishna Reddy
2020-05-21 23:31 ` [PATCH v5 1/5] iommu/arm-smmu: add NVIDIA implementation for dual ARM MMU-500 usage Krishna Reddy
2020-05-21 23:31 ` [PATCH v5 2/5] dt-bindings: arm-smmu: Add binding for Tegra194 SMMU Krishna Reddy
2020-05-21 23:31 ` [PATCH v5 3/5] iommu/arm-smmu: Add global/context fault implementation hooks Krishna Reddy
2020-05-21 23:31 ` [PATCH v5 4/5] arm64: tegra: Add DT node for T194 SMMU Krishna Reddy
2020-05-21 23:31 ` [PATCH v5 5/5] arm64: tegra: enable SMMU for SDHCI and EQOS on T194 Krishna Reddy
2020-05-22 15:14 ` [PATCH v5 0/5] Nvidia Arm SMMUv2 Implementation Thierry Reding
2020-05-22 18:10   ` Krishna Reddy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).