All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] Finish off the split page table prep work
@ 2020-01-10 15:28 ` Will Deacon
  0 siblings, 0 replies; 22+ messages in thread
From: Will Deacon @ 2020-01-10 15:28 UTC (permalink / raw)
  To: iommu; +Cc: Will Deacon, kernel-team, Robin Murphy, linux-arm-kernel

Hi all,

Last merge window, I merged most of the split page table prep work from Robin
[1], but there were a few patches left pending some rework. I think Robin was
hoping to get that done for 5.5, but what with the holidays falling like they
did and other committments, I've ended up picked up the bits that were left
over.

I'm pretty limited with regards to SMMU hardware on which I can test this lot,
so I might have broken something.

Applies against for-joerg/arm-smmu/updates.

Will

[1] https://lore.kernel.org/lkml/20191104202012.GN24909@willie-the-truck

Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Jordan Crouse <jcrouse@codeaurora.org>

--->8

Robin Murphy (3):
  iommu/io-pgtable-arm: Rationalise TTBRn handling
  iommu/io-pgtable-arm: Rationalise TCR handling
  iommu/io-pgtable-arm: Prepare for TTBR1 usage

Will Deacon (5):
  iommu/io-pgtable-arm: Support non-coherent stage-2 page tables
  iommu/io-pgtable-arm: Ensure non-cacheable mappings are Outer
    Shareable
  iommu/io-pgtable-arm: Ensure ARM_64_LPAE_S2_TCR_RES1 is unsigned
  iommu/arm-smmu: Rename public #defines under ARM_SMMU_ namespace
  iommu/io-pgtable-arm: Rationalise VTCR handling

 drivers/iommu/arm-smmu-impl.c      |   2 +-
 drivers/iommu/arm-smmu-v3.c        |  60 ++++----
 drivers/iommu/arm-smmu.c           | 171 ++++++++++++----------
 drivers/iommu/arm-smmu.h           | 228 ++++++++++++++++++-----------
 drivers/iommu/io-pgtable-arm-v7s.c |  23 ++-
 drivers/iommu/io-pgtable-arm.c     | 155 +++++++++-----------
 drivers/iommu/io-pgtable.c         |   2 +-
 drivers/iommu/ipmmu-vmsa.c         |   2 +-
 drivers/iommu/msm_iommu.c          |   4 +-
 drivers/iommu/mtk_iommu.c          |   4 +-
 drivers/iommu/qcom_iommu.c         |  25 ++--
 include/linux/io-pgtable.h         |  27 +++-
 12 files changed, 381 insertions(+), 322 deletions(-)

-- 
2.25.0.rc1.283.g88dfdc4193-goog

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 0/8] Finish off the split page table prep work
@ 2020-01-10 15:28 ` Will Deacon
  0 siblings, 0 replies; 22+ messages in thread
From: Will Deacon @ 2020-01-10 15:28 UTC (permalink / raw)
  To: iommu
  Cc: Will Deacon, Jordan Crouse, kernel-team, Robin Murphy, linux-arm-kernel

Hi all,

Last merge window, I merged most of the split page table prep work from Robin
[1], but there were a few patches left pending some rework. I think Robin was
hoping to get that done for 5.5, but what with the holidays falling like they
did and other committments, I've ended up picked up the bits that were left
over.

I'm pretty limited with regards to SMMU hardware on which I can test this lot,
so I might have broken something.

Applies against for-joerg/arm-smmu/updates.

Will

[1] https://lore.kernel.org/lkml/20191104202012.GN24909@willie-the-truck

Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Jordan Crouse <jcrouse@codeaurora.org>

--->8

Robin Murphy (3):
  iommu/io-pgtable-arm: Rationalise TTBRn handling
  iommu/io-pgtable-arm: Rationalise TCR handling
  iommu/io-pgtable-arm: Prepare for TTBR1 usage

Will Deacon (5):
  iommu/io-pgtable-arm: Support non-coherent stage-2 page tables
  iommu/io-pgtable-arm: Ensure non-cacheable mappings are Outer
    Shareable
  iommu/io-pgtable-arm: Ensure ARM_64_LPAE_S2_TCR_RES1 is unsigned
  iommu/arm-smmu: Rename public #defines under ARM_SMMU_ namespace
  iommu/io-pgtable-arm: Rationalise VTCR handling

 drivers/iommu/arm-smmu-impl.c      |   2 +-
 drivers/iommu/arm-smmu-v3.c        |  60 ++++----
 drivers/iommu/arm-smmu.c           | 171 ++++++++++++----------
 drivers/iommu/arm-smmu.h           | 228 ++++++++++++++++++-----------
 drivers/iommu/io-pgtable-arm-v7s.c |  23 ++-
 drivers/iommu/io-pgtable-arm.c     | 155 +++++++++-----------
 drivers/iommu/io-pgtable.c         |   2 +-
 drivers/iommu/ipmmu-vmsa.c         |   2 +-
 drivers/iommu/msm_iommu.c          |   4 +-
 drivers/iommu/mtk_iommu.c          |   4 +-
 drivers/iommu/qcom_iommu.c         |  25 ++--
 include/linux/io-pgtable.h         |  27 +++-
 12 files changed, 381 insertions(+), 322 deletions(-)

-- 
2.25.0.rc1.283.g88dfdc4193-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/8] iommu/io-pgtable-arm: Rationalise TTBRn handling
  2020-01-10 15:28 ` Will Deacon
@ 2020-01-10 15:28   ` Will Deacon
  -1 siblings, 0 replies; 22+ messages in thread
From: Will Deacon @ 2020-01-10 15:28 UTC (permalink / raw)
  To: iommu; +Cc: Will Deacon, kernel-team, Robin Murphy, linux-arm-kernel

From: Robin Murphy <robin.murphy@arm.com>

TTBR1 values have so far been redundant since no users implement any
support for split address spaces. Crucially, though, one of the main
reasons for wanting to do so is to be able to manage each half entirely
independently, e.g. context-switching one set of mappings without
disturbing the other. Thus it seems unlikely that tying two tables
together in a single io_pgtable_cfg would ever be particularly desirable
or useful.

Streamline the configs to just a single conceptual TTBR value
representing the allocated table. This paves the way for future users to
support split address spaces by simply allocating a table and dealing
with the detailed TTBRn logistics themselves.

Tested-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[will: Keep v7s non-coherent mappings explicitly outer-shareable]
Signed-off-by: Will Deacon <will@kernel.org>
---
 drivers/iommu/arm-smmu-v3.c        |  2 +-
 drivers/iommu/arm-smmu.c           |  9 ++++-----
 drivers/iommu/io-pgtable-arm-v7s.c | 17 ++++++++---------
 drivers/iommu/io-pgtable-arm.c     |  5 ++---
 drivers/iommu/ipmmu-vmsa.c         |  2 +-
 drivers/iommu/msm_iommu.c          |  4 ++--
 drivers/iommu/mtk_iommu.c          |  4 ++--
 drivers/iommu/qcom_iommu.c         |  3 +--
 include/linux/io-pgtable.h         |  4 ++--
 9 files changed, 23 insertions(+), 27 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index aa7e53023585..cf2ae065a6c2 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -2166,7 +2166,7 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
 	}
 
 	cfg->cd.asid	= (u16)asid;
-	cfg->cd.ttbr	= pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
+	cfg->cd.ttbr	= pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
 	cfg->cd.tcr	= pgtbl_cfg->arm_lpae_s1_cfg.tcr;
 	cfg->cd.mair	= pgtbl_cfg->arm_lpae_s1_cfg.mair;
 	return 0;
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 46b87740d708..72640e045268 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -553,13 +553,12 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
 	/* TTBRs */
 	if (stage1) {
 		if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH32_S) {
-			cb->ttbr[0] = pgtbl_cfg->arm_v7s_cfg.ttbr[0];
-			cb->ttbr[1] = pgtbl_cfg->arm_v7s_cfg.ttbr[1];
+			cb->ttbr[0] = pgtbl_cfg->arm_v7s_cfg.ttbr;
+			cb->ttbr[1] = 0;
 		} else {
-			cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
+			cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
 			cb->ttbr[0] |= FIELD_PREP(TTBRn_ASID, cfg->asid);
-			cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[1];
-			cb->ttbr[1] |= FIELD_PREP(TTBRn_ASID, cfg->asid);
+			cb->ttbr[1] = FIELD_PREP(TTBRn_ASID, cfg->asid);
 		}
 	} else {
 		cb->ttbr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
index 7c3bd2c3cdca..714cb406b0a8 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -822,15 +822,14 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg,
 	/* Ensure the empty pgd is visible before any actual TTBR write */
 	wmb();
 
-	/* TTBRs */
-	cfg->arm_v7s_cfg.ttbr[0] = virt_to_phys(data->pgd) |
-				   ARM_V7S_TTBR_S | ARM_V7S_TTBR_NOS |
-				   (cfg->coherent_walk ?
-				   (ARM_V7S_TTBR_IRGN_ATTR(ARM_V7S_RGN_WBWA) |
-				    ARM_V7S_TTBR_ORGN_ATTR(ARM_V7S_RGN_WBWA)) :
-				   (ARM_V7S_TTBR_IRGN_ATTR(ARM_V7S_RGN_NC) |
-				    ARM_V7S_TTBR_ORGN_ATTR(ARM_V7S_RGN_NC)));
-	cfg->arm_v7s_cfg.ttbr[1] = 0;
+	/* TTBR */
+	cfg->arm_v7s_cfg.ttbr = virt_to_phys(data->pgd) |
+				ARM_V7S_TTBR_S | ARM_V7S_TTBR_NOS |
+				(cfg->coherent_walk ?
+				 (ARM_V7S_TTBR_IRGN_ATTR(ARM_V7S_RGN_WBWA) |
+				  ARM_V7S_TTBR_ORGN_ATTR(ARM_V7S_RGN_WBWA)) :
+				 (ARM_V7S_TTBR_IRGN_ATTR(ARM_V7S_RGN_NC) |
+				  ARM_V7S_TTBR_ORGN_ATTR(ARM_V7S_RGN_NC)));
 	return &data->iop;
 
 out_free_data:
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index bdf47f745268..7b422b9fe05b 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -872,9 +872,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 	/* Ensure the empty pgd is visible before any actual TTBR write */
 	wmb();
 
-	/* TTBRs */
-	cfg->arm_lpae_s1_cfg.ttbr[0] = virt_to_phys(data->pgd);
-	cfg->arm_lpae_s1_cfg.ttbr[1] = 0;
+	/* TTBR */
+	cfg->arm_lpae_s1_cfg.ttbr = virt_to_phys(data->pgd);
 	return &data->iop;
 
 out_free_data:
diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index d02edd2751f3..ecb3f9464dd5 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -374,7 +374,7 @@ static void ipmmu_domain_setup_context(struct ipmmu_vmsa_domain *domain)
 	u32 tmp;
 
 	/* TTBR0 */
-	ttbr = domain->cfg.arm_lpae_s1_cfg.ttbr[0];
+	ttbr = domain->cfg.arm_lpae_s1_cfg.ttbr;
 	ipmmu_ctx_write_root(domain, IMTTLBR0, ttbr);
 	ipmmu_ctx_write_root(domain, IMTTUBR0, ttbr >> 32);
 
diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
index 93f14bca26ee..94a6df1bddd6 100644
--- a/drivers/iommu/msm_iommu.c
+++ b/drivers/iommu/msm_iommu.c
@@ -279,8 +279,8 @@ static void __program_context(void __iomem *base, int ctx,
 	SET_V2PCFG(base, ctx, 0x3);
 
 	SET_TTBCR(base, ctx, priv->cfg.arm_v7s_cfg.tcr);
-	SET_TTBR0(base, ctx, priv->cfg.arm_v7s_cfg.ttbr[0]);
-	SET_TTBR1(base, ctx, priv->cfg.arm_v7s_cfg.ttbr[1]);
+	SET_TTBR0(base, ctx, priv->cfg.arm_v7s_cfg.ttbr);
+	SET_TTBR1(base, ctx, 0);
 
 	/* Set prrr and nmrr */
 	SET_PRRR(base, ctx, priv->cfg.arm_v7s_cfg.prrr);
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 6fc1f5ecf91e..95945f467c03 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -367,7 +367,7 @@ static int mtk_iommu_attach_device(struct iommu_domain *domain,
 	/* Update the pgtable base address register of the M4U HW */
 	if (!data->m4u_dom) {
 		data->m4u_dom = dom;
-		writel(dom->cfg.arm_v7s_cfg.ttbr[0] & MMU_PT_ADDR_MASK,
+		writel(dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK,
 		       data->base + REG_MMU_PT_BASE_ADDR);
 	}
 
@@ -765,7 +765,7 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
 	writel_relaxed(reg->ivrp_paddr, base + REG_MMU_IVRP_PADDR);
 	writel_relaxed(reg->vld_pa_rng, base + REG_MMU_VLD_PA_RNG);
 	if (m4u_dom)
-		writel(m4u_dom->cfg.arm_v7s_cfg.ttbr[0] & MMU_PT_ADDR_MASK,
+		writel(m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK,
 		       base + REG_MMU_PT_BASE_ADDR);
 	return 0;
 }
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
index 52f38292df5b..c200bc066257 100644
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -269,10 +269,9 @@ static int qcom_iommu_init_domain(struct iommu_domain *domain,
 
 		/* TTBRs */
 		iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
-				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[0] |
+				pgtbl_cfg.arm_lpae_s1_cfg.ttbr |
 				FIELD_PREP(TTBRn_ASID, ctx->asid));
 		iommu_writeq(ctx, ARM_SMMU_CB_TTBR1,
-				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[1] |
 				FIELD_PREP(TTBRn_ASID, ctx->asid));
 
 		/* TCR */
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index ee21eedafe98..53bca5343f52 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -100,7 +100,7 @@ struct io_pgtable_cfg {
 	/* Low-level data specific to the table format */
 	union {
 		struct {
-			u64	ttbr[2];
+			u64	ttbr;
 			u64	tcr;
 			u64	mair;
 		} arm_lpae_s1_cfg;
@@ -111,7 +111,7 @@ struct io_pgtable_cfg {
 		} arm_lpae_s2_cfg;
 
 		struct {
-			u32	ttbr[2];
+			u32	ttbr;
 			u32	tcr;
 			u32	nmrr;
 			u32	prrr;
-- 
2.25.0.rc1.283.g88dfdc4193-goog

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/8] iommu/io-pgtable-arm: Rationalise TTBRn handling
@ 2020-01-10 15:28   ` Will Deacon
  0 siblings, 0 replies; 22+ messages in thread
From: Will Deacon @ 2020-01-10 15:28 UTC (permalink / raw)
  To: iommu
  Cc: Will Deacon, Jordan Crouse, kernel-team, Robin Murphy, linux-arm-kernel

From: Robin Murphy <robin.murphy@arm.com>

TTBR1 values have so far been redundant since no users implement any
support for split address spaces. Crucially, though, one of the main
reasons for wanting to do so is to be able to manage each half entirely
independently, e.g. context-switching one set of mappings without
disturbing the other. Thus it seems unlikely that tying two tables
together in a single io_pgtable_cfg would ever be particularly desirable
or useful.

Streamline the configs to just a single conceptual TTBR value
representing the allocated table. This paves the way for future users to
support split address spaces by simply allocating a table and dealing
with the detailed TTBRn logistics themselves.

Tested-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[will: Keep v7s non-coherent mappings explicitly outer-shareable]
Signed-off-by: Will Deacon <will@kernel.org>
---
 drivers/iommu/arm-smmu-v3.c        |  2 +-
 drivers/iommu/arm-smmu.c           |  9 ++++-----
 drivers/iommu/io-pgtable-arm-v7s.c | 17 ++++++++---------
 drivers/iommu/io-pgtable-arm.c     |  5 ++---
 drivers/iommu/ipmmu-vmsa.c         |  2 +-
 drivers/iommu/msm_iommu.c          |  4 ++--
 drivers/iommu/mtk_iommu.c          |  4 ++--
 drivers/iommu/qcom_iommu.c         |  3 +--
 include/linux/io-pgtable.h         |  4 ++--
 9 files changed, 23 insertions(+), 27 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index aa7e53023585..cf2ae065a6c2 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -2166,7 +2166,7 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
 	}
 
 	cfg->cd.asid	= (u16)asid;
-	cfg->cd.ttbr	= pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
+	cfg->cd.ttbr	= pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
 	cfg->cd.tcr	= pgtbl_cfg->arm_lpae_s1_cfg.tcr;
 	cfg->cd.mair	= pgtbl_cfg->arm_lpae_s1_cfg.mair;
 	return 0;
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 46b87740d708..72640e045268 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -553,13 +553,12 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
 	/* TTBRs */
 	if (stage1) {
 		if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH32_S) {
-			cb->ttbr[0] = pgtbl_cfg->arm_v7s_cfg.ttbr[0];
-			cb->ttbr[1] = pgtbl_cfg->arm_v7s_cfg.ttbr[1];
+			cb->ttbr[0] = pgtbl_cfg->arm_v7s_cfg.ttbr;
+			cb->ttbr[1] = 0;
 		} else {
-			cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
+			cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
 			cb->ttbr[0] |= FIELD_PREP(TTBRn_ASID, cfg->asid);
-			cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[1];
-			cb->ttbr[1] |= FIELD_PREP(TTBRn_ASID, cfg->asid);
+			cb->ttbr[1] = FIELD_PREP(TTBRn_ASID, cfg->asid);
 		}
 	} else {
 		cb->ttbr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
index 7c3bd2c3cdca..714cb406b0a8 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -822,15 +822,14 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg,
 	/* Ensure the empty pgd is visible before any actual TTBR write */
 	wmb();
 
-	/* TTBRs */
-	cfg->arm_v7s_cfg.ttbr[0] = virt_to_phys(data->pgd) |
-				   ARM_V7S_TTBR_S | ARM_V7S_TTBR_NOS |
-				   (cfg->coherent_walk ?
-				   (ARM_V7S_TTBR_IRGN_ATTR(ARM_V7S_RGN_WBWA) |
-				    ARM_V7S_TTBR_ORGN_ATTR(ARM_V7S_RGN_WBWA)) :
-				   (ARM_V7S_TTBR_IRGN_ATTR(ARM_V7S_RGN_NC) |
-				    ARM_V7S_TTBR_ORGN_ATTR(ARM_V7S_RGN_NC)));
-	cfg->arm_v7s_cfg.ttbr[1] = 0;
+	/* TTBR */
+	cfg->arm_v7s_cfg.ttbr = virt_to_phys(data->pgd) |
+				ARM_V7S_TTBR_S | ARM_V7S_TTBR_NOS |
+				(cfg->coherent_walk ?
+				 (ARM_V7S_TTBR_IRGN_ATTR(ARM_V7S_RGN_WBWA) |
+				  ARM_V7S_TTBR_ORGN_ATTR(ARM_V7S_RGN_WBWA)) :
+				 (ARM_V7S_TTBR_IRGN_ATTR(ARM_V7S_RGN_NC) |
+				  ARM_V7S_TTBR_ORGN_ATTR(ARM_V7S_RGN_NC)));
 	return &data->iop;
 
 out_free_data:
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index bdf47f745268..7b422b9fe05b 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -872,9 +872,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 	/* Ensure the empty pgd is visible before any actual TTBR write */
 	wmb();
 
-	/* TTBRs */
-	cfg->arm_lpae_s1_cfg.ttbr[0] = virt_to_phys(data->pgd);
-	cfg->arm_lpae_s1_cfg.ttbr[1] = 0;
+	/* TTBR */
+	cfg->arm_lpae_s1_cfg.ttbr = virt_to_phys(data->pgd);
 	return &data->iop;
 
 out_free_data:
diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index d02edd2751f3..ecb3f9464dd5 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -374,7 +374,7 @@ static void ipmmu_domain_setup_context(struct ipmmu_vmsa_domain *domain)
 	u32 tmp;
 
 	/* TTBR0 */
-	ttbr = domain->cfg.arm_lpae_s1_cfg.ttbr[0];
+	ttbr = domain->cfg.arm_lpae_s1_cfg.ttbr;
 	ipmmu_ctx_write_root(domain, IMTTLBR0, ttbr);
 	ipmmu_ctx_write_root(domain, IMTTUBR0, ttbr >> 32);
 
diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
index 93f14bca26ee..94a6df1bddd6 100644
--- a/drivers/iommu/msm_iommu.c
+++ b/drivers/iommu/msm_iommu.c
@@ -279,8 +279,8 @@ static void __program_context(void __iomem *base, int ctx,
 	SET_V2PCFG(base, ctx, 0x3);
 
 	SET_TTBCR(base, ctx, priv->cfg.arm_v7s_cfg.tcr);
-	SET_TTBR0(base, ctx, priv->cfg.arm_v7s_cfg.ttbr[0]);
-	SET_TTBR1(base, ctx, priv->cfg.arm_v7s_cfg.ttbr[1]);
+	SET_TTBR0(base, ctx, priv->cfg.arm_v7s_cfg.ttbr);
+	SET_TTBR1(base, ctx, 0);
 
 	/* Set prrr and nmrr */
 	SET_PRRR(base, ctx, priv->cfg.arm_v7s_cfg.prrr);
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 6fc1f5ecf91e..95945f467c03 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -367,7 +367,7 @@ static int mtk_iommu_attach_device(struct iommu_domain *domain,
 	/* Update the pgtable base address register of the M4U HW */
 	if (!data->m4u_dom) {
 		data->m4u_dom = dom;
-		writel(dom->cfg.arm_v7s_cfg.ttbr[0] & MMU_PT_ADDR_MASK,
+		writel(dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK,
 		       data->base + REG_MMU_PT_BASE_ADDR);
 	}
 
@@ -765,7 +765,7 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
 	writel_relaxed(reg->ivrp_paddr, base + REG_MMU_IVRP_PADDR);
 	writel_relaxed(reg->vld_pa_rng, base + REG_MMU_VLD_PA_RNG);
 	if (m4u_dom)
-		writel(m4u_dom->cfg.arm_v7s_cfg.ttbr[0] & MMU_PT_ADDR_MASK,
+		writel(m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK,
 		       base + REG_MMU_PT_BASE_ADDR);
 	return 0;
 }
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
index 52f38292df5b..c200bc066257 100644
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -269,10 +269,9 @@ static int qcom_iommu_init_domain(struct iommu_domain *domain,
 
 		/* TTBRs */
 		iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
-				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[0] |
+				pgtbl_cfg.arm_lpae_s1_cfg.ttbr |
 				FIELD_PREP(TTBRn_ASID, ctx->asid));
 		iommu_writeq(ctx, ARM_SMMU_CB_TTBR1,
-				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[1] |
 				FIELD_PREP(TTBRn_ASID, ctx->asid));
 
 		/* TCR */
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index ee21eedafe98..53bca5343f52 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -100,7 +100,7 @@ struct io_pgtable_cfg {
 	/* Low-level data specific to the table format */
 	union {
 		struct {
-			u64	ttbr[2];
+			u64	ttbr;
 			u64	tcr;
 			u64	mair;
 		} arm_lpae_s1_cfg;
@@ -111,7 +111,7 @@ struct io_pgtable_cfg {
 		} arm_lpae_s2_cfg;
 
 		struct {
-			u32	ttbr[2];
+			u32	ttbr;
 			u32	tcr;
 			u32	nmrr;
 			u32	prrr;
-- 
2.25.0.rc1.283.g88dfdc4193-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 2/8] iommu/io-pgtable-arm: Support non-coherent stage-2 page tables
  2020-01-10 15:28 ` Will Deacon
@ 2020-01-10 15:28   ` Will Deacon
  -1 siblings, 0 replies; 22+ messages in thread
From: Will Deacon @ 2020-01-10 15:28 UTC (permalink / raw)
  To: iommu
  Cc: kernel-team, Robin Murphy, Bjorn Andersson, Will Deacon,
	linux-arm-kernel

Commit 9e6ea59f3ff3 ("iommu/io-pgtable: Support non-coherent page tables")
added support for non-coherent page-table walks to the Arm IOMMU page-table
backends. Unfortunately, it left the stage-2 allocator unchanged, so let's
hook that up in the same way.

Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 drivers/iommu/io-pgtable-arm.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 7b422b9fe05b..ab440b52a5f4 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -910,10 +910,16 @@ arm_64_lpae_alloc_pgtable_s2(struct io_pgtable_cfg *cfg, void *cookie)
 	}
 
 	/* VTCR */
-	reg = ARM_64_LPAE_S2_TCR_RES1 |
-	     (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) |
-	     (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) |
-	     (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT);
+	reg = ARM_64_LPAE_S2_TCR_RES1;
+	if (cfg->coherent_walk) {
+		reg |= (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) |
+		       (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) |
+		       (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT);
+	} else {
+		reg |= (ARM_LPAE_TCR_SH_OS << ARM_LPAE_TCR_SH0_SHIFT) |
+		       (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_IRGN0_SHIFT) |
+		       (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_ORGN0_SHIFT);
+	}
 
 	sl = data->start_level;
 
-- 
2.25.0.rc1.283.g88dfdc4193-goog

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 2/8] iommu/io-pgtable-arm: Support non-coherent stage-2 page tables
@ 2020-01-10 15:28   ` Will Deacon
  0 siblings, 0 replies; 22+ messages in thread
From: Will Deacon @ 2020-01-10 15:28 UTC (permalink / raw)
  To: iommu
  Cc: kernel-team, Robin Murphy, Bjorn Andersson, Jordan Crouse,
	Will Deacon, linux-arm-kernel

Commit 9e6ea59f3ff3 ("iommu/io-pgtable: Support non-coherent page tables")
added support for non-coherent page-table walks to the Arm IOMMU page-table
backends. Unfortunately, it left the stage-2 allocator unchanged, so let's
hook that up in the same way.

Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 drivers/iommu/io-pgtable-arm.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 7b422b9fe05b..ab440b52a5f4 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -910,10 +910,16 @@ arm_64_lpae_alloc_pgtable_s2(struct io_pgtable_cfg *cfg, void *cookie)
 	}
 
 	/* VTCR */
-	reg = ARM_64_LPAE_S2_TCR_RES1 |
-	     (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) |
-	     (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) |
-	     (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT);
+	reg = ARM_64_LPAE_S2_TCR_RES1;
+	if (cfg->coherent_walk) {
+		reg |= (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) |
+		       (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) |
+		       (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT);
+	} else {
+		reg |= (ARM_LPAE_TCR_SH_OS << ARM_LPAE_TCR_SH0_SHIFT) |
+		       (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_IRGN0_SHIFT) |
+		       (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_ORGN0_SHIFT);
+	}
 
 	sl = data->start_level;
 
-- 
2.25.0.rc1.283.g88dfdc4193-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 3/8] iommu/io-pgtable-arm: Ensure non-cacheable mappings are Outer Shareable
  2020-01-10 15:28 ` Will Deacon
@ 2020-01-10 15:28   ` Will Deacon
  -1 siblings, 0 replies; 22+ messages in thread
From: Will Deacon @ 2020-01-10 15:28 UTC (permalink / raw)
  To: iommu; +Cc: Will Deacon, kernel-team, Robin Murphy, linux-arm-kernel

The Armv7 ARM states that Normal, Non-cacheable mappings must explicitly
be marked as Outer Shareable in order to avoid UNPREDICTABLE behaviour:

  | Overlaying the shareability attribute (B3-1377, ARM DDI 0406C.c)
  |
  | A memory region with a resultant memory type attribute of Normal, and
  | a resultant cacheability attribute of Inner Non-cacheable, Outer
  | Non-cacheable, must have a resultant shareability attribute of Outer
  | Shareable, otherwise shareability is UNPREDICTABLE

Although this requirement doesn't appear to exist in the Armv8 docs,
where the 'SH' field is simply ignored in this case, it's
straightforward enough to set it here.

Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 drivers/iommu/io-pgtable-arm.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index ab440b52a5f4..4b437ead2300 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -303,9 +303,8 @@ static void __arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
 
 	if (data->iop.fmt != ARM_MALI_LPAE)
 		pte |= ARM_LPAE_PTE_AF;
-	pte |= ARM_LPAE_PTE_SH_IS;
-	pte |= paddr_to_iopte(paddr, data);
 
+	pte |= paddr_to_iopte(paddr, data);
 	__arm_lpae_set_pte(ptep, pte, &data->iop.cfg);
 }
 
@@ -463,6 +462,11 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
 	if (prot & IOMMU_NOEXEC)
 		pte |= ARM_LPAE_PTE_XN;
 
+	if (prot & IOMMU_CACHE)
+		pte |= ARM_LPAE_PTE_SH_IS;
+	else
+		pte |= ARM_LPAE_PTE_SH_OS;
+
 	return pte;
 }
 
-- 
2.25.0.rc1.283.g88dfdc4193-goog

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 3/8] iommu/io-pgtable-arm: Ensure non-cacheable mappings are Outer Shareable
@ 2020-01-10 15:28   ` Will Deacon
  0 siblings, 0 replies; 22+ messages in thread
From: Will Deacon @ 2020-01-10 15:28 UTC (permalink / raw)
  To: iommu
  Cc: Will Deacon, Jordan Crouse, kernel-team, Robin Murphy, linux-arm-kernel

The Armv7 ARM states that Normal, Non-cacheable mappings must explicitly
be marked as Outer Shareable in order to avoid UNPREDICTABLE behaviour:

  | Overlaying the shareability attribute (B3-1377, ARM DDI 0406C.c)
  |
  | A memory region with a resultant memory type attribute of Normal, and
  | a resultant cacheability attribute of Inner Non-cacheable, Outer
  | Non-cacheable, must have a resultant shareability attribute of Outer
  | Shareable, otherwise shareability is UNPREDICTABLE

Although this requirement doesn't appear to exist in the Armv8 docs,
where the 'SH' field is simply ignored in this case, it's
straightforward enough to set it here.

Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 drivers/iommu/io-pgtable-arm.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index ab440b52a5f4..4b437ead2300 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -303,9 +303,8 @@ static void __arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
 
 	if (data->iop.fmt != ARM_MALI_LPAE)
 		pte |= ARM_LPAE_PTE_AF;
-	pte |= ARM_LPAE_PTE_SH_IS;
-	pte |= paddr_to_iopte(paddr, data);
 
+	pte |= paddr_to_iopte(paddr, data);
 	__arm_lpae_set_pte(ptep, pte, &data->iop.cfg);
 }
 
@@ -463,6 +462,11 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
 	if (prot & IOMMU_NOEXEC)
 		pte |= ARM_LPAE_PTE_XN;
 
+	if (prot & IOMMU_CACHE)
+		pte |= ARM_LPAE_PTE_SH_IS;
+	else
+		pte |= ARM_LPAE_PTE_SH_OS;
+
 	return pte;
 }
 
-- 
2.25.0.rc1.283.g88dfdc4193-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 4/8] iommu/io-pgtable-arm: Ensure ARM_64_LPAE_S2_TCR_RES1 is unsigned
  2020-01-10 15:28 ` Will Deacon
@ 2020-01-10 15:28   ` Will Deacon
  -1 siblings, 0 replies; 22+ messages in thread
From: Will Deacon @ 2020-01-10 15:28 UTC (permalink / raw)
  To: iommu; +Cc: Will Deacon, kernel-team, Robin Murphy, linux-arm-kernel

ARM_64_LPAE_S2_TCR_RES1 is intended to map to bit 31 of the VTCR register,
which is required to be set to 1 by the architecture. Unfortunately, we
accidentally treat this as a signed quantity which means we also set the
upper 32 bits of the VTCR to one, and they are required to be zero.

Treat ARM_64_LPAE_S2_TCR_RES1 as unsigned to avoid the unwanted
sign-extension up to 64 bits.

Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 drivers/iommu/io-pgtable-arm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 4b437ead2300..89216bf37282 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -101,7 +101,7 @@
 
 /* Register bits */
 #define ARM_32_LPAE_TCR_EAE		(1 << 31)
-#define ARM_64_LPAE_S2_TCR_RES1		(1 << 31)
+#define ARM_64_LPAE_S2_TCR_RES1		(1U << 31)
 
 #define ARM_LPAE_TCR_EPD1		(1 << 23)
 
-- 
2.25.0.rc1.283.g88dfdc4193-goog

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 4/8] iommu/io-pgtable-arm: Ensure ARM_64_LPAE_S2_TCR_RES1 is unsigned
@ 2020-01-10 15:28   ` Will Deacon
  0 siblings, 0 replies; 22+ messages in thread
From: Will Deacon @ 2020-01-10 15:28 UTC (permalink / raw)
  To: iommu
  Cc: Will Deacon, Jordan Crouse, kernel-team, Robin Murphy, linux-arm-kernel

ARM_64_LPAE_S2_TCR_RES1 is intended to map to bit 31 of the VTCR register,
which is required to be set to 1 by the architecture. Unfortunately, we
accidentally treat this as a signed quantity which means we also set the
upper 32 bits of the VTCR to one, and they are required to be zero.

Treat ARM_64_LPAE_S2_TCR_RES1 as unsigned to avoid the unwanted
sign-extension up to 64 bits.

Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 drivers/iommu/io-pgtable-arm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 4b437ead2300..89216bf37282 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -101,7 +101,7 @@
 
 /* Register bits */
 #define ARM_32_LPAE_TCR_EAE		(1 << 31)
-#define ARM_64_LPAE_S2_TCR_RES1		(1 << 31)
+#define ARM_64_LPAE_S2_TCR_RES1		(1U << 31)
 
 #define ARM_LPAE_TCR_EPD1		(1 << 23)
 
-- 
2.25.0.rc1.283.g88dfdc4193-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 5/8] iommu/io-pgtable-arm: Rationalise TCR handling
  2020-01-10 15:28 ` Will Deacon
@ 2020-01-10 15:28   ` Will Deacon
  -1 siblings, 0 replies; 22+ messages in thread
From: Will Deacon @ 2020-01-10 15:28 UTC (permalink / raw)
  To: iommu; +Cc: Will Deacon, kernel-team, Robin Murphy, linux-arm-kernel

From: Robin Murphy <robin.murphy@arm.com>

Although it's conceptually nice for the io_pgtable_cfg to provide a
standard VMSA TCR value, the reality is that no VMSA-compliant IOMMU
looks exactly like an Arm CPU, and they all have various other TCR
controls which io-pgtable can't be expected to understand. Thus since
there is an expectation that drivers will have to add to the given TCR
value anyway, let's strip it down to just the essentials that are
directly relevant to io-pgtable's inner workings - namely the various
sizes and the walk attributes.

Tested-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[will: Add missing include of bitfield.h]
Signed-off-by: Will Deacon <will@kernel.org>
---
 drivers/iommu/arm-smmu-v3.c        | 41 +++----------
 drivers/iommu/arm-smmu.c           |  7 ++-
 drivers/iommu/arm-smmu.h           | 28 +++++++++
 drivers/iommu/io-pgtable-arm-v7s.c |  6 +-
 drivers/iommu/io-pgtable-arm.c     | 98 ++++++++++++------------------
 drivers/iommu/io-pgtable.c         |  2 +-
 drivers/iommu/qcom_iommu.c         |  8 +--
 include/linux/io-pgtable.h         |  9 ++-
 8 files changed, 95 insertions(+), 104 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index cf2ae065a6c2..d127974afdb7 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -260,27 +260,18 @@
 /* Context descriptor (stage-1 only) */
 #define CTXDESC_CD_DWORDS		8
 #define CTXDESC_CD_0_TCR_T0SZ		GENMASK_ULL(5, 0)
-#define ARM64_TCR_T0SZ			GENMASK_ULL(5, 0)
 #define CTXDESC_CD_0_TCR_TG0		GENMASK_ULL(7, 6)
-#define ARM64_TCR_TG0			GENMASK_ULL(15, 14)
 #define CTXDESC_CD_0_TCR_IRGN0		GENMASK_ULL(9, 8)
-#define ARM64_TCR_IRGN0			GENMASK_ULL(9, 8)
 #define CTXDESC_CD_0_TCR_ORGN0		GENMASK_ULL(11, 10)
-#define ARM64_TCR_ORGN0			GENMASK_ULL(11, 10)
 #define CTXDESC_CD_0_TCR_SH0		GENMASK_ULL(13, 12)
-#define ARM64_TCR_SH0			GENMASK_ULL(13, 12)
 #define CTXDESC_CD_0_TCR_EPD0		(1ULL << 14)
-#define ARM64_TCR_EPD0			(1ULL << 7)
 #define CTXDESC_CD_0_TCR_EPD1		(1ULL << 30)
-#define ARM64_TCR_EPD1			(1ULL << 23)
 
 #define CTXDESC_CD_0_ENDI		(1UL << 15)
 #define CTXDESC_CD_0_V			(1UL << 31)
 
 #define CTXDESC_CD_0_TCR_IPS		GENMASK_ULL(34, 32)
-#define ARM64_TCR_IPS			GENMASK_ULL(34, 32)
 #define CTXDESC_CD_0_TCR_TBI0		(1ULL << 38)
-#define ARM64_TCR_TBI0			(1ULL << 37)
 
 #define CTXDESC_CD_0_AA64		(1UL << 41)
 #define CTXDESC_CD_0_S			(1UL << 44)
@@ -291,10 +282,6 @@
 
 #define CTXDESC_CD_1_TTB0_MASK		GENMASK_ULL(51, 4)
 
-/* Convert between AArch64 (CPU) TCR format and SMMU CD format */
-#define ARM_SMMU_TCR2CD(tcr, fld)	FIELD_PREP(CTXDESC_CD_0_TCR_##fld, \
-					FIELD_GET(ARM64_TCR_##fld, tcr))
-
 /* Command queue */
 #define CMDQ_ENT_SZ_SHIFT		4
 #define CMDQ_ENT_DWORDS			((1 << CMDQ_ENT_SZ_SHIFT) >> 3)
@@ -1439,23 +1426,6 @@ static int arm_smmu_cmdq_issue_sync(struct arm_smmu_device *smmu)
 }
 
 /* Context descriptor manipulation functions */
-static u64 arm_smmu_cpu_tcr_to_cd(u64 tcr)
-{
-	u64 val = 0;
-
-	/* Repack the TCR. Just care about TTBR0 for now */
-	val |= ARM_SMMU_TCR2CD(tcr, T0SZ);
-	val |= ARM_SMMU_TCR2CD(tcr, TG0);
-	val |= ARM_SMMU_TCR2CD(tcr, IRGN0);
-	val |= ARM_SMMU_TCR2CD(tcr, ORGN0);
-	val |= ARM_SMMU_TCR2CD(tcr, SH0);
-	val |= ARM_SMMU_TCR2CD(tcr, EPD0);
-	val |= ARM_SMMU_TCR2CD(tcr, EPD1);
-	val |= ARM_SMMU_TCR2CD(tcr, IPS);
-
-	return val;
-}
-
 static void arm_smmu_write_ctx_desc(struct arm_smmu_device *smmu,
 				    struct arm_smmu_s1_cfg *cfg)
 {
@@ -1465,7 +1435,7 @@ static void arm_smmu_write_ctx_desc(struct arm_smmu_device *smmu,
 	 * We don't need to issue any invalidation here, as we'll invalidate
 	 * the STE when installing the new entry anyway.
 	 */
-	val = arm_smmu_cpu_tcr_to_cd(cfg->cd.tcr) |
+	val = cfg->cd.tcr |
 #ifdef __BIG_ENDIAN
 	      CTXDESC_CD_0_ENDI |
 #endif
@@ -2151,6 +2121,7 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
 	int asid;
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg;
+	typeof(&pgtbl_cfg->arm_lpae_s1_cfg.tcr) tcr = &pgtbl_cfg->arm_lpae_s1_cfg.tcr;
 
 	asid = arm_smmu_bitmap_alloc(smmu->asid_map, smmu->asid_bits);
 	if (asid < 0)
@@ -2167,7 +2138,13 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
 
 	cfg->cd.asid	= (u16)asid;
 	cfg->cd.ttbr	= pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
-	cfg->cd.tcr	= pgtbl_cfg->arm_lpae_s1_cfg.tcr;
+	cfg->cd.tcr	= FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, tcr->tsz) |
+			  FIELD_PREP(CTXDESC_CD_0_TCR_TG0, tcr->tg) |
+			  FIELD_PREP(CTXDESC_CD_0_TCR_IRGN0, tcr->irgn) |
+			  FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, tcr->orgn) |
+			  FIELD_PREP(CTXDESC_CD_0_TCR_SH0, tcr->sh) |
+			  FIELD_PREP(CTXDESC_CD_0_TCR_IPS, tcr->ips) |
+			  CTXDESC_CD_0_TCR_EPD1 | CTXDESC_CD_0_AA64;
 	cfg->cd.mair	= pgtbl_cfg->arm_lpae_s1_cfg.mair;
 	return 0;
 
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 72640e045268..e65eb60812ec 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -540,11 +540,12 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
 		if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH32_S) {
 			cb->tcr[0] = pgtbl_cfg->arm_v7s_cfg.tcr;
 		} else {
-			cb->tcr[0] = pgtbl_cfg->arm_lpae_s1_cfg.tcr;
-			cb->tcr[1] = pgtbl_cfg->arm_lpae_s1_cfg.tcr >> 32;
-			cb->tcr[1] |= FIELD_PREP(TCR2_SEP, TCR2_SEP_UPSTREAM);
+			cb->tcr[0] = arm_smmu_lpae_tcr(pgtbl_cfg);
+			cb->tcr[1] = arm_smmu_lpae_tcr2(pgtbl_cfg);
 			if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH64)
 				cb->tcr[1] |= TCR2_AS;
+			else
+				cb->tcr[0] |= TCR_EAE;
 		}
 	} else {
 		cb->tcr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vtcr;
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index 62b9f0cec49b..aade2b0ae175 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -11,6 +11,7 @@
 #define _ARM_SMMU_H
 
 #include <linux/atomic.h>
+#include <linux/bitfield.h>
 #include <linux/bits.h>
 #include <linux/clk.h>
 #include <linux/device.h>
@@ -158,12 +159,24 @@ enum arm_smmu_cbar_type {
 #define TCR2_SEP			GENMASK(17, 15)
 #define TCR2_SEP_UPSTREAM		0x7
 #define TCR2_AS				BIT(4)
+#define TCR2_PASIZE			GENMASK(3, 0)
 
 #define ARM_SMMU_CB_TTBR0		0x20
 #define ARM_SMMU_CB_TTBR1		0x28
 #define TTBRn_ASID			GENMASK_ULL(63, 48)
 
+/* arm64 headers leak this somehow :( */
+#undef TCR_T0SZ
+
 #define ARM_SMMU_CB_TCR			0x30
+#define TCR_EAE				BIT(31)
+#define TCR_EPD1			BIT(23)
+#define TCR_TG0				GENMASK(15, 14)
+#define TCR_SH0				GENMASK(13, 12)
+#define TCR_ORGN0			GENMASK(11, 10)
+#define TCR_IRGN0			GENMASK(9, 8)
+#define TCR_T0SZ			GENMASK(5, 0)
+
 #define ARM_SMMU_CB_CONTEXTIDR		0x34
 #define ARM_SMMU_CB_S1_MAIR0		0x38
 #define ARM_SMMU_CB_S1_MAIR1		0x3c
@@ -318,6 +331,21 @@ struct arm_smmu_domain {
 	struct iommu_domain		domain;
 };
 
+static inline u32 arm_smmu_lpae_tcr(struct io_pgtable_cfg *cfg)
+{
+	return TCR_EPD1 |
+	       FIELD_PREP(TCR_TG0, cfg->arm_lpae_s1_cfg.tcr.tg) |
+	       FIELD_PREP(TCR_SH0, cfg->arm_lpae_s1_cfg.tcr.sh) |
+	       FIELD_PREP(TCR_ORGN0, cfg->arm_lpae_s1_cfg.tcr.orgn) |
+	       FIELD_PREP(TCR_IRGN0, cfg->arm_lpae_s1_cfg.tcr.irgn) |
+	       FIELD_PREP(TCR_T0SZ, cfg->arm_lpae_s1_cfg.tcr.tsz);
+}
+
+static inline u32 arm_smmu_lpae_tcr2(struct io_pgtable_cfg *cfg)
+{
+	return FIELD_PREP(TCR2_PASIZE, cfg->arm_lpae_s1_cfg.tcr.ips) |
+	       FIELD_PREP(TCR2_SEP, TCR2_SEP_UPSTREAM);
+}
 
 /* Implementation details, yay! */
 struct arm_smmu_impl {
diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
index 714cb406b0a8..ccedf50e5998 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -149,8 +149,6 @@
 #define ARM_V7S_TTBR_IRGN_ATTR(attr)					\
 	((((attr) & 0x1) << 6) | (((attr) & 0x2) >> 1))
 
-#define ARM_V7S_TCR_PD1			BIT(5)
-
 #ifdef CONFIG_ZONE_DMA32
 #define ARM_V7S_TABLE_GFP_DMA GFP_DMA32
 #define ARM_V7S_TABLE_SLAB_FLAGS SLAB_CACHE_DMA32
@@ -798,8 +796,8 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg,
 	 */
 	cfg->pgsize_bitmap &= SZ_4K | SZ_64K | SZ_1M | SZ_16M;
 
-	/* TCR: T0SZ=0, disable TTBR1 */
-	cfg->arm_v7s_cfg.tcr = ARM_V7S_TCR_PD1;
+	/* TCR: T0SZ=0, EAE=0 (if applicable) */
+	cfg->arm_v7s_cfg.tcr = 0;
 
 	/*
 	 * TEX remap: the indices used map to the closest equivalent types
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 89216bf37282..78ec0ee5c59c 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -100,40 +100,32 @@
 #define ARM_LPAE_PTE_MEMATTR_DEV	(((arm_lpae_iopte)0x1) << 2)
 
 /* Register bits */
-#define ARM_32_LPAE_TCR_EAE		(1 << 31)
-#define ARM_64_LPAE_S2_TCR_RES1		(1U << 31)
+#define ARM_64_LPAE_VTCR_RES1		(1U << 31)
 
-#define ARM_LPAE_TCR_EPD1		(1 << 23)
-
-#define ARM_LPAE_TCR_TG0_4K		(0 << 14)
-#define ARM_LPAE_TCR_TG0_64K		(1 << 14)
-#define ARM_LPAE_TCR_TG0_16K		(2 << 14)
+#define ARM_LPAE_VTCR_TG0_SHIFT		14
+#define ARM_LPAE_TCR_TG0_4K		0
+#define ARM_LPAE_TCR_TG0_64K		1
+#define ARM_LPAE_TCR_TG0_16K		2
 
 #define ARM_LPAE_TCR_SH0_SHIFT		12
-#define ARM_LPAE_TCR_SH0_MASK		0x3
 #define ARM_LPAE_TCR_SH_NS		0
 #define ARM_LPAE_TCR_SH_OS		2
 #define ARM_LPAE_TCR_SH_IS		3
 
 #define ARM_LPAE_TCR_ORGN0_SHIFT	10
 #define ARM_LPAE_TCR_IRGN0_SHIFT	8
-#define ARM_LPAE_TCR_RGN_MASK		0x3
 #define ARM_LPAE_TCR_RGN_NC		0
 #define ARM_LPAE_TCR_RGN_WBWA		1
 #define ARM_LPAE_TCR_RGN_WT		2
 #define ARM_LPAE_TCR_RGN_WB		3
 
-#define ARM_LPAE_TCR_SL0_SHIFT		6
-#define ARM_LPAE_TCR_SL0_MASK		0x3
+#define ARM_LPAE_VTCR_SL0_SHIFT		6
+#define ARM_LPAE_VTCR_SL0_MASK		0x3
 
 #define ARM_LPAE_TCR_T0SZ_SHIFT		0
-#define ARM_LPAE_TCR_SZ_MASK		0xf
-
-#define ARM_LPAE_TCR_PS_SHIFT		16
-#define ARM_LPAE_TCR_PS_MASK		0x7
 
-#define ARM_LPAE_TCR_IPS_SHIFT		32
-#define ARM_LPAE_TCR_IPS_MASK		0x7
+#define ARM_LPAE_VTCR_PS_SHIFT		16
+#define ARM_LPAE_VTCR_PS_MASK		0x7
 
 #define ARM_LPAE_TCR_PS_32_BIT		0x0ULL
 #define ARM_LPAE_TCR_PS_36_BIT		0x1ULL
@@ -791,6 +783,7 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 {
 	u64 reg;
 	struct arm_lpae_io_pgtable *data;
+	typeof(&cfg->arm_lpae_s1_cfg.tcr) tcr = &cfg->arm_lpae_s1_cfg.tcr;
 
 	if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS |
 			    IO_PGTABLE_QUIRK_NON_STRICT))
@@ -802,58 +795,54 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 
 	/* TCR */
 	if (cfg->coherent_walk) {
-		reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) |
-		      (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) |
-		      (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT);
+		tcr->sh = ARM_LPAE_TCR_SH_IS;
+		tcr->irgn = ARM_LPAE_TCR_RGN_WBWA;
+		tcr->orgn = ARM_LPAE_TCR_RGN_WBWA;
 	} else {
-		reg = (ARM_LPAE_TCR_SH_OS << ARM_LPAE_TCR_SH0_SHIFT) |
-		      (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_IRGN0_SHIFT) |
-		      (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_ORGN0_SHIFT);
+		tcr->sh = ARM_LPAE_TCR_SH_OS;
+		tcr->irgn = ARM_LPAE_TCR_RGN_NC;
+		tcr->orgn = ARM_LPAE_TCR_RGN_NC;
 	}
 
 	switch (ARM_LPAE_GRANULE(data)) {
 	case SZ_4K:
-		reg |= ARM_LPAE_TCR_TG0_4K;
+		tcr->tg = ARM_LPAE_TCR_TG0_4K;
 		break;
 	case SZ_16K:
-		reg |= ARM_LPAE_TCR_TG0_16K;
+		tcr->tg = ARM_LPAE_TCR_TG0_16K;
 		break;
 	case SZ_64K:
-		reg |= ARM_LPAE_TCR_TG0_64K;
+		tcr->tg = ARM_LPAE_TCR_TG0_64K;
 		break;
 	}
 
 	switch (cfg->oas) {
 	case 32:
-		reg |= (ARM_LPAE_TCR_PS_32_BIT << ARM_LPAE_TCR_IPS_SHIFT);
+		tcr->ips = ARM_LPAE_TCR_PS_32_BIT;
 		break;
 	case 36:
-		reg |= (ARM_LPAE_TCR_PS_36_BIT << ARM_LPAE_TCR_IPS_SHIFT);
+		tcr->ips = ARM_LPAE_TCR_PS_36_BIT;
 		break;
 	case 40:
-		reg |= (ARM_LPAE_TCR_PS_40_BIT << ARM_LPAE_TCR_IPS_SHIFT);
+		tcr->ips = ARM_LPAE_TCR_PS_40_BIT;
 		break;
 	case 42:
-		reg |= (ARM_LPAE_TCR_PS_42_BIT << ARM_LPAE_TCR_IPS_SHIFT);
+		tcr->ips = ARM_LPAE_TCR_PS_42_BIT;
 		break;
 	case 44:
-		reg |= (ARM_LPAE_TCR_PS_44_BIT << ARM_LPAE_TCR_IPS_SHIFT);
+		tcr->ips = ARM_LPAE_TCR_PS_44_BIT;
 		break;
 	case 48:
-		reg |= (ARM_LPAE_TCR_PS_48_BIT << ARM_LPAE_TCR_IPS_SHIFT);
+		tcr->ips = ARM_LPAE_TCR_PS_48_BIT;
 		break;
 	case 52:
-		reg |= (ARM_LPAE_TCR_PS_52_BIT << ARM_LPAE_TCR_IPS_SHIFT);
+		tcr->ips = ARM_LPAE_TCR_PS_52_BIT;
 		break;
 	default:
 		goto out_free_data;
 	}
 
-	reg |= (64ULL - cfg->ias) << ARM_LPAE_TCR_T0SZ_SHIFT;
-
-	/* Disable speculative walks through TTBR1 */
-	reg |= ARM_LPAE_TCR_EPD1;
-	cfg->arm_lpae_s1_cfg.tcr = reg;
+	tcr->tsz = 64ULL - cfg->ias;
 
 	/* MAIRs */
 	reg = (ARM_LPAE_MAIR_ATTR_NC
@@ -914,7 +903,7 @@ arm_64_lpae_alloc_pgtable_s2(struct io_pgtable_cfg *cfg, void *cookie)
 	}
 
 	/* VTCR */
-	reg = ARM_64_LPAE_S2_TCR_RES1;
+	reg = ARM_64_LPAE_VTCR_RES1;
 	if (cfg->coherent_walk) {
 		reg |= (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) |
 		       (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) |
@@ -929,45 +918,45 @@ arm_64_lpae_alloc_pgtable_s2(struct io_pgtable_cfg *cfg, void *cookie)
 
 	switch (ARM_LPAE_GRANULE(data)) {
 	case SZ_4K:
-		reg |= ARM_LPAE_TCR_TG0_4K;
+		reg |= (ARM_LPAE_TCR_TG0_4K << ARM_LPAE_VTCR_TG0_SHIFT);
 		sl++; /* SL0 format is different for 4K granule size */
 		break;
 	case SZ_16K:
-		reg |= ARM_LPAE_TCR_TG0_16K;
+		reg |= (ARM_LPAE_TCR_TG0_16K << ARM_LPAE_VTCR_TG0_SHIFT);
 		break;
 	case SZ_64K:
-		reg |= ARM_LPAE_TCR_TG0_64K;
+		reg |= (ARM_LPAE_TCR_TG0_64K << ARM_LPAE_VTCR_TG0_SHIFT);
 		break;
 	}
 
 	switch (cfg->oas) {
 	case 32:
-		reg |= (ARM_LPAE_TCR_PS_32_BIT << ARM_LPAE_TCR_PS_SHIFT);
+		reg |= (ARM_LPAE_TCR_PS_32_BIT << ARM_LPAE_VTCR_PS_SHIFT);
 		break;
 	case 36:
-		reg |= (ARM_LPAE_TCR_PS_36_BIT << ARM_LPAE_TCR_PS_SHIFT);
+		reg |= (ARM_LPAE_TCR_PS_36_BIT << ARM_LPAE_VTCR_PS_SHIFT);
 		break;
 	case 40:
-		reg |= (ARM_LPAE_TCR_PS_40_BIT << ARM_LPAE_TCR_PS_SHIFT);
+		reg |= (ARM_LPAE_TCR_PS_40_BIT << ARM_LPAE_VTCR_PS_SHIFT);
 		break;
 	case 42:
-		reg |= (ARM_LPAE_TCR_PS_42_BIT << ARM_LPAE_TCR_PS_SHIFT);
+		reg |= (ARM_LPAE_TCR_PS_42_BIT << ARM_LPAE_VTCR_PS_SHIFT);
 		break;
 	case 44:
-		reg |= (ARM_LPAE_TCR_PS_44_BIT << ARM_LPAE_TCR_PS_SHIFT);
+		reg |= (ARM_LPAE_TCR_PS_44_BIT << ARM_LPAE_VTCR_PS_SHIFT);
 		break;
 	case 48:
-		reg |= (ARM_LPAE_TCR_PS_48_BIT << ARM_LPAE_TCR_PS_SHIFT);
+		reg |= (ARM_LPAE_TCR_PS_48_BIT << ARM_LPAE_VTCR_PS_SHIFT);
 		break;
 	case 52:
-		reg |= (ARM_LPAE_TCR_PS_52_BIT << ARM_LPAE_TCR_PS_SHIFT);
+		reg |= (ARM_LPAE_TCR_PS_52_BIT << ARM_LPAE_VTCR_PS_SHIFT);
 		break;
 	default:
 		goto out_free_data;
 	}
 
 	reg |= (64ULL - cfg->ias) << ARM_LPAE_TCR_T0SZ_SHIFT;
-	reg |= (~sl & ARM_LPAE_TCR_SL0_MASK) << ARM_LPAE_TCR_SL0_SHIFT;
+	reg |= (~sl & ARM_LPAE_VTCR_SL0_MASK) << ARM_LPAE_VTCR_SL0_SHIFT;
 	cfg->arm_lpae_s2_cfg.vtcr = reg;
 
 	/* Allocate pgd pages */
@@ -991,19 +980,12 @@ arm_64_lpae_alloc_pgtable_s2(struct io_pgtable_cfg *cfg, void *cookie)
 static struct io_pgtable *
 arm_32_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 {
-	struct io_pgtable *iop;
-
 	if (cfg->ias > 32 || cfg->oas > 40)
 		return NULL;
 
 	cfg->pgsize_bitmap &= (SZ_4K | SZ_2M | SZ_1G);
-	iop = arm_64_lpae_alloc_pgtable_s1(cfg, cookie);
-	if (iop) {
-		cfg->arm_lpae_s1_cfg.tcr |= ARM_32_LPAE_TCR_EAE;
-		cfg->arm_lpae_s1_cfg.tcr &= 0xffffffff;
-	}
 
-	return iop;
+	return arm_64_lpae_alloc_pgtable_s1(cfg, cookie);
 }
 
 static struct io_pgtable *
diff --git a/drivers/iommu/io-pgtable.c b/drivers/iommu/io-pgtable.c
index ced53e5b72b5..94394c81468f 100644
--- a/drivers/iommu/io-pgtable.c
+++ b/drivers/iommu/io-pgtable.c
@@ -63,7 +63,7 @@ void free_io_pgtable_ops(struct io_pgtable_ops *ops)
 	if (!ops)
 		return;
 
-	iop = container_of(ops, struct io_pgtable, ops);
+	iop = io_pgtable_ops_to_pgtable(ops);
 	io_pgtable_tlb_flush_all(iop);
 	io_pgtable_init_table[iop->fmt]->free(iop);
 }
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
index c200bc066257..2eeaf2eec946 100644
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -271,15 +271,13 @@ static int qcom_iommu_init_domain(struct iommu_domain *domain,
 		iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
 				pgtbl_cfg.arm_lpae_s1_cfg.ttbr |
 				FIELD_PREP(TTBRn_ASID, ctx->asid));
-		iommu_writeq(ctx, ARM_SMMU_CB_TTBR1,
-				FIELD_PREP(TTBRn_ASID, ctx->asid));
+		iommu_writeq(ctx, ARM_SMMU_CB_TTBR1, 0);
 
 		/* TCR */
 		iommu_writel(ctx, ARM_SMMU_CB_TCR2,
-				(pgtbl_cfg.arm_lpae_s1_cfg.tcr >> 32) |
-				FIELD_PREP(TCR2_SEP, TCR2_SEP_UPSTREAM));
+				arm_smmu_lpae_tcr2(&pgtbl_cfg));
 		iommu_writel(ctx, ARM_SMMU_CB_TCR,
-				pgtbl_cfg.arm_lpae_s1_cfg.tcr);
+				arm_smmu_lpae_tcr(&pgtbl_cfg) | TCR_EAE);
 
 		/* MAIRs (stage-1 only) */
 		iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR0,
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 53bca5343f52..6ae104cedfd7 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -101,7 +101,14 @@ struct io_pgtable_cfg {
 	union {
 		struct {
 			u64	ttbr;
-			u64	tcr;
+			struct {
+				u32	ips:3;
+				u32	tg:2;
+				u32	sh:2;
+				u32	orgn:2;
+				u32	irgn:2;
+				u32	tsz:6;
+			}	tcr;
 			u64	mair;
 		} arm_lpae_s1_cfg;
 
-- 
2.25.0.rc1.283.g88dfdc4193-goog

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 5/8] iommu/io-pgtable-arm: Rationalise TCR handling
@ 2020-01-10 15:28   ` Will Deacon
  0 siblings, 0 replies; 22+ messages in thread
From: Will Deacon @ 2020-01-10 15:28 UTC (permalink / raw)
  To: iommu
  Cc: Will Deacon, Jordan Crouse, kernel-team, Robin Murphy, linux-arm-kernel

From: Robin Murphy <robin.murphy@arm.com>

Although it's conceptually nice for the io_pgtable_cfg to provide a
standard VMSA TCR value, the reality is that no VMSA-compliant IOMMU
looks exactly like an Arm CPU, and they all have various other TCR
controls which io-pgtable can't be expected to understand. Thus since
there is an expectation that drivers will have to add to the given TCR
value anyway, let's strip it down to just the essentials that are
directly relevant to io-pgtable's inner workings - namely the various
sizes and the walk attributes.

Tested-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[will: Add missing include of bitfield.h]
Signed-off-by: Will Deacon <will@kernel.org>
---
 drivers/iommu/arm-smmu-v3.c        | 41 +++----------
 drivers/iommu/arm-smmu.c           |  7 ++-
 drivers/iommu/arm-smmu.h           | 28 +++++++++
 drivers/iommu/io-pgtable-arm-v7s.c |  6 +-
 drivers/iommu/io-pgtable-arm.c     | 98 ++++++++++++------------------
 drivers/iommu/io-pgtable.c         |  2 +-
 drivers/iommu/qcom_iommu.c         |  8 +--
 include/linux/io-pgtable.h         |  9 ++-
 8 files changed, 95 insertions(+), 104 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index cf2ae065a6c2..d127974afdb7 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -260,27 +260,18 @@
 /* Context descriptor (stage-1 only) */
 #define CTXDESC_CD_DWORDS		8
 #define CTXDESC_CD_0_TCR_T0SZ		GENMASK_ULL(5, 0)
-#define ARM64_TCR_T0SZ			GENMASK_ULL(5, 0)
 #define CTXDESC_CD_0_TCR_TG0		GENMASK_ULL(7, 6)
-#define ARM64_TCR_TG0			GENMASK_ULL(15, 14)
 #define CTXDESC_CD_0_TCR_IRGN0		GENMASK_ULL(9, 8)
-#define ARM64_TCR_IRGN0			GENMASK_ULL(9, 8)
 #define CTXDESC_CD_0_TCR_ORGN0		GENMASK_ULL(11, 10)
-#define ARM64_TCR_ORGN0			GENMASK_ULL(11, 10)
 #define CTXDESC_CD_0_TCR_SH0		GENMASK_ULL(13, 12)
-#define ARM64_TCR_SH0			GENMASK_ULL(13, 12)
 #define CTXDESC_CD_0_TCR_EPD0		(1ULL << 14)
-#define ARM64_TCR_EPD0			(1ULL << 7)
 #define CTXDESC_CD_0_TCR_EPD1		(1ULL << 30)
-#define ARM64_TCR_EPD1			(1ULL << 23)
 
 #define CTXDESC_CD_0_ENDI		(1UL << 15)
 #define CTXDESC_CD_0_V			(1UL << 31)
 
 #define CTXDESC_CD_0_TCR_IPS		GENMASK_ULL(34, 32)
-#define ARM64_TCR_IPS			GENMASK_ULL(34, 32)
 #define CTXDESC_CD_0_TCR_TBI0		(1ULL << 38)
-#define ARM64_TCR_TBI0			(1ULL << 37)
 
 #define CTXDESC_CD_0_AA64		(1UL << 41)
 #define CTXDESC_CD_0_S			(1UL << 44)
@@ -291,10 +282,6 @@
 
 #define CTXDESC_CD_1_TTB0_MASK		GENMASK_ULL(51, 4)
 
-/* Convert between AArch64 (CPU) TCR format and SMMU CD format */
-#define ARM_SMMU_TCR2CD(tcr, fld)	FIELD_PREP(CTXDESC_CD_0_TCR_##fld, \
-					FIELD_GET(ARM64_TCR_##fld, tcr))
-
 /* Command queue */
 #define CMDQ_ENT_SZ_SHIFT		4
 #define CMDQ_ENT_DWORDS			((1 << CMDQ_ENT_SZ_SHIFT) >> 3)
@@ -1439,23 +1426,6 @@ static int arm_smmu_cmdq_issue_sync(struct arm_smmu_device *smmu)
 }
 
 /* Context descriptor manipulation functions */
-static u64 arm_smmu_cpu_tcr_to_cd(u64 tcr)
-{
-	u64 val = 0;
-
-	/* Repack the TCR. Just care about TTBR0 for now */
-	val |= ARM_SMMU_TCR2CD(tcr, T0SZ);
-	val |= ARM_SMMU_TCR2CD(tcr, TG0);
-	val |= ARM_SMMU_TCR2CD(tcr, IRGN0);
-	val |= ARM_SMMU_TCR2CD(tcr, ORGN0);
-	val |= ARM_SMMU_TCR2CD(tcr, SH0);
-	val |= ARM_SMMU_TCR2CD(tcr, EPD0);
-	val |= ARM_SMMU_TCR2CD(tcr, EPD1);
-	val |= ARM_SMMU_TCR2CD(tcr, IPS);
-
-	return val;
-}
-
 static void arm_smmu_write_ctx_desc(struct arm_smmu_device *smmu,
 				    struct arm_smmu_s1_cfg *cfg)
 {
@@ -1465,7 +1435,7 @@ static void arm_smmu_write_ctx_desc(struct arm_smmu_device *smmu,
 	 * We don't need to issue any invalidation here, as we'll invalidate
 	 * the STE when installing the new entry anyway.
 	 */
-	val = arm_smmu_cpu_tcr_to_cd(cfg->cd.tcr) |
+	val = cfg->cd.tcr |
 #ifdef __BIG_ENDIAN
 	      CTXDESC_CD_0_ENDI |
 #endif
@@ -2151,6 +2121,7 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
 	int asid;
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg;
+	typeof(&pgtbl_cfg->arm_lpae_s1_cfg.tcr) tcr = &pgtbl_cfg->arm_lpae_s1_cfg.tcr;
 
 	asid = arm_smmu_bitmap_alloc(smmu->asid_map, smmu->asid_bits);
 	if (asid < 0)
@@ -2167,7 +2138,13 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
 
 	cfg->cd.asid	= (u16)asid;
 	cfg->cd.ttbr	= pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
-	cfg->cd.tcr	= pgtbl_cfg->arm_lpae_s1_cfg.tcr;
+	cfg->cd.tcr	= FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, tcr->tsz) |
+			  FIELD_PREP(CTXDESC_CD_0_TCR_TG0, tcr->tg) |
+			  FIELD_PREP(CTXDESC_CD_0_TCR_IRGN0, tcr->irgn) |
+			  FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, tcr->orgn) |
+			  FIELD_PREP(CTXDESC_CD_0_TCR_SH0, tcr->sh) |
+			  FIELD_PREP(CTXDESC_CD_0_TCR_IPS, tcr->ips) |
+			  CTXDESC_CD_0_TCR_EPD1 | CTXDESC_CD_0_AA64;
 	cfg->cd.mair	= pgtbl_cfg->arm_lpae_s1_cfg.mair;
 	return 0;
 
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 72640e045268..e65eb60812ec 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -540,11 +540,12 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
 		if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH32_S) {
 			cb->tcr[0] = pgtbl_cfg->arm_v7s_cfg.tcr;
 		} else {
-			cb->tcr[0] = pgtbl_cfg->arm_lpae_s1_cfg.tcr;
-			cb->tcr[1] = pgtbl_cfg->arm_lpae_s1_cfg.tcr >> 32;
-			cb->tcr[1] |= FIELD_PREP(TCR2_SEP, TCR2_SEP_UPSTREAM);
+			cb->tcr[0] = arm_smmu_lpae_tcr(pgtbl_cfg);
+			cb->tcr[1] = arm_smmu_lpae_tcr2(pgtbl_cfg);
 			if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH64)
 				cb->tcr[1] |= TCR2_AS;
+			else
+				cb->tcr[0] |= TCR_EAE;
 		}
 	} else {
 		cb->tcr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vtcr;
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index 62b9f0cec49b..aade2b0ae175 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -11,6 +11,7 @@
 #define _ARM_SMMU_H
 
 #include <linux/atomic.h>
+#include <linux/bitfield.h>
 #include <linux/bits.h>
 #include <linux/clk.h>
 #include <linux/device.h>
@@ -158,12 +159,24 @@ enum arm_smmu_cbar_type {
 #define TCR2_SEP			GENMASK(17, 15)
 #define TCR2_SEP_UPSTREAM		0x7
 #define TCR2_AS				BIT(4)
+#define TCR2_PASIZE			GENMASK(3, 0)
 
 #define ARM_SMMU_CB_TTBR0		0x20
 #define ARM_SMMU_CB_TTBR1		0x28
 #define TTBRn_ASID			GENMASK_ULL(63, 48)
 
+/* arm64 headers leak this somehow :( */
+#undef TCR_T0SZ
+
 #define ARM_SMMU_CB_TCR			0x30
+#define TCR_EAE				BIT(31)
+#define TCR_EPD1			BIT(23)
+#define TCR_TG0				GENMASK(15, 14)
+#define TCR_SH0				GENMASK(13, 12)
+#define TCR_ORGN0			GENMASK(11, 10)
+#define TCR_IRGN0			GENMASK(9, 8)
+#define TCR_T0SZ			GENMASK(5, 0)
+
 #define ARM_SMMU_CB_CONTEXTIDR		0x34
 #define ARM_SMMU_CB_S1_MAIR0		0x38
 #define ARM_SMMU_CB_S1_MAIR1		0x3c
@@ -318,6 +331,21 @@ struct arm_smmu_domain {
 	struct iommu_domain		domain;
 };
 
+static inline u32 arm_smmu_lpae_tcr(struct io_pgtable_cfg *cfg)
+{
+	return TCR_EPD1 |
+	       FIELD_PREP(TCR_TG0, cfg->arm_lpae_s1_cfg.tcr.tg) |
+	       FIELD_PREP(TCR_SH0, cfg->arm_lpae_s1_cfg.tcr.sh) |
+	       FIELD_PREP(TCR_ORGN0, cfg->arm_lpae_s1_cfg.tcr.orgn) |
+	       FIELD_PREP(TCR_IRGN0, cfg->arm_lpae_s1_cfg.tcr.irgn) |
+	       FIELD_PREP(TCR_T0SZ, cfg->arm_lpae_s1_cfg.tcr.tsz);
+}
+
+static inline u32 arm_smmu_lpae_tcr2(struct io_pgtable_cfg *cfg)
+{
+	return FIELD_PREP(TCR2_PASIZE, cfg->arm_lpae_s1_cfg.tcr.ips) |
+	       FIELD_PREP(TCR2_SEP, TCR2_SEP_UPSTREAM);
+}
 
 /* Implementation details, yay! */
 struct arm_smmu_impl {
diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
index 714cb406b0a8..ccedf50e5998 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -149,8 +149,6 @@
 #define ARM_V7S_TTBR_IRGN_ATTR(attr)					\
 	((((attr) & 0x1) << 6) | (((attr) & 0x2) >> 1))
 
-#define ARM_V7S_TCR_PD1			BIT(5)
-
 #ifdef CONFIG_ZONE_DMA32
 #define ARM_V7S_TABLE_GFP_DMA GFP_DMA32
 #define ARM_V7S_TABLE_SLAB_FLAGS SLAB_CACHE_DMA32
@@ -798,8 +796,8 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg,
 	 */
 	cfg->pgsize_bitmap &= SZ_4K | SZ_64K | SZ_1M | SZ_16M;
 
-	/* TCR: T0SZ=0, disable TTBR1 */
-	cfg->arm_v7s_cfg.tcr = ARM_V7S_TCR_PD1;
+	/* TCR: T0SZ=0, EAE=0 (if applicable) */
+	cfg->arm_v7s_cfg.tcr = 0;
 
 	/*
 	 * TEX remap: the indices used map to the closest equivalent types
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 89216bf37282..78ec0ee5c59c 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -100,40 +100,32 @@
 #define ARM_LPAE_PTE_MEMATTR_DEV	(((arm_lpae_iopte)0x1) << 2)
 
 /* Register bits */
-#define ARM_32_LPAE_TCR_EAE		(1 << 31)
-#define ARM_64_LPAE_S2_TCR_RES1		(1U << 31)
+#define ARM_64_LPAE_VTCR_RES1		(1U << 31)
 
-#define ARM_LPAE_TCR_EPD1		(1 << 23)
-
-#define ARM_LPAE_TCR_TG0_4K		(0 << 14)
-#define ARM_LPAE_TCR_TG0_64K		(1 << 14)
-#define ARM_LPAE_TCR_TG0_16K		(2 << 14)
+#define ARM_LPAE_VTCR_TG0_SHIFT		14
+#define ARM_LPAE_TCR_TG0_4K		0
+#define ARM_LPAE_TCR_TG0_64K		1
+#define ARM_LPAE_TCR_TG0_16K		2
 
 #define ARM_LPAE_TCR_SH0_SHIFT		12
-#define ARM_LPAE_TCR_SH0_MASK		0x3
 #define ARM_LPAE_TCR_SH_NS		0
 #define ARM_LPAE_TCR_SH_OS		2
 #define ARM_LPAE_TCR_SH_IS		3
 
 #define ARM_LPAE_TCR_ORGN0_SHIFT	10
 #define ARM_LPAE_TCR_IRGN0_SHIFT	8
-#define ARM_LPAE_TCR_RGN_MASK		0x3
 #define ARM_LPAE_TCR_RGN_NC		0
 #define ARM_LPAE_TCR_RGN_WBWA		1
 #define ARM_LPAE_TCR_RGN_WT		2
 #define ARM_LPAE_TCR_RGN_WB		3
 
-#define ARM_LPAE_TCR_SL0_SHIFT		6
-#define ARM_LPAE_TCR_SL0_MASK		0x3
+#define ARM_LPAE_VTCR_SL0_SHIFT		6
+#define ARM_LPAE_VTCR_SL0_MASK		0x3
 
 #define ARM_LPAE_TCR_T0SZ_SHIFT		0
-#define ARM_LPAE_TCR_SZ_MASK		0xf
-
-#define ARM_LPAE_TCR_PS_SHIFT		16
-#define ARM_LPAE_TCR_PS_MASK		0x7
 
-#define ARM_LPAE_TCR_IPS_SHIFT		32
-#define ARM_LPAE_TCR_IPS_MASK		0x7
+#define ARM_LPAE_VTCR_PS_SHIFT		16
+#define ARM_LPAE_VTCR_PS_MASK		0x7
 
 #define ARM_LPAE_TCR_PS_32_BIT		0x0ULL
 #define ARM_LPAE_TCR_PS_36_BIT		0x1ULL
@@ -791,6 +783,7 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 {
 	u64 reg;
 	struct arm_lpae_io_pgtable *data;
+	typeof(&cfg->arm_lpae_s1_cfg.tcr) tcr = &cfg->arm_lpae_s1_cfg.tcr;
 
 	if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS |
 			    IO_PGTABLE_QUIRK_NON_STRICT))
@@ -802,58 +795,54 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 
 	/* TCR */
 	if (cfg->coherent_walk) {
-		reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) |
-		      (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) |
-		      (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT);
+		tcr->sh = ARM_LPAE_TCR_SH_IS;
+		tcr->irgn = ARM_LPAE_TCR_RGN_WBWA;
+		tcr->orgn = ARM_LPAE_TCR_RGN_WBWA;
 	} else {
-		reg = (ARM_LPAE_TCR_SH_OS << ARM_LPAE_TCR_SH0_SHIFT) |
-		      (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_IRGN0_SHIFT) |
-		      (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_ORGN0_SHIFT);
+		tcr->sh = ARM_LPAE_TCR_SH_OS;
+		tcr->irgn = ARM_LPAE_TCR_RGN_NC;
+		tcr->orgn = ARM_LPAE_TCR_RGN_NC;
 	}
 
 	switch (ARM_LPAE_GRANULE(data)) {
 	case SZ_4K:
-		reg |= ARM_LPAE_TCR_TG0_4K;
+		tcr->tg = ARM_LPAE_TCR_TG0_4K;
 		break;
 	case SZ_16K:
-		reg |= ARM_LPAE_TCR_TG0_16K;
+		tcr->tg = ARM_LPAE_TCR_TG0_16K;
 		break;
 	case SZ_64K:
-		reg |= ARM_LPAE_TCR_TG0_64K;
+		tcr->tg = ARM_LPAE_TCR_TG0_64K;
 		break;
 	}
 
 	switch (cfg->oas) {
 	case 32:
-		reg |= (ARM_LPAE_TCR_PS_32_BIT << ARM_LPAE_TCR_IPS_SHIFT);
+		tcr->ips = ARM_LPAE_TCR_PS_32_BIT;
 		break;
 	case 36:
-		reg |= (ARM_LPAE_TCR_PS_36_BIT << ARM_LPAE_TCR_IPS_SHIFT);
+		tcr->ips = ARM_LPAE_TCR_PS_36_BIT;
 		break;
 	case 40:
-		reg |= (ARM_LPAE_TCR_PS_40_BIT << ARM_LPAE_TCR_IPS_SHIFT);
+		tcr->ips = ARM_LPAE_TCR_PS_40_BIT;
 		break;
 	case 42:
-		reg |= (ARM_LPAE_TCR_PS_42_BIT << ARM_LPAE_TCR_IPS_SHIFT);
+		tcr->ips = ARM_LPAE_TCR_PS_42_BIT;
 		break;
 	case 44:
-		reg |= (ARM_LPAE_TCR_PS_44_BIT << ARM_LPAE_TCR_IPS_SHIFT);
+		tcr->ips = ARM_LPAE_TCR_PS_44_BIT;
 		break;
 	case 48:
-		reg |= (ARM_LPAE_TCR_PS_48_BIT << ARM_LPAE_TCR_IPS_SHIFT);
+		tcr->ips = ARM_LPAE_TCR_PS_48_BIT;
 		break;
 	case 52:
-		reg |= (ARM_LPAE_TCR_PS_52_BIT << ARM_LPAE_TCR_IPS_SHIFT);
+		tcr->ips = ARM_LPAE_TCR_PS_52_BIT;
 		break;
 	default:
 		goto out_free_data;
 	}
 
-	reg |= (64ULL - cfg->ias) << ARM_LPAE_TCR_T0SZ_SHIFT;
-
-	/* Disable speculative walks through TTBR1 */
-	reg |= ARM_LPAE_TCR_EPD1;
-	cfg->arm_lpae_s1_cfg.tcr = reg;
+	tcr->tsz = 64ULL - cfg->ias;
 
 	/* MAIRs */
 	reg = (ARM_LPAE_MAIR_ATTR_NC
@@ -914,7 +903,7 @@ arm_64_lpae_alloc_pgtable_s2(struct io_pgtable_cfg *cfg, void *cookie)
 	}
 
 	/* VTCR */
-	reg = ARM_64_LPAE_S2_TCR_RES1;
+	reg = ARM_64_LPAE_VTCR_RES1;
 	if (cfg->coherent_walk) {
 		reg |= (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) |
 		       (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) |
@@ -929,45 +918,45 @@ arm_64_lpae_alloc_pgtable_s2(struct io_pgtable_cfg *cfg, void *cookie)
 
 	switch (ARM_LPAE_GRANULE(data)) {
 	case SZ_4K:
-		reg |= ARM_LPAE_TCR_TG0_4K;
+		reg |= (ARM_LPAE_TCR_TG0_4K << ARM_LPAE_VTCR_TG0_SHIFT);
 		sl++; /* SL0 format is different for 4K granule size */
 		break;
 	case SZ_16K:
-		reg |= ARM_LPAE_TCR_TG0_16K;
+		reg |= (ARM_LPAE_TCR_TG0_16K << ARM_LPAE_VTCR_TG0_SHIFT);
 		break;
 	case SZ_64K:
-		reg |= ARM_LPAE_TCR_TG0_64K;
+		reg |= (ARM_LPAE_TCR_TG0_64K << ARM_LPAE_VTCR_TG0_SHIFT);
 		break;
 	}
 
 	switch (cfg->oas) {
 	case 32:
-		reg |= (ARM_LPAE_TCR_PS_32_BIT << ARM_LPAE_TCR_PS_SHIFT);
+		reg |= (ARM_LPAE_TCR_PS_32_BIT << ARM_LPAE_VTCR_PS_SHIFT);
 		break;
 	case 36:
-		reg |= (ARM_LPAE_TCR_PS_36_BIT << ARM_LPAE_TCR_PS_SHIFT);
+		reg |= (ARM_LPAE_TCR_PS_36_BIT << ARM_LPAE_VTCR_PS_SHIFT);
 		break;
 	case 40:
-		reg |= (ARM_LPAE_TCR_PS_40_BIT << ARM_LPAE_TCR_PS_SHIFT);
+		reg |= (ARM_LPAE_TCR_PS_40_BIT << ARM_LPAE_VTCR_PS_SHIFT);
 		break;
 	case 42:
-		reg |= (ARM_LPAE_TCR_PS_42_BIT << ARM_LPAE_TCR_PS_SHIFT);
+		reg |= (ARM_LPAE_TCR_PS_42_BIT << ARM_LPAE_VTCR_PS_SHIFT);
 		break;
 	case 44:
-		reg |= (ARM_LPAE_TCR_PS_44_BIT << ARM_LPAE_TCR_PS_SHIFT);
+		reg |= (ARM_LPAE_TCR_PS_44_BIT << ARM_LPAE_VTCR_PS_SHIFT);
 		break;
 	case 48:
-		reg |= (ARM_LPAE_TCR_PS_48_BIT << ARM_LPAE_TCR_PS_SHIFT);
+		reg |= (ARM_LPAE_TCR_PS_48_BIT << ARM_LPAE_VTCR_PS_SHIFT);
 		break;
 	case 52:
-		reg |= (ARM_LPAE_TCR_PS_52_BIT << ARM_LPAE_TCR_PS_SHIFT);
+		reg |= (ARM_LPAE_TCR_PS_52_BIT << ARM_LPAE_VTCR_PS_SHIFT);
 		break;
 	default:
 		goto out_free_data;
 	}
 
 	reg |= (64ULL - cfg->ias) << ARM_LPAE_TCR_T0SZ_SHIFT;
-	reg |= (~sl & ARM_LPAE_TCR_SL0_MASK) << ARM_LPAE_TCR_SL0_SHIFT;
+	reg |= (~sl & ARM_LPAE_VTCR_SL0_MASK) << ARM_LPAE_VTCR_SL0_SHIFT;
 	cfg->arm_lpae_s2_cfg.vtcr = reg;
 
 	/* Allocate pgd pages */
@@ -991,19 +980,12 @@ arm_64_lpae_alloc_pgtable_s2(struct io_pgtable_cfg *cfg, void *cookie)
 static struct io_pgtable *
 arm_32_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 {
-	struct io_pgtable *iop;
-
 	if (cfg->ias > 32 || cfg->oas > 40)
 		return NULL;
 
 	cfg->pgsize_bitmap &= (SZ_4K | SZ_2M | SZ_1G);
-	iop = arm_64_lpae_alloc_pgtable_s1(cfg, cookie);
-	if (iop) {
-		cfg->arm_lpae_s1_cfg.tcr |= ARM_32_LPAE_TCR_EAE;
-		cfg->arm_lpae_s1_cfg.tcr &= 0xffffffff;
-	}
 
-	return iop;
+	return arm_64_lpae_alloc_pgtable_s1(cfg, cookie);
 }
 
 static struct io_pgtable *
diff --git a/drivers/iommu/io-pgtable.c b/drivers/iommu/io-pgtable.c
index ced53e5b72b5..94394c81468f 100644
--- a/drivers/iommu/io-pgtable.c
+++ b/drivers/iommu/io-pgtable.c
@@ -63,7 +63,7 @@ void free_io_pgtable_ops(struct io_pgtable_ops *ops)
 	if (!ops)
 		return;
 
-	iop = container_of(ops, struct io_pgtable, ops);
+	iop = io_pgtable_ops_to_pgtable(ops);
 	io_pgtable_tlb_flush_all(iop);
 	io_pgtable_init_table[iop->fmt]->free(iop);
 }
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
index c200bc066257..2eeaf2eec946 100644
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -271,15 +271,13 @@ static int qcom_iommu_init_domain(struct iommu_domain *domain,
 		iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
 				pgtbl_cfg.arm_lpae_s1_cfg.ttbr |
 				FIELD_PREP(TTBRn_ASID, ctx->asid));
-		iommu_writeq(ctx, ARM_SMMU_CB_TTBR1,
-				FIELD_PREP(TTBRn_ASID, ctx->asid));
+		iommu_writeq(ctx, ARM_SMMU_CB_TTBR1, 0);
 
 		/* TCR */
 		iommu_writel(ctx, ARM_SMMU_CB_TCR2,
-				(pgtbl_cfg.arm_lpae_s1_cfg.tcr >> 32) |
-				FIELD_PREP(TCR2_SEP, TCR2_SEP_UPSTREAM));
+				arm_smmu_lpae_tcr2(&pgtbl_cfg));
 		iommu_writel(ctx, ARM_SMMU_CB_TCR,
-				pgtbl_cfg.arm_lpae_s1_cfg.tcr);
+				arm_smmu_lpae_tcr(&pgtbl_cfg) | TCR_EAE);
 
 		/* MAIRs (stage-1 only) */
 		iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR0,
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 53bca5343f52..6ae104cedfd7 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -101,7 +101,14 @@ struct io_pgtable_cfg {
 	union {
 		struct {
 			u64	ttbr;
-			u64	tcr;
+			struct {
+				u32	ips:3;
+				u32	tg:2;
+				u32	sh:2;
+				u32	orgn:2;
+				u32	irgn:2;
+				u32	tsz:6;
+			}	tcr;
 			u64	mair;
 		} arm_lpae_s1_cfg;
 
-- 
2.25.0.rc1.283.g88dfdc4193-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 6/8] iommu/arm-smmu: Rename public #defines under ARM_SMMU_ namespace
  2020-01-10 15:28 ` Will Deacon
@ 2020-01-10 15:28   ` Will Deacon
  -1 siblings, 0 replies; 22+ messages in thread
From: Will Deacon @ 2020-01-10 15:28 UTC (permalink / raw)
  To: iommu; +Cc: Will Deacon, kernel-team, Robin Murphy, linux-arm-kernel

Now that we have arm-smmu.h defining various SMMU constants, ensure that
they are namespaced with the ARM_SMMU_ prefix in order to avoid conflicts
with the CPU, such as the one we're currently bodging around with the
TCR.

Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 drivers/iommu/arm-smmu-impl.c |   2 +-
 drivers/iommu/arm-smmu.c      | 157 ++++++++++++------------
 drivers/iommu/arm-smmu.h      | 217 +++++++++++++++++-----------------
 drivers/iommu/qcom_iommu.c    |  16 +--
 4 files changed, 204 insertions(+), 188 deletions(-)

diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c
index b2fe72a8f019..74d97a886e93 100644
--- a/drivers/iommu/arm-smmu-impl.c
+++ b/drivers/iommu/arm-smmu-impl.c
@@ -119,7 +119,7 @@ int arm_mmu500_reset(struct arm_smmu_device *smmu)
 	 * Secure has also cleared SACR.CACHE_LOCK for this to take effect...
 	 */
 	reg = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_ID7);
-	major = FIELD_GET(ID7_MAJOR, reg);
+	major = FIELD_GET(ARM_SMMU_ID7_MAJOR, reg);
 	reg = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_sACR);
 	if (major >= 2)
 		reg &= ~ARM_MMU500_ACR_CACHE_LOCK;
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index e65eb60812ec..214be09f6ded 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -271,7 +271,7 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu, int page,
 	for (delay = 1; delay < TLB_LOOP_TIMEOUT; delay *= 2) {
 		for (spin_cnt = TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) {
 			reg = arm_smmu_readl(smmu, page, status);
-			if (!(reg & sTLBGSTATUS_GSACTIVE))
+			if (!(reg & ARM_SMMU_sTLBGSTATUS_GSACTIVE))
 				return;
 			cpu_relax();
 		}
@@ -478,7 +478,7 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
 	int idx = smmu_domain->cfg.cbndx;
 
 	fsr = arm_smmu_cb_read(smmu, idx, ARM_SMMU_CB_FSR);
-	if (!(fsr & FSR_FAULT))
+	if (!(fsr & ARM_SMMU_FSR_FAULT))
 		return IRQ_NONE;
 
 	fsynr = arm_smmu_cb_read(smmu, idx, ARM_SMMU_CB_FSYNR0);
@@ -510,7 +510,7 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
 
 	if (__ratelimit(&rs)) {
 		if (IS_ENABLED(CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT) &&
-		    (gfsr & sGFSR_USF))
+		    (gfsr & ARM_SMMU_sGFSR_USF))
 			dev_err(smmu->dev,
 				"Blocked unknown Stream ID 0x%hx; boot with \"arm-smmu.disable_bypass=0\" to allow, but this may have security implications\n",
 				(u16)gfsynr1);
@@ -543,9 +543,9 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
 			cb->tcr[0] = arm_smmu_lpae_tcr(pgtbl_cfg);
 			cb->tcr[1] = arm_smmu_lpae_tcr2(pgtbl_cfg);
 			if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH64)
-				cb->tcr[1] |= TCR2_AS;
+				cb->tcr[1] |= ARM_SMMU_TCR2_AS;
 			else
-				cb->tcr[0] |= TCR_EAE;
+				cb->tcr[0] |= ARM_SMMU_TCR_EAE;
 		}
 	} else {
 		cb->tcr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vtcr;
@@ -558,8 +558,10 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
 			cb->ttbr[1] = 0;
 		} else {
 			cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
-			cb->ttbr[0] |= FIELD_PREP(TTBRn_ASID, cfg->asid);
-			cb->ttbr[1] = FIELD_PREP(TTBRn_ASID, cfg->asid);
+			cb->ttbr[0] |= FIELD_PREP(ARM_SMMU_TTBRn_ASID,
+						  cfg->asid);
+			cb->ttbr[1] = FIELD_PREP(ARM_SMMU_TTBRn_ASID,
+						 cfg->asid);
 		}
 	} else {
 		cb->ttbr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
@@ -595,31 +597,33 @@ static void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx)
 	/* CBA2R */
 	if (smmu->version > ARM_SMMU_V1) {
 		if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH64)
-			reg = CBA2R_VA64;
+			reg = ARM_SMMU_CBA2R_VA64;
 		else
 			reg = 0;
 		/* 16-bit VMIDs live in CBA2R */
 		if (smmu->features & ARM_SMMU_FEAT_VMID16)
-			reg |= FIELD_PREP(CBA2R_VMID16, cfg->vmid);
+			reg |= FIELD_PREP(ARM_SMMU_CBA2R_VMID16, cfg->vmid);
 
 		arm_smmu_gr1_write(smmu, ARM_SMMU_GR1_CBA2R(idx), reg);
 	}
 
 	/* CBAR */
-	reg = FIELD_PREP(CBAR_TYPE, cfg->cbar);
+	reg = FIELD_PREP(ARM_SMMU_CBAR_TYPE, cfg->cbar);
 	if (smmu->version < ARM_SMMU_V2)
-		reg |= FIELD_PREP(CBAR_IRPTNDX, cfg->irptndx);
+		reg |= FIELD_PREP(ARM_SMMU_CBAR_IRPTNDX, cfg->irptndx);
 
 	/*
 	 * Use the weakest shareability/memory types, so they are
 	 * overridden by the ttbcr/pte.
 	 */
 	if (stage1) {
-		reg |= FIELD_PREP(CBAR_S1_BPSHCFG, CBAR_S1_BPSHCFG_NSH) |
-			FIELD_PREP(CBAR_S1_MEMATTR, CBAR_S1_MEMATTR_WB);
+		reg |= FIELD_PREP(ARM_SMMU_CBAR_S1_BPSHCFG,
+				  ARM_SMMU_CBAR_S1_BPSHCFG_NSH) |
+		       FIELD_PREP(ARM_SMMU_CBAR_S1_MEMATTR,
+				  ARM_SMMU_CBAR_S1_MEMATTR_WB);
 	} else if (!(smmu->features & ARM_SMMU_FEAT_VMID16)) {
 		/* 8-bit VMIDs live in CBAR */
-		reg |= FIELD_PREP(CBAR_VMID, cfg->vmid);
+		reg |= FIELD_PREP(ARM_SMMU_CBAR_VMID, cfg->vmid);
 	}
 	arm_smmu_gr1_write(smmu, ARM_SMMU_GR1_CBAR(idx), reg);
 
@@ -651,11 +655,12 @@ static void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx)
 	}
 
 	/* SCTLR */
-	reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE | SCTLR_M;
+	reg = ARM_SMMU_SCTLR_CFIE | ARM_SMMU_SCTLR_CFRE | ARM_SMMU_SCTLR_AFE |
+	      ARM_SMMU_SCTLR_TRE | ARM_SMMU_SCTLR_M;
 	if (stage1)
-		reg |= SCTLR_S1_ASIDPNE;
+		reg |= ARM_SMMU_SCTLR_S1_ASIDPNE;
 	if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
-		reg |= SCTLR_E;
+		reg |= ARM_SMMU_SCTLR_E;
 
 	arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, reg);
 }
@@ -837,7 +842,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	if (ret < 0) {
 		dev_err(smmu->dev, "failed to request context IRQ %d (%u)\n",
 			cfg->irptndx, irq);
-		cfg->irptndx = INVALID_IRPTNDX;
+		cfg->irptndx = ARM_SMMU_INVALID_IRPTNDX;
 	}
 
 	mutex_unlock(&smmu_domain->init_mutex);
@@ -875,7 +880,7 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
 	smmu->cbs[cfg->cbndx].cfg = NULL;
 	arm_smmu_write_context_bank(smmu, cfg->cbndx);
 
-	if (cfg->irptndx != INVALID_IRPTNDX) {
+	if (cfg->irptndx != ARM_SMMU_INVALID_IRPTNDX) {
 		irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
 		devm_free_irq(smmu->dev, irq, domain);
 	}
@@ -931,23 +936,24 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
 static void arm_smmu_write_smr(struct arm_smmu_device *smmu, int idx)
 {
 	struct arm_smmu_smr *smr = smmu->smrs + idx;
-	u32 reg = FIELD_PREP(SMR_ID, smr->id) | FIELD_PREP(SMR_MASK, smr->mask);
+	u32 reg = FIELD_PREP(ARM_SMMU_SMR_ID, smr->id) |
+		  FIELD_PREP(ARM_SMMU_SMR_MASK, smr->mask);
 
 	if (!(smmu->features & ARM_SMMU_FEAT_EXIDS) && smr->valid)
-		reg |= SMR_VALID;
+		reg |= ARM_SMMU_SMR_VALID;
 	arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_SMR(idx), reg);
 }
 
 static void arm_smmu_write_s2cr(struct arm_smmu_device *smmu, int idx)
 {
 	struct arm_smmu_s2cr *s2cr = smmu->s2crs + idx;
-	u32 reg = FIELD_PREP(S2CR_TYPE, s2cr->type) |
-		  FIELD_PREP(S2CR_CBNDX, s2cr->cbndx) |
-		  FIELD_PREP(S2CR_PRIVCFG, s2cr->privcfg);
+	u32 reg = FIELD_PREP(ARM_SMMU_S2CR_TYPE, s2cr->type) |
+		  FIELD_PREP(ARM_SMMU_S2CR_CBNDX, s2cr->cbndx) |
+		  FIELD_PREP(ARM_SMMU_S2CR_PRIVCFG, s2cr->privcfg);
 
 	if (smmu->features & ARM_SMMU_FEAT_EXIDS && smmu->smrs &&
 	    smmu->smrs[idx].valid)
-		reg |= S2CR_EXIDVALID;
+		reg |= ARM_SMMU_S2CR_EXIDVALID;
 	arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_S2CR(idx), reg);
 }
 
@@ -974,15 +980,15 @@ static void arm_smmu_test_smr_masks(struct arm_smmu_device *smmu)
 	 * bits are set, so check each one separately. We can reject
 	 * masters later if they try to claim IDs outside these masks.
 	 */
-	smr = FIELD_PREP(SMR_ID, smmu->streamid_mask);
+	smr = FIELD_PREP(ARM_SMMU_SMR_ID, smmu->streamid_mask);
 	arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_SMR(0), smr);
 	smr = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_SMR(0));
-	smmu->streamid_mask = FIELD_GET(SMR_ID, smr);
+	smmu->streamid_mask = FIELD_GET(ARM_SMMU_SMR_ID, smr);
 
-	smr = FIELD_PREP(SMR_MASK, smmu->streamid_mask);
+	smr = FIELD_PREP(ARM_SMMU_SMR_MASK, smmu->streamid_mask);
 	arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_SMR(0), smr);
 	smr = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_SMR(0));
-	smmu->smr_mask_mask = FIELD_GET(SMR_MASK, smr);
+	smmu->smr_mask_mask = FIELD_GET(ARM_SMMU_SMR_MASK, smr);
 }
 
 static int arm_smmu_find_sme(struct arm_smmu_device *smmu, u16 id, u16 mask)
@@ -1051,8 +1057,8 @@ static int arm_smmu_master_alloc_smes(struct device *dev)
 	mutex_lock(&smmu->stream_map_mutex);
 	/* Figure out a viable stream map entry allocation */
 	for_each_cfg_sme(fwspec, i, idx) {
-		u16 sid = FIELD_GET(SMR_ID, fwspec->ids[i]);
-		u16 mask = FIELD_GET(SMR_MASK, fwspec->ids[i]);
+		u16 sid = FIELD_GET(ARM_SMMU_SMR_ID, fwspec->ids[i]);
+		u16 mask = FIELD_GET(ARM_SMMU_SMR_MASK, fwspec->ids[i]);
 
 		if (idx != INVALID_SMENDX) {
 			ret = -EEXIST;
@@ -1296,7 +1302,8 @@ static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain,
 		arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_ATS1PR, va);
 
 	reg = arm_smmu_page(smmu, ARM_SMMU_CB(smmu, idx)) + ARM_SMMU_CB_ATSR;
-	if (readl_poll_timeout_atomic(reg, tmp, !(tmp & ATSR_ACTIVE), 5, 50)) {
+	if (readl_poll_timeout_atomic(reg, tmp, !(tmp & ARM_SMMU_ATSR_ACTIVE),
+				      5, 50)) {
 		spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
 		dev_err(dev,
 			"iova to phys timed out on %pad. Falling back to software table walk.\n",
@@ -1306,7 +1313,7 @@ static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain,
 
 	phys = arm_smmu_cb_readq(smmu, idx, ARM_SMMU_CB_PAR);
 	spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
-	if (phys & CB_PAR_F) {
+	if (phys & ARM_SMMU_CB_PAR_F) {
 		dev_err(dev, "translation fault!\n");
 		dev_err(dev, "PAR = 0x%llx\n", phys);
 		return 0;
@@ -1387,8 +1394,8 @@ static int arm_smmu_add_device(struct device *dev)
 
 	ret = -EINVAL;
 	for (i = 0; i < fwspec->num_ids; i++) {
-		u16 sid = FIELD_GET(SMR_ID, fwspec->ids[i]);
-		u16 mask = FIELD_GET(SMR_MASK, fwspec->ids[i]);
+		u16 sid = FIELD_GET(ARM_SMMU_SMR_ID, fwspec->ids[i]);
+		u16 mask = FIELD_GET(ARM_SMMU_SMR_MASK, fwspec->ids[i]);
 
 		if (sid & ~smmu->streamid_mask) {
 			dev_err(dev, "stream ID 0x%x out of range for SMMU (0x%x)\n",
@@ -1569,12 +1576,12 @@ static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
 	u32 mask, fwid = 0;
 
 	if (args->args_count > 0)
-		fwid |= FIELD_PREP(SMR_ID, args->args[0]);
+		fwid |= FIELD_PREP(ARM_SMMU_SMR_ID, args->args[0]);
 
 	if (args->args_count > 1)
-		fwid |= FIELD_PREP(SMR_MASK, args->args[1]);
+		fwid |= FIELD_PREP(ARM_SMMU_SMR_MASK, args->args[1]);
 	else if (!of_property_read_u32(args->np, "stream-match-mask", &mask))
-		fwid |= FIELD_PREP(SMR_MASK, mask);
+		fwid |= FIELD_PREP(ARM_SMMU_SMR_MASK, mask);
 
 	return iommu_fwspec_add_ids(dev, &fwid, 1);
 }
@@ -1644,7 +1651,7 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu)
 	/* Make sure all context banks are disabled and clear CB_FSR  */
 	for (i = 0; i < smmu->num_context_banks; ++i) {
 		arm_smmu_write_context_bank(smmu, i);
-		arm_smmu_cb_write(smmu, i, ARM_SMMU_CB_FSR, FSR_FAULT);
+		arm_smmu_cb_write(smmu, i, ARM_SMMU_CB_FSR, ARM_SMMU_FSR_FAULT);
 	}
 
 	/* Invalidate the TLB, just in case */
@@ -1654,29 +1661,30 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu)
 	reg = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_sCR0);
 
 	/* Enable fault reporting */
-	reg |= (sCR0_GFRE | sCR0_GFIE | sCR0_GCFGFRE | sCR0_GCFGFIE);
+	reg |= (ARM_SMMU_sCR0_GFRE | ARM_SMMU_sCR0_GFIE |
+		ARM_SMMU_sCR0_GCFGFRE | ARM_SMMU_sCR0_GCFGFIE);
 
 	/* Disable TLB broadcasting. */
-	reg |= (sCR0_VMIDPNE | sCR0_PTM);
+	reg |= (ARM_SMMU_sCR0_VMIDPNE | ARM_SMMU_sCR0_PTM);
 
 	/* Enable client access, handling unmatched streams as appropriate */
-	reg &= ~sCR0_CLIENTPD;
+	reg &= ~ARM_SMMU_sCR0_CLIENTPD;
 	if (disable_bypass)
-		reg |= sCR0_USFCFG;
+		reg |= ARM_SMMU_sCR0_USFCFG;
 	else
-		reg &= ~sCR0_USFCFG;
+		reg &= ~ARM_SMMU_sCR0_USFCFG;
 
 	/* Disable forced broadcasting */
-	reg &= ~sCR0_FB;
+	reg &= ~ARM_SMMU_sCR0_FB;
 
 	/* Don't upgrade barriers */
-	reg &= ~(sCR0_BSU);
+	reg &= ~(ARM_SMMU_sCR0_BSU);
 
 	if (smmu->features & ARM_SMMU_FEAT_VMID16)
-		reg |= sCR0_VMID16EN;
+		reg |= ARM_SMMU_sCR0_VMID16EN;
 
 	if (smmu->features & ARM_SMMU_FEAT_EXIDS)
-		reg |= sCR0_EXIDENABLE;
+		reg |= ARM_SMMU_sCR0_EXIDENABLE;
 
 	if (smmu->impl && smmu->impl->reset)
 		smmu->impl->reset(smmu);
@@ -1721,21 +1729,21 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 
 	/* Restrict available stages based on module parameter */
 	if (force_stage == 1)
-		id &= ~(ID0_S2TS | ID0_NTS);
+		id &= ~(ARM_SMMU_ID0_S2TS | ARM_SMMU_ID0_NTS);
 	else if (force_stage == 2)
-		id &= ~(ID0_S1TS | ID0_NTS);
+		id &= ~(ARM_SMMU_ID0_S1TS | ARM_SMMU_ID0_NTS);
 
-	if (id & ID0_S1TS) {
+	if (id & ARM_SMMU_ID0_S1TS) {
 		smmu->features |= ARM_SMMU_FEAT_TRANS_S1;
 		dev_notice(smmu->dev, "\tstage 1 translation\n");
 	}
 
-	if (id & ID0_S2TS) {
+	if (id & ARM_SMMU_ID0_S2TS) {
 		smmu->features |= ARM_SMMU_FEAT_TRANS_S2;
 		dev_notice(smmu->dev, "\tstage 2 translation\n");
 	}
 
-	if (id & ID0_NTS) {
+	if (id & ARM_SMMU_ID0_NTS) {
 		smmu->features |= ARM_SMMU_FEAT_TRANS_NESTED;
 		dev_notice(smmu->dev, "\tnested translation\n");
 	}
@@ -1746,8 +1754,8 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 		return -ENODEV;
 	}
 
-	if ((id & ID0_S1TS) &&
-		((smmu->version < ARM_SMMU_V2) || !(id & ID0_ATOSNS))) {
+	if ((id & ARM_SMMU_ID0_S1TS) &&
+	    ((smmu->version < ARM_SMMU_V2) || !(id & ARM_SMMU_ID0_ATOSNS))) {
 		smmu->features |= ARM_SMMU_FEAT_TRANS_OPS;
 		dev_notice(smmu->dev, "\taddress translation ops\n");
 	}
@@ -1758,7 +1766,7 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 	 * Fortunately, this also opens up a workaround for systems where the
 	 * ID register value has ended up configured incorrectly.
 	 */
-	cttw_reg = !!(id & ID0_CTTW);
+	cttw_reg = !!(id & ARM_SMMU_ID0_CTTW);
 	if (cttw_fw || cttw_reg)
 		dev_notice(smmu->dev, "\t%scoherent table walk\n",
 			   cttw_fw ? "" : "non-");
@@ -1767,16 +1775,16 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 			   "\t(IDR0.CTTW overridden by FW configuration)\n");
 
 	/* Max. number of entries we have for stream matching/indexing */
-	if (smmu->version == ARM_SMMU_V2 && id & ID0_EXIDS) {
+	if (smmu->version == ARM_SMMU_V2 && id & ARM_SMMU_ID0_EXIDS) {
 		smmu->features |= ARM_SMMU_FEAT_EXIDS;
 		size = 1 << 16;
 	} else {
-		size = 1 << FIELD_GET(ID0_NUMSIDB, id);
+		size = 1 << FIELD_GET(ARM_SMMU_ID0_NUMSIDB, id);
 	}
 	smmu->streamid_mask = size - 1;
-	if (id & ID0_SMS) {
+	if (id & ARM_SMMU_ID0_SMS) {
 		smmu->features |= ARM_SMMU_FEAT_STREAM_MATCH;
-		size = FIELD_GET(ID0_NUMSMRG, id);
+		size = FIELD_GET(ARM_SMMU_ID0_NUMSMRG, id);
 		if (size == 0) {
 			dev_err(smmu->dev,
 				"stream-matching supported, but no SMRs present!\n");
@@ -1804,18 +1812,19 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 	mutex_init(&smmu->stream_map_mutex);
 	spin_lock_init(&smmu->global_sync_lock);
 
-	if (smmu->version < ARM_SMMU_V2 || !(id & ID0_PTFS_NO_AARCH32)) {
+	if (smmu->version < ARM_SMMU_V2 ||
+	    !(id & ARM_SMMU_ID0_PTFS_NO_AARCH32)) {
 		smmu->features |= ARM_SMMU_FEAT_FMT_AARCH32_L;
-		if (!(id & ID0_PTFS_NO_AARCH32S))
+		if (!(id & ARM_SMMU_ID0_PTFS_NO_AARCH32S))
 			smmu->features |= ARM_SMMU_FEAT_FMT_AARCH32_S;
 	}
 
 	/* ID1 */
 	id = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_ID1);
-	smmu->pgshift = (id & ID1_PAGESIZE) ? 16 : 12;
+	smmu->pgshift = (id & ARM_SMMU_ID1_PAGESIZE) ? 16 : 12;
 
 	/* Check for size mismatch of SMMU address space from mapped region */
-	size = 1 << (FIELD_GET(ID1_NUMPAGENDXB, id) + 1);
+	size = 1 << (FIELD_GET(ARM_SMMU_ID1_NUMPAGENDXB, id) + 1);
 	if (smmu->numpage != 2 * size << smmu->pgshift)
 		dev_warn(smmu->dev,
 			"SMMU address space size (0x%x) differs from mapped region size (0x%x)!\n",
@@ -1823,8 +1832,8 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 	/* Now properly encode NUMPAGE to subsequently derive SMMU_CB_BASE */
 	smmu->numpage = size;
 
-	smmu->num_s2_context_banks = FIELD_GET(ID1_NUMS2CB, id);
-	smmu->num_context_banks = FIELD_GET(ID1_NUMCB, id);
+	smmu->num_s2_context_banks = FIELD_GET(ARM_SMMU_ID1_NUMS2CB, id);
+	smmu->num_context_banks = FIELD_GET(ARM_SMMU_ID1_NUMCB, id);
 	if (smmu->num_s2_context_banks > smmu->num_context_banks) {
 		dev_err(smmu->dev, "impossible number of S2 context banks!\n");
 		return -ENODEV;
@@ -1838,14 +1847,14 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 
 	/* ID2 */
 	id = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_ID2);
-	size = arm_smmu_id_size_to_bits(FIELD_GET(ID2_IAS, id));
+	size = arm_smmu_id_size_to_bits(FIELD_GET(ARM_SMMU_ID2_IAS, id));
 	smmu->ipa_size = size;
 
 	/* The output mask is also applied for bypass */
-	size = arm_smmu_id_size_to_bits(FIELD_GET(ID2_OAS, id));
+	size = arm_smmu_id_size_to_bits(FIELD_GET(ARM_SMMU_ID2_OAS, id));
 	smmu->pa_size = size;
 
-	if (id & ID2_VMID16)
+	if (id & ARM_SMMU_ID2_VMID16)
 		smmu->features |= ARM_SMMU_FEAT_VMID16;
 
 	/*
@@ -1862,13 +1871,13 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 		if (smmu->version == ARM_SMMU_V1_64K)
 			smmu->features |= ARM_SMMU_FEAT_FMT_AARCH64_64K;
 	} else {
-		size = FIELD_GET(ID2_UBS, id);
+		size = FIELD_GET(ARM_SMMU_ID2_UBS, id);
 		smmu->va_size = arm_smmu_id_size_to_bits(size);
-		if (id & ID2_PTFS_4K)
+		if (id & ARM_SMMU_ID2_PTFS_4K)
 			smmu->features |= ARM_SMMU_FEAT_FMT_AARCH64_4K;
-		if (id & ID2_PTFS_16K)
+		if (id & ARM_SMMU_ID2_PTFS_16K)
 			smmu->features |= ARM_SMMU_FEAT_FMT_AARCH64_16K;
-		if (id & ID2_PTFS_64K)
+		if (id & ARM_SMMU_ID2_PTFS_64K)
 			smmu->features |= ARM_SMMU_FEAT_FMT_AARCH64_64K;
 	}
 
@@ -2245,7 +2254,7 @@ static int arm_smmu_device_remove(struct platform_device *pdev)
 
 	arm_smmu_rpm_get(smmu);
 	/* Turn the thing off */
-	arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_sCR0, sCR0_CLIENTPD);
+	arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_sCR0, ARM_SMMU_sCR0_CLIENTPD);
 	arm_smmu_rpm_put(smmu);
 
 	if (pm_runtime_enabled(smmu->dev))
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index aade2b0ae175..6501f38a5966 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -24,51 +24,51 @@
 
 /* Configuration registers */
 #define ARM_SMMU_GR0_sCR0		0x0
-#define sCR0_VMID16EN			BIT(31)
-#define sCR0_BSU			GENMASK(15, 14)
-#define sCR0_FB				BIT(13)
-#define sCR0_PTM			BIT(12)
-#define sCR0_VMIDPNE			BIT(11)
-#define sCR0_USFCFG			BIT(10)
-#define sCR0_GCFGFIE			BIT(5)
-#define sCR0_GCFGFRE			BIT(4)
-#define sCR0_EXIDENABLE			BIT(3)
-#define sCR0_GFIE			BIT(2)
-#define sCR0_GFRE			BIT(1)
-#define sCR0_CLIENTPD			BIT(0)
+#define ARM_SMMU_sCR0_VMID16EN		BIT(31)
+#define ARM_SMMU_sCR0_BSU		GENMASK(15, 14)
+#define ARM_SMMU_sCR0_FB		BIT(13)
+#define ARM_SMMU_sCR0_PTM		BIT(12)
+#define ARM_SMMU_sCR0_VMIDPNE		BIT(11)
+#define ARM_SMMU_sCR0_USFCFG		BIT(10)
+#define ARM_SMMU_sCR0_GCFGFIE		BIT(5)
+#define ARM_SMMU_sCR0_GCFGFRE		BIT(4)
+#define ARM_SMMU_sCR0_EXIDENABLE	BIT(3)
+#define ARM_SMMU_sCR0_GFIE		BIT(2)
+#define ARM_SMMU_sCR0_GFRE		BIT(1)
+#define ARM_SMMU_sCR0_CLIENTPD		BIT(0)
 
 /* Auxiliary Configuration register */
 #define ARM_SMMU_GR0_sACR		0x10
 
 /* Identification registers */
 #define ARM_SMMU_GR0_ID0		0x20
-#define ID0_S1TS			BIT(30)
-#define ID0_S2TS			BIT(29)
-#define ID0_NTS				BIT(28)
-#define ID0_SMS				BIT(27)
-#define ID0_ATOSNS			BIT(26)
-#define ID0_PTFS_NO_AARCH32		BIT(25)
-#define ID0_PTFS_NO_AARCH32S		BIT(24)
-#define ID0_NUMIRPT			GENMASK(23, 16)
-#define ID0_CTTW			BIT(14)
-#define ID0_NUMSIDB			GENMASK(12, 9)
-#define ID0_EXIDS			BIT(8)
-#define ID0_NUMSMRG			GENMASK(7, 0)
+#define ARM_SMMU_ID0_S1TS		BIT(30)
+#define ARM_SMMU_ID0_S2TS		BIT(29)
+#define ARM_SMMU_ID0_NTS		BIT(28)
+#define ARM_SMMU_ID0_SMS		BIT(27)
+#define ARM_SMMU_ID0_ATOSNS		BIT(26)
+#define ARM_SMMU_ID0_PTFS_NO_AARCH32	BIT(25)
+#define ARM_SMMU_ID0_PTFS_NO_AARCH32S	BIT(24)
+#define ARM_SMMU_ID0_NUMIRPT		GENMASK(23, 16)
+#define ARM_SMMU_ID0_CTTW		BIT(14)
+#define ARM_SMMU_ID0_NUMSIDB		GENMASK(12, 9)
+#define ARM_SMMU_ID0_EXIDS		BIT(8)
+#define ARM_SMMU_ID0_NUMSMRG		GENMASK(7, 0)
 
 #define ARM_SMMU_GR0_ID1		0x24
-#define ID1_PAGESIZE			BIT(31)
-#define ID1_NUMPAGENDXB			GENMASK(30, 28)
-#define ID1_NUMS2CB			GENMASK(23, 16)
-#define ID1_NUMCB			GENMASK(7, 0)
+#define ARM_SMMU_ID1_PAGESIZE		BIT(31)
+#define ARM_SMMU_ID1_NUMPAGENDXB	GENMASK(30, 28)
+#define ARM_SMMU_ID1_NUMS2CB		GENMASK(23, 16)
+#define ARM_SMMU_ID1_NUMCB		GENMASK(7, 0)
 
 #define ARM_SMMU_GR0_ID2		0x28
-#define ID2_VMID16			BIT(15)
-#define ID2_PTFS_64K			BIT(14)
-#define ID2_PTFS_16K			BIT(13)
-#define ID2_PTFS_4K			BIT(12)
-#define ID2_UBS				GENMASK(11, 8)
-#define ID2_OAS				GENMASK(7, 4)
-#define ID2_IAS				GENMASK(3, 0)
+#define ARM_SMMU_ID2_VMID16		BIT(15)
+#define ARM_SMMU_ID2_PTFS_64K		BIT(14)
+#define ARM_SMMU_ID2_PTFS_16K		BIT(13)
+#define ARM_SMMU_ID2_PTFS_4K		BIT(12)
+#define ARM_SMMU_ID2_UBS		GENMASK(11, 8)
+#define ARM_SMMU_ID2_OAS		GENMASK(7, 4)
+#define ARM_SMMU_ID2_IAS		GENMASK(3, 0)
 
 #define ARM_SMMU_GR0_ID3		0x2c
 #define ARM_SMMU_GR0_ID4		0x30
@@ -76,11 +76,11 @@
 #define ARM_SMMU_GR0_ID6		0x38
 
 #define ARM_SMMU_GR0_ID7		0x3c
-#define ID7_MAJOR			GENMASK(7, 4)
-#define ID7_MINOR			GENMASK(3, 0)
+#define ARM_SMMU_ID7_MAJOR		GENMASK(7, 4)
+#define ARM_SMMU_ID7_MINOR		GENMASK(3, 0)
 
 #define ARM_SMMU_GR0_sGFSR		0x48
-#define sGFSR_USF			BIT(1)
+#define ARM_SMMU_sGFSR_USF		BIT(1)
 
 #define ARM_SMMU_GR0_sGFSYNR0		0x50
 #define ARM_SMMU_GR0_sGFSYNR1		0x54
@@ -93,118 +93,123 @@
 #define ARM_SMMU_GR0_sTLBGSYNC		0x70
 
 #define ARM_SMMU_GR0_sTLBGSTATUS	0x74
-#define sTLBGSTATUS_GSACTIVE		BIT(0)
+#define ARM_SMMU_sTLBGSTATUS_GSACTIVE	BIT(0)
 
 /* Stream mapping registers */
 #define ARM_SMMU_GR0_SMR(n)		(0x800 + ((n) << 2))
-#define SMR_VALID			BIT(31)
-#define SMR_MASK			GENMASK(31, 16)
-#define SMR_ID				GENMASK(15, 0)
+#define ARM_SMMU_SMR_VALID		BIT(31)
+#define ARM_SMMU_SMR_MASK		GENMASK(31, 16)
+#define ARM_SMMU_SMR_ID			GENMASK(15, 0)
 
 #define ARM_SMMU_GR0_S2CR(n)		(0xc00 + ((n) << 2))
-#define S2CR_PRIVCFG			GENMASK(25, 24)
+#define ARM_SMMU_S2CR_PRIVCFG		GENMASK(25, 24)
 enum arm_smmu_s2cr_privcfg {
 	S2CR_PRIVCFG_DEFAULT,
 	S2CR_PRIVCFG_DIPAN,
 	S2CR_PRIVCFG_UNPRIV,
 	S2CR_PRIVCFG_PRIV,
 };
-#define S2CR_TYPE			GENMASK(17, 16)
+#define ARM_SMMU_S2CR_TYPE		GENMASK(17, 16)
 enum arm_smmu_s2cr_type {
 	S2CR_TYPE_TRANS,
 	S2CR_TYPE_BYPASS,
 	S2CR_TYPE_FAULT,
 };
-#define S2CR_EXIDVALID			BIT(10)
-#define S2CR_CBNDX			GENMASK(7, 0)
+#define ARM_SMMU_S2CR_EXIDVALID		BIT(10)
+#define ARM_SMMU_S2CR_CBNDX		GENMASK(7, 0)
 
 /* Context bank attribute registers */
 #define ARM_SMMU_GR1_CBAR(n)		(0x0 + ((n) << 2))
-#define CBAR_IRPTNDX			GENMASK(31, 24)
-#define CBAR_TYPE			GENMASK(17, 16)
+#define ARM_SMMU_CBAR_IRPTNDX		GENMASK(31, 24)
+#define ARM_SMMU_CBAR_TYPE		GENMASK(17, 16)
 enum arm_smmu_cbar_type {
 	CBAR_TYPE_S2_TRANS,
 	CBAR_TYPE_S1_TRANS_S2_BYPASS,
 	CBAR_TYPE_S1_TRANS_S2_FAULT,
 	CBAR_TYPE_S1_TRANS_S2_TRANS,
 };
-#define CBAR_S1_MEMATTR			GENMASK(15, 12)
-#define CBAR_S1_MEMATTR_WB		0xf
-#define CBAR_S1_BPSHCFG			GENMASK(9, 8)
-#define CBAR_S1_BPSHCFG_NSH		3
-#define CBAR_VMID			GENMASK(7, 0)
+#define ARM_SMMU_CBAR_S1_MEMATTR	GENMASK(15, 12)
+#define ARM_SMMU_CBAR_S1_MEMATTR_WB	0xf
+#define ARM_SMMU_CBAR_S1_BPSHCFG	GENMASK(9, 8)
+#define ARM_SMMU_CBAR_S1_BPSHCFG_NSH	3
+#define ARM_SMMU_CBAR_VMID		GENMASK(7, 0)
 
 #define ARM_SMMU_GR1_CBFRSYNRA(n)	(0x400 + ((n) << 2))
 
 #define ARM_SMMU_GR1_CBA2R(n)		(0x800 + ((n) << 2))
-#define CBA2R_VMID16			GENMASK(31, 16)
-#define CBA2R_VA64			BIT(0)
+#define ARM_SMMU_CBA2R_VMID16		GENMASK(31, 16)
+#define ARM_SMMU_CBA2R_VA64		BIT(0)
 
 #define ARM_SMMU_CB_SCTLR		0x0
-#define SCTLR_S1_ASIDPNE		BIT(12)
-#define SCTLR_CFCFG			BIT(7)
-#define SCTLR_CFIE			BIT(6)
-#define SCTLR_CFRE			BIT(5)
-#define SCTLR_E				BIT(4)
-#define SCTLR_AFE			BIT(2)
-#define SCTLR_TRE			BIT(1)
-#define SCTLR_M				BIT(0)
+#define ARM_SMMU_SCTLR_S1_ASIDPNE	BIT(12)
+#define ARM_SMMU_SCTLR_CFCFG		BIT(7)
+#define ARM_SMMU_SCTLR_CFIE		BIT(6)
+#define ARM_SMMU_SCTLR_CFRE		BIT(5)
+#define ARM_SMMU_SCTLR_E		BIT(4)
+#define ARM_SMMU_SCTLR_AFE		BIT(2)
+#define ARM_SMMU_SCTLR_TRE		BIT(1)
+#define ARM_SMMU_SCTLR_M		BIT(0)
 
 #define ARM_SMMU_CB_ACTLR		0x4
 
 #define ARM_SMMU_CB_RESUME		0x8
-#define RESUME_TERMINATE		BIT(0)
+#define ARM_SMMU_RESUME_TERMINATE	BIT(0)
 
 #define ARM_SMMU_CB_TCR2		0x10
-#define TCR2_SEP			GENMASK(17, 15)
-#define TCR2_SEP_UPSTREAM		0x7
-#define TCR2_AS				BIT(4)
-#define TCR2_PASIZE			GENMASK(3, 0)
+#define ARM_SMMU_TCR2_SEP		GENMASK(17, 15)
+#define ARM_SMMU_TCR2_SEP_UPSTREAM	0x7
+#define ARM_SMMU_TCR2_AS		BIT(4)
+#define ARM_SMMU_TCR2_PASIZE		GENMASK(3, 0)
 
 #define ARM_SMMU_CB_TTBR0		0x20
 #define ARM_SMMU_CB_TTBR1		0x28
-#define TTBRn_ASID			GENMASK_ULL(63, 48)
-
-/* arm64 headers leak this somehow :( */
-#undef TCR_T0SZ
+#define ARM_SMMU_TTBRn_ASID		GENMASK_ULL(63, 48)
 
 #define ARM_SMMU_CB_TCR			0x30
-#define TCR_EAE				BIT(31)
-#define TCR_EPD1			BIT(23)
-#define TCR_TG0				GENMASK(15, 14)
-#define TCR_SH0				GENMASK(13, 12)
-#define TCR_ORGN0			GENMASK(11, 10)
-#define TCR_IRGN0			GENMASK(9, 8)
-#define TCR_T0SZ			GENMASK(5, 0)
+#define ARM_SMMU_TCR_EAE		BIT(31)
+#define ARM_SMMU_TCR_EPD1		BIT(23)
+#define ARM_SMMU_TCR_TG0		GENMASK(15, 14)
+#define ARM_SMMU_TCR_SH0		GENMASK(13, 12)
+#define ARM_SMMU_TCR_ORGN0		GENMASK(11, 10)
+#define ARM_SMMU_TCR_IRGN0		GENMASK(9, 8)
+#define ARM_SMMU_TCR_T0SZ		GENMASK(5, 0)
 
 #define ARM_SMMU_CB_CONTEXTIDR		0x34
 #define ARM_SMMU_CB_S1_MAIR0		0x38
 #define ARM_SMMU_CB_S1_MAIR1		0x3c
 
 #define ARM_SMMU_CB_PAR			0x50
-#define CB_PAR_F			BIT(0)
+#define ARM_SMMU_CB_PAR_F		BIT(0)
 
 #define ARM_SMMU_CB_FSR			0x58
-#define FSR_MULTI			BIT(31)
-#define FSR_SS				BIT(30)
-#define FSR_UUT				BIT(8)
-#define FSR_ASF				BIT(7)
-#define FSR_TLBLKF			BIT(6)
-#define FSR_TLBMCF			BIT(5)
-#define FSR_EF				BIT(4)
-#define FSR_PF				BIT(3)
-#define FSR_AFF				BIT(2)
-#define FSR_TF				BIT(1)
-
-#define FSR_IGN				(FSR_AFF | FSR_ASF | \
-					 FSR_TLBMCF | FSR_TLBLKF)
-#define FSR_FAULT			(FSR_MULTI | FSR_SS | FSR_UUT | \
-					 FSR_EF | FSR_PF | FSR_TF | FSR_IGN)
+#define ARM_SMMU_FSR_MULTI		BIT(31)
+#define ARM_SMMU_FSR_SS			BIT(30)
+#define ARM_SMMU_FSR_UUT		BIT(8)
+#define ARM_SMMU_FSR_ASF		BIT(7)
+#define ARM_SMMU_FSR_TLBLKF		BIT(6)
+#define ARM_SMMU_FSR_TLBMCF		BIT(5)
+#define ARM_SMMU_FSR_EF			BIT(4)
+#define ARM_SMMU_FSR_PF			BIT(3)
+#define ARM_SMMU_FSR_AFF		BIT(2)
+#define ARM_SMMU_FSR_TF			BIT(1)
+
+#define ARM_SMMU_FSR_IGN		(ARM_SMMU_FSR_AFF |		\
+					 ARM_SMMU_FSR_ASF |		\
+					 ARM_SMMU_FSR_TLBMCF |		\
+					 ARM_SMMU_FSR_TLBLKF)
+
+#define ARM_SMMU_FSR_FAULT		(ARM_SMMU_FSR_MULTI |		\
+					 ARM_SMMU_FSR_SS |		\
+					 ARM_SMMU_FSR_UUT |		\
+					 ARM_SMMU_FSR_EF |		\
+					 ARM_SMMU_FSR_PF |		\
+					 ARM_SMMU_FSR_TF |		\
+					 ARM_SMMU_FSR_IGN)
 
 #define ARM_SMMU_CB_FAR			0x60
 
 #define ARM_SMMU_CB_FSYNR0		0x68
-#define FSYNR0_WNR			BIT(4)
+#define ARM_SMMU_FSYNR0_WNR		BIT(4)
 
 #define ARM_SMMU_CB_S1_TLBIVA		0x600
 #define ARM_SMMU_CB_S1_TLBIASID		0x610
@@ -216,7 +221,7 @@ enum arm_smmu_cbar_type {
 #define ARM_SMMU_CB_ATS1PR		0x800
 
 #define ARM_SMMU_CB_ATSR		0x8f0
-#define ATSR_ACTIVE			BIT(0)
+#define ARM_SMMU_ATSR_ACTIVE		BIT(0)
 
 
 /* Maximum number of context banks per SMMU */
@@ -310,7 +315,7 @@ struct arm_smmu_cfg {
 	enum arm_smmu_cbar_type		cbar;
 	enum arm_smmu_context_fmt	fmt;
 };
-#define INVALID_IRPTNDX			0xff
+#define ARM_SMMU_INVALID_IRPTNDX	0xff
 
 enum arm_smmu_domain_stage {
 	ARM_SMMU_DOMAIN_S1 = 0,
@@ -333,18 +338,18 @@ struct arm_smmu_domain {
 
 static inline u32 arm_smmu_lpae_tcr(struct io_pgtable_cfg *cfg)
 {
-	return TCR_EPD1 |
-	       FIELD_PREP(TCR_TG0, cfg->arm_lpae_s1_cfg.tcr.tg) |
-	       FIELD_PREP(TCR_SH0, cfg->arm_lpae_s1_cfg.tcr.sh) |
-	       FIELD_PREP(TCR_ORGN0, cfg->arm_lpae_s1_cfg.tcr.orgn) |
-	       FIELD_PREP(TCR_IRGN0, cfg->arm_lpae_s1_cfg.tcr.irgn) |
-	       FIELD_PREP(TCR_T0SZ, cfg->arm_lpae_s1_cfg.tcr.tsz);
+	return ARM_SMMU_TCR_EPD1 |
+	       FIELD_PREP(ARM_SMMU_TCR_TG0, cfg->arm_lpae_s1_cfg.tcr.tg) |
+	       FIELD_PREP(ARM_SMMU_TCR_SH0, cfg->arm_lpae_s1_cfg.tcr.sh) |
+	       FIELD_PREP(ARM_SMMU_TCR_ORGN0, cfg->arm_lpae_s1_cfg.tcr.orgn) |
+	       FIELD_PREP(ARM_SMMU_TCR_IRGN0, cfg->arm_lpae_s1_cfg.tcr.irgn) |
+	       FIELD_PREP(ARM_SMMU_TCR_T0SZ, cfg->arm_lpae_s1_cfg.tcr.tsz);
 }
 
 static inline u32 arm_smmu_lpae_tcr2(struct io_pgtable_cfg *cfg)
 {
-	return FIELD_PREP(TCR2_PASIZE, cfg->arm_lpae_s1_cfg.tcr.ips) |
-	       FIELD_PREP(TCR2_SEP, TCR2_SEP_UPSTREAM);
+	return FIELD_PREP(ARM_SMMU_TCR2_PASIZE, cfg->arm_lpae_s1_cfg.tcr.ips) |
+	       FIELD_PREP(ARM_SMMU_TCR2_SEP, ARM_SMMU_TCR2_SEP_UPSTREAM);
 }
 
 /* Implementation details, yay! */
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
index 2eeaf2eec946..39759db4f003 100644
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -201,7 +201,7 @@ static irqreturn_t qcom_iommu_fault(int irq, void *dev)
 
 	fsr = iommu_readl(ctx, ARM_SMMU_CB_FSR);
 
-	if (!(fsr & FSR_FAULT))
+	if (!(fsr & ARM_SMMU_FSR_FAULT))
 		return IRQ_NONE;
 
 	fsynr = iommu_readl(ctx, ARM_SMMU_CB_FSYNR0);
@@ -215,7 +215,7 @@ static irqreturn_t qcom_iommu_fault(int irq, void *dev)
 	}
 
 	iommu_writel(ctx, ARM_SMMU_CB_FSR, fsr);
-	iommu_writel(ctx, ARM_SMMU_CB_RESUME, RESUME_TERMINATE);
+	iommu_writel(ctx, ARM_SMMU_CB_RESUME, ARM_SMMU_RESUME_TERMINATE);
 
 	return IRQ_HANDLED;
 }
@@ -270,14 +270,14 @@ static int qcom_iommu_init_domain(struct iommu_domain *domain,
 		/* TTBRs */
 		iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
 				pgtbl_cfg.arm_lpae_s1_cfg.ttbr |
-				FIELD_PREP(TTBRn_ASID, ctx->asid));
+				FIELD_PREP(ARM_SMMU_TTBRn_ASID, ctx->asid));
 		iommu_writeq(ctx, ARM_SMMU_CB_TTBR1, 0);
 
 		/* TCR */
 		iommu_writel(ctx, ARM_SMMU_CB_TCR2,
 				arm_smmu_lpae_tcr2(&pgtbl_cfg));
 		iommu_writel(ctx, ARM_SMMU_CB_TCR,
-				arm_smmu_lpae_tcr(&pgtbl_cfg) | TCR_EAE);
+			     arm_smmu_lpae_tcr(&pgtbl_cfg) | ARM_SMMU_TCR_EAE);
 
 		/* MAIRs (stage-1 only) */
 		iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR0,
@@ -286,11 +286,13 @@ static int qcom_iommu_init_domain(struct iommu_domain *domain,
 				pgtbl_cfg.arm_lpae_s1_cfg.mair >> 32);
 
 		/* SCTLR */
-		reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE |
-			SCTLR_M | SCTLR_S1_ASIDPNE | SCTLR_CFCFG;
+		reg = ARM_SMMU_SCTLR_CFIE | ARM_SMMU_SCTLR_CFRE |
+		      ARM_SMMU_SCTLR_AFE | ARM_SMMU_SCTLR_TRE |
+		      ARM_SMMU_SCTLR_M | ARM_SMMU_SCTLR_S1_ASIDPNE |
+		      ARM_SMMU_SCTLR_CFCFG;
 
 		if (IS_ENABLED(CONFIG_BIG_ENDIAN))
-			reg |= SCTLR_E;
+			reg |= ARM_SMMU_SCTLR_E;
 
 		iommu_writel(ctx, ARM_SMMU_CB_SCTLR, reg);
 
-- 
2.25.0.rc1.283.g88dfdc4193-goog

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 6/8] iommu/arm-smmu: Rename public #defines under ARM_SMMU_ namespace
@ 2020-01-10 15:28   ` Will Deacon
  0 siblings, 0 replies; 22+ messages in thread
From: Will Deacon @ 2020-01-10 15:28 UTC (permalink / raw)
  To: iommu
  Cc: Will Deacon, Jordan Crouse, kernel-team, Robin Murphy, linux-arm-kernel

Now that we have arm-smmu.h defining various SMMU constants, ensure that
they are namespaced with the ARM_SMMU_ prefix in order to avoid conflicts
with the CPU, such as the one we're currently bodging around with the
TCR.

Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 drivers/iommu/arm-smmu-impl.c |   2 +-
 drivers/iommu/arm-smmu.c      | 157 ++++++++++++------------
 drivers/iommu/arm-smmu.h      | 217 +++++++++++++++++-----------------
 drivers/iommu/qcom_iommu.c    |  16 +--
 4 files changed, 204 insertions(+), 188 deletions(-)

diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c
index b2fe72a8f019..74d97a886e93 100644
--- a/drivers/iommu/arm-smmu-impl.c
+++ b/drivers/iommu/arm-smmu-impl.c
@@ -119,7 +119,7 @@ int arm_mmu500_reset(struct arm_smmu_device *smmu)
 	 * Secure has also cleared SACR.CACHE_LOCK for this to take effect...
 	 */
 	reg = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_ID7);
-	major = FIELD_GET(ID7_MAJOR, reg);
+	major = FIELD_GET(ARM_SMMU_ID7_MAJOR, reg);
 	reg = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_sACR);
 	if (major >= 2)
 		reg &= ~ARM_MMU500_ACR_CACHE_LOCK;
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index e65eb60812ec..214be09f6ded 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -271,7 +271,7 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu, int page,
 	for (delay = 1; delay < TLB_LOOP_TIMEOUT; delay *= 2) {
 		for (spin_cnt = TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) {
 			reg = arm_smmu_readl(smmu, page, status);
-			if (!(reg & sTLBGSTATUS_GSACTIVE))
+			if (!(reg & ARM_SMMU_sTLBGSTATUS_GSACTIVE))
 				return;
 			cpu_relax();
 		}
@@ -478,7 +478,7 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
 	int idx = smmu_domain->cfg.cbndx;
 
 	fsr = arm_smmu_cb_read(smmu, idx, ARM_SMMU_CB_FSR);
-	if (!(fsr & FSR_FAULT))
+	if (!(fsr & ARM_SMMU_FSR_FAULT))
 		return IRQ_NONE;
 
 	fsynr = arm_smmu_cb_read(smmu, idx, ARM_SMMU_CB_FSYNR0);
@@ -510,7 +510,7 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
 
 	if (__ratelimit(&rs)) {
 		if (IS_ENABLED(CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT) &&
-		    (gfsr & sGFSR_USF))
+		    (gfsr & ARM_SMMU_sGFSR_USF))
 			dev_err(smmu->dev,
 				"Blocked unknown Stream ID 0x%hx; boot with \"arm-smmu.disable_bypass=0\" to allow, but this may have security implications\n",
 				(u16)gfsynr1);
@@ -543,9 +543,9 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
 			cb->tcr[0] = arm_smmu_lpae_tcr(pgtbl_cfg);
 			cb->tcr[1] = arm_smmu_lpae_tcr2(pgtbl_cfg);
 			if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH64)
-				cb->tcr[1] |= TCR2_AS;
+				cb->tcr[1] |= ARM_SMMU_TCR2_AS;
 			else
-				cb->tcr[0] |= TCR_EAE;
+				cb->tcr[0] |= ARM_SMMU_TCR_EAE;
 		}
 	} else {
 		cb->tcr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vtcr;
@@ -558,8 +558,10 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
 			cb->ttbr[1] = 0;
 		} else {
 			cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
-			cb->ttbr[0] |= FIELD_PREP(TTBRn_ASID, cfg->asid);
-			cb->ttbr[1] = FIELD_PREP(TTBRn_ASID, cfg->asid);
+			cb->ttbr[0] |= FIELD_PREP(ARM_SMMU_TTBRn_ASID,
+						  cfg->asid);
+			cb->ttbr[1] = FIELD_PREP(ARM_SMMU_TTBRn_ASID,
+						 cfg->asid);
 		}
 	} else {
 		cb->ttbr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
@@ -595,31 +597,33 @@ static void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx)
 	/* CBA2R */
 	if (smmu->version > ARM_SMMU_V1) {
 		if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH64)
-			reg = CBA2R_VA64;
+			reg = ARM_SMMU_CBA2R_VA64;
 		else
 			reg = 0;
 		/* 16-bit VMIDs live in CBA2R */
 		if (smmu->features & ARM_SMMU_FEAT_VMID16)
-			reg |= FIELD_PREP(CBA2R_VMID16, cfg->vmid);
+			reg |= FIELD_PREP(ARM_SMMU_CBA2R_VMID16, cfg->vmid);
 
 		arm_smmu_gr1_write(smmu, ARM_SMMU_GR1_CBA2R(idx), reg);
 	}
 
 	/* CBAR */
-	reg = FIELD_PREP(CBAR_TYPE, cfg->cbar);
+	reg = FIELD_PREP(ARM_SMMU_CBAR_TYPE, cfg->cbar);
 	if (smmu->version < ARM_SMMU_V2)
-		reg |= FIELD_PREP(CBAR_IRPTNDX, cfg->irptndx);
+		reg |= FIELD_PREP(ARM_SMMU_CBAR_IRPTNDX, cfg->irptndx);
 
 	/*
 	 * Use the weakest shareability/memory types, so they are
 	 * overridden by the ttbcr/pte.
 	 */
 	if (stage1) {
-		reg |= FIELD_PREP(CBAR_S1_BPSHCFG, CBAR_S1_BPSHCFG_NSH) |
-			FIELD_PREP(CBAR_S1_MEMATTR, CBAR_S1_MEMATTR_WB);
+		reg |= FIELD_PREP(ARM_SMMU_CBAR_S1_BPSHCFG,
+				  ARM_SMMU_CBAR_S1_BPSHCFG_NSH) |
+		       FIELD_PREP(ARM_SMMU_CBAR_S1_MEMATTR,
+				  ARM_SMMU_CBAR_S1_MEMATTR_WB);
 	} else if (!(smmu->features & ARM_SMMU_FEAT_VMID16)) {
 		/* 8-bit VMIDs live in CBAR */
-		reg |= FIELD_PREP(CBAR_VMID, cfg->vmid);
+		reg |= FIELD_PREP(ARM_SMMU_CBAR_VMID, cfg->vmid);
 	}
 	arm_smmu_gr1_write(smmu, ARM_SMMU_GR1_CBAR(idx), reg);
 
@@ -651,11 +655,12 @@ static void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx)
 	}
 
 	/* SCTLR */
-	reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE | SCTLR_M;
+	reg = ARM_SMMU_SCTLR_CFIE | ARM_SMMU_SCTLR_CFRE | ARM_SMMU_SCTLR_AFE |
+	      ARM_SMMU_SCTLR_TRE | ARM_SMMU_SCTLR_M;
 	if (stage1)
-		reg |= SCTLR_S1_ASIDPNE;
+		reg |= ARM_SMMU_SCTLR_S1_ASIDPNE;
 	if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
-		reg |= SCTLR_E;
+		reg |= ARM_SMMU_SCTLR_E;
 
 	arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, reg);
 }
@@ -837,7 +842,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	if (ret < 0) {
 		dev_err(smmu->dev, "failed to request context IRQ %d (%u)\n",
 			cfg->irptndx, irq);
-		cfg->irptndx = INVALID_IRPTNDX;
+		cfg->irptndx = ARM_SMMU_INVALID_IRPTNDX;
 	}
 
 	mutex_unlock(&smmu_domain->init_mutex);
@@ -875,7 +880,7 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
 	smmu->cbs[cfg->cbndx].cfg = NULL;
 	arm_smmu_write_context_bank(smmu, cfg->cbndx);
 
-	if (cfg->irptndx != INVALID_IRPTNDX) {
+	if (cfg->irptndx != ARM_SMMU_INVALID_IRPTNDX) {
 		irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
 		devm_free_irq(smmu->dev, irq, domain);
 	}
@@ -931,23 +936,24 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
 static void arm_smmu_write_smr(struct arm_smmu_device *smmu, int idx)
 {
 	struct arm_smmu_smr *smr = smmu->smrs + idx;
-	u32 reg = FIELD_PREP(SMR_ID, smr->id) | FIELD_PREP(SMR_MASK, smr->mask);
+	u32 reg = FIELD_PREP(ARM_SMMU_SMR_ID, smr->id) |
+		  FIELD_PREP(ARM_SMMU_SMR_MASK, smr->mask);
 
 	if (!(smmu->features & ARM_SMMU_FEAT_EXIDS) && smr->valid)
-		reg |= SMR_VALID;
+		reg |= ARM_SMMU_SMR_VALID;
 	arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_SMR(idx), reg);
 }
 
 static void arm_smmu_write_s2cr(struct arm_smmu_device *smmu, int idx)
 {
 	struct arm_smmu_s2cr *s2cr = smmu->s2crs + idx;
-	u32 reg = FIELD_PREP(S2CR_TYPE, s2cr->type) |
-		  FIELD_PREP(S2CR_CBNDX, s2cr->cbndx) |
-		  FIELD_PREP(S2CR_PRIVCFG, s2cr->privcfg);
+	u32 reg = FIELD_PREP(ARM_SMMU_S2CR_TYPE, s2cr->type) |
+		  FIELD_PREP(ARM_SMMU_S2CR_CBNDX, s2cr->cbndx) |
+		  FIELD_PREP(ARM_SMMU_S2CR_PRIVCFG, s2cr->privcfg);
 
 	if (smmu->features & ARM_SMMU_FEAT_EXIDS && smmu->smrs &&
 	    smmu->smrs[idx].valid)
-		reg |= S2CR_EXIDVALID;
+		reg |= ARM_SMMU_S2CR_EXIDVALID;
 	arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_S2CR(idx), reg);
 }
 
@@ -974,15 +980,15 @@ static void arm_smmu_test_smr_masks(struct arm_smmu_device *smmu)
 	 * bits are set, so check each one separately. We can reject
 	 * masters later if they try to claim IDs outside these masks.
 	 */
-	smr = FIELD_PREP(SMR_ID, smmu->streamid_mask);
+	smr = FIELD_PREP(ARM_SMMU_SMR_ID, smmu->streamid_mask);
 	arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_SMR(0), smr);
 	smr = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_SMR(0));
-	smmu->streamid_mask = FIELD_GET(SMR_ID, smr);
+	smmu->streamid_mask = FIELD_GET(ARM_SMMU_SMR_ID, smr);
 
-	smr = FIELD_PREP(SMR_MASK, smmu->streamid_mask);
+	smr = FIELD_PREP(ARM_SMMU_SMR_MASK, smmu->streamid_mask);
 	arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_SMR(0), smr);
 	smr = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_SMR(0));
-	smmu->smr_mask_mask = FIELD_GET(SMR_MASK, smr);
+	smmu->smr_mask_mask = FIELD_GET(ARM_SMMU_SMR_MASK, smr);
 }
 
 static int arm_smmu_find_sme(struct arm_smmu_device *smmu, u16 id, u16 mask)
@@ -1051,8 +1057,8 @@ static int arm_smmu_master_alloc_smes(struct device *dev)
 	mutex_lock(&smmu->stream_map_mutex);
 	/* Figure out a viable stream map entry allocation */
 	for_each_cfg_sme(fwspec, i, idx) {
-		u16 sid = FIELD_GET(SMR_ID, fwspec->ids[i]);
-		u16 mask = FIELD_GET(SMR_MASK, fwspec->ids[i]);
+		u16 sid = FIELD_GET(ARM_SMMU_SMR_ID, fwspec->ids[i]);
+		u16 mask = FIELD_GET(ARM_SMMU_SMR_MASK, fwspec->ids[i]);
 
 		if (idx != INVALID_SMENDX) {
 			ret = -EEXIST;
@@ -1296,7 +1302,8 @@ static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain,
 		arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_ATS1PR, va);
 
 	reg = arm_smmu_page(smmu, ARM_SMMU_CB(smmu, idx)) + ARM_SMMU_CB_ATSR;
-	if (readl_poll_timeout_atomic(reg, tmp, !(tmp & ATSR_ACTIVE), 5, 50)) {
+	if (readl_poll_timeout_atomic(reg, tmp, !(tmp & ARM_SMMU_ATSR_ACTIVE),
+				      5, 50)) {
 		spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
 		dev_err(dev,
 			"iova to phys timed out on %pad. Falling back to software table walk.\n",
@@ -1306,7 +1313,7 @@ static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain,
 
 	phys = arm_smmu_cb_readq(smmu, idx, ARM_SMMU_CB_PAR);
 	spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
-	if (phys & CB_PAR_F) {
+	if (phys & ARM_SMMU_CB_PAR_F) {
 		dev_err(dev, "translation fault!\n");
 		dev_err(dev, "PAR = 0x%llx\n", phys);
 		return 0;
@@ -1387,8 +1394,8 @@ static int arm_smmu_add_device(struct device *dev)
 
 	ret = -EINVAL;
 	for (i = 0; i < fwspec->num_ids; i++) {
-		u16 sid = FIELD_GET(SMR_ID, fwspec->ids[i]);
-		u16 mask = FIELD_GET(SMR_MASK, fwspec->ids[i]);
+		u16 sid = FIELD_GET(ARM_SMMU_SMR_ID, fwspec->ids[i]);
+		u16 mask = FIELD_GET(ARM_SMMU_SMR_MASK, fwspec->ids[i]);
 
 		if (sid & ~smmu->streamid_mask) {
 			dev_err(dev, "stream ID 0x%x out of range for SMMU (0x%x)\n",
@@ -1569,12 +1576,12 @@ static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
 	u32 mask, fwid = 0;
 
 	if (args->args_count > 0)
-		fwid |= FIELD_PREP(SMR_ID, args->args[0]);
+		fwid |= FIELD_PREP(ARM_SMMU_SMR_ID, args->args[0]);
 
 	if (args->args_count > 1)
-		fwid |= FIELD_PREP(SMR_MASK, args->args[1]);
+		fwid |= FIELD_PREP(ARM_SMMU_SMR_MASK, args->args[1]);
 	else if (!of_property_read_u32(args->np, "stream-match-mask", &mask))
-		fwid |= FIELD_PREP(SMR_MASK, mask);
+		fwid |= FIELD_PREP(ARM_SMMU_SMR_MASK, mask);
 
 	return iommu_fwspec_add_ids(dev, &fwid, 1);
 }
@@ -1644,7 +1651,7 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu)
 	/* Make sure all context banks are disabled and clear CB_FSR  */
 	for (i = 0; i < smmu->num_context_banks; ++i) {
 		arm_smmu_write_context_bank(smmu, i);
-		arm_smmu_cb_write(smmu, i, ARM_SMMU_CB_FSR, FSR_FAULT);
+		arm_smmu_cb_write(smmu, i, ARM_SMMU_CB_FSR, ARM_SMMU_FSR_FAULT);
 	}
 
 	/* Invalidate the TLB, just in case */
@@ -1654,29 +1661,30 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu)
 	reg = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_sCR0);
 
 	/* Enable fault reporting */
-	reg |= (sCR0_GFRE | sCR0_GFIE | sCR0_GCFGFRE | sCR0_GCFGFIE);
+	reg |= (ARM_SMMU_sCR0_GFRE | ARM_SMMU_sCR0_GFIE |
+		ARM_SMMU_sCR0_GCFGFRE | ARM_SMMU_sCR0_GCFGFIE);
 
 	/* Disable TLB broadcasting. */
-	reg |= (sCR0_VMIDPNE | sCR0_PTM);
+	reg |= (ARM_SMMU_sCR0_VMIDPNE | ARM_SMMU_sCR0_PTM);
 
 	/* Enable client access, handling unmatched streams as appropriate */
-	reg &= ~sCR0_CLIENTPD;
+	reg &= ~ARM_SMMU_sCR0_CLIENTPD;
 	if (disable_bypass)
-		reg |= sCR0_USFCFG;
+		reg |= ARM_SMMU_sCR0_USFCFG;
 	else
-		reg &= ~sCR0_USFCFG;
+		reg &= ~ARM_SMMU_sCR0_USFCFG;
 
 	/* Disable forced broadcasting */
-	reg &= ~sCR0_FB;
+	reg &= ~ARM_SMMU_sCR0_FB;
 
 	/* Don't upgrade barriers */
-	reg &= ~(sCR0_BSU);
+	reg &= ~(ARM_SMMU_sCR0_BSU);
 
 	if (smmu->features & ARM_SMMU_FEAT_VMID16)
-		reg |= sCR0_VMID16EN;
+		reg |= ARM_SMMU_sCR0_VMID16EN;
 
 	if (smmu->features & ARM_SMMU_FEAT_EXIDS)
-		reg |= sCR0_EXIDENABLE;
+		reg |= ARM_SMMU_sCR0_EXIDENABLE;
 
 	if (smmu->impl && smmu->impl->reset)
 		smmu->impl->reset(smmu);
@@ -1721,21 +1729,21 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 
 	/* Restrict available stages based on module parameter */
 	if (force_stage == 1)
-		id &= ~(ID0_S2TS | ID0_NTS);
+		id &= ~(ARM_SMMU_ID0_S2TS | ARM_SMMU_ID0_NTS);
 	else if (force_stage == 2)
-		id &= ~(ID0_S1TS | ID0_NTS);
+		id &= ~(ARM_SMMU_ID0_S1TS | ARM_SMMU_ID0_NTS);
 
-	if (id & ID0_S1TS) {
+	if (id & ARM_SMMU_ID0_S1TS) {
 		smmu->features |= ARM_SMMU_FEAT_TRANS_S1;
 		dev_notice(smmu->dev, "\tstage 1 translation\n");
 	}
 
-	if (id & ID0_S2TS) {
+	if (id & ARM_SMMU_ID0_S2TS) {
 		smmu->features |= ARM_SMMU_FEAT_TRANS_S2;
 		dev_notice(smmu->dev, "\tstage 2 translation\n");
 	}
 
-	if (id & ID0_NTS) {
+	if (id & ARM_SMMU_ID0_NTS) {
 		smmu->features |= ARM_SMMU_FEAT_TRANS_NESTED;
 		dev_notice(smmu->dev, "\tnested translation\n");
 	}
@@ -1746,8 +1754,8 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 		return -ENODEV;
 	}
 
-	if ((id & ID0_S1TS) &&
-		((smmu->version < ARM_SMMU_V2) || !(id & ID0_ATOSNS))) {
+	if ((id & ARM_SMMU_ID0_S1TS) &&
+	    ((smmu->version < ARM_SMMU_V2) || !(id & ARM_SMMU_ID0_ATOSNS))) {
 		smmu->features |= ARM_SMMU_FEAT_TRANS_OPS;
 		dev_notice(smmu->dev, "\taddress translation ops\n");
 	}
@@ -1758,7 +1766,7 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 	 * Fortunately, this also opens up a workaround for systems where the
 	 * ID register value has ended up configured incorrectly.
 	 */
-	cttw_reg = !!(id & ID0_CTTW);
+	cttw_reg = !!(id & ARM_SMMU_ID0_CTTW);
 	if (cttw_fw || cttw_reg)
 		dev_notice(smmu->dev, "\t%scoherent table walk\n",
 			   cttw_fw ? "" : "non-");
@@ -1767,16 +1775,16 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 			   "\t(IDR0.CTTW overridden by FW configuration)\n");
 
 	/* Max. number of entries we have for stream matching/indexing */
-	if (smmu->version == ARM_SMMU_V2 && id & ID0_EXIDS) {
+	if (smmu->version == ARM_SMMU_V2 && id & ARM_SMMU_ID0_EXIDS) {
 		smmu->features |= ARM_SMMU_FEAT_EXIDS;
 		size = 1 << 16;
 	} else {
-		size = 1 << FIELD_GET(ID0_NUMSIDB, id);
+		size = 1 << FIELD_GET(ARM_SMMU_ID0_NUMSIDB, id);
 	}
 	smmu->streamid_mask = size - 1;
-	if (id & ID0_SMS) {
+	if (id & ARM_SMMU_ID0_SMS) {
 		smmu->features |= ARM_SMMU_FEAT_STREAM_MATCH;
-		size = FIELD_GET(ID0_NUMSMRG, id);
+		size = FIELD_GET(ARM_SMMU_ID0_NUMSMRG, id);
 		if (size == 0) {
 			dev_err(smmu->dev,
 				"stream-matching supported, but no SMRs present!\n");
@@ -1804,18 +1812,19 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 	mutex_init(&smmu->stream_map_mutex);
 	spin_lock_init(&smmu->global_sync_lock);
 
-	if (smmu->version < ARM_SMMU_V2 || !(id & ID0_PTFS_NO_AARCH32)) {
+	if (smmu->version < ARM_SMMU_V2 ||
+	    !(id & ARM_SMMU_ID0_PTFS_NO_AARCH32)) {
 		smmu->features |= ARM_SMMU_FEAT_FMT_AARCH32_L;
-		if (!(id & ID0_PTFS_NO_AARCH32S))
+		if (!(id & ARM_SMMU_ID0_PTFS_NO_AARCH32S))
 			smmu->features |= ARM_SMMU_FEAT_FMT_AARCH32_S;
 	}
 
 	/* ID1 */
 	id = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_ID1);
-	smmu->pgshift = (id & ID1_PAGESIZE) ? 16 : 12;
+	smmu->pgshift = (id & ARM_SMMU_ID1_PAGESIZE) ? 16 : 12;
 
 	/* Check for size mismatch of SMMU address space from mapped region */
-	size = 1 << (FIELD_GET(ID1_NUMPAGENDXB, id) + 1);
+	size = 1 << (FIELD_GET(ARM_SMMU_ID1_NUMPAGENDXB, id) + 1);
 	if (smmu->numpage != 2 * size << smmu->pgshift)
 		dev_warn(smmu->dev,
 			"SMMU address space size (0x%x) differs from mapped region size (0x%x)!\n",
@@ -1823,8 +1832,8 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 	/* Now properly encode NUMPAGE to subsequently derive SMMU_CB_BASE */
 	smmu->numpage = size;
 
-	smmu->num_s2_context_banks = FIELD_GET(ID1_NUMS2CB, id);
-	smmu->num_context_banks = FIELD_GET(ID1_NUMCB, id);
+	smmu->num_s2_context_banks = FIELD_GET(ARM_SMMU_ID1_NUMS2CB, id);
+	smmu->num_context_banks = FIELD_GET(ARM_SMMU_ID1_NUMCB, id);
 	if (smmu->num_s2_context_banks > smmu->num_context_banks) {
 		dev_err(smmu->dev, "impossible number of S2 context banks!\n");
 		return -ENODEV;
@@ -1838,14 +1847,14 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 
 	/* ID2 */
 	id = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_ID2);
-	size = arm_smmu_id_size_to_bits(FIELD_GET(ID2_IAS, id));
+	size = arm_smmu_id_size_to_bits(FIELD_GET(ARM_SMMU_ID2_IAS, id));
 	smmu->ipa_size = size;
 
 	/* The output mask is also applied for bypass */
-	size = arm_smmu_id_size_to_bits(FIELD_GET(ID2_OAS, id));
+	size = arm_smmu_id_size_to_bits(FIELD_GET(ARM_SMMU_ID2_OAS, id));
 	smmu->pa_size = size;
 
-	if (id & ID2_VMID16)
+	if (id & ARM_SMMU_ID2_VMID16)
 		smmu->features |= ARM_SMMU_FEAT_VMID16;
 
 	/*
@@ -1862,13 +1871,13 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 		if (smmu->version == ARM_SMMU_V1_64K)
 			smmu->features |= ARM_SMMU_FEAT_FMT_AARCH64_64K;
 	} else {
-		size = FIELD_GET(ID2_UBS, id);
+		size = FIELD_GET(ARM_SMMU_ID2_UBS, id);
 		smmu->va_size = arm_smmu_id_size_to_bits(size);
-		if (id & ID2_PTFS_4K)
+		if (id & ARM_SMMU_ID2_PTFS_4K)
 			smmu->features |= ARM_SMMU_FEAT_FMT_AARCH64_4K;
-		if (id & ID2_PTFS_16K)
+		if (id & ARM_SMMU_ID2_PTFS_16K)
 			smmu->features |= ARM_SMMU_FEAT_FMT_AARCH64_16K;
-		if (id & ID2_PTFS_64K)
+		if (id & ARM_SMMU_ID2_PTFS_64K)
 			smmu->features |= ARM_SMMU_FEAT_FMT_AARCH64_64K;
 	}
 
@@ -2245,7 +2254,7 @@ static int arm_smmu_device_remove(struct platform_device *pdev)
 
 	arm_smmu_rpm_get(smmu);
 	/* Turn the thing off */
-	arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_sCR0, sCR0_CLIENTPD);
+	arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_sCR0, ARM_SMMU_sCR0_CLIENTPD);
 	arm_smmu_rpm_put(smmu);
 
 	if (pm_runtime_enabled(smmu->dev))
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index aade2b0ae175..6501f38a5966 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -24,51 +24,51 @@
 
 /* Configuration registers */
 #define ARM_SMMU_GR0_sCR0		0x0
-#define sCR0_VMID16EN			BIT(31)
-#define sCR0_BSU			GENMASK(15, 14)
-#define sCR0_FB				BIT(13)
-#define sCR0_PTM			BIT(12)
-#define sCR0_VMIDPNE			BIT(11)
-#define sCR0_USFCFG			BIT(10)
-#define sCR0_GCFGFIE			BIT(5)
-#define sCR0_GCFGFRE			BIT(4)
-#define sCR0_EXIDENABLE			BIT(3)
-#define sCR0_GFIE			BIT(2)
-#define sCR0_GFRE			BIT(1)
-#define sCR0_CLIENTPD			BIT(0)
+#define ARM_SMMU_sCR0_VMID16EN		BIT(31)
+#define ARM_SMMU_sCR0_BSU		GENMASK(15, 14)
+#define ARM_SMMU_sCR0_FB		BIT(13)
+#define ARM_SMMU_sCR0_PTM		BIT(12)
+#define ARM_SMMU_sCR0_VMIDPNE		BIT(11)
+#define ARM_SMMU_sCR0_USFCFG		BIT(10)
+#define ARM_SMMU_sCR0_GCFGFIE		BIT(5)
+#define ARM_SMMU_sCR0_GCFGFRE		BIT(4)
+#define ARM_SMMU_sCR0_EXIDENABLE	BIT(3)
+#define ARM_SMMU_sCR0_GFIE		BIT(2)
+#define ARM_SMMU_sCR0_GFRE		BIT(1)
+#define ARM_SMMU_sCR0_CLIENTPD		BIT(0)
 
 /* Auxiliary Configuration register */
 #define ARM_SMMU_GR0_sACR		0x10
 
 /* Identification registers */
 #define ARM_SMMU_GR0_ID0		0x20
-#define ID0_S1TS			BIT(30)
-#define ID0_S2TS			BIT(29)
-#define ID0_NTS				BIT(28)
-#define ID0_SMS				BIT(27)
-#define ID0_ATOSNS			BIT(26)
-#define ID0_PTFS_NO_AARCH32		BIT(25)
-#define ID0_PTFS_NO_AARCH32S		BIT(24)
-#define ID0_NUMIRPT			GENMASK(23, 16)
-#define ID0_CTTW			BIT(14)
-#define ID0_NUMSIDB			GENMASK(12, 9)
-#define ID0_EXIDS			BIT(8)
-#define ID0_NUMSMRG			GENMASK(7, 0)
+#define ARM_SMMU_ID0_S1TS		BIT(30)
+#define ARM_SMMU_ID0_S2TS		BIT(29)
+#define ARM_SMMU_ID0_NTS		BIT(28)
+#define ARM_SMMU_ID0_SMS		BIT(27)
+#define ARM_SMMU_ID0_ATOSNS		BIT(26)
+#define ARM_SMMU_ID0_PTFS_NO_AARCH32	BIT(25)
+#define ARM_SMMU_ID0_PTFS_NO_AARCH32S	BIT(24)
+#define ARM_SMMU_ID0_NUMIRPT		GENMASK(23, 16)
+#define ARM_SMMU_ID0_CTTW		BIT(14)
+#define ARM_SMMU_ID0_NUMSIDB		GENMASK(12, 9)
+#define ARM_SMMU_ID0_EXIDS		BIT(8)
+#define ARM_SMMU_ID0_NUMSMRG		GENMASK(7, 0)
 
 #define ARM_SMMU_GR0_ID1		0x24
-#define ID1_PAGESIZE			BIT(31)
-#define ID1_NUMPAGENDXB			GENMASK(30, 28)
-#define ID1_NUMS2CB			GENMASK(23, 16)
-#define ID1_NUMCB			GENMASK(7, 0)
+#define ARM_SMMU_ID1_PAGESIZE		BIT(31)
+#define ARM_SMMU_ID1_NUMPAGENDXB	GENMASK(30, 28)
+#define ARM_SMMU_ID1_NUMS2CB		GENMASK(23, 16)
+#define ARM_SMMU_ID1_NUMCB		GENMASK(7, 0)
 
 #define ARM_SMMU_GR0_ID2		0x28
-#define ID2_VMID16			BIT(15)
-#define ID2_PTFS_64K			BIT(14)
-#define ID2_PTFS_16K			BIT(13)
-#define ID2_PTFS_4K			BIT(12)
-#define ID2_UBS				GENMASK(11, 8)
-#define ID2_OAS				GENMASK(7, 4)
-#define ID2_IAS				GENMASK(3, 0)
+#define ARM_SMMU_ID2_VMID16		BIT(15)
+#define ARM_SMMU_ID2_PTFS_64K		BIT(14)
+#define ARM_SMMU_ID2_PTFS_16K		BIT(13)
+#define ARM_SMMU_ID2_PTFS_4K		BIT(12)
+#define ARM_SMMU_ID2_UBS		GENMASK(11, 8)
+#define ARM_SMMU_ID2_OAS		GENMASK(7, 4)
+#define ARM_SMMU_ID2_IAS		GENMASK(3, 0)
 
 #define ARM_SMMU_GR0_ID3		0x2c
 #define ARM_SMMU_GR0_ID4		0x30
@@ -76,11 +76,11 @@
 #define ARM_SMMU_GR0_ID6		0x38
 
 #define ARM_SMMU_GR0_ID7		0x3c
-#define ID7_MAJOR			GENMASK(7, 4)
-#define ID7_MINOR			GENMASK(3, 0)
+#define ARM_SMMU_ID7_MAJOR		GENMASK(7, 4)
+#define ARM_SMMU_ID7_MINOR		GENMASK(3, 0)
 
 #define ARM_SMMU_GR0_sGFSR		0x48
-#define sGFSR_USF			BIT(1)
+#define ARM_SMMU_sGFSR_USF		BIT(1)
 
 #define ARM_SMMU_GR0_sGFSYNR0		0x50
 #define ARM_SMMU_GR0_sGFSYNR1		0x54
@@ -93,118 +93,123 @@
 #define ARM_SMMU_GR0_sTLBGSYNC		0x70
 
 #define ARM_SMMU_GR0_sTLBGSTATUS	0x74
-#define sTLBGSTATUS_GSACTIVE		BIT(0)
+#define ARM_SMMU_sTLBGSTATUS_GSACTIVE	BIT(0)
 
 /* Stream mapping registers */
 #define ARM_SMMU_GR0_SMR(n)		(0x800 + ((n) << 2))
-#define SMR_VALID			BIT(31)
-#define SMR_MASK			GENMASK(31, 16)
-#define SMR_ID				GENMASK(15, 0)
+#define ARM_SMMU_SMR_VALID		BIT(31)
+#define ARM_SMMU_SMR_MASK		GENMASK(31, 16)
+#define ARM_SMMU_SMR_ID			GENMASK(15, 0)
 
 #define ARM_SMMU_GR0_S2CR(n)		(0xc00 + ((n) << 2))
-#define S2CR_PRIVCFG			GENMASK(25, 24)
+#define ARM_SMMU_S2CR_PRIVCFG		GENMASK(25, 24)
 enum arm_smmu_s2cr_privcfg {
 	S2CR_PRIVCFG_DEFAULT,
 	S2CR_PRIVCFG_DIPAN,
 	S2CR_PRIVCFG_UNPRIV,
 	S2CR_PRIVCFG_PRIV,
 };
-#define S2CR_TYPE			GENMASK(17, 16)
+#define ARM_SMMU_S2CR_TYPE		GENMASK(17, 16)
 enum arm_smmu_s2cr_type {
 	S2CR_TYPE_TRANS,
 	S2CR_TYPE_BYPASS,
 	S2CR_TYPE_FAULT,
 };
-#define S2CR_EXIDVALID			BIT(10)
-#define S2CR_CBNDX			GENMASK(7, 0)
+#define ARM_SMMU_S2CR_EXIDVALID		BIT(10)
+#define ARM_SMMU_S2CR_CBNDX		GENMASK(7, 0)
 
 /* Context bank attribute registers */
 #define ARM_SMMU_GR1_CBAR(n)		(0x0 + ((n) << 2))
-#define CBAR_IRPTNDX			GENMASK(31, 24)
-#define CBAR_TYPE			GENMASK(17, 16)
+#define ARM_SMMU_CBAR_IRPTNDX		GENMASK(31, 24)
+#define ARM_SMMU_CBAR_TYPE		GENMASK(17, 16)
 enum arm_smmu_cbar_type {
 	CBAR_TYPE_S2_TRANS,
 	CBAR_TYPE_S1_TRANS_S2_BYPASS,
 	CBAR_TYPE_S1_TRANS_S2_FAULT,
 	CBAR_TYPE_S1_TRANS_S2_TRANS,
 };
-#define CBAR_S1_MEMATTR			GENMASK(15, 12)
-#define CBAR_S1_MEMATTR_WB		0xf
-#define CBAR_S1_BPSHCFG			GENMASK(9, 8)
-#define CBAR_S1_BPSHCFG_NSH		3
-#define CBAR_VMID			GENMASK(7, 0)
+#define ARM_SMMU_CBAR_S1_MEMATTR	GENMASK(15, 12)
+#define ARM_SMMU_CBAR_S1_MEMATTR_WB	0xf
+#define ARM_SMMU_CBAR_S1_BPSHCFG	GENMASK(9, 8)
+#define ARM_SMMU_CBAR_S1_BPSHCFG_NSH	3
+#define ARM_SMMU_CBAR_VMID		GENMASK(7, 0)
 
 #define ARM_SMMU_GR1_CBFRSYNRA(n)	(0x400 + ((n) << 2))
 
 #define ARM_SMMU_GR1_CBA2R(n)		(0x800 + ((n) << 2))
-#define CBA2R_VMID16			GENMASK(31, 16)
-#define CBA2R_VA64			BIT(0)
+#define ARM_SMMU_CBA2R_VMID16		GENMASK(31, 16)
+#define ARM_SMMU_CBA2R_VA64		BIT(0)
 
 #define ARM_SMMU_CB_SCTLR		0x0
-#define SCTLR_S1_ASIDPNE		BIT(12)
-#define SCTLR_CFCFG			BIT(7)
-#define SCTLR_CFIE			BIT(6)
-#define SCTLR_CFRE			BIT(5)
-#define SCTLR_E				BIT(4)
-#define SCTLR_AFE			BIT(2)
-#define SCTLR_TRE			BIT(1)
-#define SCTLR_M				BIT(0)
+#define ARM_SMMU_SCTLR_S1_ASIDPNE	BIT(12)
+#define ARM_SMMU_SCTLR_CFCFG		BIT(7)
+#define ARM_SMMU_SCTLR_CFIE		BIT(6)
+#define ARM_SMMU_SCTLR_CFRE		BIT(5)
+#define ARM_SMMU_SCTLR_E		BIT(4)
+#define ARM_SMMU_SCTLR_AFE		BIT(2)
+#define ARM_SMMU_SCTLR_TRE		BIT(1)
+#define ARM_SMMU_SCTLR_M		BIT(0)
 
 #define ARM_SMMU_CB_ACTLR		0x4
 
 #define ARM_SMMU_CB_RESUME		0x8
-#define RESUME_TERMINATE		BIT(0)
+#define ARM_SMMU_RESUME_TERMINATE	BIT(0)
 
 #define ARM_SMMU_CB_TCR2		0x10
-#define TCR2_SEP			GENMASK(17, 15)
-#define TCR2_SEP_UPSTREAM		0x7
-#define TCR2_AS				BIT(4)
-#define TCR2_PASIZE			GENMASK(3, 0)
+#define ARM_SMMU_TCR2_SEP		GENMASK(17, 15)
+#define ARM_SMMU_TCR2_SEP_UPSTREAM	0x7
+#define ARM_SMMU_TCR2_AS		BIT(4)
+#define ARM_SMMU_TCR2_PASIZE		GENMASK(3, 0)
 
 #define ARM_SMMU_CB_TTBR0		0x20
 #define ARM_SMMU_CB_TTBR1		0x28
-#define TTBRn_ASID			GENMASK_ULL(63, 48)
-
-/* arm64 headers leak this somehow :( */
-#undef TCR_T0SZ
+#define ARM_SMMU_TTBRn_ASID		GENMASK_ULL(63, 48)
 
 #define ARM_SMMU_CB_TCR			0x30
-#define TCR_EAE				BIT(31)
-#define TCR_EPD1			BIT(23)
-#define TCR_TG0				GENMASK(15, 14)
-#define TCR_SH0				GENMASK(13, 12)
-#define TCR_ORGN0			GENMASK(11, 10)
-#define TCR_IRGN0			GENMASK(9, 8)
-#define TCR_T0SZ			GENMASK(5, 0)
+#define ARM_SMMU_TCR_EAE		BIT(31)
+#define ARM_SMMU_TCR_EPD1		BIT(23)
+#define ARM_SMMU_TCR_TG0		GENMASK(15, 14)
+#define ARM_SMMU_TCR_SH0		GENMASK(13, 12)
+#define ARM_SMMU_TCR_ORGN0		GENMASK(11, 10)
+#define ARM_SMMU_TCR_IRGN0		GENMASK(9, 8)
+#define ARM_SMMU_TCR_T0SZ		GENMASK(5, 0)
 
 #define ARM_SMMU_CB_CONTEXTIDR		0x34
 #define ARM_SMMU_CB_S1_MAIR0		0x38
 #define ARM_SMMU_CB_S1_MAIR1		0x3c
 
 #define ARM_SMMU_CB_PAR			0x50
-#define CB_PAR_F			BIT(0)
+#define ARM_SMMU_CB_PAR_F		BIT(0)
 
 #define ARM_SMMU_CB_FSR			0x58
-#define FSR_MULTI			BIT(31)
-#define FSR_SS				BIT(30)
-#define FSR_UUT				BIT(8)
-#define FSR_ASF				BIT(7)
-#define FSR_TLBLKF			BIT(6)
-#define FSR_TLBMCF			BIT(5)
-#define FSR_EF				BIT(4)
-#define FSR_PF				BIT(3)
-#define FSR_AFF				BIT(2)
-#define FSR_TF				BIT(1)
-
-#define FSR_IGN				(FSR_AFF | FSR_ASF | \
-					 FSR_TLBMCF | FSR_TLBLKF)
-#define FSR_FAULT			(FSR_MULTI | FSR_SS | FSR_UUT | \
-					 FSR_EF | FSR_PF | FSR_TF | FSR_IGN)
+#define ARM_SMMU_FSR_MULTI		BIT(31)
+#define ARM_SMMU_FSR_SS			BIT(30)
+#define ARM_SMMU_FSR_UUT		BIT(8)
+#define ARM_SMMU_FSR_ASF		BIT(7)
+#define ARM_SMMU_FSR_TLBLKF		BIT(6)
+#define ARM_SMMU_FSR_TLBMCF		BIT(5)
+#define ARM_SMMU_FSR_EF			BIT(4)
+#define ARM_SMMU_FSR_PF			BIT(3)
+#define ARM_SMMU_FSR_AFF		BIT(2)
+#define ARM_SMMU_FSR_TF			BIT(1)
+
+#define ARM_SMMU_FSR_IGN		(ARM_SMMU_FSR_AFF |		\
+					 ARM_SMMU_FSR_ASF |		\
+					 ARM_SMMU_FSR_TLBMCF |		\
+					 ARM_SMMU_FSR_TLBLKF)
+
+#define ARM_SMMU_FSR_FAULT		(ARM_SMMU_FSR_MULTI |		\
+					 ARM_SMMU_FSR_SS |		\
+					 ARM_SMMU_FSR_UUT |		\
+					 ARM_SMMU_FSR_EF |		\
+					 ARM_SMMU_FSR_PF |		\
+					 ARM_SMMU_FSR_TF |		\
+					 ARM_SMMU_FSR_IGN)
 
 #define ARM_SMMU_CB_FAR			0x60
 
 #define ARM_SMMU_CB_FSYNR0		0x68
-#define FSYNR0_WNR			BIT(4)
+#define ARM_SMMU_FSYNR0_WNR		BIT(4)
 
 #define ARM_SMMU_CB_S1_TLBIVA		0x600
 #define ARM_SMMU_CB_S1_TLBIASID		0x610
@@ -216,7 +221,7 @@ enum arm_smmu_cbar_type {
 #define ARM_SMMU_CB_ATS1PR		0x800
 
 #define ARM_SMMU_CB_ATSR		0x8f0
-#define ATSR_ACTIVE			BIT(0)
+#define ARM_SMMU_ATSR_ACTIVE		BIT(0)
 
 
 /* Maximum number of context banks per SMMU */
@@ -310,7 +315,7 @@ struct arm_smmu_cfg {
 	enum arm_smmu_cbar_type		cbar;
 	enum arm_smmu_context_fmt	fmt;
 };
-#define INVALID_IRPTNDX			0xff
+#define ARM_SMMU_INVALID_IRPTNDX	0xff
 
 enum arm_smmu_domain_stage {
 	ARM_SMMU_DOMAIN_S1 = 0,
@@ -333,18 +338,18 @@ struct arm_smmu_domain {
 
 static inline u32 arm_smmu_lpae_tcr(struct io_pgtable_cfg *cfg)
 {
-	return TCR_EPD1 |
-	       FIELD_PREP(TCR_TG0, cfg->arm_lpae_s1_cfg.tcr.tg) |
-	       FIELD_PREP(TCR_SH0, cfg->arm_lpae_s1_cfg.tcr.sh) |
-	       FIELD_PREP(TCR_ORGN0, cfg->arm_lpae_s1_cfg.tcr.orgn) |
-	       FIELD_PREP(TCR_IRGN0, cfg->arm_lpae_s1_cfg.tcr.irgn) |
-	       FIELD_PREP(TCR_T0SZ, cfg->arm_lpae_s1_cfg.tcr.tsz);
+	return ARM_SMMU_TCR_EPD1 |
+	       FIELD_PREP(ARM_SMMU_TCR_TG0, cfg->arm_lpae_s1_cfg.tcr.tg) |
+	       FIELD_PREP(ARM_SMMU_TCR_SH0, cfg->arm_lpae_s1_cfg.tcr.sh) |
+	       FIELD_PREP(ARM_SMMU_TCR_ORGN0, cfg->arm_lpae_s1_cfg.tcr.orgn) |
+	       FIELD_PREP(ARM_SMMU_TCR_IRGN0, cfg->arm_lpae_s1_cfg.tcr.irgn) |
+	       FIELD_PREP(ARM_SMMU_TCR_T0SZ, cfg->arm_lpae_s1_cfg.tcr.tsz);
 }
 
 static inline u32 arm_smmu_lpae_tcr2(struct io_pgtable_cfg *cfg)
 {
-	return FIELD_PREP(TCR2_PASIZE, cfg->arm_lpae_s1_cfg.tcr.ips) |
-	       FIELD_PREP(TCR2_SEP, TCR2_SEP_UPSTREAM);
+	return FIELD_PREP(ARM_SMMU_TCR2_PASIZE, cfg->arm_lpae_s1_cfg.tcr.ips) |
+	       FIELD_PREP(ARM_SMMU_TCR2_SEP, ARM_SMMU_TCR2_SEP_UPSTREAM);
 }
 
 /* Implementation details, yay! */
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
index 2eeaf2eec946..39759db4f003 100644
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -201,7 +201,7 @@ static irqreturn_t qcom_iommu_fault(int irq, void *dev)
 
 	fsr = iommu_readl(ctx, ARM_SMMU_CB_FSR);
 
-	if (!(fsr & FSR_FAULT))
+	if (!(fsr & ARM_SMMU_FSR_FAULT))
 		return IRQ_NONE;
 
 	fsynr = iommu_readl(ctx, ARM_SMMU_CB_FSYNR0);
@@ -215,7 +215,7 @@ static irqreturn_t qcom_iommu_fault(int irq, void *dev)
 	}
 
 	iommu_writel(ctx, ARM_SMMU_CB_FSR, fsr);
-	iommu_writel(ctx, ARM_SMMU_CB_RESUME, RESUME_TERMINATE);
+	iommu_writel(ctx, ARM_SMMU_CB_RESUME, ARM_SMMU_RESUME_TERMINATE);
 
 	return IRQ_HANDLED;
 }
@@ -270,14 +270,14 @@ static int qcom_iommu_init_domain(struct iommu_domain *domain,
 		/* TTBRs */
 		iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
 				pgtbl_cfg.arm_lpae_s1_cfg.ttbr |
-				FIELD_PREP(TTBRn_ASID, ctx->asid));
+				FIELD_PREP(ARM_SMMU_TTBRn_ASID, ctx->asid));
 		iommu_writeq(ctx, ARM_SMMU_CB_TTBR1, 0);
 
 		/* TCR */
 		iommu_writel(ctx, ARM_SMMU_CB_TCR2,
 				arm_smmu_lpae_tcr2(&pgtbl_cfg));
 		iommu_writel(ctx, ARM_SMMU_CB_TCR,
-				arm_smmu_lpae_tcr(&pgtbl_cfg) | TCR_EAE);
+			     arm_smmu_lpae_tcr(&pgtbl_cfg) | ARM_SMMU_TCR_EAE);
 
 		/* MAIRs (stage-1 only) */
 		iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR0,
@@ -286,11 +286,13 @@ static int qcom_iommu_init_domain(struct iommu_domain *domain,
 				pgtbl_cfg.arm_lpae_s1_cfg.mair >> 32);
 
 		/* SCTLR */
-		reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE |
-			SCTLR_M | SCTLR_S1_ASIDPNE | SCTLR_CFCFG;
+		reg = ARM_SMMU_SCTLR_CFIE | ARM_SMMU_SCTLR_CFRE |
+		      ARM_SMMU_SCTLR_AFE | ARM_SMMU_SCTLR_TRE |
+		      ARM_SMMU_SCTLR_M | ARM_SMMU_SCTLR_S1_ASIDPNE |
+		      ARM_SMMU_SCTLR_CFCFG;
 
 		if (IS_ENABLED(CONFIG_BIG_ENDIAN))
-			reg |= SCTLR_E;
+			reg |= ARM_SMMU_SCTLR_E;
 
 		iommu_writel(ctx, ARM_SMMU_CB_SCTLR, reg);
 
-- 
2.25.0.rc1.283.g88dfdc4193-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 7/8] iommu/io-pgtable-arm: Rationalise VTCR handling
  2020-01-10 15:28 ` Will Deacon
@ 2020-01-10 15:28   ` Will Deacon
  -1 siblings, 0 replies; 22+ messages in thread
From: Will Deacon @ 2020-01-10 15:28 UTC (permalink / raw)
  To: iommu; +Cc: Will Deacon, kernel-team, Robin Murphy, linux-arm-kernel

Commit 05a648cd2dd7 ("iommu/io-pgtable-arm: Rationalise TCR handling")
reworked the way in which the TCR register value is returned from the
io-pgtable code when targetting the Arm long-descriptor format, in
preparation for allowing page-tables to target TTBR1.

As it turns out, the new interface is a lot nicer to use, so do the same
conversion for the VTCR register even though there is only a single base
register for stage-2 translation.

Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 drivers/iommu/arm-smmu-v3.c    | 17 +++++++++-
 drivers/iommu/arm-smmu.c       |  2 +-
 drivers/iommu/arm-smmu.h       | 21 +++++++++++++
 drivers/iommu/io-pgtable-arm.c | 57 +++++++++++++---------------------
 include/linux/io-pgtable.h     | 10 +++++-
 5 files changed, 68 insertions(+), 39 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index d127974afdb7..4443e1890077 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -250,6 +250,13 @@
 
 #define STRTAB_STE_2_S2VMID		GENMASK_ULL(15, 0)
 #define STRTAB_STE_2_VTCR		GENMASK_ULL(50, 32)
+#define STRTAB_STE_2_VTCR_S2T0SZ	GENMASK_ULL(5, 0)
+#define STRTAB_STE_2_VTCR_S2SL0		GENMASK_ULL(7, 6)
+#define STRTAB_STE_2_VTCR_S2IR0		GENMASK_ULL(9, 8)
+#define STRTAB_STE_2_VTCR_S2OR0		GENMASK_ULL(11, 10)
+#define STRTAB_STE_2_VTCR_S2SH0		GENMASK_ULL(13, 12)
+#define STRTAB_STE_2_VTCR_S2TG		GENMASK_ULL(15, 14)
+#define STRTAB_STE_2_VTCR_S2PS		GENMASK_ULL(18, 16)
 #define STRTAB_STE_2_S2AA64		(1UL << 51)
 #define STRTAB_STE_2_S2ENDI		(1UL << 52)
 #define STRTAB_STE_2_S2PTW		(1UL << 54)
@@ -2159,14 +2166,22 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
 	int vmid;
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
+	typeof(&pgtbl_cfg->arm_lpae_s2_cfg.vtcr) vtcr;
 
 	vmid = arm_smmu_bitmap_alloc(smmu->vmid_map, smmu->vmid_bits);
 	if (vmid < 0)
 		return vmid;
 
+	vtcr = &pgtbl_cfg->arm_lpae_s2_cfg.vtcr;
 	cfg->vmid	= (u16)vmid;
 	cfg->vttbr	= pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
-	cfg->vtcr	= pgtbl_cfg->arm_lpae_s2_cfg.vtcr;
+	cfg->vtcr	= FIELD_PREP(STRTAB_STE_2_VTCR_S2T0SZ, vtcr->tsz) |
+			  FIELD_PREP(STRTAB_STE_2_VTCR_S2SL0, vtcr->sl) |
+			  FIELD_PREP(STRTAB_STE_2_VTCR_S2IR0, vtcr->irgn) |
+			  FIELD_PREP(STRTAB_STE_2_VTCR_S2OR0, vtcr->orgn) |
+			  FIELD_PREP(STRTAB_STE_2_VTCR_S2SH0, vtcr->sh) |
+			  FIELD_PREP(STRTAB_STE_2_VTCR_S2TG, vtcr->tg) |
+			  FIELD_PREP(STRTAB_STE_2_VTCR_S2PS, vtcr->ps);
 	return 0;
 }
 
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 214be09f6ded..f067783ebd59 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -548,7 +548,7 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
 				cb->tcr[0] |= ARM_SMMU_TCR_EAE;
 		}
 	} else {
-		cb->tcr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vtcr;
+		cb->tcr[0] = arm_smmu_lpae_vtcr(pgtbl_cfg);
 	}
 
 	/* TTBRs */
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index 6501f38a5966..8d1cd54d82a6 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -174,6 +174,15 @@ enum arm_smmu_cbar_type {
 #define ARM_SMMU_TCR_IRGN0		GENMASK(9, 8)
 #define ARM_SMMU_TCR_T0SZ		GENMASK(5, 0)
 
+#define ARM_SMMU_VTCR_RES1		BIT(31)
+#define ARM_SMMU_VTCR_PS		GENMASK(18, 16)
+#define ARM_SMMU_VTCR_TG0		ARM_SMMU_TCR_TG0
+#define ARM_SMMU_VTCR_SH0		ARM_SMMU_TCR_SH0
+#define ARM_SMMU_VTCR_ORGN0		ARM_SMMU_TCR_ORGN0
+#define ARM_SMMU_VTCR_IRGN0		ARM_SMMU_TCR_IRGN0
+#define ARM_SMMU_VTCR_SL0		GENMASK(7, 6)
+#define ARM_SMMU_VTCR_T0SZ		ARM_SMMU_TCR_T0SZ
+
 #define ARM_SMMU_CB_CONTEXTIDR		0x34
 #define ARM_SMMU_CB_S1_MAIR0		0x38
 #define ARM_SMMU_CB_S1_MAIR1		0x3c
@@ -352,6 +361,18 @@ static inline u32 arm_smmu_lpae_tcr2(struct io_pgtable_cfg *cfg)
 	       FIELD_PREP(ARM_SMMU_TCR2_SEP, ARM_SMMU_TCR2_SEP_UPSTREAM);
 }
 
+static inline u32 arm_smmu_lpae_vtcr(struct io_pgtable_cfg *cfg)
+{
+	return ARM_SMMU_VTCR_RES1 |
+	       FIELD_PREP(ARM_SMMU_VTCR_PS, cfg->arm_lpae_s2_cfg.vtcr.ps) |
+	       FIELD_PREP(ARM_SMMU_VTCR_TG0, cfg->arm_lpae_s2_cfg.vtcr.tg) |
+	       FIELD_PREP(ARM_SMMU_VTCR_SH0, cfg->arm_lpae_s2_cfg.vtcr.sh) |
+	       FIELD_PREP(ARM_SMMU_VTCR_ORGN0, cfg->arm_lpae_s2_cfg.vtcr.orgn) |
+	       FIELD_PREP(ARM_SMMU_VTCR_IRGN0, cfg->arm_lpae_s2_cfg.vtcr.irgn) |
+	       FIELD_PREP(ARM_SMMU_VTCR_SL0, cfg->arm_lpae_s2_cfg.vtcr.sl) |
+	       FIELD_PREP(ARM_SMMU_VTCR_T0SZ, cfg->arm_lpae_s2_cfg.vtcr.tsz);
+}
+
 /* Implementation details, yay! */
 struct arm_smmu_impl {
 	u32 (*read_reg)(struct arm_smmu_device *smmu, int page, int offset);
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 78ec0ee5c59c..846963c87e0f 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -100,26 +100,19 @@
 #define ARM_LPAE_PTE_MEMATTR_DEV	(((arm_lpae_iopte)0x1) << 2)
 
 /* Register bits */
-#define ARM_64_LPAE_VTCR_RES1		(1U << 31)
-
-#define ARM_LPAE_VTCR_TG0_SHIFT		14
 #define ARM_LPAE_TCR_TG0_4K		0
 #define ARM_LPAE_TCR_TG0_64K		1
 #define ARM_LPAE_TCR_TG0_16K		2
 
-#define ARM_LPAE_TCR_SH0_SHIFT		12
 #define ARM_LPAE_TCR_SH_NS		0
 #define ARM_LPAE_TCR_SH_OS		2
 #define ARM_LPAE_TCR_SH_IS		3
 
-#define ARM_LPAE_TCR_ORGN0_SHIFT	10
-#define ARM_LPAE_TCR_IRGN0_SHIFT	8
 #define ARM_LPAE_TCR_RGN_NC		0
 #define ARM_LPAE_TCR_RGN_WBWA		1
 #define ARM_LPAE_TCR_RGN_WT		2
 #define ARM_LPAE_TCR_RGN_WB		3
 
-#define ARM_LPAE_VTCR_SL0_SHIFT		6
 #define ARM_LPAE_VTCR_SL0_MASK		0x3
 
 #define ARM_LPAE_TCR_T0SZ_SHIFT		0
@@ -877,8 +870,9 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 static struct io_pgtable *
 arm_64_lpae_alloc_pgtable_s2(struct io_pgtable_cfg *cfg, void *cookie)
 {
-	u64 reg, sl;
+	u64 sl;
 	struct arm_lpae_io_pgtable *data;
+	typeof(&cfg->arm_lpae_s2_cfg.vtcr) vtcr = &cfg->arm_lpae_s2_cfg.vtcr;
 
 	/* The NS quirk doesn't apply at stage 2 */
 	if (cfg->quirks & ~(IO_PGTABLE_QUIRK_NON_STRICT))
@@ -903,61 +897,59 @@ arm_64_lpae_alloc_pgtable_s2(struct io_pgtable_cfg *cfg, void *cookie)
 	}
 
 	/* VTCR */
-	reg = ARM_64_LPAE_VTCR_RES1;
 	if (cfg->coherent_walk) {
-		reg |= (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) |
-		       (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) |
-		       (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT);
+		vtcr->sh = ARM_LPAE_TCR_SH_IS;
+		vtcr->irgn = ARM_LPAE_TCR_RGN_WBWA;
+		vtcr->orgn = ARM_LPAE_TCR_RGN_WBWA;
 	} else {
-		reg |= (ARM_LPAE_TCR_SH_OS << ARM_LPAE_TCR_SH0_SHIFT) |
-		       (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_IRGN0_SHIFT) |
-		       (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_ORGN0_SHIFT);
+		vtcr->sh = ARM_LPAE_TCR_SH_OS;
+		vtcr->irgn = ARM_LPAE_TCR_RGN_NC;
+		vtcr->orgn = ARM_LPAE_TCR_RGN_NC;
 	}
 
 	sl = data->start_level;
 
 	switch (ARM_LPAE_GRANULE(data)) {
 	case SZ_4K:
-		reg |= (ARM_LPAE_TCR_TG0_4K << ARM_LPAE_VTCR_TG0_SHIFT);
+		vtcr->tg = ARM_LPAE_TCR_TG0_4K;
 		sl++; /* SL0 format is different for 4K granule size */
 		break;
 	case SZ_16K:
-		reg |= (ARM_LPAE_TCR_TG0_16K << ARM_LPAE_VTCR_TG0_SHIFT);
+		vtcr->tg = ARM_LPAE_TCR_TG0_16K;
 		break;
 	case SZ_64K:
-		reg |= (ARM_LPAE_TCR_TG0_64K << ARM_LPAE_VTCR_TG0_SHIFT);
+		vtcr->tg = ARM_LPAE_TCR_TG0_64K;
 		break;
 	}
 
 	switch (cfg->oas) {
 	case 32:
-		reg |= (ARM_LPAE_TCR_PS_32_BIT << ARM_LPAE_VTCR_PS_SHIFT);
+		vtcr->ps = ARM_LPAE_TCR_PS_32_BIT;
 		break;
 	case 36:
-		reg |= (ARM_LPAE_TCR_PS_36_BIT << ARM_LPAE_VTCR_PS_SHIFT);
+		vtcr->ps = ARM_LPAE_TCR_PS_36_BIT;
 		break;
 	case 40:
-		reg |= (ARM_LPAE_TCR_PS_40_BIT << ARM_LPAE_VTCR_PS_SHIFT);
+		vtcr->ps = ARM_LPAE_TCR_PS_40_BIT;
 		break;
 	case 42:
-		reg |= (ARM_LPAE_TCR_PS_42_BIT << ARM_LPAE_VTCR_PS_SHIFT);
+		vtcr->ps = ARM_LPAE_TCR_PS_42_BIT;
 		break;
 	case 44:
-		reg |= (ARM_LPAE_TCR_PS_44_BIT << ARM_LPAE_VTCR_PS_SHIFT);
+		vtcr->ps = ARM_LPAE_TCR_PS_44_BIT;
 		break;
 	case 48:
-		reg |= (ARM_LPAE_TCR_PS_48_BIT << ARM_LPAE_VTCR_PS_SHIFT);
+		vtcr->ps = ARM_LPAE_TCR_PS_48_BIT;
 		break;
 	case 52:
-		reg |= (ARM_LPAE_TCR_PS_52_BIT << ARM_LPAE_VTCR_PS_SHIFT);
+		vtcr->ps = ARM_LPAE_TCR_PS_52_BIT;
 		break;
 	default:
 		goto out_free_data;
 	}
 
-	reg |= (64ULL - cfg->ias) << ARM_LPAE_TCR_T0SZ_SHIFT;
-	reg |= (~sl & ARM_LPAE_VTCR_SL0_MASK) << ARM_LPAE_VTCR_SL0_SHIFT;
-	cfg->arm_lpae_s2_cfg.vtcr = reg;
+	vtcr->tsz = 64ULL - cfg->ias;
+	vtcr->sl = ~sl & ARM_LPAE_VTCR_SL0_MASK;
 
 	/* Allocate pgd pages */
 	data->pgd = __arm_lpae_alloc_pages(ARM_LPAE_PGD_SIZE(data),
@@ -984,24 +976,17 @@ arm_32_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 		return NULL;
 
 	cfg->pgsize_bitmap &= (SZ_4K | SZ_2M | SZ_1G);
-
 	return arm_64_lpae_alloc_pgtable_s1(cfg, cookie);
 }
 
 static struct io_pgtable *
 arm_32_lpae_alloc_pgtable_s2(struct io_pgtable_cfg *cfg, void *cookie)
 {
-	struct io_pgtable *iop;
-
 	if (cfg->ias > 40 || cfg->oas > 40)
 		return NULL;
 
 	cfg->pgsize_bitmap &= (SZ_4K | SZ_2M | SZ_1G);
-	iop = arm_64_lpae_alloc_pgtable_s2(cfg, cookie);
-	if (iop)
-		cfg->arm_lpae_s2_cfg.vtcr &= 0xffffffff;
-
-	return iop;
+	return arm_64_lpae_alloc_pgtable_s2(cfg, cookie);
 }
 
 static struct io_pgtable *
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 6ae104cedfd7..40c1b7745fb6 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -114,7 +114,15 @@ struct io_pgtable_cfg {
 
 		struct {
 			u64	vttbr;
-			u64	vtcr;
+			struct {
+				u32	ps:3;
+				u32	tg:2;
+				u32	sh:2;
+				u32	orgn:2;
+				u32	irgn:2;
+				u32	sl:2;
+				u32	tsz:6;
+			}	vtcr;
 		} arm_lpae_s2_cfg;
 
 		struct {
-- 
2.25.0.rc1.283.g88dfdc4193-goog

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 7/8] iommu/io-pgtable-arm: Rationalise VTCR handling
@ 2020-01-10 15:28   ` Will Deacon
  0 siblings, 0 replies; 22+ messages in thread
From: Will Deacon @ 2020-01-10 15:28 UTC (permalink / raw)
  To: iommu
  Cc: Will Deacon, Jordan Crouse, kernel-team, Robin Murphy, linux-arm-kernel

Commit 05a648cd2dd7 ("iommu/io-pgtable-arm: Rationalise TCR handling")
reworked the way in which the TCR register value is returned from the
io-pgtable code when targetting the Arm long-descriptor format, in
preparation for allowing page-tables to target TTBR1.

As it turns out, the new interface is a lot nicer to use, so do the same
conversion for the VTCR register even though there is only a single base
register for stage-2 translation.

Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 drivers/iommu/arm-smmu-v3.c    | 17 +++++++++-
 drivers/iommu/arm-smmu.c       |  2 +-
 drivers/iommu/arm-smmu.h       | 21 +++++++++++++
 drivers/iommu/io-pgtable-arm.c | 57 +++++++++++++---------------------
 include/linux/io-pgtable.h     | 10 +++++-
 5 files changed, 68 insertions(+), 39 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index d127974afdb7..4443e1890077 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -250,6 +250,13 @@
 
 #define STRTAB_STE_2_S2VMID		GENMASK_ULL(15, 0)
 #define STRTAB_STE_2_VTCR		GENMASK_ULL(50, 32)
+#define STRTAB_STE_2_VTCR_S2T0SZ	GENMASK_ULL(5, 0)
+#define STRTAB_STE_2_VTCR_S2SL0		GENMASK_ULL(7, 6)
+#define STRTAB_STE_2_VTCR_S2IR0		GENMASK_ULL(9, 8)
+#define STRTAB_STE_2_VTCR_S2OR0		GENMASK_ULL(11, 10)
+#define STRTAB_STE_2_VTCR_S2SH0		GENMASK_ULL(13, 12)
+#define STRTAB_STE_2_VTCR_S2TG		GENMASK_ULL(15, 14)
+#define STRTAB_STE_2_VTCR_S2PS		GENMASK_ULL(18, 16)
 #define STRTAB_STE_2_S2AA64		(1UL << 51)
 #define STRTAB_STE_2_S2ENDI		(1UL << 52)
 #define STRTAB_STE_2_S2PTW		(1UL << 54)
@@ -2159,14 +2166,22 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
 	int vmid;
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
+	typeof(&pgtbl_cfg->arm_lpae_s2_cfg.vtcr) vtcr;
 
 	vmid = arm_smmu_bitmap_alloc(smmu->vmid_map, smmu->vmid_bits);
 	if (vmid < 0)
 		return vmid;
 
+	vtcr = &pgtbl_cfg->arm_lpae_s2_cfg.vtcr;
 	cfg->vmid	= (u16)vmid;
 	cfg->vttbr	= pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
-	cfg->vtcr	= pgtbl_cfg->arm_lpae_s2_cfg.vtcr;
+	cfg->vtcr	= FIELD_PREP(STRTAB_STE_2_VTCR_S2T0SZ, vtcr->tsz) |
+			  FIELD_PREP(STRTAB_STE_2_VTCR_S2SL0, vtcr->sl) |
+			  FIELD_PREP(STRTAB_STE_2_VTCR_S2IR0, vtcr->irgn) |
+			  FIELD_PREP(STRTAB_STE_2_VTCR_S2OR0, vtcr->orgn) |
+			  FIELD_PREP(STRTAB_STE_2_VTCR_S2SH0, vtcr->sh) |
+			  FIELD_PREP(STRTAB_STE_2_VTCR_S2TG, vtcr->tg) |
+			  FIELD_PREP(STRTAB_STE_2_VTCR_S2PS, vtcr->ps);
 	return 0;
 }
 
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 214be09f6ded..f067783ebd59 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -548,7 +548,7 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
 				cb->tcr[0] |= ARM_SMMU_TCR_EAE;
 		}
 	} else {
-		cb->tcr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vtcr;
+		cb->tcr[0] = arm_smmu_lpae_vtcr(pgtbl_cfg);
 	}
 
 	/* TTBRs */
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index 6501f38a5966..8d1cd54d82a6 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -174,6 +174,15 @@ enum arm_smmu_cbar_type {
 #define ARM_SMMU_TCR_IRGN0		GENMASK(9, 8)
 #define ARM_SMMU_TCR_T0SZ		GENMASK(5, 0)
 
+#define ARM_SMMU_VTCR_RES1		BIT(31)
+#define ARM_SMMU_VTCR_PS		GENMASK(18, 16)
+#define ARM_SMMU_VTCR_TG0		ARM_SMMU_TCR_TG0
+#define ARM_SMMU_VTCR_SH0		ARM_SMMU_TCR_SH0
+#define ARM_SMMU_VTCR_ORGN0		ARM_SMMU_TCR_ORGN0
+#define ARM_SMMU_VTCR_IRGN0		ARM_SMMU_TCR_IRGN0
+#define ARM_SMMU_VTCR_SL0		GENMASK(7, 6)
+#define ARM_SMMU_VTCR_T0SZ		ARM_SMMU_TCR_T0SZ
+
 #define ARM_SMMU_CB_CONTEXTIDR		0x34
 #define ARM_SMMU_CB_S1_MAIR0		0x38
 #define ARM_SMMU_CB_S1_MAIR1		0x3c
@@ -352,6 +361,18 @@ static inline u32 arm_smmu_lpae_tcr2(struct io_pgtable_cfg *cfg)
 	       FIELD_PREP(ARM_SMMU_TCR2_SEP, ARM_SMMU_TCR2_SEP_UPSTREAM);
 }
 
+static inline u32 arm_smmu_lpae_vtcr(struct io_pgtable_cfg *cfg)
+{
+	return ARM_SMMU_VTCR_RES1 |
+	       FIELD_PREP(ARM_SMMU_VTCR_PS, cfg->arm_lpae_s2_cfg.vtcr.ps) |
+	       FIELD_PREP(ARM_SMMU_VTCR_TG0, cfg->arm_lpae_s2_cfg.vtcr.tg) |
+	       FIELD_PREP(ARM_SMMU_VTCR_SH0, cfg->arm_lpae_s2_cfg.vtcr.sh) |
+	       FIELD_PREP(ARM_SMMU_VTCR_ORGN0, cfg->arm_lpae_s2_cfg.vtcr.orgn) |
+	       FIELD_PREP(ARM_SMMU_VTCR_IRGN0, cfg->arm_lpae_s2_cfg.vtcr.irgn) |
+	       FIELD_PREP(ARM_SMMU_VTCR_SL0, cfg->arm_lpae_s2_cfg.vtcr.sl) |
+	       FIELD_PREP(ARM_SMMU_VTCR_T0SZ, cfg->arm_lpae_s2_cfg.vtcr.tsz);
+}
+
 /* Implementation details, yay! */
 struct arm_smmu_impl {
 	u32 (*read_reg)(struct arm_smmu_device *smmu, int page, int offset);
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 78ec0ee5c59c..846963c87e0f 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -100,26 +100,19 @@
 #define ARM_LPAE_PTE_MEMATTR_DEV	(((arm_lpae_iopte)0x1) << 2)
 
 /* Register bits */
-#define ARM_64_LPAE_VTCR_RES1		(1U << 31)
-
-#define ARM_LPAE_VTCR_TG0_SHIFT		14
 #define ARM_LPAE_TCR_TG0_4K		0
 #define ARM_LPAE_TCR_TG0_64K		1
 #define ARM_LPAE_TCR_TG0_16K		2
 
-#define ARM_LPAE_TCR_SH0_SHIFT		12
 #define ARM_LPAE_TCR_SH_NS		0
 #define ARM_LPAE_TCR_SH_OS		2
 #define ARM_LPAE_TCR_SH_IS		3
 
-#define ARM_LPAE_TCR_ORGN0_SHIFT	10
-#define ARM_LPAE_TCR_IRGN0_SHIFT	8
 #define ARM_LPAE_TCR_RGN_NC		0
 #define ARM_LPAE_TCR_RGN_WBWA		1
 #define ARM_LPAE_TCR_RGN_WT		2
 #define ARM_LPAE_TCR_RGN_WB		3
 
-#define ARM_LPAE_VTCR_SL0_SHIFT		6
 #define ARM_LPAE_VTCR_SL0_MASK		0x3
 
 #define ARM_LPAE_TCR_T0SZ_SHIFT		0
@@ -877,8 +870,9 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 static struct io_pgtable *
 arm_64_lpae_alloc_pgtable_s2(struct io_pgtable_cfg *cfg, void *cookie)
 {
-	u64 reg, sl;
+	u64 sl;
 	struct arm_lpae_io_pgtable *data;
+	typeof(&cfg->arm_lpae_s2_cfg.vtcr) vtcr = &cfg->arm_lpae_s2_cfg.vtcr;
 
 	/* The NS quirk doesn't apply at stage 2 */
 	if (cfg->quirks & ~(IO_PGTABLE_QUIRK_NON_STRICT))
@@ -903,61 +897,59 @@ arm_64_lpae_alloc_pgtable_s2(struct io_pgtable_cfg *cfg, void *cookie)
 	}
 
 	/* VTCR */
-	reg = ARM_64_LPAE_VTCR_RES1;
 	if (cfg->coherent_walk) {
-		reg |= (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) |
-		       (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) |
-		       (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT);
+		vtcr->sh = ARM_LPAE_TCR_SH_IS;
+		vtcr->irgn = ARM_LPAE_TCR_RGN_WBWA;
+		vtcr->orgn = ARM_LPAE_TCR_RGN_WBWA;
 	} else {
-		reg |= (ARM_LPAE_TCR_SH_OS << ARM_LPAE_TCR_SH0_SHIFT) |
-		       (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_IRGN0_SHIFT) |
-		       (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_ORGN0_SHIFT);
+		vtcr->sh = ARM_LPAE_TCR_SH_OS;
+		vtcr->irgn = ARM_LPAE_TCR_RGN_NC;
+		vtcr->orgn = ARM_LPAE_TCR_RGN_NC;
 	}
 
 	sl = data->start_level;
 
 	switch (ARM_LPAE_GRANULE(data)) {
 	case SZ_4K:
-		reg |= (ARM_LPAE_TCR_TG0_4K << ARM_LPAE_VTCR_TG0_SHIFT);
+		vtcr->tg = ARM_LPAE_TCR_TG0_4K;
 		sl++; /* SL0 format is different for 4K granule size */
 		break;
 	case SZ_16K:
-		reg |= (ARM_LPAE_TCR_TG0_16K << ARM_LPAE_VTCR_TG0_SHIFT);
+		vtcr->tg = ARM_LPAE_TCR_TG0_16K;
 		break;
 	case SZ_64K:
-		reg |= (ARM_LPAE_TCR_TG0_64K << ARM_LPAE_VTCR_TG0_SHIFT);
+		vtcr->tg = ARM_LPAE_TCR_TG0_64K;
 		break;
 	}
 
 	switch (cfg->oas) {
 	case 32:
-		reg |= (ARM_LPAE_TCR_PS_32_BIT << ARM_LPAE_VTCR_PS_SHIFT);
+		vtcr->ps = ARM_LPAE_TCR_PS_32_BIT;
 		break;
 	case 36:
-		reg |= (ARM_LPAE_TCR_PS_36_BIT << ARM_LPAE_VTCR_PS_SHIFT);
+		vtcr->ps = ARM_LPAE_TCR_PS_36_BIT;
 		break;
 	case 40:
-		reg |= (ARM_LPAE_TCR_PS_40_BIT << ARM_LPAE_VTCR_PS_SHIFT);
+		vtcr->ps = ARM_LPAE_TCR_PS_40_BIT;
 		break;
 	case 42:
-		reg |= (ARM_LPAE_TCR_PS_42_BIT << ARM_LPAE_VTCR_PS_SHIFT);
+		vtcr->ps = ARM_LPAE_TCR_PS_42_BIT;
 		break;
 	case 44:
-		reg |= (ARM_LPAE_TCR_PS_44_BIT << ARM_LPAE_VTCR_PS_SHIFT);
+		vtcr->ps = ARM_LPAE_TCR_PS_44_BIT;
 		break;
 	case 48:
-		reg |= (ARM_LPAE_TCR_PS_48_BIT << ARM_LPAE_VTCR_PS_SHIFT);
+		vtcr->ps = ARM_LPAE_TCR_PS_48_BIT;
 		break;
 	case 52:
-		reg |= (ARM_LPAE_TCR_PS_52_BIT << ARM_LPAE_VTCR_PS_SHIFT);
+		vtcr->ps = ARM_LPAE_TCR_PS_52_BIT;
 		break;
 	default:
 		goto out_free_data;
 	}
 
-	reg |= (64ULL - cfg->ias) << ARM_LPAE_TCR_T0SZ_SHIFT;
-	reg |= (~sl & ARM_LPAE_VTCR_SL0_MASK) << ARM_LPAE_VTCR_SL0_SHIFT;
-	cfg->arm_lpae_s2_cfg.vtcr = reg;
+	vtcr->tsz = 64ULL - cfg->ias;
+	vtcr->sl = ~sl & ARM_LPAE_VTCR_SL0_MASK;
 
 	/* Allocate pgd pages */
 	data->pgd = __arm_lpae_alloc_pages(ARM_LPAE_PGD_SIZE(data),
@@ -984,24 +976,17 @@ arm_32_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 		return NULL;
 
 	cfg->pgsize_bitmap &= (SZ_4K | SZ_2M | SZ_1G);
-
 	return arm_64_lpae_alloc_pgtable_s1(cfg, cookie);
 }
 
 static struct io_pgtable *
 arm_32_lpae_alloc_pgtable_s2(struct io_pgtable_cfg *cfg, void *cookie)
 {
-	struct io_pgtable *iop;
-
 	if (cfg->ias > 40 || cfg->oas > 40)
 		return NULL;
 
 	cfg->pgsize_bitmap &= (SZ_4K | SZ_2M | SZ_1G);
-	iop = arm_64_lpae_alloc_pgtable_s2(cfg, cookie);
-	if (iop)
-		cfg->arm_lpae_s2_cfg.vtcr &= 0xffffffff;
-
-	return iop;
+	return arm_64_lpae_alloc_pgtable_s2(cfg, cookie);
 }
 
 static struct io_pgtable *
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 6ae104cedfd7..40c1b7745fb6 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -114,7 +114,15 @@ struct io_pgtable_cfg {
 
 		struct {
 			u64	vttbr;
-			u64	vtcr;
+			struct {
+				u32	ps:3;
+				u32	tg:2;
+				u32	sh:2;
+				u32	orgn:2;
+				u32	irgn:2;
+				u32	sl:2;
+				u32	tsz:6;
+			}	vtcr;
 		} arm_lpae_s2_cfg;
 
 		struct {
-- 
2.25.0.rc1.283.g88dfdc4193-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 8/8] iommu/io-pgtable-arm: Prepare for TTBR1 usage
  2020-01-10 15:28 ` Will Deacon
@ 2020-01-10 15:28   ` Will Deacon
  -1 siblings, 0 replies; 22+ messages in thread
From: Will Deacon @ 2020-01-10 15:28 UTC (permalink / raw)
  To: iommu; +Cc: Will Deacon, kernel-team, Robin Murphy, linux-arm-kernel

From: Robin Murphy <robin.murphy@arm.com>

Now that we can correctly extract top-level indices without relying on
the remaining upper bits being zero, the only remaining impediments to
using a given table for TTBR1 are the address validation on map/unmap
and the awkward TCR translation granule format. Add a quirk so that we
can do the right thing at those points.

Tested-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 drivers/iommu/io-pgtable-arm.c | 25 +++++++++++++++++++------
 include/linux/io-pgtable.h     |  4 ++++
 2 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 846963c87e0f..5e966ef0bacf 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -104,6 +104,10 @@
 #define ARM_LPAE_TCR_TG0_64K		1
 #define ARM_LPAE_TCR_TG0_16K		2
 
+#define ARM_LPAE_TCR_TG1_16K		1
+#define ARM_LPAE_TCR_TG1_4K		2
+#define ARM_LPAE_TCR_TG1_64K		3
+
 #define ARM_LPAE_TCR_SH_NS		0
 #define ARM_LPAE_TCR_SH_OS		2
 #define ARM_LPAE_TCR_SH_IS		3
@@ -463,6 +467,7 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
 	arm_lpae_iopte *ptep = data->pgd;
 	int ret, lvl = data->start_level;
 	arm_lpae_iopte prot;
+	long iaext = (long)iova >> cfg->ias;
 
 	/* If no access, then nothing to do */
 	if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
@@ -471,7 +476,9 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
 	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
 		return -EINVAL;
 
-	if (WARN_ON(iova >> data->iop.cfg.ias || paddr >> data->iop.cfg.oas))
+	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
+		iaext = ~iaext;
+	if (WARN_ON(iaext || paddr >> cfg->oas))
 		return -ERANGE;
 
 	prot = arm_lpae_prot_to_pte(data, iommu_prot);
@@ -637,11 +644,14 @@ static size_t arm_lpae_unmap(struct io_pgtable_ops *ops, unsigned long iova,
 	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
 	struct io_pgtable_cfg *cfg = &data->iop.cfg;
 	arm_lpae_iopte *ptep = data->pgd;
+	long iaext = (long)iova >> cfg->ias;
 
 	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
 		return 0;
 
-	if (WARN_ON(iova >> data->iop.cfg.ias))
+	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
+		iaext = ~iaext;
+	if (WARN_ON(iaext))
 		return 0;
 
 	return __arm_lpae_unmap(data, gather, iova, size, data->start_level, ptep);
@@ -777,9 +787,11 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 	u64 reg;
 	struct arm_lpae_io_pgtable *data;
 	typeof(&cfg->arm_lpae_s1_cfg.tcr) tcr = &cfg->arm_lpae_s1_cfg.tcr;
+	bool tg1;
 
 	if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS |
-			    IO_PGTABLE_QUIRK_NON_STRICT))
+			    IO_PGTABLE_QUIRK_NON_STRICT |
+			    IO_PGTABLE_QUIRK_ARM_TTBR1))
 		return NULL;
 
 	data = arm_lpae_alloc_pgtable(cfg);
@@ -797,15 +809,16 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 		tcr->orgn = ARM_LPAE_TCR_RGN_NC;
 	}
 
+	tg1 = cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1;
 	switch (ARM_LPAE_GRANULE(data)) {
 	case SZ_4K:
-		tcr->tg = ARM_LPAE_TCR_TG0_4K;
+		tcr->tg = tg1 ? ARM_LPAE_TCR_TG1_4K : ARM_LPAE_TCR_TG0_4K;
 		break;
 	case SZ_16K:
-		tcr->tg = ARM_LPAE_TCR_TG0_16K;
+		tcr->tg = tg1 ? ARM_LPAE_TCR_TG1_16K : ARM_LPAE_TCR_TG0_16K;
 		break;
 	case SZ_64K:
-		tcr->tg = ARM_LPAE_TCR_TG0_64K;
+		tcr->tg = tg1 ? ARM_LPAE_TCR_TG1_64K : ARM_LPAE_TCR_TG0_64K;
 		break;
 	}
 
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 40c1b7745fb6..53d53c6c2be9 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -83,12 +83,16 @@ struct io_pgtable_cfg {
 	 * IO_PGTABLE_QUIRK_NON_STRICT: Skip issuing synchronous leaf TLBIs
 	 *	on unmap, for DMA domains using the flush queue mechanism for
 	 *	delayed invalidation.
+	 *
+	 * IO_PGTABLE_QUIRK_ARM_TTBR1: (ARM LPAE format) Configure the table
+	 *	for use in the upper half of a split address space.
 	 */
 	#define IO_PGTABLE_QUIRK_ARM_NS		BIT(0)
 	#define IO_PGTABLE_QUIRK_NO_PERMS	BIT(1)
 	#define IO_PGTABLE_QUIRK_TLBI_ON_MAP	BIT(2)
 	#define IO_PGTABLE_QUIRK_ARM_MTK_EXT	BIT(3)
 	#define IO_PGTABLE_QUIRK_NON_STRICT	BIT(4)
+	#define IO_PGTABLE_QUIRK_ARM_TTBR1	BIT(5)
 	unsigned long			quirks;
 	unsigned long			pgsize_bitmap;
 	unsigned int			ias;
-- 
2.25.0.rc1.283.g88dfdc4193-goog

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 8/8] iommu/io-pgtable-arm: Prepare for TTBR1 usage
@ 2020-01-10 15:28   ` Will Deacon
  0 siblings, 0 replies; 22+ messages in thread
From: Will Deacon @ 2020-01-10 15:28 UTC (permalink / raw)
  To: iommu
  Cc: Will Deacon, Jordan Crouse, kernel-team, Robin Murphy, linux-arm-kernel

From: Robin Murphy <robin.murphy@arm.com>

Now that we can correctly extract top-level indices without relying on
the remaining upper bits being zero, the only remaining impediments to
using a given table for TTBR1 are the address validation on map/unmap
and the awkward TCR translation granule format. Add a quirk so that we
can do the right thing at those points.

Tested-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 drivers/iommu/io-pgtable-arm.c | 25 +++++++++++++++++++------
 include/linux/io-pgtable.h     |  4 ++++
 2 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 846963c87e0f..5e966ef0bacf 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -104,6 +104,10 @@
 #define ARM_LPAE_TCR_TG0_64K		1
 #define ARM_LPAE_TCR_TG0_16K		2
 
+#define ARM_LPAE_TCR_TG1_16K		1
+#define ARM_LPAE_TCR_TG1_4K		2
+#define ARM_LPAE_TCR_TG1_64K		3
+
 #define ARM_LPAE_TCR_SH_NS		0
 #define ARM_LPAE_TCR_SH_OS		2
 #define ARM_LPAE_TCR_SH_IS		3
@@ -463,6 +467,7 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
 	arm_lpae_iopte *ptep = data->pgd;
 	int ret, lvl = data->start_level;
 	arm_lpae_iopte prot;
+	long iaext = (long)iova >> cfg->ias;
 
 	/* If no access, then nothing to do */
 	if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
@@ -471,7 +476,9 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
 	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
 		return -EINVAL;
 
-	if (WARN_ON(iova >> data->iop.cfg.ias || paddr >> data->iop.cfg.oas))
+	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
+		iaext = ~iaext;
+	if (WARN_ON(iaext || paddr >> cfg->oas))
 		return -ERANGE;
 
 	prot = arm_lpae_prot_to_pte(data, iommu_prot);
@@ -637,11 +644,14 @@ static size_t arm_lpae_unmap(struct io_pgtable_ops *ops, unsigned long iova,
 	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
 	struct io_pgtable_cfg *cfg = &data->iop.cfg;
 	arm_lpae_iopte *ptep = data->pgd;
+	long iaext = (long)iova >> cfg->ias;
 
 	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
 		return 0;
 
-	if (WARN_ON(iova >> data->iop.cfg.ias))
+	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
+		iaext = ~iaext;
+	if (WARN_ON(iaext))
 		return 0;
 
 	return __arm_lpae_unmap(data, gather, iova, size, data->start_level, ptep);
@@ -777,9 +787,11 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 	u64 reg;
 	struct arm_lpae_io_pgtable *data;
 	typeof(&cfg->arm_lpae_s1_cfg.tcr) tcr = &cfg->arm_lpae_s1_cfg.tcr;
+	bool tg1;
 
 	if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS |
-			    IO_PGTABLE_QUIRK_NON_STRICT))
+			    IO_PGTABLE_QUIRK_NON_STRICT |
+			    IO_PGTABLE_QUIRK_ARM_TTBR1))
 		return NULL;
 
 	data = arm_lpae_alloc_pgtable(cfg);
@@ -797,15 +809,16 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 		tcr->orgn = ARM_LPAE_TCR_RGN_NC;
 	}
 
+	tg1 = cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1;
 	switch (ARM_LPAE_GRANULE(data)) {
 	case SZ_4K:
-		tcr->tg = ARM_LPAE_TCR_TG0_4K;
+		tcr->tg = tg1 ? ARM_LPAE_TCR_TG1_4K : ARM_LPAE_TCR_TG0_4K;
 		break;
 	case SZ_16K:
-		tcr->tg = ARM_LPAE_TCR_TG0_16K;
+		tcr->tg = tg1 ? ARM_LPAE_TCR_TG1_16K : ARM_LPAE_TCR_TG0_16K;
 		break;
 	case SZ_64K:
-		tcr->tg = ARM_LPAE_TCR_TG0_64K;
+		tcr->tg = tg1 ? ARM_LPAE_TCR_TG1_64K : ARM_LPAE_TCR_TG0_64K;
 		break;
 	}
 
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 40c1b7745fb6..53d53c6c2be9 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -83,12 +83,16 @@ struct io_pgtable_cfg {
 	 * IO_PGTABLE_QUIRK_NON_STRICT: Skip issuing synchronous leaf TLBIs
 	 *	on unmap, for DMA domains using the flush queue mechanism for
 	 *	delayed invalidation.
+	 *
+	 * IO_PGTABLE_QUIRK_ARM_TTBR1: (ARM LPAE format) Configure the table
+	 *	for use in the upper half of a split address space.
 	 */
 	#define IO_PGTABLE_QUIRK_ARM_NS		BIT(0)
 	#define IO_PGTABLE_QUIRK_NO_PERMS	BIT(1)
 	#define IO_PGTABLE_QUIRK_TLBI_ON_MAP	BIT(2)
 	#define IO_PGTABLE_QUIRK_ARM_MTK_EXT	BIT(3)
 	#define IO_PGTABLE_QUIRK_NON_STRICT	BIT(4)
+	#define IO_PGTABLE_QUIRK_ARM_TTBR1	BIT(5)
 	unsigned long			quirks;
 	unsigned long			pgsize_bitmap;
 	unsigned int			ias;
-- 
2.25.0.rc1.283.g88dfdc4193-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 8/8] iommu/io-pgtable-arm: Prepare for TTBR1 usage
  2020-01-10 15:28   ` Will Deacon
@ 2020-02-19 12:27     ` Stephan Gerhold
  -1 siblings, 0 replies; 22+ messages in thread
From: Stephan Gerhold @ 2020-02-19 12:27 UTC (permalink / raw)
  To: Will Deacon, Robin Murphy; +Cc: iommu, kernel-team, linux-arm-kernel

Hi Will, Hi Robin,

On Fri, Jan 10, 2020 at 03:28:52PM +0000, Will Deacon wrote:
> From: Robin Murphy <robin.murphy@arm.com>
> 
> Now that we can correctly extract top-level indices without relying on
> the remaining upper bits being zero, the only remaining impediments to
> using a given table for TTBR1 are the address validation on map/unmap
> and the awkward TCR translation granule format. Add a quirk so that we
> can do the right thing at those points.
> 
> Tested-by: Jordan Crouse <jcrouse@codeaurora.org>
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
>  drivers/iommu/io-pgtable-arm.c | 25 +++++++++++++++++++------
>  include/linux/io-pgtable.h     |  4 ++++
>  2 files changed, 23 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 846963c87e0f..5e966ef0bacf 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -104,6 +104,10 @@
>  #define ARM_LPAE_TCR_TG0_64K		1
>  #define ARM_LPAE_TCR_TG0_16K		2
>  
> +#define ARM_LPAE_TCR_TG1_16K		1
> +#define ARM_LPAE_TCR_TG1_4K		2
> +#define ARM_LPAE_TCR_TG1_64K		3
> +
>  #define ARM_LPAE_TCR_SH_NS		0
>  #define ARM_LPAE_TCR_SH_OS		2
>  #define ARM_LPAE_TCR_SH_IS		3
> @@ -463,6 +467,7 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
>  	arm_lpae_iopte *ptep = data->pgd;
>  	int ret, lvl = data->start_level;
>  	arm_lpae_iopte prot;
> +	long iaext = (long)iova >> cfg->ias;
>  
>  	/* If no access, then nothing to do */
>  	if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
> @@ -471,7 +476,9 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
>  	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
>  		return -EINVAL;
>  
> -	if (WARN_ON(iova >> data->iop.cfg.ias || paddr >> data->iop.cfg.oas))
> +	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
> +		iaext = ~iaext;
> +	if (WARN_ON(iaext || paddr >> cfg->oas))
>  		return -ERANGE;
>  
>  	prot = arm_lpae_prot_to_pte(data, iommu_prot);
> @@ -637,11 +644,14 @@ static size_t arm_lpae_unmap(struct io_pgtable_ops *ops, unsigned long iova,
>  	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
>  	struct io_pgtable_cfg *cfg = &data->iop.cfg;
>  	arm_lpae_iopte *ptep = data->pgd;
> +	long iaext = (long)iova >> cfg->ias;
>  
>  	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
>  		return 0;
>  
> -	if (WARN_ON(iova >> data->iop.cfg.ias))
> +	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
> +		iaext = ~iaext;
> +	if (WARN_ON(iaext))
>  		return 0;
>  
>  	return __arm_lpae_unmap(data, gather, iova, size, data->start_level, ptep);

This part of the patch seems to break io-pgtable-arm.c on ARM32 to some
extent. I have a device using qcom_iommu that could normally run ARM64,
but I'm unable to because of outdated (signed) firmware. :(
So it's running a lot of code that would normally run only on ARM64.

It used to work quite well but now qcom-venus fails to probe on:
    if (WARN_ON(iaext || paddr >> cfg->oas))
(I added some debug prints for clarity.)

    qcom-venus 1d00000.video-codec: Adding to iommu group 0
    arm-lpae io-pgtable: quirks: 0x0
    arm-lpae io-pgtable: arm_lpae_map: iova: 0xdd800000, paddr: 0xebe3f000, iaext: 0xffffffff, paddr >> oas: 0x0
    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 954 at drivers/iommu/io-pgtable-arm.c:487 arm_lpae_map+0xe4/0x510
    Hardware name: Generic DT based system
    ...
    [<c04bafb0>] (arm_lpae_map) from [<c04bcd6c>] (qcom_iommu_map+0x50/0x78)
    [<c04bcd6c>] (qcom_iommu_map) from [<c04b7290>] (__iommu_map+0xe8/0x1cc)
    [<c04b7290>] (__iommu_map) from [<c04b7bbc>] (__iommu_map_sg+0xa8/0x118)
    [<c04b7bbc>] (__iommu_map_sg) from [<c04b7c64>] (iommu_map_sg_atomic+0x18/0x20)
    [<c04b7c64>] (iommu_map_sg_atomic) from [<c04b94f8>] (iommu_dma_alloc+0x4dc/0x5a0)
    [<c04b94f8>] (iommu_dma_alloc) from [<c0196a50>] (dma_alloc_attrs+0x104/0x114)
    [<c0196a50>] (dma_alloc_attrs) from [<bf302c60>] (venus_hfi_create+0xa4/0x260 [venus_core])
    [<bf302c60>] (venus_hfi_create [venus_core]) from [<bf2fe6cc>] (venus_probe+0x1e4/0x328 [venus_core])
    [<bf2fe6cc>] (venus_probe [venus_core]) from [<c04c8eb4>] (platform_drv_probe+0x48/0x98)
    ...
    Exception stack(0xc2587fa8 to 0xc2587ff0)
    7fa0:                   b6f3dc70 b5051010 b5051010 0017388c b6e645b0 b6e645b0
    7fc0: b6f3dc70 b5051010 00020000 00000080 b6e30790 be84ea54 0046ac91 00000000
    7fe0: b6e75f1c be84e940 b6e5fb51 b6eec8a4
    ---[ end trace 2a0ed284f6d82f16 ]---
    qcom-venus: probe of 1d00000.video-codec failed with error -12

Note that iaext = 0xffffffff != 0.
It seems to be some int/long size problem that only happens with larger
iova addresses on 32-bit systems.

Without the (long) cast for iova everything is working correctly again:

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 983b08477e64..f7ecc763c706 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -468,7 +468,7 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
 	arm_lpae_iopte *ptep = data->pgd;
 	int ret, lvl = data->start_level;
 	arm_lpae_iopte prot;
-	long iaext = (long)iova >> cfg->ias;
+	long iaext = iova >> cfg->ias;
 
 	/* If no access, then nothing to do */
 	if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
@@ -645,7 +645,7 @@ static size_t arm_lpae_unmap(struct io_pgtable_ops *ops, unsigned long iova,
 	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
 	struct io_pgtable_cfg *cfg = &data->iop.cfg;
 	arm_lpae_iopte *ptep = data->pgd;
-	long iaext = (long)iova >> cfg->ias;
+	long iaext = iova >> cfg->ias;
 
 	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
 		return 0;

But I'm not sure if this is really the correct "fix"...

This problem can be also easily reproduced by enabling
IOMMU_IO_PGTABLE_LPAE_SELFTEST on ARM32.
(Shouldn't there be some system that runs these automatically? ;))

Without this patch all of them are passing:

    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 32
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 36
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 40
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 42
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 44
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 48
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 32
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 36
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 40
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 42
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 44
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 48
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 32
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 36
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 40
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 42
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 44
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 48
    arm-lpae io-pgtable: selftest: completed with 18 PASS 0 FAIL

But with this patch 6 of them are failing (with a similar warning as
I posted above):

    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 32
    ------------[ cut here ]------------
    WARNING: CPU: 3 PID: 1 at drivers/iommu/io-pgtable-arm.c:482 arm_lpae_map+0xa4/0x4e4
    Hardware name: Generic DT based system
    [<c01104ac>] (unwind_backtrace) from [<c010c0b4>] (show_stack+0x10/0x14)
    [<c010c0b4>] (show_stack) from [<c06cfb34>] (dump_stack+0x90/0xa4)
    [<c06cfb34>] (dump_stack) from [<c012d76c>] (__warn+0xb8/0xd4)
    [<c012d76c>] (__warn) from [<c012d7ec>] (warn_slowpath_fmt+0x64/0xbc)
    [<c012d7ec>] (warn_slowpath_fmt) from [<c04baf70>] (arm_lpae_map+0xa4/0x4e4)
    [<c04baf70>] (arm_lpae_map) from [<c0a1de3c>] (arm_lpae_do_selftests+0x234/0x688)
    [<c0a1de3c>] (arm_lpae_do_selftests) from [<c0102788>] (do_one_initcall+0x50/0x1b8)
    [<c0102788>] (do_one_initcall) from [<c0a0148c>] (kernel_init_freeable+0x1ec/0x25c)
    [<c0a0148c>] (kernel_init_freeable) from [<c06e6dc4>] (kernel_init+0x8/0x110)
    [<c06e6dc4>] (kernel_init) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
    Exception stack(0xee867fb0 to 0xee867ff8)
    7fa0:                                     00000000 00000000 00000000 00000000
    7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    7fe0: 00000000 00000000 00000000 00000000 00000013 00000000
    ---[ end trace 89e831c50a111e7c ]---
    ------------[ cut here ]------------
    WARNING: CPU: 3 PID: 1 at drivers/iommu/io-pgtable-arm.c:1182 arm_lpae_do_selftests+0x4b8/0x688
    selftest: test failed for fmt idx 0
    Hardware name: Generic DT based system
    [<c01104ac>] (unwind_backtrace) from [<c010c0b4>] (show_stack+0x10/0x14)
    [<c010c0b4>] (show_stack) from [<c06cfb34>] (dump_stack+0x90/0xa4)
    [<c06cfb34>] (dump_stack) from [<c012d76c>] (__warn+0xb8/0xd4)
    [<c012d76c>] (__warn) from [<c012d820>] (warn_slowpath_fmt+0x98/0xbc)
    [<c012d820>] (warn_slowpath_fmt) from [<c0a1e0c0>] (arm_lpae_do_selftests+0x4b8/0x688)
    [<c0a1e0c0>] (arm_lpae_do_selftests) from [<c0102788>] (do_one_initcall+0x50/0x1b8)
    [<c0102788>] (do_one_initcall) from [<c0a0148c>] (kernel_init_freeable+0x1ec/0x25c)
    [<c0a0148c>] (kernel_init_freeable) from [<c06e6dc4>] (kernel_init+0x8/0x110)
    [<c06e6dc4>] (kernel_init) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
    Exception stack(0xee867fb0 to 0xee867ff8)
    7fa0:                                     00000000 00000000 00000000 00000000
    7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    7fe0: 00000000 00000000 00000000 00000000 00000013 00000000
    ---[ end trace 89e831c50a111e7d ]---
    arm-lpae io-pgtable: cfg: pgsize_bitmap 0x40201000, ias 32-bit
    arm-lpae io-pgtable: data: 3 levels, 0x20 pgd_size, 12 pg_shift, 9 bits_per_level, pgd @ (ptrval)
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 36
    [warning as above]
    arm-lpae io-pgtable: cfg: pgsize_bitmap 0x40201000, ias 36-bit
    arm-lpae io-pgtable: data: 3 levels, 0x200 pgd_size, 12 pg_shift, 9 bits_per_level, pgd @ (ptrval)
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 40
    [warning as above]
    arm-lpae io-pgtable: cfg: pgsize_bitmap 0x40201000, ias 40-bit
    arm-lpae io-pgtable: data: 4 levels, 0x10 pgd_size, 12 pg_shift, 9 bits_per_level, pgd @ (ptrval)
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 42
    [warning as above]
    arm-lpae io-pgtable: cfg: pgsize_bitmap 0x40201000, ias 42-bit
    arm-lpae io-pgtable: data: 4 levels, 0x40 pgd_size, 12 pg_shift, 9 bits_per_level, pgd @ (ptrval)
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 44
    [warning as above]
    arm-lpae io-pgtable: cfg: pgsize_bitmap 0x40201000, ias 44-bit
    arm-lpae io-pgtable: data: 4 levels, 0x100 pgd_size, 12 pg_shift, 9 bits_per_level, pgd @ (ptrval)
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 48
    [warning as above]
    arm-lpae io-pgtable: cfg: pgsize_bitmap 0x40201000, ias 48-bit
    arm-lpae io-pgtable: data: 4 levels, 0x1000 pgd_size, 12 pg_shift, 9 bits_per_level, pgd @ (ptrval)
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 32
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 36
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 40
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 42
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 44
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 48
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 32
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 36
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 40
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 42
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 44
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 48
    arm-lpae io-pgtable: selftest: completed with 12 PASS 6 FAIL

Any suggestions how to fix this properly?

Thanks,
Stephan
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 8/8] iommu/io-pgtable-arm: Prepare for TTBR1 usage
@ 2020-02-19 12:27     ` Stephan Gerhold
  0 siblings, 0 replies; 22+ messages in thread
From: Stephan Gerhold @ 2020-02-19 12:27 UTC (permalink / raw)
  To: Will Deacon, Robin Murphy; +Cc: iommu, kernel-team, linux-arm-kernel

Hi Will, Hi Robin,

On Fri, Jan 10, 2020 at 03:28:52PM +0000, Will Deacon wrote:
> From: Robin Murphy <robin.murphy@arm.com>
> 
> Now that we can correctly extract top-level indices without relying on
> the remaining upper bits being zero, the only remaining impediments to
> using a given table for TTBR1 are the address validation on map/unmap
> and the awkward TCR translation granule format. Add a quirk so that we
> can do the right thing at those points.
> 
> Tested-by: Jordan Crouse <jcrouse@codeaurora.org>
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
>  drivers/iommu/io-pgtable-arm.c | 25 +++++++++++++++++++------
>  include/linux/io-pgtable.h     |  4 ++++
>  2 files changed, 23 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 846963c87e0f..5e966ef0bacf 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -104,6 +104,10 @@
>  #define ARM_LPAE_TCR_TG0_64K		1
>  #define ARM_LPAE_TCR_TG0_16K		2
>  
> +#define ARM_LPAE_TCR_TG1_16K		1
> +#define ARM_LPAE_TCR_TG1_4K		2
> +#define ARM_LPAE_TCR_TG1_64K		3
> +
>  #define ARM_LPAE_TCR_SH_NS		0
>  #define ARM_LPAE_TCR_SH_OS		2
>  #define ARM_LPAE_TCR_SH_IS		3
> @@ -463,6 +467,7 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
>  	arm_lpae_iopte *ptep = data->pgd;
>  	int ret, lvl = data->start_level;
>  	arm_lpae_iopte prot;
> +	long iaext = (long)iova >> cfg->ias;
>  
>  	/* If no access, then nothing to do */
>  	if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
> @@ -471,7 +476,9 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
>  	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
>  		return -EINVAL;
>  
> -	if (WARN_ON(iova >> data->iop.cfg.ias || paddr >> data->iop.cfg.oas))
> +	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
> +		iaext = ~iaext;
> +	if (WARN_ON(iaext || paddr >> cfg->oas))
>  		return -ERANGE;
>  
>  	prot = arm_lpae_prot_to_pte(data, iommu_prot);
> @@ -637,11 +644,14 @@ static size_t arm_lpae_unmap(struct io_pgtable_ops *ops, unsigned long iova,
>  	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
>  	struct io_pgtable_cfg *cfg = &data->iop.cfg;
>  	arm_lpae_iopte *ptep = data->pgd;
> +	long iaext = (long)iova >> cfg->ias;
>  
>  	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
>  		return 0;
>  
> -	if (WARN_ON(iova >> data->iop.cfg.ias))
> +	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
> +		iaext = ~iaext;
> +	if (WARN_ON(iaext))
>  		return 0;
>  
>  	return __arm_lpae_unmap(data, gather, iova, size, data->start_level, ptep);

This part of the patch seems to break io-pgtable-arm.c on ARM32 to some
extent. I have a device using qcom_iommu that could normally run ARM64,
but I'm unable to because of outdated (signed) firmware. :(
So it's running a lot of code that would normally run only on ARM64.

It used to work quite well but now qcom-venus fails to probe on:
    if (WARN_ON(iaext || paddr >> cfg->oas))
(I added some debug prints for clarity.)

    qcom-venus 1d00000.video-codec: Adding to iommu group 0
    arm-lpae io-pgtable: quirks: 0x0
    arm-lpae io-pgtable: arm_lpae_map: iova: 0xdd800000, paddr: 0xebe3f000, iaext: 0xffffffff, paddr >> oas: 0x0
    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 954 at drivers/iommu/io-pgtable-arm.c:487 arm_lpae_map+0xe4/0x510
    Hardware name: Generic DT based system
    ...
    [<c04bafb0>] (arm_lpae_map) from [<c04bcd6c>] (qcom_iommu_map+0x50/0x78)
    [<c04bcd6c>] (qcom_iommu_map) from [<c04b7290>] (__iommu_map+0xe8/0x1cc)
    [<c04b7290>] (__iommu_map) from [<c04b7bbc>] (__iommu_map_sg+0xa8/0x118)
    [<c04b7bbc>] (__iommu_map_sg) from [<c04b7c64>] (iommu_map_sg_atomic+0x18/0x20)
    [<c04b7c64>] (iommu_map_sg_atomic) from [<c04b94f8>] (iommu_dma_alloc+0x4dc/0x5a0)
    [<c04b94f8>] (iommu_dma_alloc) from [<c0196a50>] (dma_alloc_attrs+0x104/0x114)
    [<c0196a50>] (dma_alloc_attrs) from [<bf302c60>] (venus_hfi_create+0xa4/0x260 [venus_core])
    [<bf302c60>] (venus_hfi_create [venus_core]) from [<bf2fe6cc>] (venus_probe+0x1e4/0x328 [venus_core])
    [<bf2fe6cc>] (venus_probe [venus_core]) from [<c04c8eb4>] (platform_drv_probe+0x48/0x98)
    ...
    Exception stack(0xc2587fa8 to 0xc2587ff0)
    7fa0:                   b6f3dc70 b5051010 b5051010 0017388c b6e645b0 b6e645b0
    7fc0: b6f3dc70 b5051010 00020000 00000080 b6e30790 be84ea54 0046ac91 00000000
    7fe0: b6e75f1c be84e940 b6e5fb51 b6eec8a4
    ---[ end trace 2a0ed284f6d82f16 ]---
    qcom-venus: probe of 1d00000.video-codec failed with error -12

Note that iaext = 0xffffffff != 0.
It seems to be some int/long size problem that only happens with larger
iova addresses on 32-bit systems.

Without the (long) cast for iova everything is working correctly again:

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 983b08477e64..f7ecc763c706 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -468,7 +468,7 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
 	arm_lpae_iopte *ptep = data->pgd;
 	int ret, lvl = data->start_level;
 	arm_lpae_iopte prot;
-	long iaext = (long)iova >> cfg->ias;
+	long iaext = iova >> cfg->ias;
 
 	/* If no access, then nothing to do */
 	if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
@@ -645,7 +645,7 @@ static size_t arm_lpae_unmap(struct io_pgtable_ops *ops, unsigned long iova,
 	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
 	struct io_pgtable_cfg *cfg = &data->iop.cfg;
 	arm_lpae_iopte *ptep = data->pgd;
-	long iaext = (long)iova >> cfg->ias;
+	long iaext = iova >> cfg->ias;
 
 	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
 		return 0;

But I'm not sure if this is really the correct "fix"...

This problem can be also easily reproduced by enabling
IOMMU_IO_PGTABLE_LPAE_SELFTEST on ARM32.
(Shouldn't there be some system that runs these automatically? ;))

Without this patch all of them are passing:

    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 32
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 36
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 40
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 42
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 44
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 48
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 32
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 36
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 40
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 42
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 44
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 48
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 32
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 36
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 40
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 42
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 44
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 48
    arm-lpae io-pgtable: selftest: completed with 18 PASS 0 FAIL

But with this patch 6 of them are failing (with a similar warning as
I posted above):

    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 32
    ------------[ cut here ]------------
    WARNING: CPU: 3 PID: 1 at drivers/iommu/io-pgtable-arm.c:482 arm_lpae_map+0xa4/0x4e4
    Hardware name: Generic DT based system
    [<c01104ac>] (unwind_backtrace) from [<c010c0b4>] (show_stack+0x10/0x14)
    [<c010c0b4>] (show_stack) from [<c06cfb34>] (dump_stack+0x90/0xa4)
    [<c06cfb34>] (dump_stack) from [<c012d76c>] (__warn+0xb8/0xd4)
    [<c012d76c>] (__warn) from [<c012d7ec>] (warn_slowpath_fmt+0x64/0xbc)
    [<c012d7ec>] (warn_slowpath_fmt) from [<c04baf70>] (arm_lpae_map+0xa4/0x4e4)
    [<c04baf70>] (arm_lpae_map) from [<c0a1de3c>] (arm_lpae_do_selftests+0x234/0x688)
    [<c0a1de3c>] (arm_lpae_do_selftests) from [<c0102788>] (do_one_initcall+0x50/0x1b8)
    [<c0102788>] (do_one_initcall) from [<c0a0148c>] (kernel_init_freeable+0x1ec/0x25c)
    [<c0a0148c>] (kernel_init_freeable) from [<c06e6dc4>] (kernel_init+0x8/0x110)
    [<c06e6dc4>] (kernel_init) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
    Exception stack(0xee867fb0 to 0xee867ff8)
    7fa0:                                     00000000 00000000 00000000 00000000
    7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    7fe0: 00000000 00000000 00000000 00000000 00000013 00000000
    ---[ end trace 89e831c50a111e7c ]---
    ------------[ cut here ]------------
    WARNING: CPU: 3 PID: 1 at drivers/iommu/io-pgtable-arm.c:1182 arm_lpae_do_selftests+0x4b8/0x688
    selftest: test failed for fmt idx 0
    Hardware name: Generic DT based system
    [<c01104ac>] (unwind_backtrace) from [<c010c0b4>] (show_stack+0x10/0x14)
    [<c010c0b4>] (show_stack) from [<c06cfb34>] (dump_stack+0x90/0xa4)
    [<c06cfb34>] (dump_stack) from [<c012d76c>] (__warn+0xb8/0xd4)
    [<c012d76c>] (__warn) from [<c012d820>] (warn_slowpath_fmt+0x98/0xbc)
    [<c012d820>] (warn_slowpath_fmt) from [<c0a1e0c0>] (arm_lpae_do_selftests+0x4b8/0x688)
    [<c0a1e0c0>] (arm_lpae_do_selftests) from [<c0102788>] (do_one_initcall+0x50/0x1b8)
    [<c0102788>] (do_one_initcall) from [<c0a0148c>] (kernel_init_freeable+0x1ec/0x25c)
    [<c0a0148c>] (kernel_init_freeable) from [<c06e6dc4>] (kernel_init+0x8/0x110)
    [<c06e6dc4>] (kernel_init) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
    Exception stack(0xee867fb0 to 0xee867ff8)
    7fa0:                                     00000000 00000000 00000000 00000000
    7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    7fe0: 00000000 00000000 00000000 00000000 00000013 00000000
    ---[ end trace 89e831c50a111e7d ]---
    arm-lpae io-pgtable: cfg: pgsize_bitmap 0x40201000, ias 32-bit
    arm-lpae io-pgtable: data: 3 levels, 0x20 pgd_size, 12 pg_shift, 9 bits_per_level, pgd @ (ptrval)
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 36
    [warning as above]
    arm-lpae io-pgtable: cfg: pgsize_bitmap 0x40201000, ias 36-bit
    arm-lpae io-pgtable: data: 3 levels, 0x200 pgd_size, 12 pg_shift, 9 bits_per_level, pgd @ (ptrval)
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 40
    [warning as above]
    arm-lpae io-pgtable: cfg: pgsize_bitmap 0x40201000, ias 40-bit
    arm-lpae io-pgtable: data: 4 levels, 0x10 pgd_size, 12 pg_shift, 9 bits_per_level, pgd @ (ptrval)
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 42
    [warning as above]
    arm-lpae io-pgtable: cfg: pgsize_bitmap 0x40201000, ias 42-bit
    arm-lpae io-pgtable: data: 4 levels, 0x40 pgd_size, 12 pg_shift, 9 bits_per_level, pgd @ (ptrval)
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 44
    [warning as above]
    arm-lpae io-pgtable: cfg: pgsize_bitmap 0x40201000, ias 44-bit
    arm-lpae io-pgtable: data: 4 levels, 0x100 pgd_size, 12 pg_shift, 9 bits_per_level, pgd @ (ptrval)
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 48
    [warning as above]
    arm-lpae io-pgtable: cfg: pgsize_bitmap 0x40201000, ias 48-bit
    arm-lpae io-pgtable: data: 4 levels, 0x1000 pgd_size, 12 pg_shift, 9 bits_per_level, pgd @ (ptrval)
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 32
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 36
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 40
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 42
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 44
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 48
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 32
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 36
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 40
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 42
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 44
    arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 48
    arm-lpae io-pgtable: selftest: completed with 12 PASS 6 FAIL

Any suggestions how to fix this properly?

Thanks,
Stephan

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 8/8] iommu/io-pgtable-arm: Prepare for TTBR1 usage
  2020-02-19 12:27     ` Stephan Gerhold
@ 2020-02-19 17:51       ` Robin Murphy
  -1 siblings, 0 replies; 22+ messages in thread
From: Robin Murphy @ 2020-02-19 17:51 UTC (permalink / raw)
  To: Stephan Gerhold, Will Deacon; +Cc: iommu, kernel-team, linux-arm-kernel

On 19/02/2020 12:27 pm, Stephan Gerhold wrote:
> Hi Will, Hi Robin,
> 
> On Fri, Jan 10, 2020 at 03:28:52PM +0000, Will Deacon wrote:
>> From: Robin Murphy <robin.murphy@arm.com>
>>
>> Now that we can correctly extract top-level indices without relying on
>> the remaining upper bits being zero, the only remaining impediments to
>> using a given table for TTBR1 are the address validation on map/unmap
>> and the awkward TCR translation granule format. Add a quirk so that we
>> can do the right thing at those points.
>>
>> Tested-by: Jordan Crouse <jcrouse@codeaurora.org>
>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>> Signed-off-by: Will Deacon <will@kernel.org>
>> ---
>>   drivers/iommu/io-pgtable-arm.c | 25 +++++++++++++++++++------
>>   include/linux/io-pgtable.h     |  4 ++++
>>   2 files changed, 23 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
>> index 846963c87e0f..5e966ef0bacf 100644
>> --- a/drivers/iommu/io-pgtable-arm.c
>> +++ b/drivers/iommu/io-pgtable-arm.c
>> @@ -104,6 +104,10 @@
>>   #define ARM_LPAE_TCR_TG0_64K		1
>>   #define ARM_LPAE_TCR_TG0_16K		2
>>   
>> +#define ARM_LPAE_TCR_TG1_16K		1
>> +#define ARM_LPAE_TCR_TG1_4K		2
>> +#define ARM_LPAE_TCR_TG1_64K		3
>> +
>>   #define ARM_LPAE_TCR_SH_NS		0
>>   #define ARM_LPAE_TCR_SH_OS		2
>>   #define ARM_LPAE_TCR_SH_IS		3
>> @@ -463,6 +467,7 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
>>   	arm_lpae_iopte *ptep = data->pgd;
>>   	int ret, lvl = data->start_level;
>>   	arm_lpae_iopte prot;
>> +	long iaext = (long)iova >> cfg->ias;
>>   
>>   	/* If no access, then nothing to do */
>>   	if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
>> @@ -471,7 +476,9 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
>>   	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
>>   		return -EINVAL;
>>   
>> -	if (WARN_ON(iova >> data->iop.cfg.ias || paddr >> data->iop.cfg.oas))
>> +	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
>> +		iaext = ~iaext;
>> +	if (WARN_ON(iaext || paddr >> cfg->oas))
>>   		return -ERANGE;
>>   
>>   	prot = arm_lpae_prot_to_pte(data, iommu_prot);
>> @@ -637,11 +644,14 @@ static size_t arm_lpae_unmap(struct io_pgtable_ops *ops, unsigned long iova,
>>   	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
>>   	struct io_pgtable_cfg *cfg = &data->iop.cfg;
>>   	arm_lpae_iopte *ptep = data->pgd;
>> +	long iaext = (long)iova >> cfg->ias;
>>   
>>   	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
>>   		return 0;
>>   
>> -	if (WARN_ON(iova >> data->iop.cfg.ias))
>> +	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
>> +		iaext = ~iaext;
>> +	if (WARN_ON(iaext))
>>   		return 0;
>>   
>>   	return __arm_lpae_unmap(data, gather, iova, size, data->start_level, ptep);
> 
> This part of the patch seems to break io-pgtable-arm.c on ARM32 to some
> extent. I have a device using qcom_iommu that could normally run ARM64,
> but I'm unable to because of outdated (signed) firmware. :(
> So it's running a lot of code that would normally run only on ARM64.
> 
> It used to work quite well but now qcom-venus fails to probe on:
>      if (WARN_ON(iaext || paddr >> cfg->oas))
> (I added some debug prints for clarity.)
> 
>      qcom-venus 1d00000.video-codec: Adding to iommu group 0
>      arm-lpae io-pgtable: quirks: 0x0
>      arm-lpae io-pgtable: arm_lpae_map: iova: 0xdd800000, paddr: 0xebe3f000, iaext: 0xffffffff, paddr >> oas: 0x0
>      ------------[ cut here ]------------
>      WARNING: CPU: 0 PID: 954 at drivers/iommu/io-pgtable-arm.c:487 arm_lpae_map+0xe4/0x510
>      Hardware name: Generic DT based system
>      ...
>      [<c04bafb0>] (arm_lpae_map) from [<c04bcd6c>] (qcom_iommu_map+0x50/0x78)
>      [<c04bcd6c>] (qcom_iommu_map) from [<c04b7290>] (__iommu_map+0xe8/0x1cc)
>      [<c04b7290>] (__iommu_map) from [<c04b7bbc>] (__iommu_map_sg+0xa8/0x118)
>      [<c04b7bbc>] (__iommu_map_sg) from [<c04b7c64>] (iommu_map_sg_atomic+0x18/0x20)
>      [<c04b7c64>] (iommu_map_sg_atomic) from [<c04b94f8>] (iommu_dma_alloc+0x4dc/0x5a0)
>      [<c04b94f8>] (iommu_dma_alloc) from [<c0196a50>] (dma_alloc_attrs+0x104/0x114)
>      [<c0196a50>] (dma_alloc_attrs) from [<bf302c60>] (venus_hfi_create+0xa4/0x260 [venus_core])
>      [<bf302c60>] (venus_hfi_create [venus_core]) from [<bf2fe6cc>] (venus_probe+0x1e4/0x328 [venus_core])
>      [<bf2fe6cc>] (venus_probe [venus_core]) from [<c04c8eb4>] (platform_drv_probe+0x48/0x98)
>      ...
>      Exception stack(0xc2587fa8 to 0xc2587ff0)
>      7fa0:                   b6f3dc70 b5051010 b5051010 0017388c b6e645b0 b6e645b0
>      7fc0: b6f3dc70 b5051010 00020000 00000080 b6e30790 be84ea54 0046ac91 00000000
>      7fe0: b6e75f1c be84e940 b6e5fb51 b6eec8a4
>      ---[ end trace 2a0ed284f6d82f16 ]---
>      qcom-venus: probe of 1d00000.video-codec failed with error -12
> 
> Note that iaext = 0xffffffff != 0.
> It seems to be some int/long size problem that only happens with larger
> iova addresses on 32-bit systems.
> 
> Without the (long) cast for iova everything is working correctly again:
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 983b08477e64..f7ecc763c706 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -468,7 +468,7 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
>   	arm_lpae_iopte *ptep = data->pgd;
>   	int ret, lvl = data->start_level;
>   	arm_lpae_iopte prot;
> -	long iaext = (long)iova >> cfg->ias;
> +	long iaext = iova >> cfg->ias;
>   
>   	/* If no access, then nothing to do */
>   	if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
> @@ -645,7 +645,7 @@ static size_t arm_lpae_unmap(struct io_pgtable_ops *ops, unsigned long iova,
>   	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
>   	struct io_pgtable_cfg *cfg = &data->iop.cfg;
>   	arm_lpae_iopte *ptep = data->pgd;
> -	long iaext = (long)iova >> cfg->ias;
> +	long iaext = iova >> cfg->ias;
>   
>   	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
>   		return 0;
> 
> But I'm not sure if this is really the correct "fix"...

No, that would end up breaking the TTBR1 case. The quirk isn't intended 
to be supported on 32-bit platforms, but somehow it's slipped through 
that this sign-extension still affects them (and goes wrong to boot), 
sorry about that.

I think quick fixes would be either to replace the "(long)" cast with 
"(s64)(u64)", or to just elide the iaext check altogether when 32-bit 
longs make it impossible to pass an out-of-range IOVA anyway. I'll have 
a think and see what works out nicest.

> This problem can be also easily reproduced by enabling
> IOMMU_IO_PGTABLE_LPAE_SELFTEST on ARM32.
> (Shouldn't there be some system that runs these automatically? ;))
> 
> Without this patch all of them are passing:
> 
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 32
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 36
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 40
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 42
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 44
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 48
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 32
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 36
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 40
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 42
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 44
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 48
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 32
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 36
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 40
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 42
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 44
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 48
>      arm-lpae io-pgtable: selftest: completed with 18 PASS 0 FAIL

Hmm, we probably shouldn't be running IAS > 32 tests on 32-bit, given 
that such configs are effectively impossible to use.

Robin.

> 
> But with this patch 6 of them are failing (with a similar warning as
> I posted above):
> 
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 32
>      ------------[ cut here ]------------
>      WARNING: CPU: 3 PID: 1 at drivers/iommu/io-pgtable-arm.c:482 arm_lpae_map+0xa4/0x4e4
>      Hardware name: Generic DT based system
>      [<c01104ac>] (unwind_backtrace) from [<c010c0b4>] (show_stack+0x10/0x14)
>      [<c010c0b4>] (show_stack) from [<c06cfb34>] (dump_stack+0x90/0xa4)
>      [<c06cfb34>] (dump_stack) from [<c012d76c>] (__warn+0xb8/0xd4)
>      [<c012d76c>] (__warn) from [<c012d7ec>] (warn_slowpath_fmt+0x64/0xbc)
>      [<c012d7ec>] (warn_slowpath_fmt) from [<c04baf70>] (arm_lpae_map+0xa4/0x4e4)
>      [<c04baf70>] (arm_lpae_map) from [<c0a1de3c>] (arm_lpae_do_selftests+0x234/0x688)
>      [<c0a1de3c>] (arm_lpae_do_selftests) from [<c0102788>] (do_one_initcall+0x50/0x1b8)
>      [<c0102788>] (do_one_initcall) from [<c0a0148c>] (kernel_init_freeable+0x1ec/0x25c)
>      [<c0a0148c>] (kernel_init_freeable) from [<c06e6dc4>] (kernel_init+0x8/0x110)
>      [<c06e6dc4>] (kernel_init) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
>      Exception stack(0xee867fb0 to 0xee867ff8)
>      7fa0:                                     00000000 00000000 00000000 00000000
>      7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>      7fe0: 00000000 00000000 00000000 00000000 00000013 00000000
>      ---[ end trace 89e831c50a111e7c ]---
>      ------------[ cut here ]------------
>      WARNING: CPU: 3 PID: 1 at drivers/iommu/io-pgtable-arm.c:1182 arm_lpae_do_selftests+0x4b8/0x688
>      selftest: test failed for fmt idx 0
>      Hardware name: Generic DT based system
>      [<c01104ac>] (unwind_backtrace) from [<c010c0b4>] (show_stack+0x10/0x14)
>      [<c010c0b4>] (show_stack) from [<c06cfb34>] (dump_stack+0x90/0xa4)
>      [<c06cfb34>] (dump_stack) from [<c012d76c>] (__warn+0xb8/0xd4)
>      [<c012d76c>] (__warn) from [<c012d820>] (warn_slowpath_fmt+0x98/0xbc)
>      [<c012d820>] (warn_slowpath_fmt) from [<c0a1e0c0>] (arm_lpae_do_selftests+0x4b8/0x688)
>      [<c0a1e0c0>] (arm_lpae_do_selftests) from [<c0102788>] (do_one_initcall+0x50/0x1b8)
>      [<c0102788>] (do_one_initcall) from [<c0a0148c>] (kernel_init_freeable+0x1ec/0x25c)
>      [<c0a0148c>] (kernel_init_freeable) from [<c06e6dc4>] (kernel_init+0x8/0x110)
>      [<c06e6dc4>] (kernel_init) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
>      Exception stack(0xee867fb0 to 0xee867ff8)
>      7fa0:                                     00000000 00000000 00000000 00000000
>      7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>      7fe0: 00000000 00000000 00000000 00000000 00000013 00000000
>      ---[ end trace 89e831c50a111e7d ]---
>      arm-lpae io-pgtable: cfg: pgsize_bitmap 0x40201000, ias 32-bit
>      arm-lpae io-pgtable: data: 3 levels, 0x20 pgd_size, 12 pg_shift, 9 bits_per_level, pgd @ (ptrval)
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 36
>      [warning as above]
>      arm-lpae io-pgtable: cfg: pgsize_bitmap 0x40201000, ias 36-bit
>      arm-lpae io-pgtable: data: 3 levels, 0x200 pgd_size, 12 pg_shift, 9 bits_per_level, pgd @ (ptrval)
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 40
>      [warning as above]
>      arm-lpae io-pgtable: cfg: pgsize_bitmap 0x40201000, ias 40-bit
>      arm-lpae io-pgtable: data: 4 levels, 0x10 pgd_size, 12 pg_shift, 9 bits_per_level, pgd @ (ptrval)
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 42
>      [warning as above]
>      arm-lpae io-pgtable: cfg: pgsize_bitmap 0x40201000, ias 42-bit
>      arm-lpae io-pgtable: data: 4 levels, 0x40 pgd_size, 12 pg_shift, 9 bits_per_level, pgd @ (ptrval)
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 44
>      [warning as above]
>      arm-lpae io-pgtable: cfg: pgsize_bitmap 0x40201000, ias 44-bit
>      arm-lpae io-pgtable: data: 4 levels, 0x100 pgd_size, 12 pg_shift, 9 bits_per_level, pgd @ (ptrval)
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 48
>      [warning as above]
>      arm-lpae io-pgtable: cfg: pgsize_bitmap 0x40201000, ias 48-bit
>      arm-lpae io-pgtable: data: 4 levels, 0x1000 pgd_size, 12 pg_shift, 9 bits_per_level, pgd @ (ptrval)
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 32
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 36
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 40
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 42
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 44
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 48
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 32
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 36
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 40
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 42
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 44
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 48
>      arm-lpae io-pgtable: selftest: completed with 12 PASS 6 FAIL
> 
> Any suggestions how to fix this properly?
> 
> Thanks,
> Stephan
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 8/8] iommu/io-pgtable-arm: Prepare for TTBR1 usage
@ 2020-02-19 17:51       ` Robin Murphy
  0 siblings, 0 replies; 22+ messages in thread
From: Robin Murphy @ 2020-02-19 17:51 UTC (permalink / raw)
  To: Stephan Gerhold, Will Deacon; +Cc: iommu, kernel-team, linux-arm-kernel

On 19/02/2020 12:27 pm, Stephan Gerhold wrote:
> Hi Will, Hi Robin,
> 
> On Fri, Jan 10, 2020 at 03:28:52PM +0000, Will Deacon wrote:
>> From: Robin Murphy <robin.murphy@arm.com>
>>
>> Now that we can correctly extract top-level indices without relying on
>> the remaining upper bits being zero, the only remaining impediments to
>> using a given table for TTBR1 are the address validation on map/unmap
>> and the awkward TCR translation granule format. Add a quirk so that we
>> can do the right thing at those points.
>>
>> Tested-by: Jordan Crouse <jcrouse@codeaurora.org>
>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>> Signed-off-by: Will Deacon <will@kernel.org>
>> ---
>>   drivers/iommu/io-pgtable-arm.c | 25 +++++++++++++++++++------
>>   include/linux/io-pgtable.h     |  4 ++++
>>   2 files changed, 23 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
>> index 846963c87e0f..5e966ef0bacf 100644
>> --- a/drivers/iommu/io-pgtable-arm.c
>> +++ b/drivers/iommu/io-pgtable-arm.c
>> @@ -104,6 +104,10 @@
>>   #define ARM_LPAE_TCR_TG0_64K		1
>>   #define ARM_LPAE_TCR_TG0_16K		2
>>   
>> +#define ARM_LPAE_TCR_TG1_16K		1
>> +#define ARM_LPAE_TCR_TG1_4K		2
>> +#define ARM_LPAE_TCR_TG1_64K		3
>> +
>>   #define ARM_LPAE_TCR_SH_NS		0
>>   #define ARM_LPAE_TCR_SH_OS		2
>>   #define ARM_LPAE_TCR_SH_IS		3
>> @@ -463,6 +467,7 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
>>   	arm_lpae_iopte *ptep = data->pgd;
>>   	int ret, lvl = data->start_level;
>>   	arm_lpae_iopte prot;
>> +	long iaext = (long)iova >> cfg->ias;
>>   
>>   	/* If no access, then nothing to do */
>>   	if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
>> @@ -471,7 +476,9 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
>>   	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
>>   		return -EINVAL;
>>   
>> -	if (WARN_ON(iova >> data->iop.cfg.ias || paddr >> data->iop.cfg.oas))
>> +	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
>> +		iaext = ~iaext;
>> +	if (WARN_ON(iaext || paddr >> cfg->oas))
>>   		return -ERANGE;
>>   
>>   	prot = arm_lpae_prot_to_pte(data, iommu_prot);
>> @@ -637,11 +644,14 @@ static size_t arm_lpae_unmap(struct io_pgtable_ops *ops, unsigned long iova,
>>   	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
>>   	struct io_pgtable_cfg *cfg = &data->iop.cfg;
>>   	arm_lpae_iopte *ptep = data->pgd;
>> +	long iaext = (long)iova >> cfg->ias;
>>   
>>   	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
>>   		return 0;
>>   
>> -	if (WARN_ON(iova >> data->iop.cfg.ias))
>> +	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
>> +		iaext = ~iaext;
>> +	if (WARN_ON(iaext))
>>   		return 0;
>>   
>>   	return __arm_lpae_unmap(data, gather, iova, size, data->start_level, ptep);
> 
> This part of the patch seems to break io-pgtable-arm.c on ARM32 to some
> extent. I have a device using qcom_iommu that could normally run ARM64,
> but I'm unable to because of outdated (signed) firmware. :(
> So it's running a lot of code that would normally run only on ARM64.
> 
> It used to work quite well but now qcom-venus fails to probe on:
>      if (WARN_ON(iaext || paddr >> cfg->oas))
> (I added some debug prints for clarity.)
> 
>      qcom-venus 1d00000.video-codec: Adding to iommu group 0
>      arm-lpae io-pgtable: quirks: 0x0
>      arm-lpae io-pgtable: arm_lpae_map: iova: 0xdd800000, paddr: 0xebe3f000, iaext: 0xffffffff, paddr >> oas: 0x0
>      ------------[ cut here ]------------
>      WARNING: CPU: 0 PID: 954 at drivers/iommu/io-pgtable-arm.c:487 arm_lpae_map+0xe4/0x510
>      Hardware name: Generic DT based system
>      ...
>      [<c04bafb0>] (arm_lpae_map) from [<c04bcd6c>] (qcom_iommu_map+0x50/0x78)
>      [<c04bcd6c>] (qcom_iommu_map) from [<c04b7290>] (__iommu_map+0xe8/0x1cc)
>      [<c04b7290>] (__iommu_map) from [<c04b7bbc>] (__iommu_map_sg+0xa8/0x118)
>      [<c04b7bbc>] (__iommu_map_sg) from [<c04b7c64>] (iommu_map_sg_atomic+0x18/0x20)
>      [<c04b7c64>] (iommu_map_sg_atomic) from [<c04b94f8>] (iommu_dma_alloc+0x4dc/0x5a0)
>      [<c04b94f8>] (iommu_dma_alloc) from [<c0196a50>] (dma_alloc_attrs+0x104/0x114)
>      [<c0196a50>] (dma_alloc_attrs) from [<bf302c60>] (venus_hfi_create+0xa4/0x260 [venus_core])
>      [<bf302c60>] (venus_hfi_create [venus_core]) from [<bf2fe6cc>] (venus_probe+0x1e4/0x328 [venus_core])
>      [<bf2fe6cc>] (venus_probe [venus_core]) from [<c04c8eb4>] (platform_drv_probe+0x48/0x98)
>      ...
>      Exception stack(0xc2587fa8 to 0xc2587ff0)
>      7fa0:                   b6f3dc70 b5051010 b5051010 0017388c b6e645b0 b6e645b0
>      7fc0: b6f3dc70 b5051010 00020000 00000080 b6e30790 be84ea54 0046ac91 00000000
>      7fe0: b6e75f1c be84e940 b6e5fb51 b6eec8a4
>      ---[ end trace 2a0ed284f6d82f16 ]---
>      qcom-venus: probe of 1d00000.video-codec failed with error -12
> 
> Note that iaext = 0xffffffff != 0.
> It seems to be some int/long size problem that only happens with larger
> iova addresses on 32-bit systems.
> 
> Without the (long) cast for iova everything is working correctly again:
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 983b08477e64..f7ecc763c706 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -468,7 +468,7 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
>   	arm_lpae_iopte *ptep = data->pgd;
>   	int ret, lvl = data->start_level;
>   	arm_lpae_iopte prot;
> -	long iaext = (long)iova >> cfg->ias;
> +	long iaext = iova >> cfg->ias;
>   
>   	/* If no access, then nothing to do */
>   	if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
> @@ -645,7 +645,7 @@ static size_t arm_lpae_unmap(struct io_pgtable_ops *ops, unsigned long iova,
>   	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
>   	struct io_pgtable_cfg *cfg = &data->iop.cfg;
>   	arm_lpae_iopte *ptep = data->pgd;
> -	long iaext = (long)iova >> cfg->ias;
> +	long iaext = iova >> cfg->ias;
>   
>   	if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
>   		return 0;
> 
> But I'm not sure if this is really the correct "fix"...

No, that would end up breaking the TTBR1 case. The quirk isn't intended 
to be supported on 32-bit platforms, but somehow it's slipped through 
that this sign-extension still affects them (and goes wrong to boot), 
sorry about that.

I think quick fixes would be either to replace the "(long)" cast with 
"(s64)(u64)", or to just elide the iaext check altogether when 32-bit 
longs make it impossible to pass an out-of-range IOVA anyway. I'll have 
a think and see what works out nicest.

> This problem can be also easily reproduced by enabling
> IOMMU_IO_PGTABLE_LPAE_SELFTEST on ARM32.
> (Shouldn't there be some system that runs these automatically? ;))
> 
> Without this patch all of them are passing:
> 
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 32
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 36
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 40
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 42
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 44
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 48
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 32
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 36
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 40
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 42
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 44
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 48
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 32
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 36
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 40
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 42
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 44
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 48
>      arm-lpae io-pgtable: selftest: completed with 18 PASS 0 FAIL

Hmm, we probably shouldn't be running IAS > 32 tests on 32-bit, given 
that such configs are effectively impossible to use.

Robin.

> 
> But with this patch 6 of them are failing (with a similar warning as
> I posted above):
> 
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 32
>      ------------[ cut here ]------------
>      WARNING: CPU: 3 PID: 1 at drivers/iommu/io-pgtable-arm.c:482 arm_lpae_map+0xa4/0x4e4
>      Hardware name: Generic DT based system
>      [<c01104ac>] (unwind_backtrace) from [<c010c0b4>] (show_stack+0x10/0x14)
>      [<c010c0b4>] (show_stack) from [<c06cfb34>] (dump_stack+0x90/0xa4)
>      [<c06cfb34>] (dump_stack) from [<c012d76c>] (__warn+0xb8/0xd4)
>      [<c012d76c>] (__warn) from [<c012d7ec>] (warn_slowpath_fmt+0x64/0xbc)
>      [<c012d7ec>] (warn_slowpath_fmt) from [<c04baf70>] (arm_lpae_map+0xa4/0x4e4)
>      [<c04baf70>] (arm_lpae_map) from [<c0a1de3c>] (arm_lpae_do_selftests+0x234/0x688)
>      [<c0a1de3c>] (arm_lpae_do_selftests) from [<c0102788>] (do_one_initcall+0x50/0x1b8)
>      [<c0102788>] (do_one_initcall) from [<c0a0148c>] (kernel_init_freeable+0x1ec/0x25c)
>      [<c0a0148c>] (kernel_init_freeable) from [<c06e6dc4>] (kernel_init+0x8/0x110)
>      [<c06e6dc4>] (kernel_init) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
>      Exception stack(0xee867fb0 to 0xee867ff8)
>      7fa0:                                     00000000 00000000 00000000 00000000
>      7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>      7fe0: 00000000 00000000 00000000 00000000 00000013 00000000
>      ---[ end trace 89e831c50a111e7c ]---
>      ------------[ cut here ]------------
>      WARNING: CPU: 3 PID: 1 at drivers/iommu/io-pgtable-arm.c:1182 arm_lpae_do_selftests+0x4b8/0x688
>      selftest: test failed for fmt idx 0
>      Hardware name: Generic DT based system
>      [<c01104ac>] (unwind_backtrace) from [<c010c0b4>] (show_stack+0x10/0x14)
>      [<c010c0b4>] (show_stack) from [<c06cfb34>] (dump_stack+0x90/0xa4)
>      [<c06cfb34>] (dump_stack) from [<c012d76c>] (__warn+0xb8/0xd4)
>      [<c012d76c>] (__warn) from [<c012d820>] (warn_slowpath_fmt+0x98/0xbc)
>      [<c012d820>] (warn_slowpath_fmt) from [<c0a1e0c0>] (arm_lpae_do_selftests+0x4b8/0x688)
>      [<c0a1e0c0>] (arm_lpae_do_selftests) from [<c0102788>] (do_one_initcall+0x50/0x1b8)
>      [<c0102788>] (do_one_initcall) from [<c0a0148c>] (kernel_init_freeable+0x1ec/0x25c)
>      [<c0a0148c>] (kernel_init_freeable) from [<c06e6dc4>] (kernel_init+0x8/0x110)
>      [<c06e6dc4>] (kernel_init) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
>      Exception stack(0xee867fb0 to 0xee867ff8)
>      7fa0:                                     00000000 00000000 00000000 00000000
>      7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>      7fe0: 00000000 00000000 00000000 00000000 00000013 00000000
>      ---[ end trace 89e831c50a111e7d ]---
>      arm-lpae io-pgtable: cfg: pgsize_bitmap 0x40201000, ias 32-bit
>      arm-lpae io-pgtable: data: 3 levels, 0x20 pgd_size, 12 pg_shift, 9 bits_per_level, pgd @ (ptrval)
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 36
>      [warning as above]
>      arm-lpae io-pgtable: cfg: pgsize_bitmap 0x40201000, ias 36-bit
>      arm-lpae io-pgtable: data: 3 levels, 0x200 pgd_size, 12 pg_shift, 9 bits_per_level, pgd @ (ptrval)
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 40
>      [warning as above]
>      arm-lpae io-pgtable: cfg: pgsize_bitmap 0x40201000, ias 40-bit
>      arm-lpae io-pgtable: data: 4 levels, 0x10 pgd_size, 12 pg_shift, 9 bits_per_level, pgd @ (ptrval)
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 42
>      [warning as above]
>      arm-lpae io-pgtable: cfg: pgsize_bitmap 0x40201000, ias 42-bit
>      arm-lpae io-pgtable: data: 4 levels, 0x40 pgd_size, 12 pg_shift, 9 bits_per_level, pgd @ (ptrval)
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 44
>      [warning as above]
>      arm-lpae io-pgtable: cfg: pgsize_bitmap 0x40201000, ias 44-bit
>      arm-lpae io-pgtable: data: 4 levels, 0x100 pgd_size, 12 pg_shift, 9 bits_per_level, pgd @ (ptrval)
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 48
>      [warning as above]
>      arm-lpae io-pgtable: cfg: pgsize_bitmap 0x40201000, ias 48-bit
>      arm-lpae io-pgtable: data: 4 levels, 0x1000 pgd_size, 12 pg_shift, 9 bits_per_level, pgd @ (ptrval)
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 32
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 36
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 40
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 42
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 44
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 48
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 32
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 36
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 40
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 42
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 44
>      arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 48
>      arm-lpae io-pgtable: selftest: completed with 12 PASS 6 FAIL
> 
> Any suggestions how to fix this properly?
> 
> Thanks,
> Stephan
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2020-02-19 17:51 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-10 15:28 [PATCH 0/8] Finish off the split page table prep work Will Deacon
2020-01-10 15:28 ` Will Deacon
2020-01-10 15:28 ` [PATCH 1/8] iommu/io-pgtable-arm: Rationalise TTBRn handling Will Deacon
2020-01-10 15:28   ` Will Deacon
2020-01-10 15:28 ` [PATCH 2/8] iommu/io-pgtable-arm: Support non-coherent stage-2 page tables Will Deacon
2020-01-10 15:28   ` Will Deacon
2020-01-10 15:28 ` [PATCH 3/8] iommu/io-pgtable-arm: Ensure non-cacheable mappings are Outer Shareable Will Deacon
2020-01-10 15:28   ` Will Deacon
2020-01-10 15:28 ` [PATCH 4/8] iommu/io-pgtable-arm: Ensure ARM_64_LPAE_S2_TCR_RES1 is unsigned Will Deacon
2020-01-10 15:28   ` Will Deacon
2020-01-10 15:28 ` [PATCH 5/8] iommu/io-pgtable-arm: Rationalise TCR handling Will Deacon
2020-01-10 15:28   ` Will Deacon
2020-01-10 15:28 ` [PATCH 6/8] iommu/arm-smmu: Rename public #defines under ARM_SMMU_ namespace Will Deacon
2020-01-10 15:28   ` Will Deacon
2020-01-10 15:28 ` [PATCH 7/8] iommu/io-pgtable-arm: Rationalise VTCR handling Will Deacon
2020-01-10 15:28   ` Will Deacon
2020-01-10 15:28 ` [PATCH 8/8] iommu/io-pgtable-arm: Prepare for TTBR1 usage Will Deacon
2020-01-10 15:28   ` Will Deacon
2020-02-19 12:27   ` Stephan Gerhold
2020-02-19 12:27     ` Stephan Gerhold
2020-02-19 17:51     ` Robin Murphy
2020-02-19 17:51       ` Robin Murphy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.