All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/19] Update SMMUv3 to the modern iommu API (part 1/3)
@ 2023-11-13 17:53 ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

The SMMUv3 driver was originally written in 2015 when the iommu driver
facing API looked quite different. The API has evolved, especially lately,
and the driver has fallen behind.

This work aims to bring make the SMMUv3 driver the best IOMMU driver with
the most comprehensive implementation of the API. After all parts it
addresses:

 - Global static BLOCKED and IDENTITY domains with 'never fail' attach
   semantics. BLOCKED is desired for efficient VFIO.

 - Support map before attach for PAGING iommu_domains.

 - attach_dev failure does not change the HW configuration.

 - Fully hitless transitions between IDENTITY -> DMA -> IDENTITY.
   The API has IOMMU_RESV_DIRECT which is expected to be
   continuously translating.

 - Safe transitions between PAGING -> BLOCKED, do not ever temporarily
   do IDENTITY. This is required for iommufd security.

 - Full PASID API support including:
    - S1/SVA domains attached to PASIDs
    - IDENTITY/BLOCKED/S1 attached to RID
    - Change of the RID domain while PASIDs are attached

 - Streamlined SVA support using the core infrastructure

 - Hitless, whenever possible, change between two domains

 - iommufd IOMMU_GET_HW_INFO, IOMMU_HWPT_ALLOC_NEST_PARENT, and
   IOMMU_DOMAIN_NESTED support

Over all these things are going to become more accessible to iommufd, and
exposed to VMs, so it is important for the driver to have a robust
implementation of the API.

The work is split into three parts, with this part largely focusing on the
STE and building up to the BLOCKED & IDENTITY global static domains.

The second part largely focuses on the CD and builds up to having a common
PASID infrastructure that SVA and S1 domains equally use.

The third part has some random cleanups and the iommufd related parts.

Overall this takes the approach of turning the STE/CD programming upside
down where the CD/STE value is computed right at a driver callback
function and then pushed down into programming logic. The programming
logic hides the details of the required CD/STE tear-less update. This
makes the CD/STE functions independent of the arm_smmu_domain which makes
it fairly straightforward to untangle all the different call chains, and
add news ones.

Further, this frees the arm_smmu_domain related logic from keeping track
of what state the STE/CD is currently in so it can carefully sequence the
correct update. There are many new update pairs that are subtly introduced
as the work progresses.

The locking to support BTM via arm_smmu_asid_lock is a bit subtle right
now and patches throughout this work adjust and tighten this so that it is
clearer and doesn't get broken.

Once the lower STE layers no longer need to touch arm_smmu_domain we can
isolate struct arm_smmu_domain to be only used for PAGING domains, audit
all the to_smmu_domain() calls to be only in PAGING domain ops, and
introduce the normal global static BLOCKED/IDENTITY domains using the new
STE infrastructure. Part 2 will ultimately migrate SVA over to use
arm_smmu_domain as well.

All parts are on github:

 https://github.com/jgunthorpe/linux/commits/smmuv3_newapi

v2:
 - Rebased on v6.7-rc1
 - Improve the comment for arm_smmu_write_entry_step()
 - Fix the botched memcmp
 - Document the spec justification for the SHCFG exclusion in used
 - Include STRTAB_STE_1_SHCFG for STRTAB_STE_0_CFG_S2_TRANS in used
 - WARN_ON for unknown STEs in used
 - Fix error unwind in arm_smmu_attach_dev()
 - Whitespace, spelling, and checkpatch related items
v1: https://lore.kernel.org/r/0-v1-e289ca9121be+2be-smmuv3_newapi_p1_jgg@nvidia.com

Jason Gunthorpe (19):
  iommu/arm-smmu-v3: Add a type for the STE
  iommu/arm-smmu-v3: Master cannot be NULL in
    arm_smmu_write_strtab_ent()
  iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED
  iommu/arm-smmu-v3: Make STE programming independent of the callers
  iommu/arm-smmu-v3: Consolidate the STE generation for abort/bypass
  iommu/arm-smmu-v3: Move arm_smmu_rmr_install_bypass_ste()
  iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into
    functions
  iommu/arm-smmu-v3: Build the whole STE in
    arm_smmu_make_s2_domain_ste()
  iommu/arm-smmu-v3: Hold arm_smmu_asid_lock during all of attach_dev
  iommu/arm-smmu-v3: Compute the STE only once for each master
  iommu/arm-smmu-v3: Do not change the STE twice during
    arm_smmu_attach_dev()
  iommu/arm-smmu-v3: Put writing the context descriptor in the right
    order
  iommu/arm-smmu-v3: Pass smmu_domain to arm_enable/disable_ats()
  iommu/arm-smmu-v3: Remove arm_smmu_master->domain
  iommu/arm-smmu-v3: Add a global static IDENTITY domain
  iommu/arm-smmu-v3: Add a global static BLOCKED domain
  iommu/arm-smmu-v3: Use the identity/blocked domain during release
  iommu/arm-smmu-v3: Pass arm_smmu_domain and arm_smmu_device to
    finalize
  iommu/arm-smmu-v3: Convert to domain_alloc_paging()

 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 720 +++++++++++++-------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  12 +-
 2 files changed, 472 insertions(+), 260 deletions(-)


base-commit: ca7fcaff577c92d85f0e05cc7be79759155fe328
-- 
2.42.0


^ permalink raw reply	[flat|nested] 158+ messages in thread

* [PATCH v2 00/19] Update SMMUv3 to the modern iommu API (part 1/3)
@ 2023-11-13 17:53 ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

The SMMUv3 driver was originally written in 2015 when the iommu driver
facing API looked quite different. The API has evolved, especially lately,
and the driver has fallen behind.

This work aims to bring make the SMMUv3 driver the best IOMMU driver with
the most comprehensive implementation of the API. After all parts it
addresses:

 - Global static BLOCKED and IDENTITY domains with 'never fail' attach
   semantics. BLOCKED is desired for efficient VFIO.

 - Support map before attach for PAGING iommu_domains.

 - attach_dev failure does not change the HW configuration.

 - Fully hitless transitions between IDENTITY -> DMA -> IDENTITY.
   The API has IOMMU_RESV_DIRECT which is expected to be
   continuously translating.

 - Safe transitions between PAGING -> BLOCKED, do not ever temporarily
   do IDENTITY. This is required for iommufd security.

 - Full PASID API support including:
    - S1/SVA domains attached to PASIDs
    - IDENTITY/BLOCKED/S1 attached to RID
    - Change of the RID domain while PASIDs are attached

 - Streamlined SVA support using the core infrastructure

 - Hitless, whenever possible, change between two domains

 - iommufd IOMMU_GET_HW_INFO, IOMMU_HWPT_ALLOC_NEST_PARENT, and
   IOMMU_DOMAIN_NESTED support

Over all these things are going to become more accessible to iommufd, and
exposed to VMs, so it is important for the driver to have a robust
implementation of the API.

The work is split into three parts, with this part largely focusing on the
STE and building up to the BLOCKED & IDENTITY global static domains.

The second part largely focuses on the CD and builds up to having a common
PASID infrastructure that SVA and S1 domains equally use.

The third part has some random cleanups and the iommufd related parts.

Overall this takes the approach of turning the STE/CD programming upside
down where the CD/STE value is computed right at a driver callback
function and then pushed down into programming logic. The programming
logic hides the details of the required CD/STE tear-less update. This
makes the CD/STE functions independent of the arm_smmu_domain which makes
it fairly straightforward to untangle all the different call chains, and
add news ones.

Further, this frees the arm_smmu_domain related logic from keeping track
of what state the STE/CD is currently in so it can carefully sequence the
correct update. There are many new update pairs that are subtly introduced
as the work progresses.

The locking to support BTM via arm_smmu_asid_lock is a bit subtle right
now and patches throughout this work adjust and tighten this so that it is
clearer and doesn't get broken.

Once the lower STE layers no longer need to touch arm_smmu_domain we can
isolate struct arm_smmu_domain to be only used for PAGING domains, audit
all the to_smmu_domain() calls to be only in PAGING domain ops, and
introduce the normal global static BLOCKED/IDENTITY domains using the new
STE infrastructure. Part 2 will ultimately migrate SVA over to use
arm_smmu_domain as well.

All parts are on github:

 https://github.com/jgunthorpe/linux/commits/smmuv3_newapi

v2:
 - Rebased on v6.7-rc1
 - Improve the comment for arm_smmu_write_entry_step()
 - Fix the botched memcmp
 - Document the spec justification for the SHCFG exclusion in used
 - Include STRTAB_STE_1_SHCFG for STRTAB_STE_0_CFG_S2_TRANS in used
 - WARN_ON for unknown STEs in used
 - Fix error unwind in arm_smmu_attach_dev()
 - Whitespace, spelling, and checkpatch related items
v1: https://lore.kernel.org/r/0-v1-e289ca9121be+2be-smmuv3_newapi_p1_jgg@nvidia.com

Jason Gunthorpe (19):
  iommu/arm-smmu-v3: Add a type for the STE
  iommu/arm-smmu-v3: Master cannot be NULL in
    arm_smmu_write_strtab_ent()
  iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED
  iommu/arm-smmu-v3: Make STE programming independent of the callers
  iommu/arm-smmu-v3: Consolidate the STE generation for abort/bypass
  iommu/arm-smmu-v3: Move arm_smmu_rmr_install_bypass_ste()
  iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into
    functions
  iommu/arm-smmu-v3: Build the whole STE in
    arm_smmu_make_s2_domain_ste()
  iommu/arm-smmu-v3: Hold arm_smmu_asid_lock during all of attach_dev
  iommu/arm-smmu-v3: Compute the STE only once for each master
  iommu/arm-smmu-v3: Do not change the STE twice during
    arm_smmu_attach_dev()
  iommu/arm-smmu-v3: Put writing the context descriptor in the right
    order
  iommu/arm-smmu-v3: Pass smmu_domain to arm_enable/disable_ats()
  iommu/arm-smmu-v3: Remove arm_smmu_master->domain
  iommu/arm-smmu-v3: Add a global static IDENTITY domain
  iommu/arm-smmu-v3: Add a global static BLOCKED domain
  iommu/arm-smmu-v3: Use the identity/blocked domain during release
  iommu/arm-smmu-v3: Pass arm_smmu_domain and arm_smmu_device to
    finalize
  iommu/arm-smmu-v3: Convert to domain_alloc_paging()

 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 720 +++++++++++++-------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  12 +-
 2 files changed, 472 insertions(+), 260 deletions(-)


base-commit: ca7fcaff577c92d85f0e05cc7be79759155fe328
-- 
2.42.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* [PATCH v2 01/19] iommu/arm-smmu-v3: Add a type for the STE
  2023-11-13 17:53 ` Jason Gunthorpe
@ 2023-11-13 17:53   ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Instead of passing a naked __le16 * around to represent a STE wrap it in a
"struct arm_smmu_ste" with an array of the correct size. This makes it
much clearer which functions will comprise the "STE API".

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 54 ++++++++++-----------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  7 ++-
 2 files changed, 32 insertions(+), 29 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 7445454c2af244..519749d15fbda0 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1249,7 +1249,7 @@ static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 sid)
 }
 
 static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
-				      __le64 *dst)
+				      struct arm_smmu_ste *dst)
 {
 	/*
 	 * This is hideously complicated, but we only really care about
@@ -1267,7 +1267,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 	 * 2. Write everything apart from dword 0, sync, write dword 0, sync
 	 * 3. Update Config, sync
 	 */
-	u64 val = le64_to_cpu(dst[0]);
+	u64 val = le64_to_cpu(dst->data[0]);
 	bool ste_live = false;
 	struct arm_smmu_device *smmu = NULL;
 	struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
@@ -1325,10 +1325,10 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 		else
 			val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
 
-		dst[0] = cpu_to_le64(val);
-		dst[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
+		dst->data[0] = cpu_to_le64(val);
+		dst->data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
 						STRTAB_STE_1_SHCFG_INCOMING));
-		dst[2] = 0; /* Nuke the VMID */
+		dst->data[2] = 0; /* Nuke the VMID */
 		/*
 		 * The SMMU can perform negative caching, so we must sync
 		 * the STE regardless of whether the old value was live.
@@ -1343,7 +1343,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 			STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
 
 		BUG_ON(ste_live);
-		dst[1] = cpu_to_le64(
+		dst->data[1] = cpu_to_le64(
 			 FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
 			 FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
 			 FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
@@ -1352,7 +1352,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 
 		if (smmu->features & ARM_SMMU_FEAT_STALLS &&
 		    !master->stall_enabled)
-			dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
+			dst->data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
 
 		val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
 			FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
@@ -1362,7 +1362,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 
 	if (s2_cfg) {
 		BUG_ON(ste_live);
-		dst[2] = cpu_to_le64(
+		dst->data[2] = cpu_to_le64(
 			 FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
 			 FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
 #ifdef __BIG_ENDIAN
@@ -1371,18 +1371,18 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 			 STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
 			 STRTAB_STE_2_S2R);
 
-		dst[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
+		dst->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
 
 		val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
 	}
 
 	if (master->ats_enabled)
-		dst[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
+		dst->data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
 						 STRTAB_STE_1_EATS_TRANS));
 
 	arm_smmu_sync_ste_for_sid(smmu, sid);
 	/* See comment in arm_smmu_write_ctx_desc() */
-	WRITE_ONCE(dst[0], cpu_to_le64(val));
+	WRITE_ONCE(dst->data[0], cpu_to_le64(val));
 	arm_smmu_sync_ste_for_sid(smmu, sid);
 
 	/* It's likely that we'll want to use the new STE soon */
@@ -1390,7 +1390,8 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 		arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd);
 }
 
-static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent, bool force)
+static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab,
+				      unsigned int nent, bool force)
 {
 	unsigned int i;
 	u64 val = STRTAB_STE_0_V;
@@ -1401,11 +1402,11 @@ static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent, bool fo
 		val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
 
 	for (i = 0; i < nent; ++i) {
-		strtab[0] = cpu_to_le64(val);
-		strtab[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
-						   STRTAB_STE_1_SHCFG_INCOMING));
-		strtab[2] = 0;
-		strtab += STRTAB_STE_DWORDS;
+		strtab->data[0] = cpu_to_le64(val);
+		strtab->data[1] = cpu_to_le64(FIELD_PREP(
+			STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
+		strtab->data[2] = 0;
+		strtab++;
 	}
 }
 
@@ -2209,26 +2210,22 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
 	return 0;
 }
 
-static __le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
+static struct arm_smmu_ste *
+arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
 {
-	__le64 *step;
 	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
 
 	if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
-		struct arm_smmu_strtab_l1_desc *l1_desc;
 		int idx;
 
 		/* Two-level walk */
 		idx = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS;
-		l1_desc = &cfg->l1_desc[idx];
-		idx = (sid & ((1 << STRTAB_SPLIT) - 1)) * STRTAB_STE_DWORDS;
-		step = &l1_desc->l2ptr[idx];
+		return &cfg->l1_desc[idx].l2ptr[sid & ((1 << STRTAB_SPLIT) - 1)];
 	} else {
 		/* Simple linear lookup */
-		step = &cfg->strtab[sid * STRTAB_STE_DWORDS];
+		return (struct arm_smmu_ste *)&cfg
+			       ->strtab[sid * STRTAB_STE_DWORDS];
 	}
-
-	return step;
 }
 
 static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
@@ -2238,7 +2235,8 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
 
 	for (i = 0; i < master->num_streams; ++i) {
 		u32 sid = master->streams[i].id;
-		__le64 *step = arm_smmu_get_step_for_sid(smmu, sid);
+		struct arm_smmu_ste *step =
+			arm_smmu_get_step_for_sid(smmu, sid);
 
 		/* Bridged PCI devices may end up with duplicated IDs */
 		for (j = 0; j < i; j++)
@@ -3769,7 +3767,7 @@ static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu)
 	iort_get_rmr_sids(dev_fwnode(smmu->dev), &rmr_list);
 
 	list_for_each_entry(e, &rmr_list, list) {
-		__le64 *step;
+		struct arm_smmu_ste *step;
 		struct iommu_iort_rmr_data *rmr;
 		int ret, i;
 
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 961205ba86d25d..03f9e526cbd92f 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -206,6 +206,11 @@
 #define STRTAB_L1_DESC_L2PTR_MASK	GENMASK_ULL(51, 6)
 
 #define STRTAB_STE_DWORDS		8
+
+struct arm_smmu_ste {
+	__le64 data[STRTAB_STE_DWORDS];
+};
+
 #define STRTAB_STE_0_V			(1UL << 0)
 #define STRTAB_STE_0_CFG		GENMASK_ULL(3, 1)
 #define STRTAB_STE_0_CFG_ABORT		0
@@ -571,7 +576,7 @@ struct arm_smmu_priq {
 struct arm_smmu_strtab_l1_desc {
 	u8				span;
 
-	__le64				*l2ptr;
+	struct arm_smmu_ste		*l2ptr;
 	dma_addr_t			l2ptr_dma;
 };
 
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 01/19] iommu/arm-smmu-v3: Add a type for the STE
@ 2023-11-13 17:53   ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Instead of passing a naked __le16 * around to represent a STE wrap it in a
"struct arm_smmu_ste" with an array of the correct size. This makes it
much clearer which functions will comprise the "STE API".

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 54 ++++++++++-----------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  7 ++-
 2 files changed, 32 insertions(+), 29 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 7445454c2af244..519749d15fbda0 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1249,7 +1249,7 @@ static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 sid)
 }
 
 static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
-				      __le64 *dst)
+				      struct arm_smmu_ste *dst)
 {
 	/*
 	 * This is hideously complicated, but we only really care about
@@ -1267,7 +1267,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 	 * 2. Write everything apart from dword 0, sync, write dword 0, sync
 	 * 3. Update Config, sync
 	 */
-	u64 val = le64_to_cpu(dst[0]);
+	u64 val = le64_to_cpu(dst->data[0]);
 	bool ste_live = false;
 	struct arm_smmu_device *smmu = NULL;
 	struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
@@ -1325,10 +1325,10 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 		else
 			val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
 
-		dst[0] = cpu_to_le64(val);
-		dst[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
+		dst->data[0] = cpu_to_le64(val);
+		dst->data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
 						STRTAB_STE_1_SHCFG_INCOMING));
-		dst[2] = 0; /* Nuke the VMID */
+		dst->data[2] = 0; /* Nuke the VMID */
 		/*
 		 * The SMMU can perform negative caching, so we must sync
 		 * the STE regardless of whether the old value was live.
@@ -1343,7 +1343,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 			STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
 
 		BUG_ON(ste_live);
-		dst[1] = cpu_to_le64(
+		dst->data[1] = cpu_to_le64(
 			 FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
 			 FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
 			 FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
@@ -1352,7 +1352,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 
 		if (smmu->features & ARM_SMMU_FEAT_STALLS &&
 		    !master->stall_enabled)
-			dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
+			dst->data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
 
 		val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
 			FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
@@ -1362,7 +1362,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 
 	if (s2_cfg) {
 		BUG_ON(ste_live);
-		dst[2] = cpu_to_le64(
+		dst->data[2] = cpu_to_le64(
 			 FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
 			 FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
 #ifdef __BIG_ENDIAN
@@ -1371,18 +1371,18 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 			 STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
 			 STRTAB_STE_2_S2R);
 
-		dst[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
+		dst->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
 
 		val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
 	}
 
 	if (master->ats_enabled)
-		dst[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
+		dst->data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
 						 STRTAB_STE_1_EATS_TRANS));
 
 	arm_smmu_sync_ste_for_sid(smmu, sid);
 	/* See comment in arm_smmu_write_ctx_desc() */
-	WRITE_ONCE(dst[0], cpu_to_le64(val));
+	WRITE_ONCE(dst->data[0], cpu_to_le64(val));
 	arm_smmu_sync_ste_for_sid(smmu, sid);
 
 	/* It's likely that we'll want to use the new STE soon */
@@ -1390,7 +1390,8 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 		arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd);
 }
 
-static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent, bool force)
+static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab,
+				      unsigned int nent, bool force)
 {
 	unsigned int i;
 	u64 val = STRTAB_STE_0_V;
@@ -1401,11 +1402,11 @@ static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent, bool fo
 		val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
 
 	for (i = 0; i < nent; ++i) {
-		strtab[0] = cpu_to_le64(val);
-		strtab[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
-						   STRTAB_STE_1_SHCFG_INCOMING));
-		strtab[2] = 0;
-		strtab += STRTAB_STE_DWORDS;
+		strtab->data[0] = cpu_to_le64(val);
+		strtab->data[1] = cpu_to_le64(FIELD_PREP(
+			STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
+		strtab->data[2] = 0;
+		strtab++;
 	}
 }
 
@@ -2209,26 +2210,22 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
 	return 0;
 }
 
-static __le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
+static struct arm_smmu_ste *
+arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
 {
-	__le64 *step;
 	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
 
 	if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
-		struct arm_smmu_strtab_l1_desc *l1_desc;
 		int idx;
 
 		/* Two-level walk */
 		idx = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS;
-		l1_desc = &cfg->l1_desc[idx];
-		idx = (sid & ((1 << STRTAB_SPLIT) - 1)) * STRTAB_STE_DWORDS;
-		step = &l1_desc->l2ptr[idx];
+		return &cfg->l1_desc[idx].l2ptr[sid & ((1 << STRTAB_SPLIT) - 1)];
 	} else {
 		/* Simple linear lookup */
-		step = &cfg->strtab[sid * STRTAB_STE_DWORDS];
+		return (struct arm_smmu_ste *)&cfg
+			       ->strtab[sid * STRTAB_STE_DWORDS];
 	}
-
-	return step;
 }
 
 static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
@@ -2238,7 +2235,8 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
 
 	for (i = 0; i < master->num_streams; ++i) {
 		u32 sid = master->streams[i].id;
-		__le64 *step = arm_smmu_get_step_for_sid(smmu, sid);
+		struct arm_smmu_ste *step =
+			arm_smmu_get_step_for_sid(smmu, sid);
 
 		/* Bridged PCI devices may end up with duplicated IDs */
 		for (j = 0; j < i; j++)
@@ -3769,7 +3767,7 @@ static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu)
 	iort_get_rmr_sids(dev_fwnode(smmu->dev), &rmr_list);
 
 	list_for_each_entry(e, &rmr_list, list) {
-		__le64 *step;
+		struct arm_smmu_ste *step;
 		struct iommu_iort_rmr_data *rmr;
 		int ret, i;
 
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 961205ba86d25d..03f9e526cbd92f 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -206,6 +206,11 @@
 #define STRTAB_L1_DESC_L2PTR_MASK	GENMASK_ULL(51, 6)
 
 #define STRTAB_STE_DWORDS		8
+
+struct arm_smmu_ste {
+	__le64 data[STRTAB_STE_DWORDS];
+};
+
 #define STRTAB_STE_0_V			(1UL << 0)
 #define STRTAB_STE_0_CFG		GENMASK_ULL(3, 1)
 #define STRTAB_STE_0_CFG_ABORT		0
@@ -571,7 +576,7 @@ struct arm_smmu_priq {
 struct arm_smmu_strtab_l1_desc {
 	u8				span;
 
-	__le64				*l2ptr;
+	struct arm_smmu_ste		*l2ptr;
 	dma_addr_t			l2ptr_dma;
 };
 
-- 
2.42.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 02/19] iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent()
  2023-11-13 17:53 ` Jason Gunthorpe
@ 2023-11-13 17:53   ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

The only caller is arm_smmu_install_ste_for_dev() which never has a NULL
master. Remove the confusing if.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 519749d15fbda0..9117e769a965e1 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1269,10 +1269,10 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 	 */
 	u64 val = le64_to_cpu(dst->data[0]);
 	bool ste_live = false;
-	struct arm_smmu_device *smmu = NULL;
+	struct arm_smmu_device *smmu = master->smmu;
 	struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
 	struct arm_smmu_s2_cfg *s2_cfg = NULL;
-	struct arm_smmu_domain *smmu_domain = NULL;
+	struct arm_smmu_domain *smmu_domain = master->domain;
 	struct arm_smmu_cmdq_ent prefetch_cmd = {
 		.opcode		= CMDQ_OP_PREFETCH_CFG,
 		.prefetch	= {
@@ -1280,11 +1280,6 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 		},
 	};
 
-	if (master) {
-		smmu_domain = master->domain;
-		smmu = master->smmu;
-	}
-
 	if (smmu_domain) {
 		switch (smmu_domain->stage) {
 		case ARM_SMMU_DOMAIN_S1:
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 02/19] iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent()
@ 2023-11-13 17:53   ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

The only caller is arm_smmu_install_ste_for_dev() which never has a NULL
master. Remove the confusing if.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 519749d15fbda0..9117e769a965e1 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1269,10 +1269,10 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 	 */
 	u64 val = le64_to_cpu(dst->data[0]);
 	bool ste_live = false;
-	struct arm_smmu_device *smmu = NULL;
+	struct arm_smmu_device *smmu = master->smmu;
 	struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
 	struct arm_smmu_s2_cfg *s2_cfg = NULL;
-	struct arm_smmu_domain *smmu_domain = NULL;
+	struct arm_smmu_domain *smmu_domain = master->domain;
 	struct arm_smmu_cmdq_ent prefetch_cmd = {
 		.opcode		= CMDQ_OP_PREFETCH_CFG,
 		.prefetch	= {
@@ -1280,11 +1280,6 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 		},
 	};
 
-	if (master) {
-		smmu_domain = master->domain;
-		smmu = master->smmu;
-	}
-
 	if (smmu_domain) {
 		switch (smmu_domain->stage) {
 		case ARM_SMMU_DOMAIN_S1:
-- 
2.42.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 03/19] iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED
  2023-11-13 17:53 ` Jason Gunthorpe
@ 2023-11-13 17:53   ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Currently this is exactly the same as ARM_SMMU_DOMAIN_S2, so just remove
it. The ongoing work to add nesting support through iommufd will do
something a little different.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 4 +---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 -
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 9117e769a965e1..bf7218adbc2822 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1286,7 +1286,6 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 			cd_table = &master->cd_table;
 			break;
 		case ARM_SMMU_DOMAIN_S2:
-		case ARM_SMMU_DOMAIN_NESTED:
 			s2_cfg = &smmu_domain->s2_cfg;
 			break;
 		default:
@@ -2167,7 +2166,6 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
 		fmt = ARM_64_LPAE_S1;
 		finalise_stage_fn = arm_smmu_domain_finalise_s1;
 		break;
-	case ARM_SMMU_DOMAIN_NESTED:
 	case ARM_SMMU_DOMAIN_S2:
 		ias = smmu->ias;
 		oas = smmu->oas;
@@ -2735,7 +2733,7 @@ static int arm_smmu_enable_nesting(struct iommu_domain *domain)
 	if (smmu_domain->smmu)
 		ret = -EPERM;
 	else
-		smmu_domain->stage = ARM_SMMU_DOMAIN_NESTED;
+		smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
 	mutex_unlock(&smmu_domain->init_mutex);
 
 	return ret;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 03f9e526cbd92f..27ddf1acd12cea 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -715,7 +715,6 @@ struct arm_smmu_master {
 enum arm_smmu_domain_stage {
 	ARM_SMMU_DOMAIN_S1 = 0,
 	ARM_SMMU_DOMAIN_S2,
-	ARM_SMMU_DOMAIN_NESTED,
 	ARM_SMMU_DOMAIN_BYPASS,
 };
 
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 03/19] iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED
@ 2023-11-13 17:53   ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Currently this is exactly the same as ARM_SMMU_DOMAIN_S2, so just remove
it. The ongoing work to add nesting support through iommufd will do
something a little different.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 4 +---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 -
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 9117e769a965e1..bf7218adbc2822 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1286,7 +1286,6 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 			cd_table = &master->cd_table;
 			break;
 		case ARM_SMMU_DOMAIN_S2:
-		case ARM_SMMU_DOMAIN_NESTED:
 			s2_cfg = &smmu_domain->s2_cfg;
 			break;
 		default:
@@ -2167,7 +2166,6 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
 		fmt = ARM_64_LPAE_S1;
 		finalise_stage_fn = arm_smmu_domain_finalise_s1;
 		break;
-	case ARM_SMMU_DOMAIN_NESTED:
 	case ARM_SMMU_DOMAIN_S2:
 		ias = smmu->ias;
 		oas = smmu->oas;
@@ -2735,7 +2733,7 @@ static int arm_smmu_enable_nesting(struct iommu_domain *domain)
 	if (smmu_domain->smmu)
 		ret = -EPERM;
 	else
-		smmu_domain->stage = ARM_SMMU_DOMAIN_NESTED;
+		smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
 	mutex_unlock(&smmu_domain->init_mutex);
 
 	return ret;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 03f9e526cbd92f..27ddf1acd12cea 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -715,7 +715,6 @@ struct arm_smmu_master {
 enum arm_smmu_domain_stage {
 	ARM_SMMU_DOMAIN_S1 = 0,
 	ARM_SMMU_DOMAIN_S2,
-	ARM_SMMU_DOMAIN_NESTED,
 	ARM_SMMU_DOMAIN_BYPASS,
 };
 
-- 
2.42.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 04/19] iommu/arm-smmu-v3: Make STE programming independent of the callers
  2023-11-13 17:53 ` Jason Gunthorpe
@ 2023-11-13 17:53   ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

As the comment in arm_smmu_write_strtab_ent() explains, this routine has
been limited to only work correctly in certain scenarios that the caller
must ensure. Generally the caller must put the STE into ABORT or BYPASS
before attempting to program it to something else.

The next patches/series are going to start removing some of this logic
from the callers, and add more complex state combinations than currently.

Thus, consolidate all the complexity here. Callers do not have to care
about what STE transition they are doing, this function will handle
everything optimally.

Revise arm_smmu_write_strtab_ent() so it algorithmically computes the
required programming sequence to avoid creating an incoherent 'torn' STE
in the HW caches. The update algorithm follows the same design that the
driver already uses: it is safe to change bits that HW doesn't currently
use and then do a single 64 bit update, with sync's in between.

The basic idea is to express in a bitmask what bits the HW is actually
using based on the V and CFG bits. Based on that mask we know what STE
changes are safe and which are disruptive. We can count how many 64 bit
QWORDS need a disruptive update and know if a step with V=0 is required.

This gives two basic flows through the algorithm.

If only a single 64 bit quantity needs disruptive replacement:
 - Write the target value into all currently unused bits
 - Write the single 64 bit quantity
 - Zero the remaining different bits

If multiple 64 bit quantities need disruptive replacement then do:
 - Write V=0 to QWORD 0
 - Write the entire STE except QWORD 0
 - Write QWORD 0

With HW flushes at each step, that can be skipped if the STE didn't change
in that step.

At this point it generates the same sequence of updates as the current
code, except that zeroing the VMID on entry to BYPASS/ABORT will do an
extra sync (this seems to be an existing bug).

Going forward this will use a V=0 transition instead of cycling through
ABORT if a hitfull change is required. This seems more appropriate as ABORT
will fail DMAs without any logging, but dropping a DMA due to transient
V=0 is probably signaling a bug, so the C_BAD_STE is valuable.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 272 +++++++++++++++-----
 1 file changed, 208 insertions(+), 64 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index bf7218adbc2822..6430a8d89cb471 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -971,6 +971,101 @@ void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid)
 	arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
 }
 
+/*
+ * This algorithm updates any STE/CD to any value without creating a situation
+ * where the HW can percieve a corrupted entry. HW is only required to have a 64
+ * bit atomicity with stores from the CPU, while entries are many 64 bit values
+ * big.
+ *
+ * The algorithm works by evolving the entry toward the target in a series of
+ * steps. Each step synchronizes with the HW so that the HW can not see an entry
+ * torn across two steps. Upon each call cur/cur_used reflect the current
+ * synchronized value seen by the HW.
+ *
+ * During each step the HW can observe a torn entry that has any combination of
+ * the step's old/new 64 bit words. The algorithm objective is for the HW
+ * behavior to always be one of current behavior, V=0, or new behavior, during
+ * each step, and across all steps.
+ *
+ * At each step one of three actions is chosen to evolve cur to target:
+ *  - Update all unused bits with their target values.
+ *    This relies on the IGNORED behavior described in the specification
+ *  - Update a single 64-bit value
+ *  - Update all unused bits and set V=0
+ *
+ * The last two actions will cause cur_used to change, which will then allow the
+ * first action on the next step.
+ *
+ * In the most general case we can make any update in three steps:
+ *  - Disrupting the entry (V=0)
+ *  - Fill now unused bits, all bits except V
+ *  - Make valid (V=1), single 64 bit store
+ *
+ * However this disrupts the HW while it is happening. There are several
+ * interesting cases where a STE/CD can be updated without disturbing the HW
+ * because only a small number of bits are changing (S1DSS, CONFIG, etc) or
+ * because the used bits don't intersect. We can detect this by calculating how
+ * many 64 bit values need update after adjusting the unused bits and skip the
+ * V=0 process.
+ */
+static bool arm_smmu_write_entry_step(__le64 *cur, const __le64 *cur_used,
+				      const __le64 *target,
+				      const __le64 *target_used, __le64 *step,
+				      __le64 v_bit,
+				      unsigned int len)
+{
+	u8 step_used_diff = 0;
+	u8 step_change = 0;
+	unsigned int i;
+
+	/*
+	 * Compute a step that has all the bits currently unused by HW set to
+	 * their target values.
+	 */
+	for (i = 0; i != len; i++) {
+		step[i] = (cur[i] & cur_used[i]) | (target[i] & ~cur_used[i]);
+		if (cur[i] != step[i])
+			step_change |= 1 << i;
+		/*
+		 * Each bit indicates if the step is incorrect compared to the
+		 * target, considering only the used bits in the target
+		 */
+		if ((step[i] & target_used[i]) != (target[i] & target_used[i]))
+			step_used_diff |= 1 << i;
+	}
+
+	if (hweight8(step_used_diff) > 1) {
+		/*
+		 * More than 1 qword is mismatched, this cannot be done without
+		 * a break. Clear the V bit and go again.
+		 */
+		step[0] &= ~v_bit;
+	} else if (!step_change && step_used_diff) {
+		/*
+		 * Have exactly one critical qword, all the other qwords are set
+		 * correctly, so we can set this qword now.
+		 */
+		i = ffs(step_used_diff) - 1;
+		step[i] = target[i];
+	} else if (!step_change) {
+		/* cur == target, so all done */
+		if (memcmp(cur, target, len * sizeof(*cur)) == 0)
+			return true;
+
+		/*
+		 * All the used HW bits match, but unused bits are different.
+		 * Set them as well. Technically this isn't necessary but it
+		 * brings the entry to the full target state, so if there are
+		 * bugs in the mask calculation this will obscure them.
+		 */
+		memcpy(step, target, len * sizeof(*step));
+	}
+
+	for (i = 0; i != len; i++)
+		WRITE_ONCE(cur[i], step[i]);
+	return false;
+}
+
 static void arm_smmu_sync_cd(struct arm_smmu_master *master,
 			     int ssid, bool leaf)
 {
@@ -1248,37 +1343,115 @@ static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 sid)
 	arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
 }
 
+/*
+ * Based on the value of ent report which bits of the STE the HW will access. It
+ * would be nice if this was complete according to the spec, but minimally it
+ * has to capture the bits this driver uses.
+ */
+static void arm_smmu_get_ste_used(const struct arm_smmu_ste *ent,
+				  struct arm_smmu_ste *used_bits)
+{
+	memset(used_bits, 0, sizeof(*used_bits));
+
+	used_bits->data[0] = cpu_to_le64(STRTAB_STE_0_V);
+	if (!(ent->data[0] & cpu_to_le64(STRTAB_STE_0_V)))
+		return;
+
+	/*
+	 * If S1 is enabled S1DSS is valid, see 13.5 Summary of
+	 * attribute/permission configuration fields for the SHCFG behavior.
+	 */
+	if (FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(ent->data[0])) & 1 &&
+	    FIELD_GET(STRTAB_STE_1_S1DSS, le64_to_cpu(ent->data[1])) ==
+		    STRTAB_STE_1_S1DSS_BYPASS)
+		used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG);
+
+	used_bits->data[0] |= cpu_to_le64(STRTAB_STE_0_CFG);
+	switch (FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(ent->data[0]))) {
+	case STRTAB_STE_0_CFG_ABORT:
+		break;
+	case STRTAB_STE_0_CFG_BYPASS:
+		used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG);
+		break;
+	case STRTAB_STE_0_CFG_S1_TRANS:
+		used_bits->data[0] |= cpu_to_le64(STRTAB_STE_0_S1FMT |
+						  STRTAB_STE_0_S1CTXPTR_MASK |
+						  STRTAB_STE_0_S1CDMAX);
+		used_bits->data[1] |=
+			cpu_to_le64(STRTAB_STE_1_S1DSS | STRTAB_STE_1_S1CIR |
+				    STRTAB_STE_1_S1COR | STRTAB_STE_1_S1CSH |
+				    STRTAB_STE_1_S1STALLD | STRTAB_STE_1_STRW);
+		used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_EATS);
+		break;
+	case STRTAB_STE_0_CFG_S2_TRANS:
+		used_bits->data[1] |=
+			cpu_to_le64(STRTAB_STE_1_EATS | STRTAB_STE_1_SHCFG);
+		used_bits->data[2] |=
+			cpu_to_le64(STRTAB_STE_2_S2VMID | STRTAB_STE_2_VTCR |
+				    STRTAB_STE_2_S2AA64 | STRTAB_STE_2_S2ENDI |
+				    STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2R);
+		used_bits->data[3] |= cpu_to_le64(STRTAB_STE_3_S2TTB_MASK);
+		break;
+
+	default:
+		memset(used_bits, 0xFF, sizeof(*used_bits));
+		WARN_ON(true);
+	}
+}
+
+static bool arm_smmu_write_ste_step(struct arm_smmu_ste *cur,
+				    const struct arm_smmu_ste *target,
+				    const struct arm_smmu_ste *target_used)
+{
+	struct arm_smmu_ste cur_used;
+	struct arm_smmu_ste step;
+
+	arm_smmu_get_ste_used(cur, &cur_used);
+	return arm_smmu_write_entry_step(cur->data, cur_used.data, target->data,
+					 target_used->data, step.data,
+					 cpu_to_le64(STRTAB_STE_0_V),
+					 ARRAY_SIZE(cur->data));
+}
+
+static void arm_smmu_write_ste(struct arm_smmu_device *smmu, u32 sid,
+			       struct arm_smmu_ste *ste,
+			       const struct arm_smmu_ste *target)
+{
+	struct arm_smmu_ste target_used;
+	int i;
+
+	arm_smmu_get_ste_used(target, &target_used);
+	/* Masks in arm_smmu_get_ste_used() are up to date */
+	for (i = 0; i != ARRAY_SIZE(target->data); i++)
+		WARN_ON_ONCE(target->data[i] & ~target_used.data[i]);
+
+	while (true) {
+		if (arm_smmu_write_ste_step(ste, target, &target_used))
+			break;
+		arm_smmu_sync_ste_for_sid(smmu, sid);
+	}
+
+	/* It's likely that we'll want to use the new STE soon */
+	if (!(smmu->options & ARM_SMMU_OPT_SKIP_PREFETCH)) {
+		struct arm_smmu_cmdq_ent
+			prefetch_cmd = { .opcode = CMDQ_OP_PREFETCH_CFG,
+					 .prefetch = {
+						 .sid = sid,
+					 } };
+
+		arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd);
+	}
+}
+
 static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 				      struct arm_smmu_ste *dst)
 {
-	/*
-	 * This is hideously complicated, but we only really care about
-	 * three cases at the moment:
-	 *
-	 * 1. Invalid (all zero) -> bypass/fault (init)
-	 * 2. Bypass/fault -> translation/bypass (attach)
-	 * 3. Translation/bypass -> bypass/fault (detach)
-	 *
-	 * Given that we can't update the STE atomically and the SMMU
-	 * doesn't read the thing in a defined order, that leaves us
-	 * with the following maintenance requirements:
-	 *
-	 * 1. Update Config, return (init time STEs aren't live)
-	 * 2. Write everything apart from dword 0, sync, write dword 0, sync
-	 * 3. Update Config, sync
-	 */
-	u64 val = le64_to_cpu(dst->data[0]);
-	bool ste_live = false;
+	u64 val;
 	struct arm_smmu_device *smmu = master->smmu;
 	struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
 	struct arm_smmu_s2_cfg *s2_cfg = NULL;
 	struct arm_smmu_domain *smmu_domain = master->domain;
-	struct arm_smmu_cmdq_ent prefetch_cmd = {
-		.opcode		= CMDQ_OP_PREFETCH_CFG,
-		.prefetch	= {
-			.sid	= sid,
-		},
-	};
+	struct arm_smmu_ste target = {};
 
 	if (smmu_domain) {
 		switch (smmu_domain->stage) {
@@ -1293,22 +1466,6 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 		}
 	}
 
-	if (val & STRTAB_STE_0_V) {
-		switch (FIELD_GET(STRTAB_STE_0_CFG, val)) {
-		case STRTAB_STE_0_CFG_BYPASS:
-			break;
-		case STRTAB_STE_0_CFG_S1_TRANS:
-		case STRTAB_STE_0_CFG_S2_TRANS:
-			ste_live = true;
-			break;
-		case STRTAB_STE_0_CFG_ABORT:
-			BUG_ON(!disable_bypass);
-			break;
-		default:
-			BUG(); /* STE corruption */
-		}
-	}
-
 	/* Nuke the existing STE_0 value, as we're going to rewrite it */
 	val = STRTAB_STE_0_V;
 
@@ -1319,16 +1476,11 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 		else
 			val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
 
-		dst->data[0] = cpu_to_le64(val);
-		dst->data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
+		target.data[0] = cpu_to_le64(val);
+		target.data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
 						STRTAB_STE_1_SHCFG_INCOMING));
-		dst->data[2] = 0; /* Nuke the VMID */
-		/*
-		 * The SMMU can perform negative caching, so we must sync
-		 * the STE regardless of whether the old value was live.
-		 */
-		if (smmu)
-			arm_smmu_sync_ste_for_sid(smmu, sid);
+		target.data[2] = 0; /* Nuke the VMID */
+		arm_smmu_write_ste(smmu, sid, dst, &target);
 		return;
 	}
 
@@ -1336,8 +1488,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 		u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ?
 			STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
 
-		BUG_ON(ste_live);
-		dst->data[1] = cpu_to_le64(
+		target.data[1] = cpu_to_le64(
 			 FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
 			 FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
 			 FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
@@ -1346,7 +1497,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 
 		if (smmu->features & ARM_SMMU_FEAT_STALLS &&
 		    !master->stall_enabled)
-			dst->data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
+			target.data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
 
 		val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
 			FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
@@ -1355,8 +1506,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 	}
 
 	if (s2_cfg) {
-		BUG_ON(ste_live);
-		dst->data[2] = cpu_to_le64(
+		target.data[2] = cpu_to_le64(
 			 FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
 			 FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
 #ifdef __BIG_ENDIAN
@@ -1365,23 +1515,17 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 			 STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
 			 STRTAB_STE_2_S2R);
 
-		dst->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
+		target.data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
 
 		val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
 	}
 
 	if (master->ats_enabled)
-		dst->data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
+		target.data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
 						 STRTAB_STE_1_EATS_TRANS));
 
-	arm_smmu_sync_ste_for_sid(smmu, sid);
-	/* See comment in arm_smmu_write_ctx_desc() */
-	WRITE_ONCE(dst->data[0], cpu_to_le64(val));
-	arm_smmu_sync_ste_for_sid(smmu, sid);
-
-	/* It's likely that we'll want to use the new STE soon */
-	if (!(smmu->options & ARM_SMMU_OPT_SKIP_PREFETCH))
-		arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd);
+	target.data[0] = cpu_to_le64(val);
+	arm_smmu_write_ste(smmu, sid, dst, &target);
 }
 
 static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab,
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 04/19] iommu/arm-smmu-v3: Make STE programming independent of the callers
@ 2023-11-13 17:53   ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

As the comment in arm_smmu_write_strtab_ent() explains, this routine has
been limited to only work correctly in certain scenarios that the caller
must ensure. Generally the caller must put the STE into ABORT or BYPASS
before attempting to program it to something else.

The next patches/series are going to start removing some of this logic
from the callers, and add more complex state combinations than currently.

Thus, consolidate all the complexity here. Callers do not have to care
about what STE transition they are doing, this function will handle
everything optimally.

Revise arm_smmu_write_strtab_ent() so it algorithmically computes the
required programming sequence to avoid creating an incoherent 'torn' STE
in the HW caches. The update algorithm follows the same design that the
driver already uses: it is safe to change bits that HW doesn't currently
use and then do a single 64 bit update, with sync's in between.

The basic idea is to express in a bitmask what bits the HW is actually
using based on the V and CFG bits. Based on that mask we know what STE
changes are safe and which are disruptive. We can count how many 64 bit
QWORDS need a disruptive update and know if a step with V=0 is required.

This gives two basic flows through the algorithm.

If only a single 64 bit quantity needs disruptive replacement:
 - Write the target value into all currently unused bits
 - Write the single 64 bit quantity
 - Zero the remaining different bits

If multiple 64 bit quantities need disruptive replacement then do:
 - Write V=0 to QWORD 0
 - Write the entire STE except QWORD 0
 - Write QWORD 0

With HW flushes at each step, that can be skipped if the STE didn't change
in that step.

At this point it generates the same sequence of updates as the current
code, except that zeroing the VMID on entry to BYPASS/ABORT will do an
extra sync (this seems to be an existing bug).

Going forward this will use a V=0 transition instead of cycling through
ABORT if a hitfull change is required. This seems more appropriate as ABORT
will fail DMAs without any logging, but dropping a DMA due to transient
V=0 is probably signaling a bug, so the C_BAD_STE is valuable.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 272 +++++++++++++++-----
 1 file changed, 208 insertions(+), 64 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index bf7218adbc2822..6430a8d89cb471 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -971,6 +971,101 @@ void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid)
 	arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
 }
 
+/*
+ * This algorithm updates any STE/CD to any value without creating a situation
+ * where the HW can percieve a corrupted entry. HW is only required to have a 64
+ * bit atomicity with stores from the CPU, while entries are many 64 bit values
+ * big.
+ *
+ * The algorithm works by evolving the entry toward the target in a series of
+ * steps. Each step synchronizes with the HW so that the HW can not see an entry
+ * torn across two steps. Upon each call cur/cur_used reflect the current
+ * synchronized value seen by the HW.
+ *
+ * During each step the HW can observe a torn entry that has any combination of
+ * the step's old/new 64 bit words. The algorithm objective is for the HW
+ * behavior to always be one of current behavior, V=0, or new behavior, during
+ * each step, and across all steps.
+ *
+ * At each step one of three actions is chosen to evolve cur to target:
+ *  - Update all unused bits with their target values.
+ *    This relies on the IGNORED behavior described in the specification
+ *  - Update a single 64-bit value
+ *  - Update all unused bits and set V=0
+ *
+ * The last two actions will cause cur_used to change, which will then allow the
+ * first action on the next step.
+ *
+ * In the most general case we can make any update in three steps:
+ *  - Disrupting the entry (V=0)
+ *  - Fill now unused bits, all bits except V
+ *  - Make valid (V=1), single 64 bit store
+ *
+ * However this disrupts the HW while it is happening. There are several
+ * interesting cases where a STE/CD can be updated without disturbing the HW
+ * because only a small number of bits are changing (S1DSS, CONFIG, etc) or
+ * because the used bits don't intersect. We can detect this by calculating how
+ * many 64 bit values need update after adjusting the unused bits and skip the
+ * V=0 process.
+ */
+static bool arm_smmu_write_entry_step(__le64 *cur, const __le64 *cur_used,
+				      const __le64 *target,
+				      const __le64 *target_used, __le64 *step,
+				      __le64 v_bit,
+				      unsigned int len)
+{
+	u8 step_used_diff = 0;
+	u8 step_change = 0;
+	unsigned int i;
+
+	/*
+	 * Compute a step that has all the bits currently unused by HW set to
+	 * their target values.
+	 */
+	for (i = 0; i != len; i++) {
+		step[i] = (cur[i] & cur_used[i]) | (target[i] & ~cur_used[i]);
+		if (cur[i] != step[i])
+			step_change |= 1 << i;
+		/*
+		 * Each bit indicates if the step is incorrect compared to the
+		 * target, considering only the used bits in the target
+		 */
+		if ((step[i] & target_used[i]) != (target[i] & target_used[i]))
+			step_used_diff |= 1 << i;
+	}
+
+	if (hweight8(step_used_diff) > 1) {
+		/*
+		 * More than 1 qword is mismatched, this cannot be done without
+		 * a break. Clear the V bit and go again.
+		 */
+		step[0] &= ~v_bit;
+	} else if (!step_change && step_used_diff) {
+		/*
+		 * Have exactly one critical qword, all the other qwords are set
+		 * correctly, so we can set this qword now.
+		 */
+		i = ffs(step_used_diff) - 1;
+		step[i] = target[i];
+	} else if (!step_change) {
+		/* cur == target, so all done */
+		if (memcmp(cur, target, len * sizeof(*cur)) == 0)
+			return true;
+
+		/*
+		 * All the used HW bits match, but unused bits are different.
+		 * Set them as well. Technically this isn't necessary but it
+		 * brings the entry to the full target state, so if there are
+		 * bugs in the mask calculation this will obscure them.
+		 */
+		memcpy(step, target, len * sizeof(*step));
+	}
+
+	for (i = 0; i != len; i++)
+		WRITE_ONCE(cur[i], step[i]);
+	return false;
+}
+
 static void arm_smmu_sync_cd(struct arm_smmu_master *master,
 			     int ssid, bool leaf)
 {
@@ -1248,37 +1343,115 @@ static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 sid)
 	arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
 }
 
+/*
+ * Based on the value of ent report which bits of the STE the HW will access. It
+ * would be nice if this was complete according to the spec, but minimally it
+ * has to capture the bits this driver uses.
+ */
+static void arm_smmu_get_ste_used(const struct arm_smmu_ste *ent,
+				  struct arm_smmu_ste *used_bits)
+{
+	memset(used_bits, 0, sizeof(*used_bits));
+
+	used_bits->data[0] = cpu_to_le64(STRTAB_STE_0_V);
+	if (!(ent->data[0] & cpu_to_le64(STRTAB_STE_0_V)))
+		return;
+
+	/*
+	 * If S1 is enabled S1DSS is valid, see 13.5 Summary of
+	 * attribute/permission configuration fields for the SHCFG behavior.
+	 */
+	if (FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(ent->data[0])) & 1 &&
+	    FIELD_GET(STRTAB_STE_1_S1DSS, le64_to_cpu(ent->data[1])) ==
+		    STRTAB_STE_1_S1DSS_BYPASS)
+		used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG);
+
+	used_bits->data[0] |= cpu_to_le64(STRTAB_STE_0_CFG);
+	switch (FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(ent->data[0]))) {
+	case STRTAB_STE_0_CFG_ABORT:
+		break;
+	case STRTAB_STE_0_CFG_BYPASS:
+		used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG);
+		break;
+	case STRTAB_STE_0_CFG_S1_TRANS:
+		used_bits->data[0] |= cpu_to_le64(STRTAB_STE_0_S1FMT |
+						  STRTAB_STE_0_S1CTXPTR_MASK |
+						  STRTAB_STE_0_S1CDMAX);
+		used_bits->data[1] |=
+			cpu_to_le64(STRTAB_STE_1_S1DSS | STRTAB_STE_1_S1CIR |
+				    STRTAB_STE_1_S1COR | STRTAB_STE_1_S1CSH |
+				    STRTAB_STE_1_S1STALLD | STRTAB_STE_1_STRW);
+		used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_EATS);
+		break;
+	case STRTAB_STE_0_CFG_S2_TRANS:
+		used_bits->data[1] |=
+			cpu_to_le64(STRTAB_STE_1_EATS | STRTAB_STE_1_SHCFG);
+		used_bits->data[2] |=
+			cpu_to_le64(STRTAB_STE_2_S2VMID | STRTAB_STE_2_VTCR |
+				    STRTAB_STE_2_S2AA64 | STRTAB_STE_2_S2ENDI |
+				    STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2R);
+		used_bits->data[3] |= cpu_to_le64(STRTAB_STE_3_S2TTB_MASK);
+		break;
+
+	default:
+		memset(used_bits, 0xFF, sizeof(*used_bits));
+		WARN_ON(true);
+	}
+}
+
+static bool arm_smmu_write_ste_step(struct arm_smmu_ste *cur,
+				    const struct arm_smmu_ste *target,
+				    const struct arm_smmu_ste *target_used)
+{
+	struct arm_smmu_ste cur_used;
+	struct arm_smmu_ste step;
+
+	arm_smmu_get_ste_used(cur, &cur_used);
+	return arm_smmu_write_entry_step(cur->data, cur_used.data, target->data,
+					 target_used->data, step.data,
+					 cpu_to_le64(STRTAB_STE_0_V),
+					 ARRAY_SIZE(cur->data));
+}
+
+static void arm_smmu_write_ste(struct arm_smmu_device *smmu, u32 sid,
+			       struct arm_smmu_ste *ste,
+			       const struct arm_smmu_ste *target)
+{
+	struct arm_smmu_ste target_used;
+	int i;
+
+	arm_smmu_get_ste_used(target, &target_used);
+	/* Masks in arm_smmu_get_ste_used() are up to date */
+	for (i = 0; i != ARRAY_SIZE(target->data); i++)
+		WARN_ON_ONCE(target->data[i] & ~target_used.data[i]);
+
+	while (true) {
+		if (arm_smmu_write_ste_step(ste, target, &target_used))
+			break;
+		arm_smmu_sync_ste_for_sid(smmu, sid);
+	}
+
+	/* It's likely that we'll want to use the new STE soon */
+	if (!(smmu->options & ARM_SMMU_OPT_SKIP_PREFETCH)) {
+		struct arm_smmu_cmdq_ent
+			prefetch_cmd = { .opcode = CMDQ_OP_PREFETCH_CFG,
+					 .prefetch = {
+						 .sid = sid,
+					 } };
+
+		arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd);
+	}
+}
+
 static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 				      struct arm_smmu_ste *dst)
 {
-	/*
-	 * This is hideously complicated, but we only really care about
-	 * three cases at the moment:
-	 *
-	 * 1. Invalid (all zero) -> bypass/fault (init)
-	 * 2. Bypass/fault -> translation/bypass (attach)
-	 * 3. Translation/bypass -> bypass/fault (detach)
-	 *
-	 * Given that we can't update the STE atomically and the SMMU
-	 * doesn't read the thing in a defined order, that leaves us
-	 * with the following maintenance requirements:
-	 *
-	 * 1. Update Config, return (init time STEs aren't live)
-	 * 2. Write everything apart from dword 0, sync, write dword 0, sync
-	 * 3. Update Config, sync
-	 */
-	u64 val = le64_to_cpu(dst->data[0]);
-	bool ste_live = false;
+	u64 val;
 	struct arm_smmu_device *smmu = master->smmu;
 	struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
 	struct arm_smmu_s2_cfg *s2_cfg = NULL;
 	struct arm_smmu_domain *smmu_domain = master->domain;
-	struct arm_smmu_cmdq_ent prefetch_cmd = {
-		.opcode		= CMDQ_OP_PREFETCH_CFG,
-		.prefetch	= {
-			.sid	= sid,
-		},
-	};
+	struct arm_smmu_ste target = {};
 
 	if (smmu_domain) {
 		switch (smmu_domain->stage) {
@@ -1293,22 +1466,6 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 		}
 	}
 
-	if (val & STRTAB_STE_0_V) {
-		switch (FIELD_GET(STRTAB_STE_0_CFG, val)) {
-		case STRTAB_STE_0_CFG_BYPASS:
-			break;
-		case STRTAB_STE_0_CFG_S1_TRANS:
-		case STRTAB_STE_0_CFG_S2_TRANS:
-			ste_live = true;
-			break;
-		case STRTAB_STE_0_CFG_ABORT:
-			BUG_ON(!disable_bypass);
-			break;
-		default:
-			BUG(); /* STE corruption */
-		}
-	}
-
 	/* Nuke the existing STE_0 value, as we're going to rewrite it */
 	val = STRTAB_STE_0_V;
 
@@ -1319,16 +1476,11 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 		else
 			val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
 
-		dst->data[0] = cpu_to_le64(val);
-		dst->data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
+		target.data[0] = cpu_to_le64(val);
+		target.data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
 						STRTAB_STE_1_SHCFG_INCOMING));
-		dst->data[2] = 0; /* Nuke the VMID */
-		/*
-		 * The SMMU can perform negative caching, so we must sync
-		 * the STE regardless of whether the old value was live.
-		 */
-		if (smmu)
-			arm_smmu_sync_ste_for_sid(smmu, sid);
+		target.data[2] = 0; /* Nuke the VMID */
+		arm_smmu_write_ste(smmu, sid, dst, &target);
 		return;
 	}
 
@@ -1336,8 +1488,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 		u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ?
 			STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
 
-		BUG_ON(ste_live);
-		dst->data[1] = cpu_to_le64(
+		target.data[1] = cpu_to_le64(
 			 FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
 			 FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
 			 FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
@@ -1346,7 +1497,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 
 		if (smmu->features & ARM_SMMU_FEAT_STALLS &&
 		    !master->stall_enabled)
-			dst->data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
+			target.data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
 
 		val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
 			FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
@@ -1355,8 +1506,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 	}
 
 	if (s2_cfg) {
-		BUG_ON(ste_live);
-		dst->data[2] = cpu_to_le64(
+		target.data[2] = cpu_to_le64(
 			 FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
 			 FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
 #ifdef __BIG_ENDIAN
@@ -1365,23 +1515,17 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 			 STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
 			 STRTAB_STE_2_S2R);
 
-		dst->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
+		target.data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
 
 		val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
 	}
 
 	if (master->ats_enabled)
-		dst->data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
+		target.data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
 						 STRTAB_STE_1_EATS_TRANS));
 
-	arm_smmu_sync_ste_for_sid(smmu, sid);
-	/* See comment in arm_smmu_write_ctx_desc() */
-	WRITE_ONCE(dst->data[0], cpu_to_le64(val));
-	arm_smmu_sync_ste_for_sid(smmu, sid);
-
-	/* It's likely that we'll want to use the new STE soon */
-	if (!(smmu->options & ARM_SMMU_OPT_SKIP_PREFETCH))
-		arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd);
+	target.data[0] = cpu_to_le64(val);
+	arm_smmu_write_ste(smmu, sid, dst, &target);
 }
 
 static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab,
-- 
2.42.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 05/19] iommu/arm-smmu-v3: Consolidate the STE generation for abort/bypass
  2023-11-13 17:53 ` Jason Gunthorpe
@ 2023-11-13 17:53   ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

This allows writing the flow of arm_smmu_write_strtab_ent() around abort
and bypass domains more naturally.

Note that the core code no longer supplies NULL domains, though there is
still a flow in the driver that end up in arm_smmu_write_strtab_ent() with
NULL. A later patch will remove it.

Remove the duplicate calculation of the STE in arm_smmu_init_bypass_stes()
and remove the force parameter. arm_smmu_rmr_install_bypass_ste() can now
simply invoke arm_smmu_make_bypass_ste() directly.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 89 +++++++++++----------
 1 file changed, 47 insertions(+), 42 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 6430a8d89cb471..13cdb959ec8f58 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1443,6 +1443,24 @@ static void arm_smmu_write_ste(struct arm_smmu_device *smmu, u32 sid,
 	}
 }
 
+static void arm_smmu_make_abort_ste(struct arm_smmu_ste *target)
+{
+	memset(target, 0, sizeof(*target));
+	target->data[0] = cpu_to_le64(
+		STRTAB_STE_0_V |
+		FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT));
+}
+
+static void arm_smmu_make_bypass_ste(struct arm_smmu_ste *target)
+{
+	memset(target, 0, sizeof(*target));
+	target->data[0] = cpu_to_le64(
+		STRTAB_STE_0_V |
+		FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS));
+	target->data[1] = cpu_to_le64(
+		FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
+}
+
 static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 				      struct arm_smmu_ste *dst)
 {
@@ -1453,37 +1471,31 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 	struct arm_smmu_domain *smmu_domain = master->domain;
 	struct arm_smmu_ste target = {};
 
-	if (smmu_domain) {
-		switch (smmu_domain->stage) {
-		case ARM_SMMU_DOMAIN_S1:
-			cd_table = &master->cd_table;
-			break;
-		case ARM_SMMU_DOMAIN_S2:
-			s2_cfg = &smmu_domain->s2_cfg;
-			break;
-		default:
-			break;
-		}
+	if (!smmu_domain) {
+		if (disable_bypass)
+			arm_smmu_make_abort_ste(&target);
+		else
+			arm_smmu_make_bypass_ste(&target);
+		arm_smmu_write_ste(smmu, sid, dst, &target);
+		return;
+	}
+
+	switch (smmu_domain->stage) {
+	case ARM_SMMU_DOMAIN_S1:
+		cd_table = &master->cd_table;
+		break;
+	case ARM_SMMU_DOMAIN_S2:
+		s2_cfg = &smmu_domain->s2_cfg;
+		break;
+	case ARM_SMMU_DOMAIN_BYPASS:
+		arm_smmu_make_bypass_ste(&target);
+		arm_smmu_write_ste(smmu, sid, dst, &target);
+		return;
 	}
 
 	/* Nuke the existing STE_0 value, as we're going to rewrite it */
 	val = STRTAB_STE_0_V;
 
-	/* Bypass/fault */
-	if (!smmu_domain || !(cd_table || s2_cfg)) {
-		if (!smmu_domain && disable_bypass)
-			val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT);
-		else
-			val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
-
-		target.data[0] = cpu_to_le64(val);
-		target.data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
-						STRTAB_STE_1_SHCFG_INCOMING));
-		target.data[2] = 0; /* Nuke the VMID */
-		arm_smmu_write_ste(smmu, sid, dst, &target);
-		return;
-	}
-
 	if (cd_table) {
 		u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ?
 			STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
@@ -1529,21 +1541,15 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 }
 
 static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab,
-				      unsigned int nent, bool force)
+				      unsigned int nent)
 {
 	unsigned int i;
-	u64 val = STRTAB_STE_0_V;
-
-	if (disable_bypass && !force)
-		val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT);
-	else
-		val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
 
 	for (i = 0; i < nent; ++i) {
-		strtab->data[0] = cpu_to_le64(val);
-		strtab->data[1] = cpu_to_le64(FIELD_PREP(
-			STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
-		strtab->data[2] = 0;
+		if (disable_bypass)
+			arm_smmu_make_abort_ste(strtab);
+		else
+			arm_smmu_make_bypass_ste(strtab);
 		strtab++;
 	}
 }
@@ -1571,7 +1577,7 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
 		return -ENOMEM;
 	}
 
-	arm_smmu_init_bypass_stes(desc->l2ptr, 1 << STRTAB_SPLIT, false);
+	arm_smmu_init_bypass_stes(desc->l2ptr, 1 << STRTAB_SPLIT);
 	arm_smmu_write_strtab_l1_desc(strtab, desc);
 	return 0;
 }
@@ -3193,7 +3199,7 @@ static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu)
 	reg |= FIELD_PREP(STRTAB_BASE_CFG_LOG2SIZE, smmu->sid_bits);
 	cfg->strtab_base_cfg = reg;
 
-	arm_smmu_init_bypass_stes(strtab, cfg->num_l1_ents, false);
+	arm_smmu_init_bypass_stes(strtab, cfg->num_l1_ents);
 	return 0;
 }
 
@@ -3904,7 +3910,6 @@ static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu)
 	iort_get_rmr_sids(dev_fwnode(smmu->dev), &rmr_list);
 
 	list_for_each_entry(e, &rmr_list, list) {
-		struct arm_smmu_ste *step;
 		struct iommu_iort_rmr_data *rmr;
 		int ret, i;
 
@@ -3917,8 +3922,8 @@ static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu)
 				continue;
 			}
 
-			step = arm_smmu_get_step_for_sid(smmu, rmr->sids[i]);
-			arm_smmu_init_bypass_stes(step, 1, true);
+			arm_smmu_make_bypass_ste(
+				arm_smmu_get_step_for_sid(smmu, rmr->sids[i]));
 		}
 	}
 
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 05/19] iommu/arm-smmu-v3: Consolidate the STE generation for abort/bypass
@ 2023-11-13 17:53   ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

This allows writing the flow of arm_smmu_write_strtab_ent() around abort
and bypass domains more naturally.

Note that the core code no longer supplies NULL domains, though there is
still a flow in the driver that end up in arm_smmu_write_strtab_ent() with
NULL. A later patch will remove it.

Remove the duplicate calculation of the STE in arm_smmu_init_bypass_stes()
and remove the force parameter. arm_smmu_rmr_install_bypass_ste() can now
simply invoke arm_smmu_make_bypass_ste() directly.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 89 +++++++++++----------
 1 file changed, 47 insertions(+), 42 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 6430a8d89cb471..13cdb959ec8f58 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1443,6 +1443,24 @@ static void arm_smmu_write_ste(struct arm_smmu_device *smmu, u32 sid,
 	}
 }
 
+static void arm_smmu_make_abort_ste(struct arm_smmu_ste *target)
+{
+	memset(target, 0, sizeof(*target));
+	target->data[0] = cpu_to_le64(
+		STRTAB_STE_0_V |
+		FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT));
+}
+
+static void arm_smmu_make_bypass_ste(struct arm_smmu_ste *target)
+{
+	memset(target, 0, sizeof(*target));
+	target->data[0] = cpu_to_le64(
+		STRTAB_STE_0_V |
+		FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS));
+	target->data[1] = cpu_to_le64(
+		FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
+}
+
 static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 				      struct arm_smmu_ste *dst)
 {
@@ -1453,37 +1471,31 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 	struct arm_smmu_domain *smmu_domain = master->domain;
 	struct arm_smmu_ste target = {};
 
-	if (smmu_domain) {
-		switch (smmu_domain->stage) {
-		case ARM_SMMU_DOMAIN_S1:
-			cd_table = &master->cd_table;
-			break;
-		case ARM_SMMU_DOMAIN_S2:
-			s2_cfg = &smmu_domain->s2_cfg;
-			break;
-		default:
-			break;
-		}
+	if (!smmu_domain) {
+		if (disable_bypass)
+			arm_smmu_make_abort_ste(&target);
+		else
+			arm_smmu_make_bypass_ste(&target);
+		arm_smmu_write_ste(smmu, sid, dst, &target);
+		return;
+	}
+
+	switch (smmu_domain->stage) {
+	case ARM_SMMU_DOMAIN_S1:
+		cd_table = &master->cd_table;
+		break;
+	case ARM_SMMU_DOMAIN_S2:
+		s2_cfg = &smmu_domain->s2_cfg;
+		break;
+	case ARM_SMMU_DOMAIN_BYPASS:
+		arm_smmu_make_bypass_ste(&target);
+		arm_smmu_write_ste(smmu, sid, dst, &target);
+		return;
 	}
 
 	/* Nuke the existing STE_0 value, as we're going to rewrite it */
 	val = STRTAB_STE_0_V;
 
-	/* Bypass/fault */
-	if (!smmu_domain || !(cd_table || s2_cfg)) {
-		if (!smmu_domain && disable_bypass)
-			val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT);
-		else
-			val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
-
-		target.data[0] = cpu_to_le64(val);
-		target.data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
-						STRTAB_STE_1_SHCFG_INCOMING));
-		target.data[2] = 0; /* Nuke the VMID */
-		arm_smmu_write_ste(smmu, sid, dst, &target);
-		return;
-	}
-
 	if (cd_table) {
 		u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ?
 			STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
@@ -1529,21 +1541,15 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 }
 
 static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab,
-				      unsigned int nent, bool force)
+				      unsigned int nent)
 {
 	unsigned int i;
-	u64 val = STRTAB_STE_0_V;
-
-	if (disable_bypass && !force)
-		val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT);
-	else
-		val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
 
 	for (i = 0; i < nent; ++i) {
-		strtab->data[0] = cpu_to_le64(val);
-		strtab->data[1] = cpu_to_le64(FIELD_PREP(
-			STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
-		strtab->data[2] = 0;
+		if (disable_bypass)
+			arm_smmu_make_abort_ste(strtab);
+		else
+			arm_smmu_make_bypass_ste(strtab);
 		strtab++;
 	}
 }
@@ -1571,7 +1577,7 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
 		return -ENOMEM;
 	}
 
-	arm_smmu_init_bypass_stes(desc->l2ptr, 1 << STRTAB_SPLIT, false);
+	arm_smmu_init_bypass_stes(desc->l2ptr, 1 << STRTAB_SPLIT);
 	arm_smmu_write_strtab_l1_desc(strtab, desc);
 	return 0;
 }
@@ -3193,7 +3199,7 @@ static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu)
 	reg |= FIELD_PREP(STRTAB_BASE_CFG_LOG2SIZE, smmu->sid_bits);
 	cfg->strtab_base_cfg = reg;
 
-	arm_smmu_init_bypass_stes(strtab, cfg->num_l1_ents, false);
+	arm_smmu_init_bypass_stes(strtab, cfg->num_l1_ents);
 	return 0;
 }
 
@@ -3904,7 +3910,6 @@ static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu)
 	iort_get_rmr_sids(dev_fwnode(smmu->dev), &rmr_list);
 
 	list_for_each_entry(e, &rmr_list, list) {
-		struct arm_smmu_ste *step;
 		struct iommu_iort_rmr_data *rmr;
 		int ret, i;
 
@@ -3917,8 +3922,8 @@ static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu)
 				continue;
 			}
 
-			step = arm_smmu_get_step_for_sid(smmu, rmr->sids[i]);
-			arm_smmu_init_bypass_stes(step, 1, true);
+			arm_smmu_make_bypass_ste(
+				arm_smmu_get_step_for_sid(smmu, rmr->sids[i]));
 		}
 	}
 
-- 
2.42.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 06/19] iommu/arm-smmu-v3: Move arm_smmu_rmr_install_bypass_ste()
  2023-11-13 17:53 ` Jason Gunthorpe
@ 2023-11-13 17:53   ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Logically arm_smmu_init_strtab_linear() is the function that allocates and
populates the stream table with the initial value of the STEs. After this
function returns the stream table should be fully ready.

arm_smmu_rmr_install_bypass_ste() adjusts the initial stream table to force
any SIDs that the FW says have IOMMU_RESV_DIRECT to use bypass. This
ensures there is no disruption to the identity mapping during boot.

Put arm_smmu_rmr_install_bypass_ste() into arm_smmu_init_strtab_linear(),
it already executes immediately after arm_smmu_init_strtab_linear().

No functional change intended.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 13cdb959ec8f58..3fc8787db2dbc1 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -86,6 +86,8 @@ static struct arm_smmu_option_prop arm_smmu_options[] = {
 	{ 0, NULL},
 };
 
+static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu);
+
 static void parse_driver_options(struct arm_smmu_device *smmu)
 {
 	int i = 0;
@@ -3200,6 +3202,9 @@ static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu)
 	cfg->strtab_base_cfg = reg;
 
 	arm_smmu_init_bypass_stes(strtab, cfg->num_l1_ents);
+
+	/* Check for RMRs and install bypass STEs if any */
+	arm_smmu_rmr_install_bypass_ste(smmu);
 	return 0;
 }
 
@@ -4013,9 +4018,6 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
 	/* Record our private device structure */
 	platform_set_drvdata(pdev, smmu);
 
-	/* Check for RMRs and install bypass STEs if any */
-	arm_smmu_rmr_install_bypass_ste(smmu);
-
 	/* Reset the device */
 	ret = arm_smmu_device_reset(smmu, bypass);
 	if (ret)
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 06/19] iommu/arm-smmu-v3: Move arm_smmu_rmr_install_bypass_ste()
@ 2023-11-13 17:53   ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Logically arm_smmu_init_strtab_linear() is the function that allocates and
populates the stream table with the initial value of the STEs. After this
function returns the stream table should be fully ready.

arm_smmu_rmr_install_bypass_ste() adjusts the initial stream table to force
any SIDs that the FW says have IOMMU_RESV_DIRECT to use bypass. This
ensures there is no disruption to the identity mapping during boot.

Put arm_smmu_rmr_install_bypass_ste() into arm_smmu_init_strtab_linear(),
it already executes immediately after arm_smmu_init_strtab_linear().

No functional change intended.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 13cdb959ec8f58..3fc8787db2dbc1 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -86,6 +86,8 @@ static struct arm_smmu_option_prop arm_smmu_options[] = {
 	{ 0, NULL},
 };
 
+static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu);
+
 static void parse_driver_options(struct arm_smmu_device *smmu)
 {
 	int i = 0;
@@ -3200,6 +3202,9 @@ static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu)
 	cfg->strtab_base_cfg = reg;
 
 	arm_smmu_init_bypass_stes(strtab, cfg->num_l1_ents);
+
+	/* Check for RMRs and install bypass STEs if any */
+	arm_smmu_rmr_install_bypass_ste(smmu);
 	return 0;
 }
 
@@ -4013,9 +4018,6 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
 	/* Record our private device structure */
 	platform_set_drvdata(pdev, smmu);
 
-	/* Check for RMRs and install bypass STEs if any */
-	arm_smmu_rmr_install_bypass_ste(smmu);
-
 	/* Reset the device */
 	ret = arm_smmu_device_reset(smmu, bypass);
 	if (ret)
-- 
2.42.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 07/19] iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into functions
  2023-11-13 17:53 ` Jason Gunthorpe
@ 2023-11-13 17:53   ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

This is preparation to move the STE calculation higher up in to the call
chain and remove arm_smmu_write_strtab_ent(). These new functions will be
called directly from attach_dev.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 115 +++++++++++---------
 1 file changed, 63 insertions(+), 52 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 3fc8787db2dbc1..1c63fdebbda9d4 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1463,13 +1463,70 @@ static void arm_smmu_make_bypass_ste(struct arm_smmu_ste *target)
 		FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
 }
 
+static void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
+				      struct arm_smmu_master *master,
+				      struct arm_smmu_ctx_desc_cfg *cd_table)
+{
+	struct arm_smmu_device *smmu = master->smmu;
+
+	memset(target, 0, sizeof(*target));
+	target->data[0] = cpu_to_le64(
+		STRTAB_STE_0_V |
+		FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
+		FIELD_PREP(STRTAB_STE_0_S1FMT, cd_table->s1fmt) |
+		(cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
+		FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax));
+
+	target->data[1] = cpu_to_le64(
+		FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
+		FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
+		FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
+		FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) |
+		((smmu->features & ARM_SMMU_FEAT_STALLS &&
+		  !master->stall_enabled) ?
+			 STRTAB_STE_1_S1STALLD :
+			 0) |
+		FIELD_PREP(STRTAB_STE_1_EATS,
+			   master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0) |
+		FIELD_PREP(STRTAB_STE_1_STRW,
+			   (smmu->features & ARM_SMMU_FEAT_E2H) ?
+				   STRTAB_STE_1_STRW_EL2 :
+				   STRTAB_STE_1_STRW_NSEL1));
+}
+
+static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
+					struct arm_smmu_master *master,
+					struct arm_smmu_domain *smmu_domain)
+{
+	struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg;
+
+	memset(target, 0, sizeof(*target));
+
+	target->data[0] = cpu_to_le64(
+		STRTAB_STE_0_V |
+		FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS));
+
+	target->data[1] |= cpu_to_le64(
+		FIELD_PREP(STRTAB_STE_1_EATS,
+			   master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
+
+	target->data[2] = cpu_to_le64(
+		FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
+		FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
+		STRTAB_STE_2_S2AA64 |
+#ifdef __BIG_ENDIAN
+		STRTAB_STE_2_S2ENDI |
+#endif
+		STRTAB_STE_2_S2PTW |
+		STRTAB_STE_2_S2R);
+
+	target->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
+}
+
 static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 				      struct arm_smmu_ste *dst)
 {
-	u64 val;
 	struct arm_smmu_device *smmu = master->smmu;
-	struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
-	struct arm_smmu_s2_cfg *s2_cfg = NULL;
 	struct arm_smmu_domain *smmu_domain = master->domain;
 	struct arm_smmu_ste target = {};
 
@@ -1484,61 +1541,15 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 
 	switch (smmu_domain->stage) {
 	case ARM_SMMU_DOMAIN_S1:
-		cd_table = &master->cd_table;
+		arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
 		break;
 	case ARM_SMMU_DOMAIN_S2:
-		s2_cfg = &smmu_domain->s2_cfg;
+		arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
 		break;
 	case ARM_SMMU_DOMAIN_BYPASS:
 		arm_smmu_make_bypass_ste(&target);
-		arm_smmu_write_ste(smmu, sid, dst, &target);
-		return;
+		break;
 	}
-
-	/* Nuke the existing STE_0 value, as we're going to rewrite it */
-	val = STRTAB_STE_0_V;
-
-	if (cd_table) {
-		u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ?
-			STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
-
-		target.data[1] = cpu_to_le64(
-			 FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
-			 FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
-			 FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
-			 FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) |
-			 FIELD_PREP(STRTAB_STE_1_STRW, strw));
-
-		if (smmu->features & ARM_SMMU_FEAT_STALLS &&
-		    !master->stall_enabled)
-			target.data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
-
-		val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
-			FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
-			FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax) |
-			FIELD_PREP(STRTAB_STE_0_S1FMT, cd_table->s1fmt);
-	}
-
-	if (s2_cfg) {
-		target.data[2] = cpu_to_le64(
-			 FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
-			 FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
-#ifdef __BIG_ENDIAN
-			 STRTAB_STE_2_S2ENDI |
-#endif
-			 STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
-			 STRTAB_STE_2_S2R);
-
-		target.data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
-
-		val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
-	}
-
-	if (master->ats_enabled)
-		target.data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
-						 STRTAB_STE_1_EATS_TRANS));
-
-	target.data[0] = cpu_to_le64(val);
 	arm_smmu_write_ste(smmu, sid, dst, &target);
 }
 
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 07/19] iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into functions
@ 2023-11-13 17:53   ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

This is preparation to move the STE calculation higher up in to the call
chain and remove arm_smmu_write_strtab_ent(). These new functions will be
called directly from attach_dev.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 115 +++++++++++---------
 1 file changed, 63 insertions(+), 52 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 3fc8787db2dbc1..1c63fdebbda9d4 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1463,13 +1463,70 @@ static void arm_smmu_make_bypass_ste(struct arm_smmu_ste *target)
 		FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
 }
 
+static void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
+				      struct arm_smmu_master *master,
+				      struct arm_smmu_ctx_desc_cfg *cd_table)
+{
+	struct arm_smmu_device *smmu = master->smmu;
+
+	memset(target, 0, sizeof(*target));
+	target->data[0] = cpu_to_le64(
+		STRTAB_STE_0_V |
+		FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
+		FIELD_PREP(STRTAB_STE_0_S1FMT, cd_table->s1fmt) |
+		(cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
+		FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax));
+
+	target->data[1] = cpu_to_le64(
+		FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
+		FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
+		FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
+		FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) |
+		((smmu->features & ARM_SMMU_FEAT_STALLS &&
+		  !master->stall_enabled) ?
+			 STRTAB_STE_1_S1STALLD :
+			 0) |
+		FIELD_PREP(STRTAB_STE_1_EATS,
+			   master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0) |
+		FIELD_PREP(STRTAB_STE_1_STRW,
+			   (smmu->features & ARM_SMMU_FEAT_E2H) ?
+				   STRTAB_STE_1_STRW_EL2 :
+				   STRTAB_STE_1_STRW_NSEL1));
+}
+
+static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
+					struct arm_smmu_master *master,
+					struct arm_smmu_domain *smmu_domain)
+{
+	struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg;
+
+	memset(target, 0, sizeof(*target));
+
+	target->data[0] = cpu_to_le64(
+		STRTAB_STE_0_V |
+		FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS));
+
+	target->data[1] |= cpu_to_le64(
+		FIELD_PREP(STRTAB_STE_1_EATS,
+			   master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
+
+	target->data[2] = cpu_to_le64(
+		FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
+		FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
+		STRTAB_STE_2_S2AA64 |
+#ifdef __BIG_ENDIAN
+		STRTAB_STE_2_S2ENDI |
+#endif
+		STRTAB_STE_2_S2PTW |
+		STRTAB_STE_2_S2R);
+
+	target->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
+}
+
 static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 				      struct arm_smmu_ste *dst)
 {
-	u64 val;
 	struct arm_smmu_device *smmu = master->smmu;
-	struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
-	struct arm_smmu_s2_cfg *s2_cfg = NULL;
 	struct arm_smmu_domain *smmu_domain = master->domain;
 	struct arm_smmu_ste target = {};
 
@@ -1484,61 +1541,15 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
 
 	switch (smmu_domain->stage) {
 	case ARM_SMMU_DOMAIN_S1:
-		cd_table = &master->cd_table;
+		arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
 		break;
 	case ARM_SMMU_DOMAIN_S2:
-		s2_cfg = &smmu_domain->s2_cfg;
+		arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
 		break;
 	case ARM_SMMU_DOMAIN_BYPASS:
 		arm_smmu_make_bypass_ste(&target);
-		arm_smmu_write_ste(smmu, sid, dst, &target);
-		return;
+		break;
 	}
-
-	/* Nuke the existing STE_0 value, as we're going to rewrite it */
-	val = STRTAB_STE_0_V;
-
-	if (cd_table) {
-		u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ?
-			STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
-
-		target.data[1] = cpu_to_le64(
-			 FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
-			 FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
-			 FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
-			 FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) |
-			 FIELD_PREP(STRTAB_STE_1_STRW, strw));
-
-		if (smmu->features & ARM_SMMU_FEAT_STALLS &&
-		    !master->stall_enabled)
-			target.data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
-
-		val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
-			FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
-			FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax) |
-			FIELD_PREP(STRTAB_STE_0_S1FMT, cd_table->s1fmt);
-	}
-
-	if (s2_cfg) {
-		target.data[2] = cpu_to_le64(
-			 FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
-			 FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
-#ifdef __BIG_ENDIAN
-			 STRTAB_STE_2_S2ENDI |
-#endif
-			 STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
-			 STRTAB_STE_2_S2R);
-
-		target.data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
-
-		val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
-	}
-
-	if (master->ats_enabled)
-		target.data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
-						 STRTAB_STE_1_EATS_TRANS));
-
-	target.data[0] = cpu_to_le64(val);
 	arm_smmu_write_ste(smmu, sid, dst, &target);
 }
 
-- 
2.42.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 08/19] iommu/arm-smmu-v3: Build the whole STE in arm_smmu_make_s2_domain_ste()
  2023-11-13 17:53 ` Jason Gunthorpe
@ 2023-11-13 17:53   ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Half the code was living in arm_smmu_domain_finalise_s2(), just move it
here and take the values directly from the pgtbl_ops instead of storing
copies.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 27 ++++++++++++---------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 --
 2 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 1c63fdebbda9d4..e80373885d8b19 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1499,6 +1499,11 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
 					struct arm_smmu_domain *smmu_domain)
 {
 	struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg;
+	const struct io_pgtable_cfg *pgtbl_cfg =
+		&io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops)->cfg;
+	typeof(&pgtbl_cfg->arm_lpae_s2_cfg.vtcr) vtcr =
+		&pgtbl_cfg->arm_lpae_s2_cfg.vtcr;
+	u64 vtcr_val;
 
 	memset(target, 0, sizeof(*target));
 
@@ -1510,9 +1515,16 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
 		FIELD_PREP(STRTAB_STE_1_EATS,
 			   master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
 
+	vtcr_val = FIELD_PREP(STRTAB_STE_2_VTCR_S2T0SZ, vtcr->tsz) |
+		   FIELD_PREP(STRTAB_STE_2_VTCR_S2SL0, vtcr->sl) |
+		   FIELD_PREP(STRTAB_STE_2_VTCR_S2IR0, vtcr->irgn) |
+		   FIELD_PREP(STRTAB_STE_2_VTCR_S2OR0, vtcr->orgn) |
+		   FIELD_PREP(STRTAB_STE_2_VTCR_S2SH0, vtcr->sh) |
+		   FIELD_PREP(STRTAB_STE_2_VTCR_S2TG, vtcr->tg) |
+		   FIELD_PREP(STRTAB_STE_2_VTCR_S2PS, vtcr->ps);
 	target->data[2] = cpu_to_le64(
 		FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
-		FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
+		FIELD_PREP(STRTAB_STE_2_VTCR, vtcr_val) |
 		STRTAB_STE_2_S2AA64 |
 #ifdef __BIG_ENDIAN
 		STRTAB_STE_2_S2ENDI |
@@ -1520,7 +1532,8 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
 		STRTAB_STE_2_S2PTW |
 		STRTAB_STE_2_S2R);
 
-	target->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
+	target->data[3] = cpu_to_le64(pgtbl_cfg->arm_lpae_s2_cfg.vttbr &
+				      STRTAB_STE_3_S2TTB_MASK);
 }
 
 static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
@@ -2277,7 +2290,6 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
 	int vmid;
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
-	typeof(&pgtbl_cfg->arm_lpae_s2_cfg.vtcr) vtcr;
 
 	/* Reserve VMID 0 for stage-2 bypass STEs */
 	vmid = ida_alloc_range(&smmu->vmid_map, 1, (1 << smmu->vmid_bits) - 1,
@@ -2285,16 +2297,7 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
 	if (vmid < 0)
 		return vmid;
 
-	vtcr = &pgtbl_cfg->arm_lpae_s2_cfg.vtcr;
 	cfg->vmid	= (u16)vmid;
-	cfg->vttbr	= pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
-	cfg->vtcr	= FIELD_PREP(STRTAB_STE_2_VTCR_S2T0SZ, vtcr->tsz) |
-			  FIELD_PREP(STRTAB_STE_2_VTCR_S2SL0, vtcr->sl) |
-			  FIELD_PREP(STRTAB_STE_2_VTCR_S2IR0, vtcr->irgn) |
-			  FIELD_PREP(STRTAB_STE_2_VTCR_S2OR0, vtcr->orgn) |
-			  FIELD_PREP(STRTAB_STE_2_VTCR_S2SH0, vtcr->sh) |
-			  FIELD_PREP(STRTAB_STE_2_VTCR_S2TG, vtcr->tg) |
-			  FIELD_PREP(STRTAB_STE_2_VTCR_S2PS, vtcr->ps);
 	return 0;
 }
 
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 27ddf1acd12cea..1be0c1151c50c3 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -609,8 +609,6 @@ struct arm_smmu_ctx_desc_cfg {
 
 struct arm_smmu_s2_cfg {
 	u16				vmid;
-	u64				vttbr;
-	u64				vtcr;
 };
 
 struct arm_smmu_strtab_cfg {
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 08/19] iommu/arm-smmu-v3: Build the whole STE in arm_smmu_make_s2_domain_ste()
@ 2023-11-13 17:53   ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Half the code was living in arm_smmu_domain_finalise_s2(), just move it
here and take the values directly from the pgtbl_ops instead of storing
copies.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 27 ++++++++++++---------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 --
 2 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 1c63fdebbda9d4..e80373885d8b19 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1499,6 +1499,11 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
 					struct arm_smmu_domain *smmu_domain)
 {
 	struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg;
+	const struct io_pgtable_cfg *pgtbl_cfg =
+		&io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops)->cfg;
+	typeof(&pgtbl_cfg->arm_lpae_s2_cfg.vtcr) vtcr =
+		&pgtbl_cfg->arm_lpae_s2_cfg.vtcr;
+	u64 vtcr_val;
 
 	memset(target, 0, sizeof(*target));
 
@@ -1510,9 +1515,16 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
 		FIELD_PREP(STRTAB_STE_1_EATS,
 			   master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
 
+	vtcr_val = FIELD_PREP(STRTAB_STE_2_VTCR_S2T0SZ, vtcr->tsz) |
+		   FIELD_PREP(STRTAB_STE_2_VTCR_S2SL0, vtcr->sl) |
+		   FIELD_PREP(STRTAB_STE_2_VTCR_S2IR0, vtcr->irgn) |
+		   FIELD_PREP(STRTAB_STE_2_VTCR_S2OR0, vtcr->orgn) |
+		   FIELD_PREP(STRTAB_STE_2_VTCR_S2SH0, vtcr->sh) |
+		   FIELD_PREP(STRTAB_STE_2_VTCR_S2TG, vtcr->tg) |
+		   FIELD_PREP(STRTAB_STE_2_VTCR_S2PS, vtcr->ps);
 	target->data[2] = cpu_to_le64(
 		FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
-		FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
+		FIELD_PREP(STRTAB_STE_2_VTCR, vtcr_val) |
 		STRTAB_STE_2_S2AA64 |
 #ifdef __BIG_ENDIAN
 		STRTAB_STE_2_S2ENDI |
@@ -1520,7 +1532,8 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
 		STRTAB_STE_2_S2PTW |
 		STRTAB_STE_2_S2R);
 
-	target->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
+	target->data[3] = cpu_to_le64(pgtbl_cfg->arm_lpae_s2_cfg.vttbr &
+				      STRTAB_STE_3_S2TTB_MASK);
 }
 
 static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
@@ -2277,7 +2290,6 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
 	int vmid;
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
-	typeof(&pgtbl_cfg->arm_lpae_s2_cfg.vtcr) vtcr;
 
 	/* Reserve VMID 0 for stage-2 bypass STEs */
 	vmid = ida_alloc_range(&smmu->vmid_map, 1, (1 << smmu->vmid_bits) - 1,
@@ -2285,16 +2297,7 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
 	if (vmid < 0)
 		return vmid;
 
-	vtcr = &pgtbl_cfg->arm_lpae_s2_cfg.vtcr;
 	cfg->vmid	= (u16)vmid;
-	cfg->vttbr	= pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
-	cfg->vtcr	= FIELD_PREP(STRTAB_STE_2_VTCR_S2T0SZ, vtcr->tsz) |
-			  FIELD_PREP(STRTAB_STE_2_VTCR_S2SL0, vtcr->sl) |
-			  FIELD_PREP(STRTAB_STE_2_VTCR_S2IR0, vtcr->irgn) |
-			  FIELD_PREP(STRTAB_STE_2_VTCR_S2OR0, vtcr->orgn) |
-			  FIELD_PREP(STRTAB_STE_2_VTCR_S2SH0, vtcr->sh) |
-			  FIELD_PREP(STRTAB_STE_2_VTCR_S2TG, vtcr->tg) |
-			  FIELD_PREP(STRTAB_STE_2_VTCR_S2PS, vtcr->ps);
 	return 0;
 }
 
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 27ddf1acd12cea..1be0c1151c50c3 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -609,8 +609,6 @@ struct arm_smmu_ctx_desc_cfg {
 
 struct arm_smmu_s2_cfg {
 	u16				vmid;
-	u64				vttbr;
-	u64				vtcr;
 };
 
 struct arm_smmu_strtab_cfg {
-- 
2.42.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 09/19] iommu/arm-smmu-v3: Hold arm_smmu_asid_lock during all of attach_dev
  2023-11-13 17:53 ` Jason Gunthorpe
@ 2023-11-13 17:53   ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

The BTM support wants to be able to change the ASID of any smmu_domain.
When it goes to do this it holds the arm_smmu_asid_lock and iterates over
the target domain's devices list.

During attach of a S1 domain we must ensure that the devices list and
CD are in sync, otherwise we could miss CD updates or a parallel CD update
could push an out of date CD.

This is pretty complicated, and works today because arm_smmu_detach_dev()
remove the CD table from the STE before working on the CD entries.

The next patch will allow the CD table to remain in the STE so solve this
racy by holding the lock for a longer period. The lock covers both of the
changes to the device list and the CD table entries.

Move arm_smmu_detach_dev() till after we have initialized the domain so
the lock can be held for less time.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 22 ++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index e80373885d8b19..b11dc03ee16880 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2560,8 +2560,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		return -EBUSY;
 	}
 
-	arm_smmu_detach_dev(master);
-
 	mutex_lock(&smmu_domain->init_mutex);
 
 	if (!smmu_domain->smmu) {
@@ -2576,6 +2574,16 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	if (ret)
 		return ret;
 
+	/*
+	 * Prevent arm_smmu_share_asid() from trying to change the ASID
+	 * of either the old or new domain while we are working on it.
+	 * This allows the STE and the smmu_domain->devices list to
+	 * be inconsistent during this routine.
+	 */
+	mutex_lock(&arm_smmu_asid_lock);
+
+	arm_smmu_detach_dev(master);
+
 	master->domain = smmu_domain;
 
 	/*
@@ -2601,13 +2609,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 			}
 		}
 
-		/*
-		 * Prevent SVA from concurrently modifying the CD or writing to
-		 * the CD entry
-		 */
-		mutex_lock(&arm_smmu_asid_lock);
 		ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd);
-		mutex_unlock(&arm_smmu_asid_lock);
 		if (ret) {
 			master->domain = NULL;
 			goto out_list_del;
@@ -2617,13 +2619,15 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	arm_smmu_install_ste_for_dev(master);
 
 	arm_smmu_enable_ats(master);
-	return 0;
+	goto out_unlock;
 
 out_list_del:
 	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
 	list_del(&master->domain_head);
 	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
 
+out_unlock:
+	mutex_unlock(&arm_smmu_asid_lock);
 	return ret;
 }
 
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 09/19] iommu/arm-smmu-v3: Hold arm_smmu_asid_lock during all of attach_dev
@ 2023-11-13 17:53   ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

The BTM support wants to be able to change the ASID of any smmu_domain.
When it goes to do this it holds the arm_smmu_asid_lock and iterates over
the target domain's devices list.

During attach of a S1 domain we must ensure that the devices list and
CD are in sync, otherwise we could miss CD updates or a parallel CD update
could push an out of date CD.

This is pretty complicated, and works today because arm_smmu_detach_dev()
remove the CD table from the STE before working on the CD entries.

The next patch will allow the CD table to remain in the STE so solve this
racy by holding the lock for a longer period. The lock covers both of the
changes to the device list and the CD table entries.

Move arm_smmu_detach_dev() till after we have initialized the domain so
the lock can be held for less time.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 22 ++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index e80373885d8b19..b11dc03ee16880 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2560,8 +2560,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		return -EBUSY;
 	}
 
-	arm_smmu_detach_dev(master);
-
 	mutex_lock(&smmu_domain->init_mutex);
 
 	if (!smmu_domain->smmu) {
@@ -2576,6 +2574,16 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	if (ret)
 		return ret;
 
+	/*
+	 * Prevent arm_smmu_share_asid() from trying to change the ASID
+	 * of either the old or new domain while we are working on it.
+	 * This allows the STE and the smmu_domain->devices list to
+	 * be inconsistent during this routine.
+	 */
+	mutex_lock(&arm_smmu_asid_lock);
+
+	arm_smmu_detach_dev(master);
+
 	master->domain = smmu_domain;
 
 	/*
@@ -2601,13 +2609,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 			}
 		}
 
-		/*
-		 * Prevent SVA from concurrently modifying the CD or writing to
-		 * the CD entry
-		 */
-		mutex_lock(&arm_smmu_asid_lock);
 		ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd);
-		mutex_unlock(&arm_smmu_asid_lock);
 		if (ret) {
 			master->domain = NULL;
 			goto out_list_del;
@@ -2617,13 +2619,15 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	arm_smmu_install_ste_for_dev(master);
 
 	arm_smmu_enable_ats(master);
-	return 0;
+	goto out_unlock;
 
 out_list_del:
 	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
 	list_del(&master->domain_head);
 	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
 
+out_unlock:
+	mutex_unlock(&arm_smmu_asid_lock);
 	return ret;
 }
 
-- 
2.42.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 10/19] iommu/arm-smmu-v3: Compute the STE only once for each master
  2023-11-13 17:53 ` Jason Gunthorpe
@ 2023-11-13 17:53   ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Currently arm_smmu_install_ste_for_dev() iterates over every SID and
computes from scratch an identical STE. Every SID should have the same STE
contents. Turn this inside out so that the STE is supplied by the caller
and arm_smmu_install_ste_for_dev() simply installs it to every SID.

This is possible now that the STE generation does not inform what sequence
should be used to program it.

This allows splitting the STE calculation up according to the call site,
which following patches will make use of, and removes the confusing NULL
domain special case that only supported arm_smmu_detach_dev().

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 58 ++++++++-------------
 1 file changed, 22 insertions(+), 36 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index b11dc03ee16880..4b157c2ddf9a80 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1536,36 +1536,6 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
 				      STRTAB_STE_3_S2TTB_MASK);
 }
 
-static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
-				      struct arm_smmu_ste *dst)
-{
-	struct arm_smmu_device *smmu = master->smmu;
-	struct arm_smmu_domain *smmu_domain = master->domain;
-	struct arm_smmu_ste target = {};
-
-	if (!smmu_domain) {
-		if (disable_bypass)
-			arm_smmu_make_abort_ste(&target);
-		else
-			arm_smmu_make_bypass_ste(&target);
-		arm_smmu_write_ste(smmu, sid, dst, &target);
-		return;
-	}
-
-	switch (smmu_domain->stage) {
-	case ARM_SMMU_DOMAIN_S1:
-		arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
-		break;
-	case ARM_SMMU_DOMAIN_S2:
-		arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
-		break;
-	case ARM_SMMU_DOMAIN_BYPASS:
-		arm_smmu_make_bypass_ste(&target);
-		break;
-	}
-	arm_smmu_write_ste(smmu, sid, dst, &target);
-}
-
 static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab,
 				      unsigned int nent)
 {
@@ -2387,7 +2357,8 @@ arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
 	}
 }
 
-static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
+static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master,
+					 const struct arm_smmu_ste *target)
 {
 	int i, j;
 	struct arm_smmu_device *smmu = master->smmu;
@@ -2404,7 +2375,7 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
 		if (j < i)
 			continue;
 
-		arm_smmu_write_strtab_ent(master, sid, step);
+		arm_smmu_write_ste(smmu, sid, step, target);
 	}
 }
 
@@ -2511,6 +2482,7 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master)
 static void arm_smmu_detach_dev(struct arm_smmu_master *master)
 {
 	unsigned long flags;
+	struct arm_smmu_ste target;
 	struct arm_smmu_domain *smmu_domain = master->domain;
 
 	if (!smmu_domain)
@@ -2524,7 +2496,11 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
 
 	master->domain = NULL;
 	master->ats_enabled = false;
-	arm_smmu_install_ste_for_dev(master);
+	if (disable_bypass)
+		arm_smmu_make_abort_ste(&target);
+	else
+		arm_smmu_make_bypass_ste(&target);
+	arm_smmu_install_ste_for_dev(master, &target);
 	/*
 	 * Clearing the CD entry isn't strictly required to detach the domain
 	 * since the table is uninstalled anyway, but it helps avoid confusion
@@ -2539,6 +2515,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	int ret = 0;
 	unsigned long flags;
+	struct arm_smmu_ste target;
 	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
 	struct arm_smmu_device *smmu;
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
@@ -2600,7 +2577,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	list_add(&master->domain_head, &smmu_domain->devices);
 	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
 
-	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
+	switch (smmu_domain->stage) {
+	case ARM_SMMU_DOMAIN_S1:
 		if (!master->cd_table.cdtab) {
 			ret = arm_smmu_alloc_cd_tables(master);
 			if (ret) {
@@ -2614,9 +2592,17 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 			master->domain = NULL;
 			goto out_list_del;
 		}
-	}
 
-	arm_smmu_install_ste_for_dev(master);
+		arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
+		break;
+	case ARM_SMMU_DOMAIN_S2:
+		arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
+		break;
+	case ARM_SMMU_DOMAIN_BYPASS:
+		arm_smmu_make_bypass_ste(&target);
+		break;
+	}
+	arm_smmu_install_ste_for_dev(master, &target);
 
 	arm_smmu_enable_ats(master);
 	goto out_unlock;
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 10/19] iommu/arm-smmu-v3: Compute the STE only once for each master
@ 2023-11-13 17:53   ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Currently arm_smmu_install_ste_for_dev() iterates over every SID and
computes from scratch an identical STE. Every SID should have the same STE
contents. Turn this inside out so that the STE is supplied by the caller
and arm_smmu_install_ste_for_dev() simply installs it to every SID.

This is possible now that the STE generation does not inform what sequence
should be used to program it.

This allows splitting the STE calculation up according to the call site,
which following patches will make use of, and removes the confusing NULL
domain special case that only supported arm_smmu_detach_dev().

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 58 ++++++++-------------
 1 file changed, 22 insertions(+), 36 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index b11dc03ee16880..4b157c2ddf9a80 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1536,36 +1536,6 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
 				      STRTAB_STE_3_S2TTB_MASK);
 }
 
-static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
-				      struct arm_smmu_ste *dst)
-{
-	struct arm_smmu_device *smmu = master->smmu;
-	struct arm_smmu_domain *smmu_domain = master->domain;
-	struct arm_smmu_ste target = {};
-
-	if (!smmu_domain) {
-		if (disable_bypass)
-			arm_smmu_make_abort_ste(&target);
-		else
-			arm_smmu_make_bypass_ste(&target);
-		arm_smmu_write_ste(smmu, sid, dst, &target);
-		return;
-	}
-
-	switch (smmu_domain->stage) {
-	case ARM_SMMU_DOMAIN_S1:
-		arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
-		break;
-	case ARM_SMMU_DOMAIN_S2:
-		arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
-		break;
-	case ARM_SMMU_DOMAIN_BYPASS:
-		arm_smmu_make_bypass_ste(&target);
-		break;
-	}
-	arm_smmu_write_ste(smmu, sid, dst, &target);
-}
-
 static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab,
 				      unsigned int nent)
 {
@@ -2387,7 +2357,8 @@ arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
 	}
 }
 
-static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
+static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master,
+					 const struct arm_smmu_ste *target)
 {
 	int i, j;
 	struct arm_smmu_device *smmu = master->smmu;
@@ -2404,7 +2375,7 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
 		if (j < i)
 			continue;
 
-		arm_smmu_write_strtab_ent(master, sid, step);
+		arm_smmu_write_ste(smmu, sid, step, target);
 	}
 }
 
@@ -2511,6 +2482,7 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master)
 static void arm_smmu_detach_dev(struct arm_smmu_master *master)
 {
 	unsigned long flags;
+	struct arm_smmu_ste target;
 	struct arm_smmu_domain *smmu_domain = master->domain;
 
 	if (!smmu_domain)
@@ -2524,7 +2496,11 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
 
 	master->domain = NULL;
 	master->ats_enabled = false;
-	arm_smmu_install_ste_for_dev(master);
+	if (disable_bypass)
+		arm_smmu_make_abort_ste(&target);
+	else
+		arm_smmu_make_bypass_ste(&target);
+	arm_smmu_install_ste_for_dev(master, &target);
 	/*
 	 * Clearing the CD entry isn't strictly required to detach the domain
 	 * since the table is uninstalled anyway, but it helps avoid confusion
@@ -2539,6 +2515,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	int ret = 0;
 	unsigned long flags;
+	struct arm_smmu_ste target;
 	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
 	struct arm_smmu_device *smmu;
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
@@ -2600,7 +2577,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	list_add(&master->domain_head, &smmu_domain->devices);
 	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
 
-	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
+	switch (smmu_domain->stage) {
+	case ARM_SMMU_DOMAIN_S1:
 		if (!master->cd_table.cdtab) {
 			ret = arm_smmu_alloc_cd_tables(master);
 			if (ret) {
@@ -2614,9 +2592,17 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 			master->domain = NULL;
 			goto out_list_del;
 		}
-	}
 
-	arm_smmu_install_ste_for_dev(master);
+		arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
+		break;
+	case ARM_SMMU_DOMAIN_S2:
+		arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
+		break;
+	case ARM_SMMU_DOMAIN_BYPASS:
+		arm_smmu_make_bypass_ste(&target);
+		break;
+	}
+	arm_smmu_install_ste_for_dev(master, &target);
 
 	arm_smmu_enable_ats(master);
 	goto out_unlock;
-- 
2.42.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 11/19] iommu/arm-smmu-v3: Do not change the STE twice during arm_smmu_attach_dev()
  2023-11-13 17:53 ` Jason Gunthorpe
@ 2023-11-13 17:53   ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

This was needed because the STE code required the STE to be in
ABORT/BYPASS inorder to program a cdtable or S2 STE. Now that the STE code
can automatically handle all transitions we can remove this step
from the attach_dev flow.

A few small bugs exist because of this:

1) If the core code does BLOCKED -> UNMANAGED with disable_bypass=false
   then there will be a moment where the STE points at BYPASS. Since
   this can be done by VFIO/IOMMUFD it is a small security race.

2) If the core code does IDENTITY -> DMA then any IOMMU_RESV_DIRECT
   regions will temporarily become BLOCKED. We'd like drivers to
   work in a way that allows IOMMU_RESV_DIRECT to be continuously
   functional during these transitions.

Make arm_smmu_release_device() put the STE back to the correct
ABORT/BYPASS setting. Fix a bug where a IOMMU_RESV_DIRECT was ignored on
this path.

Notice this subtly depends on the prior arm_smmu_asid_lock change as the
STE must be put to non-paging before removing the device for the linked
list to avoid races with arm_smmu_share_asid().

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 4b157c2ddf9a80..f70862806211de 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2482,7 +2482,6 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master)
 static void arm_smmu_detach_dev(struct arm_smmu_master *master)
 {
 	unsigned long flags;
-	struct arm_smmu_ste target;
 	struct arm_smmu_domain *smmu_domain = master->domain;
 
 	if (!smmu_domain)
@@ -2496,11 +2495,6 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
 
 	master->domain = NULL;
 	master->ats_enabled = false;
-	if (disable_bypass)
-		arm_smmu_make_abort_ste(&target);
-	else
-		arm_smmu_make_bypass_ste(&target);
-	arm_smmu_install_ste_for_dev(master, &target);
 	/*
 	 * Clearing the CD entry isn't strictly required to detach the domain
 	 * since the table is uninstalled anyway, but it helps avoid confusion
@@ -2852,9 +2846,18 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
 static void arm_smmu_release_device(struct device *dev)
 {
 	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+	struct arm_smmu_ste target;
 
 	if (WARN_ON(arm_smmu_master_sva_enabled(master)))
 		iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
+
+	/* Put the STE back to what arm_smmu_init_strtab() sets */
+	if (disable_bypass && !dev->iommu->require_direct)
+		arm_smmu_make_abort_ste(&target);
+	else
+		arm_smmu_make_bypass_ste(&target);
+	arm_smmu_install_ste_for_dev(master, &target);
+
 	arm_smmu_detach_dev(master);
 	arm_smmu_disable_pasid(master);
 	arm_smmu_remove_master(master);
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 11/19] iommu/arm-smmu-v3: Do not change the STE twice during arm_smmu_attach_dev()
@ 2023-11-13 17:53   ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

This was needed because the STE code required the STE to be in
ABORT/BYPASS inorder to program a cdtable or S2 STE. Now that the STE code
can automatically handle all transitions we can remove this step
from the attach_dev flow.

A few small bugs exist because of this:

1) If the core code does BLOCKED -> UNMANAGED with disable_bypass=false
   then there will be a moment where the STE points at BYPASS. Since
   this can be done by VFIO/IOMMUFD it is a small security race.

2) If the core code does IDENTITY -> DMA then any IOMMU_RESV_DIRECT
   regions will temporarily become BLOCKED. We'd like drivers to
   work in a way that allows IOMMU_RESV_DIRECT to be continuously
   functional during these transitions.

Make arm_smmu_release_device() put the STE back to the correct
ABORT/BYPASS setting. Fix a bug where a IOMMU_RESV_DIRECT was ignored on
this path.

Notice this subtly depends on the prior arm_smmu_asid_lock change as the
STE must be put to non-paging before removing the device for the linked
list to avoid races with arm_smmu_share_asid().

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 4b157c2ddf9a80..f70862806211de 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2482,7 +2482,6 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master)
 static void arm_smmu_detach_dev(struct arm_smmu_master *master)
 {
 	unsigned long flags;
-	struct arm_smmu_ste target;
 	struct arm_smmu_domain *smmu_domain = master->domain;
 
 	if (!smmu_domain)
@@ -2496,11 +2495,6 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
 
 	master->domain = NULL;
 	master->ats_enabled = false;
-	if (disable_bypass)
-		arm_smmu_make_abort_ste(&target);
-	else
-		arm_smmu_make_bypass_ste(&target);
-	arm_smmu_install_ste_for_dev(master, &target);
 	/*
 	 * Clearing the CD entry isn't strictly required to detach the domain
 	 * since the table is uninstalled anyway, but it helps avoid confusion
@@ -2852,9 +2846,18 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
 static void arm_smmu_release_device(struct device *dev)
 {
 	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+	struct arm_smmu_ste target;
 
 	if (WARN_ON(arm_smmu_master_sva_enabled(master)))
 		iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
+
+	/* Put the STE back to what arm_smmu_init_strtab() sets */
+	if (disable_bypass && !dev->iommu->require_direct)
+		arm_smmu_make_abort_ste(&target);
+	else
+		arm_smmu_make_bypass_ste(&target);
+	arm_smmu_install_ste_for_dev(master, &target);
+
 	arm_smmu_detach_dev(master);
 	arm_smmu_disable_pasid(master);
 	arm_smmu_remove_master(master);
-- 
2.42.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 12/19] iommu/arm-smmu-v3: Put writing the context descriptor in the right order
  2023-11-13 17:53 ` Jason Gunthorpe
@ 2023-11-13 17:53   ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Get closer to the IOMMU API ideal that changes between domains can be
hitless. The ordering for the CD table entry is not entirely clean from
this perspective.

When switching away from a STE with a CD table programmed in it we should
write the new STE first, then clear any old data in the CD entry.

If we are programming a CD table for the first time to a STE then the CD
entry should be programmed before the STE is loaded.

If we are replacing a CD table entry when the STE already points at the CD
entry then we just need to do the make/break sequence.

Lift this code out of arm_smmu_detach_dev() so it can all be sequenced
properly. The only other caller is arm_smmu_release_device() and it is
going to free the cdtable anyhow, so it doesn't matter what is in it.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 29 ++++++++++++++-------
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index f70862806211de..eb5dcd357a42b8 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2495,14 +2495,6 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
 
 	master->domain = NULL;
 	master->ats_enabled = false;
-	/*
-	 * Clearing the CD entry isn't strictly required to detach the domain
-	 * since the table is uninstalled anyway, but it helps avoid confusion
-	 * in the call to arm_smmu_write_ctx_desc on the next attach (which
-	 * expects the entry to be empty).
-	 */
-	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 && master->cd_table.cdtab)
-		arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
 }
 
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
@@ -2579,6 +2571,17 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 				master->domain = NULL;
 				goto out_list_del;
 			}
+		} else {
+			/*
+			 * arm_smmu_write_ctx_desc() relies on the entry being
+			 * invalid to work, clear any existing entry.
+			 */
+			ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
+						      NULL);
+			if (ret) {
+				master->domain = NULL;
+				goto out_list_del;
+			}
 		}
 
 		ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd);
@@ -2588,15 +2591,23 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		}
 
 		arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
+		arm_smmu_install_ste_for_dev(master, &target);
 		break;
 	case ARM_SMMU_DOMAIN_S2:
 		arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
+		arm_smmu_install_ste_for_dev(master, &target);
+		if (master->cd_table.cdtab)
+			arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
+						      NULL);
 		break;
 	case ARM_SMMU_DOMAIN_BYPASS:
 		arm_smmu_make_bypass_ste(&target);
+		arm_smmu_install_ste_for_dev(master, &target);
+		if (master->cd_table.cdtab)
+			arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
+						      NULL);
 		break;
 	}
-	arm_smmu_install_ste_for_dev(master, &target);
 
 	arm_smmu_enable_ats(master);
 	goto out_unlock;
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 12/19] iommu/arm-smmu-v3: Put writing the context descriptor in the right order
@ 2023-11-13 17:53   ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Get closer to the IOMMU API ideal that changes between domains can be
hitless. The ordering for the CD table entry is not entirely clean from
this perspective.

When switching away from a STE with a CD table programmed in it we should
write the new STE first, then clear any old data in the CD entry.

If we are programming a CD table for the first time to a STE then the CD
entry should be programmed before the STE is loaded.

If we are replacing a CD table entry when the STE already points at the CD
entry then we just need to do the make/break sequence.

Lift this code out of arm_smmu_detach_dev() so it can all be sequenced
properly. The only other caller is arm_smmu_release_device() and it is
going to free the cdtable anyhow, so it doesn't matter what is in it.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 29 ++++++++++++++-------
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index f70862806211de..eb5dcd357a42b8 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2495,14 +2495,6 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
 
 	master->domain = NULL;
 	master->ats_enabled = false;
-	/*
-	 * Clearing the CD entry isn't strictly required to detach the domain
-	 * since the table is uninstalled anyway, but it helps avoid confusion
-	 * in the call to arm_smmu_write_ctx_desc on the next attach (which
-	 * expects the entry to be empty).
-	 */
-	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 && master->cd_table.cdtab)
-		arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
 }
 
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
@@ -2579,6 +2571,17 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 				master->domain = NULL;
 				goto out_list_del;
 			}
+		} else {
+			/*
+			 * arm_smmu_write_ctx_desc() relies on the entry being
+			 * invalid to work, clear any existing entry.
+			 */
+			ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
+						      NULL);
+			if (ret) {
+				master->domain = NULL;
+				goto out_list_del;
+			}
 		}
 
 		ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd);
@@ -2588,15 +2591,23 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		}
 
 		arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
+		arm_smmu_install_ste_for_dev(master, &target);
 		break;
 	case ARM_SMMU_DOMAIN_S2:
 		arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
+		arm_smmu_install_ste_for_dev(master, &target);
+		if (master->cd_table.cdtab)
+			arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
+						      NULL);
 		break;
 	case ARM_SMMU_DOMAIN_BYPASS:
 		arm_smmu_make_bypass_ste(&target);
+		arm_smmu_install_ste_for_dev(master, &target);
+		if (master->cd_table.cdtab)
+			arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
+						      NULL);
 		break;
 	}
-	arm_smmu_install_ste_for_dev(master, &target);
 
 	arm_smmu_enable_ats(master);
 	goto out_unlock;
-- 
2.42.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 13/19] iommu/arm-smmu-v3: Pass smmu_domain to arm_enable/disable_ats()
  2023-11-13 17:53 ` Jason Gunthorpe
@ 2023-11-13 17:53   ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

The caller already has the domain, just pass it in. A following patch will
remove master->domain.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index eb5dcd357a42b8..7d2dd3ea47ab68 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2394,12 +2394,12 @@ static bool arm_smmu_ats_supported(struct arm_smmu_master *master)
 	return dev_is_pci(dev) && pci_ats_supported(to_pci_dev(dev));
 }
 
-static void arm_smmu_enable_ats(struct arm_smmu_master *master)
+static void arm_smmu_enable_ats(struct arm_smmu_master *master,
+				struct arm_smmu_domain *smmu_domain)
 {
 	size_t stu;
 	struct pci_dev *pdev;
 	struct arm_smmu_device *smmu = master->smmu;
-	struct arm_smmu_domain *smmu_domain = master->domain;
 
 	/* Don't enable ATS at the endpoint if it's not enabled in the STE */
 	if (!master->ats_enabled)
@@ -2415,10 +2415,9 @@ static void arm_smmu_enable_ats(struct arm_smmu_master *master)
 		dev_err(master->dev, "Failed to enable ATS (STU %zu)\n", stu);
 }
 
-static void arm_smmu_disable_ats(struct arm_smmu_master *master)
+static void arm_smmu_disable_ats(struct arm_smmu_master *master,
+				 struct arm_smmu_domain *smmu_domain)
 {
-	struct arm_smmu_domain *smmu_domain = master->domain;
-
 	if (!master->ats_enabled)
 		return;
 
@@ -2487,7 +2486,7 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
 	if (!smmu_domain)
 		return;
 
-	arm_smmu_disable_ats(master);
+	arm_smmu_disable_ats(master, smmu_domain);
 
 	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
 	list_del(&master->domain_head);
@@ -2609,7 +2608,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		break;
 	}
 
-	arm_smmu_enable_ats(master);
+	arm_smmu_enable_ats(master, smmu_domain);
 	goto out_unlock;
 
 out_list_del:
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 13/19] iommu/arm-smmu-v3: Pass smmu_domain to arm_enable/disable_ats()
@ 2023-11-13 17:53   ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

The caller already has the domain, just pass it in. A following patch will
remove master->domain.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index eb5dcd357a42b8..7d2dd3ea47ab68 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2394,12 +2394,12 @@ static bool arm_smmu_ats_supported(struct arm_smmu_master *master)
 	return dev_is_pci(dev) && pci_ats_supported(to_pci_dev(dev));
 }
 
-static void arm_smmu_enable_ats(struct arm_smmu_master *master)
+static void arm_smmu_enable_ats(struct arm_smmu_master *master,
+				struct arm_smmu_domain *smmu_domain)
 {
 	size_t stu;
 	struct pci_dev *pdev;
 	struct arm_smmu_device *smmu = master->smmu;
-	struct arm_smmu_domain *smmu_domain = master->domain;
 
 	/* Don't enable ATS at the endpoint if it's not enabled in the STE */
 	if (!master->ats_enabled)
@@ -2415,10 +2415,9 @@ static void arm_smmu_enable_ats(struct arm_smmu_master *master)
 		dev_err(master->dev, "Failed to enable ATS (STU %zu)\n", stu);
 }
 
-static void arm_smmu_disable_ats(struct arm_smmu_master *master)
+static void arm_smmu_disable_ats(struct arm_smmu_master *master,
+				 struct arm_smmu_domain *smmu_domain)
 {
-	struct arm_smmu_domain *smmu_domain = master->domain;
-
 	if (!master->ats_enabled)
 		return;
 
@@ -2487,7 +2486,7 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
 	if (!smmu_domain)
 		return;
 
-	arm_smmu_disable_ats(master);
+	arm_smmu_disable_ats(master, smmu_domain);
 
 	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
 	list_del(&master->domain_head);
@@ -2609,7 +2608,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		break;
 	}
 
-	arm_smmu_enable_ats(master);
+	arm_smmu_enable_ats(master, smmu_domain);
 	goto out_unlock;
 
 out_list_del:
-- 
2.42.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 14/19] iommu/arm-smmu-v3: Remove arm_smmu_master->domain
  2023-11-13 17:53 ` Jason Gunthorpe
@ 2023-11-13 17:53   ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Introducing global statics which are of type struct iommu_domain, not
struct arm_smmu_domain makes it difficult to retain
arm_smmu_master->domain, as it can no longer point to an IDENTITY or
BLOCKED domain.

The only place that uses the value is arm_smmu_detach_dev(). Change things
to work like other drivers and call iommu_get_domain_for_dev() to obtain
the current domain.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 21 +++++++--------------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  1 -
 2 files changed, 7 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 7d2dd3ea47ab68..23dda64722ea17 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2480,19 +2480,20 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master)
 
 static void arm_smmu_detach_dev(struct arm_smmu_master *master)
 {
+	struct iommu_domain *domain = iommu_get_domain_for_dev(master->dev);
+	struct arm_smmu_domain *smmu_domain;
 	unsigned long flags;
-	struct arm_smmu_domain *smmu_domain = master->domain;
 
-	if (!smmu_domain)
+	if (!domain)
 		return;
 
+	smmu_domain = to_smmu_domain(domain);
 	arm_smmu_disable_ats(master, smmu_domain);
 
 	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
 	list_del(&master->domain_head);
 	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
 
-	master->domain = NULL;
 	master->ats_enabled = false;
 }
 
@@ -2546,8 +2547,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 
 	arm_smmu_detach_dev(master);
 
-	master->domain = smmu_domain;
-
 	/*
 	 * The SMMU does not support enabling ATS with bypass. When the STE is
 	 * in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests and
@@ -2566,10 +2565,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	case ARM_SMMU_DOMAIN_S1:
 		if (!master->cd_table.cdtab) {
 			ret = arm_smmu_alloc_cd_tables(master);
-			if (ret) {
-				master->domain = NULL;
+			if (ret)
 				goto out_list_del;
-			}
 		} else {
 			/*
 			 * arm_smmu_write_ctx_desc() relies on the entry being
@@ -2577,17 +2574,13 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 			 */
 			ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
 						      NULL);
-			if (ret) {
-				master->domain = NULL;
+			if (ret)
 				goto out_list_del;
-			}
 		}
 
 		ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd);
-		if (ret) {
-			master->domain = NULL;
+		if (ret)
 			goto out_list_del;
-		}
 
 		arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
 		arm_smmu_install_ste_for_dev(master, &target);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 1be0c1151c50c3..21f2f73501019a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -695,7 +695,6 @@ struct arm_smmu_stream {
 struct arm_smmu_master {
 	struct arm_smmu_device		*smmu;
 	struct device			*dev;
-	struct arm_smmu_domain		*domain;
 	struct list_head		domain_head;
 	struct arm_smmu_stream		*streams;
 	/* Locked by the iommu core using the group mutex */
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 14/19] iommu/arm-smmu-v3: Remove arm_smmu_master->domain
@ 2023-11-13 17:53   ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Introducing global statics which are of type struct iommu_domain, not
struct arm_smmu_domain makes it difficult to retain
arm_smmu_master->domain, as it can no longer point to an IDENTITY or
BLOCKED domain.

The only place that uses the value is arm_smmu_detach_dev(). Change things
to work like other drivers and call iommu_get_domain_for_dev() to obtain
the current domain.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 21 +++++++--------------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  1 -
 2 files changed, 7 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 7d2dd3ea47ab68..23dda64722ea17 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2480,19 +2480,20 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master)
 
 static void arm_smmu_detach_dev(struct arm_smmu_master *master)
 {
+	struct iommu_domain *domain = iommu_get_domain_for_dev(master->dev);
+	struct arm_smmu_domain *smmu_domain;
 	unsigned long flags;
-	struct arm_smmu_domain *smmu_domain = master->domain;
 
-	if (!smmu_domain)
+	if (!domain)
 		return;
 
+	smmu_domain = to_smmu_domain(domain);
 	arm_smmu_disable_ats(master, smmu_domain);
 
 	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
 	list_del(&master->domain_head);
 	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
 
-	master->domain = NULL;
 	master->ats_enabled = false;
 }
 
@@ -2546,8 +2547,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 
 	arm_smmu_detach_dev(master);
 
-	master->domain = smmu_domain;
-
 	/*
 	 * The SMMU does not support enabling ATS with bypass. When the STE is
 	 * in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests and
@@ -2566,10 +2565,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	case ARM_SMMU_DOMAIN_S1:
 		if (!master->cd_table.cdtab) {
 			ret = arm_smmu_alloc_cd_tables(master);
-			if (ret) {
-				master->domain = NULL;
+			if (ret)
 				goto out_list_del;
-			}
 		} else {
 			/*
 			 * arm_smmu_write_ctx_desc() relies on the entry being
@@ -2577,17 +2574,13 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 			 */
 			ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
 						      NULL);
-			if (ret) {
-				master->domain = NULL;
+			if (ret)
 				goto out_list_del;
-			}
 		}
 
 		ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd);
-		if (ret) {
-			master->domain = NULL;
+		if (ret)
 			goto out_list_del;
-		}
 
 		arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
 		arm_smmu_install_ste_for_dev(master, &target);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 1be0c1151c50c3..21f2f73501019a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -695,7 +695,6 @@ struct arm_smmu_stream {
 struct arm_smmu_master {
 	struct arm_smmu_device		*smmu;
 	struct device			*dev;
-	struct arm_smmu_domain		*domain;
 	struct list_head		domain_head;
 	struct arm_smmu_stream		*streams;
 	/* Locked by the iommu core using the group mutex */
-- 
2.42.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 15/19] iommu/arm-smmu-v3: Add a global static IDENTITY domain
  2023-11-13 17:53 ` Jason Gunthorpe
@ 2023-11-13 17:53   ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Move to the new static global for identity domains. Move all the logic out
of arm_smmu_attach_dev into an identity only function.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 82 +++++++++++++++------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  1 -
 2 files changed, 58 insertions(+), 25 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 23dda64722ea17..d6f68a6187d290 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2174,8 +2174,7 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
 		return arm_smmu_sva_domain_alloc();
 
 	if (type != IOMMU_DOMAIN_UNMANAGED &&
-	    type != IOMMU_DOMAIN_DMA &&
-	    type != IOMMU_DOMAIN_IDENTITY)
+	    type != IOMMU_DOMAIN_DMA)
 		return NULL;
 
 	/*
@@ -2283,11 +2282,6 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 
-	if (domain->type == IOMMU_DOMAIN_IDENTITY) {
-		smmu_domain->stage = ARM_SMMU_DOMAIN_BYPASS;
-		return 0;
-	}
-
 	/* Restrict the stage to what we can actually support */
 	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
 		smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
@@ -2484,7 +2478,7 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
 	struct arm_smmu_domain *smmu_domain;
 	unsigned long flags;
 
-	if (!domain)
+	if (!domain || !(domain->type & __IOMMU_DOMAIN_PAGING))
 		return;
 
 	smmu_domain = to_smmu_domain(domain);
@@ -2547,15 +2541,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 
 	arm_smmu_detach_dev(master);
 
-	/*
-	 * The SMMU does not support enabling ATS with bypass. When the STE is
-	 * in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests and
-	 * Translated transactions are denied as though ATS is disabled for the
-	 * stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
-	 * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
-	 */
-	if (smmu_domain->stage != ARM_SMMU_DOMAIN_BYPASS)
-		master->ats_enabled = arm_smmu_ats_supported(master);
+	master->ats_enabled = arm_smmu_ats_supported(master);
 
 	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
 	list_add(&master->domain_head, &smmu_domain->devices);
@@ -2592,13 +2578,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 			arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
 						      NULL);
 		break;
-	case ARM_SMMU_DOMAIN_BYPASS:
-		arm_smmu_make_bypass_ste(&target);
-		arm_smmu_install_ste_for_dev(master, &target);
-		if (master->cd_table.cdtab)
-			arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
-						      NULL);
-		break;
 	}
 
 	arm_smmu_enable_ats(master, smmu_domain);
@@ -2614,6 +2593,60 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	return ret;
 }
 
+static int arm_smmu_attach_dev_ste(struct device *dev,
+				   struct arm_smmu_ste *ste)
+{
+	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+
+	if (arm_smmu_master_sva_enabled(master))
+		return -EBUSY;
+
+	/*
+	 * Do not allow any ASID to be changed while are working on the STE,
+	 * otherwise we could miss invalidations.
+	 */
+	mutex_lock(&arm_smmu_asid_lock);
+
+	/*
+	 * The SMMU does not support enabling ATS with bypass/abort. When the
+	 * STE is in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests
+	 * and Translated transactions are denied as though ATS is disabled for
+	 * the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
+	 * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
+	 */
+	arm_smmu_detach_dev(master);
+
+	arm_smmu_install_ste_for_dev(master, ste);
+	mutex_unlock(&arm_smmu_asid_lock);
+
+	/*
+	 * This has to be done after removing the master from the
+	 * arm_smmu_domain->devices to avoid races updating the same context
+	 * descriptor from arm_smmu_share_asid().
+	 */
+	if (master->cd_table.cdtab)
+		arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
+	return 0;
+}
+
+static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
+					struct device *dev)
+{
+	struct arm_smmu_ste ste;
+
+	arm_smmu_make_bypass_ste(&ste);
+	return arm_smmu_attach_dev_ste(dev, &ste);
+}
+
+static const struct iommu_domain_ops arm_smmu_identity_ops = {
+	.attach_dev = arm_smmu_attach_dev_identity,
+};
+
+static struct iommu_domain arm_smmu_identity_domain = {
+	.type = IOMMU_DOMAIN_IDENTITY,
+	.ops = &arm_smmu_identity_ops,
+};
+
 static int arm_smmu_map_pages(struct iommu_domain *domain, unsigned long iova,
 			      phys_addr_t paddr, size_t pgsize, size_t pgcount,
 			      int prot, gfp_t gfp, size_t *mapped)
@@ -3006,6 +3039,7 @@ static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
 }
 
 static struct iommu_ops arm_smmu_ops = {
+	.identity_domain	= &arm_smmu_identity_domain,
 	.capable		= arm_smmu_capable,
 	.domain_alloc		= arm_smmu_domain_alloc,
 	.probe_device		= arm_smmu_probe_device,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 21f2f73501019a..154808f96718df 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -712,7 +712,6 @@ struct arm_smmu_master {
 enum arm_smmu_domain_stage {
 	ARM_SMMU_DOMAIN_S1 = 0,
 	ARM_SMMU_DOMAIN_S2,
-	ARM_SMMU_DOMAIN_BYPASS,
 };
 
 struct arm_smmu_domain {
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 15/19] iommu/arm-smmu-v3: Add a global static IDENTITY domain
@ 2023-11-13 17:53   ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Move to the new static global for identity domains. Move all the logic out
of arm_smmu_attach_dev into an identity only function.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 82 +++++++++++++++------
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  1 -
 2 files changed, 58 insertions(+), 25 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 23dda64722ea17..d6f68a6187d290 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2174,8 +2174,7 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
 		return arm_smmu_sva_domain_alloc();
 
 	if (type != IOMMU_DOMAIN_UNMANAGED &&
-	    type != IOMMU_DOMAIN_DMA &&
-	    type != IOMMU_DOMAIN_IDENTITY)
+	    type != IOMMU_DOMAIN_DMA)
 		return NULL;
 
 	/*
@@ -2283,11 +2282,6 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 
-	if (domain->type == IOMMU_DOMAIN_IDENTITY) {
-		smmu_domain->stage = ARM_SMMU_DOMAIN_BYPASS;
-		return 0;
-	}
-
 	/* Restrict the stage to what we can actually support */
 	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
 		smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
@@ -2484,7 +2478,7 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
 	struct arm_smmu_domain *smmu_domain;
 	unsigned long flags;
 
-	if (!domain)
+	if (!domain || !(domain->type & __IOMMU_DOMAIN_PAGING))
 		return;
 
 	smmu_domain = to_smmu_domain(domain);
@@ -2547,15 +2541,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 
 	arm_smmu_detach_dev(master);
 
-	/*
-	 * The SMMU does not support enabling ATS with bypass. When the STE is
-	 * in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests and
-	 * Translated transactions are denied as though ATS is disabled for the
-	 * stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
-	 * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
-	 */
-	if (smmu_domain->stage != ARM_SMMU_DOMAIN_BYPASS)
-		master->ats_enabled = arm_smmu_ats_supported(master);
+	master->ats_enabled = arm_smmu_ats_supported(master);
 
 	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
 	list_add(&master->domain_head, &smmu_domain->devices);
@@ -2592,13 +2578,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 			arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
 						      NULL);
 		break;
-	case ARM_SMMU_DOMAIN_BYPASS:
-		arm_smmu_make_bypass_ste(&target);
-		arm_smmu_install_ste_for_dev(master, &target);
-		if (master->cd_table.cdtab)
-			arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
-						      NULL);
-		break;
 	}
 
 	arm_smmu_enable_ats(master, smmu_domain);
@@ -2614,6 +2593,60 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	return ret;
 }
 
+static int arm_smmu_attach_dev_ste(struct device *dev,
+				   struct arm_smmu_ste *ste)
+{
+	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+
+	if (arm_smmu_master_sva_enabled(master))
+		return -EBUSY;
+
+	/*
+	 * Do not allow any ASID to be changed while are working on the STE,
+	 * otherwise we could miss invalidations.
+	 */
+	mutex_lock(&arm_smmu_asid_lock);
+
+	/*
+	 * The SMMU does not support enabling ATS with bypass/abort. When the
+	 * STE is in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests
+	 * and Translated transactions are denied as though ATS is disabled for
+	 * the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
+	 * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
+	 */
+	arm_smmu_detach_dev(master);
+
+	arm_smmu_install_ste_for_dev(master, ste);
+	mutex_unlock(&arm_smmu_asid_lock);
+
+	/*
+	 * This has to be done after removing the master from the
+	 * arm_smmu_domain->devices to avoid races updating the same context
+	 * descriptor from arm_smmu_share_asid().
+	 */
+	if (master->cd_table.cdtab)
+		arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
+	return 0;
+}
+
+static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
+					struct device *dev)
+{
+	struct arm_smmu_ste ste;
+
+	arm_smmu_make_bypass_ste(&ste);
+	return arm_smmu_attach_dev_ste(dev, &ste);
+}
+
+static const struct iommu_domain_ops arm_smmu_identity_ops = {
+	.attach_dev = arm_smmu_attach_dev_identity,
+};
+
+static struct iommu_domain arm_smmu_identity_domain = {
+	.type = IOMMU_DOMAIN_IDENTITY,
+	.ops = &arm_smmu_identity_ops,
+};
+
 static int arm_smmu_map_pages(struct iommu_domain *domain, unsigned long iova,
 			      phys_addr_t paddr, size_t pgsize, size_t pgcount,
 			      int prot, gfp_t gfp, size_t *mapped)
@@ -3006,6 +3039,7 @@ static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
 }
 
 static struct iommu_ops arm_smmu_ops = {
+	.identity_domain	= &arm_smmu_identity_domain,
 	.capable		= arm_smmu_capable,
 	.domain_alloc		= arm_smmu_domain_alloc,
 	.probe_device		= arm_smmu_probe_device,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 21f2f73501019a..154808f96718df 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -712,7 +712,6 @@ struct arm_smmu_master {
 enum arm_smmu_domain_stage {
 	ARM_SMMU_DOMAIN_S1 = 0,
 	ARM_SMMU_DOMAIN_S2,
-	ARM_SMMU_DOMAIN_BYPASS,
 };
 
 struct arm_smmu_domain {
-- 
2.42.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 16/19] iommu/arm-smmu-v3: Add a global static BLOCKED domain
  2023-11-13 17:53 ` Jason Gunthorpe
@ 2023-11-13 17:53   ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Using the same design as the IDENTITY domain install an
STRTAB_STE_0_CFG_ABORT STE.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index d6f68a6187d290..48981c2ff7a746 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2647,6 +2647,24 @@ static struct iommu_domain arm_smmu_identity_domain = {
 	.ops = &arm_smmu_identity_ops,
 };
 
+static int arm_smmu_attach_dev_blocked(struct iommu_domain *domain,
+					struct device *dev)
+{
+	struct arm_smmu_ste ste;
+
+	arm_smmu_make_abort_ste(&ste);
+	return arm_smmu_attach_dev_ste(dev, &ste);
+}
+
+static const struct iommu_domain_ops arm_smmu_blocked_ops = {
+	.attach_dev = arm_smmu_attach_dev_blocked,
+};
+
+static struct iommu_domain arm_smmu_blocked_domain = {
+	.type = IOMMU_DOMAIN_BLOCKED,
+	.ops = &arm_smmu_blocked_ops,
+};
+
 static int arm_smmu_map_pages(struct iommu_domain *domain, unsigned long iova,
 			      phys_addr_t paddr, size_t pgsize, size_t pgcount,
 			      int prot, gfp_t gfp, size_t *mapped)
@@ -3040,6 +3058,7 @@ static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
 
 static struct iommu_ops arm_smmu_ops = {
 	.identity_domain	= &arm_smmu_identity_domain,
+	.blocked_domain		= &arm_smmu_blocked_domain,
 	.capable		= arm_smmu_capable,
 	.domain_alloc		= arm_smmu_domain_alloc,
 	.probe_device		= arm_smmu_probe_device,
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 16/19] iommu/arm-smmu-v3: Add a global static BLOCKED domain
@ 2023-11-13 17:53   ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Using the same design as the IDENTITY domain install an
STRTAB_STE_0_CFG_ABORT STE.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index d6f68a6187d290..48981c2ff7a746 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2647,6 +2647,24 @@ static struct iommu_domain arm_smmu_identity_domain = {
 	.ops = &arm_smmu_identity_ops,
 };
 
+static int arm_smmu_attach_dev_blocked(struct iommu_domain *domain,
+					struct device *dev)
+{
+	struct arm_smmu_ste ste;
+
+	arm_smmu_make_abort_ste(&ste);
+	return arm_smmu_attach_dev_ste(dev, &ste);
+}
+
+static const struct iommu_domain_ops arm_smmu_blocked_ops = {
+	.attach_dev = arm_smmu_attach_dev_blocked,
+};
+
+static struct iommu_domain arm_smmu_blocked_domain = {
+	.type = IOMMU_DOMAIN_BLOCKED,
+	.ops = &arm_smmu_blocked_ops,
+};
+
 static int arm_smmu_map_pages(struct iommu_domain *domain, unsigned long iova,
 			      phys_addr_t paddr, size_t pgsize, size_t pgcount,
 			      int prot, gfp_t gfp, size_t *mapped)
@@ -3040,6 +3058,7 @@ static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
 
 static struct iommu_ops arm_smmu_ops = {
 	.identity_domain	= &arm_smmu_identity_domain,
+	.blocked_domain		= &arm_smmu_blocked_domain,
 	.capable		= arm_smmu_capable,
 	.domain_alloc		= arm_smmu_domain_alloc,
 	.probe_device		= arm_smmu_probe_device,
-- 
2.42.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 17/19] iommu/arm-smmu-v3: Use the identity/blocked domain during release
  2023-11-13 17:53 ` Jason Gunthorpe
@ 2023-11-13 17:53   ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Consolidate some more core by having release call
arm_smmu_attach_dev_identity/blocked() instead of open coding this.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 48981c2ff7a746..331568e086c70a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2900,19 +2900,16 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
 static void arm_smmu_release_device(struct device *dev)
 {
 	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
-	struct arm_smmu_ste target;
 
 	if (WARN_ON(arm_smmu_master_sva_enabled(master)))
 		iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
 
 	/* Put the STE back to what arm_smmu_init_strtab() sets */
 	if (disable_bypass && !dev->iommu->require_direct)
-		arm_smmu_make_abort_ste(&target);
+		arm_smmu_attach_dev_blocked(&arm_smmu_blocked_domain, dev);
 	else
-		arm_smmu_make_bypass_ste(&target);
-	arm_smmu_install_ste_for_dev(master, &target);
+		arm_smmu_attach_dev_identity(&arm_smmu_identity_domain, dev);
 
-	arm_smmu_detach_dev(master);
 	arm_smmu_disable_pasid(master);
 	arm_smmu_remove_master(master);
 	if (master->cd_table.cdtab)
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 17/19] iommu/arm-smmu-v3: Use the identity/blocked domain during release
@ 2023-11-13 17:53   ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Consolidate some more core by having release call
arm_smmu_attach_dev_identity/blocked() instead of open coding this.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 48981c2ff7a746..331568e086c70a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2900,19 +2900,16 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
 static void arm_smmu_release_device(struct device *dev)
 {
 	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
-	struct arm_smmu_ste target;
 
 	if (WARN_ON(arm_smmu_master_sva_enabled(master)))
 		iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
 
 	/* Put the STE back to what arm_smmu_init_strtab() sets */
 	if (disable_bypass && !dev->iommu->require_direct)
-		arm_smmu_make_abort_ste(&target);
+		arm_smmu_attach_dev_blocked(&arm_smmu_blocked_domain, dev);
 	else
-		arm_smmu_make_bypass_ste(&target);
-	arm_smmu_install_ste_for_dev(master, &target);
+		arm_smmu_attach_dev_identity(&arm_smmu_identity_domain, dev);
 
-	arm_smmu_detach_dev(master);
 	arm_smmu_disable_pasid(master);
 	arm_smmu_remove_master(master);
 	if (master->cd_table.cdtab)
-- 
2.42.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 18/19] iommu/arm-smmu-v3: Pass arm_smmu_domain and arm_smmu_device to finalize
  2023-11-13 17:53 ` Jason Gunthorpe
@ 2023-11-13 17:53   ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Instead of putting container_of() casts in the internals, use the proper
type in this call chain. This makes it easier to check that the two global
static domains are not leaking into call chains they should not.

Passing the smmu avoids the only caller from having to set it and unset it
in the error path.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 34 ++++++++++-----------
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 331568e086c70a..50c26792391b56 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -87,6 +87,8 @@ static struct arm_smmu_option_prop arm_smmu_options[] = {
 };
 
 static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu);
+static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
+				    struct arm_smmu_device *smmu);
 
 static void parse_driver_options(struct arm_smmu_device *smmu)
 {
@@ -2216,12 +2218,12 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
 	kfree(smmu_domain);
 }
 
-static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
+static int arm_smmu_domain_finalise_s1(struct arm_smmu_device *smmu,
+				       struct arm_smmu_domain *smmu_domain,
 				       struct io_pgtable_cfg *pgtbl_cfg)
 {
 	int ret;
 	u32 asid;
-	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	struct arm_smmu_ctx_desc *cd = &smmu_domain->cd;
 	typeof(&pgtbl_cfg->arm_lpae_s1_cfg.tcr) tcr = &pgtbl_cfg->arm_lpae_s1_cfg.tcr;
 
@@ -2253,11 +2255,11 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
 	return ret;
 }
 
-static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
+static int arm_smmu_domain_finalise_s2(struct arm_smmu_device *smmu,
+				       struct arm_smmu_domain *smmu_domain,
 				       struct io_pgtable_cfg *pgtbl_cfg)
 {
 	int vmid;
-	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
 
 	/* Reserve VMID 0 for stage-2 bypass STEs */
@@ -2270,17 +2272,17 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
 	return 0;
 }
 
-static int arm_smmu_domain_finalise(struct iommu_domain *domain)
+static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
+				    struct arm_smmu_device *smmu)
 {
 	int ret;
 	unsigned long ias, oas;
 	enum io_pgtable_fmt fmt;
 	struct io_pgtable_cfg pgtbl_cfg;
 	struct io_pgtable_ops *pgtbl_ops;
-	int (*finalise_stage_fn)(struct arm_smmu_domain *,
-				 struct io_pgtable_cfg *);
-	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
-	struct arm_smmu_device *smmu = smmu_domain->smmu;
+	int (*finalise_stage_fn)(struct arm_smmu_device *smmu,
+				 struct arm_smmu_domain *smmu_domain,
+				 struct io_pgtable_cfg *pgtbl_cfg);
 
 	/* Restrict the stage to what we can actually support */
 	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
@@ -2319,17 +2321,18 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
 	if (!pgtbl_ops)
 		return -ENOMEM;
 
-	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
-	domain->geometry.aperture_end = (1UL << pgtbl_cfg.ias) - 1;
-	domain->geometry.force_aperture = true;
+	smmu_domain->domain.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+	smmu_domain->domain.geometry.aperture_end = (1UL << pgtbl_cfg.ias) - 1;
+	smmu_domain->domain.geometry.force_aperture = true;
 
-	ret = finalise_stage_fn(smmu_domain, &pgtbl_cfg);
+	ret = finalise_stage_fn(smmu, smmu_domain, &pgtbl_cfg);
 	if (ret < 0) {
 		free_io_pgtable_ops(pgtbl_ops);
 		return ret;
 	}
 
 	smmu_domain->pgtbl_ops = pgtbl_ops;
+	smmu_domain->smmu = smmu;
 	return 0;
 }
 
@@ -2520,10 +2523,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	mutex_lock(&smmu_domain->init_mutex);
 
 	if (!smmu_domain->smmu) {
-		smmu_domain->smmu = smmu;
-		ret = arm_smmu_domain_finalise(domain);
-		if (ret)
-			smmu_domain->smmu = NULL;
+		ret = arm_smmu_domain_finalise(smmu_domain, smmu);
 	} else if (smmu_domain->smmu != smmu)
 		ret = -EINVAL;
 
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 18/19] iommu/arm-smmu-v3: Pass arm_smmu_domain and arm_smmu_device to finalize
@ 2023-11-13 17:53   ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Instead of putting container_of() casts in the internals, use the proper
type in this call chain. This makes it easier to check that the two global
static domains are not leaking into call chains they should not.

Passing the smmu avoids the only caller from having to set it and unset it
in the error path.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 34 ++++++++++-----------
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 331568e086c70a..50c26792391b56 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -87,6 +87,8 @@ static struct arm_smmu_option_prop arm_smmu_options[] = {
 };
 
 static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu);
+static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
+				    struct arm_smmu_device *smmu);
 
 static void parse_driver_options(struct arm_smmu_device *smmu)
 {
@@ -2216,12 +2218,12 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
 	kfree(smmu_domain);
 }
 
-static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
+static int arm_smmu_domain_finalise_s1(struct arm_smmu_device *smmu,
+				       struct arm_smmu_domain *smmu_domain,
 				       struct io_pgtable_cfg *pgtbl_cfg)
 {
 	int ret;
 	u32 asid;
-	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	struct arm_smmu_ctx_desc *cd = &smmu_domain->cd;
 	typeof(&pgtbl_cfg->arm_lpae_s1_cfg.tcr) tcr = &pgtbl_cfg->arm_lpae_s1_cfg.tcr;
 
@@ -2253,11 +2255,11 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
 	return ret;
 }
 
-static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
+static int arm_smmu_domain_finalise_s2(struct arm_smmu_device *smmu,
+				       struct arm_smmu_domain *smmu_domain,
 				       struct io_pgtable_cfg *pgtbl_cfg)
 {
 	int vmid;
-	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
 
 	/* Reserve VMID 0 for stage-2 bypass STEs */
@@ -2270,17 +2272,17 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
 	return 0;
 }
 
-static int arm_smmu_domain_finalise(struct iommu_domain *domain)
+static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
+				    struct arm_smmu_device *smmu)
 {
 	int ret;
 	unsigned long ias, oas;
 	enum io_pgtable_fmt fmt;
 	struct io_pgtable_cfg pgtbl_cfg;
 	struct io_pgtable_ops *pgtbl_ops;
-	int (*finalise_stage_fn)(struct arm_smmu_domain *,
-				 struct io_pgtable_cfg *);
-	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
-	struct arm_smmu_device *smmu = smmu_domain->smmu;
+	int (*finalise_stage_fn)(struct arm_smmu_device *smmu,
+				 struct arm_smmu_domain *smmu_domain,
+				 struct io_pgtable_cfg *pgtbl_cfg);
 
 	/* Restrict the stage to what we can actually support */
 	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
@@ -2319,17 +2321,18 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
 	if (!pgtbl_ops)
 		return -ENOMEM;
 
-	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
-	domain->geometry.aperture_end = (1UL << pgtbl_cfg.ias) - 1;
-	domain->geometry.force_aperture = true;
+	smmu_domain->domain.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+	smmu_domain->domain.geometry.aperture_end = (1UL << pgtbl_cfg.ias) - 1;
+	smmu_domain->domain.geometry.force_aperture = true;
 
-	ret = finalise_stage_fn(smmu_domain, &pgtbl_cfg);
+	ret = finalise_stage_fn(smmu, smmu_domain, &pgtbl_cfg);
 	if (ret < 0) {
 		free_io_pgtable_ops(pgtbl_ops);
 		return ret;
 	}
 
 	smmu_domain->pgtbl_ops = pgtbl_ops;
+	smmu_domain->smmu = smmu;
 	return 0;
 }
 
@@ -2520,10 +2523,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	mutex_lock(&smmu_domain->init_mutex);
 
 	if (!smmu_domain->smmu) {
-		smmu_domain->smmu = smmu;
-		ret = arm_smmu_domain_finalise(domain);
-		if (ret)
-			smmu_domain->smmu = NULL;
+		ret = arm_smmu_domain_finalise(smmu_domain, smmu);
 	} else if (smmu_domain->smmu != smmu)
 		ret = -EINVAL;
 
-- 
2.42.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 19/19] iommu/arm-smmu-v3: Convert to domain_alloc_paging()
  2023-11-13 17:53 ` Jason Gunthorpe
@ 2023-11-13 17:53   ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Now that the BLOCKED and IDENTITY behaviors are managed with their own
domains change to the domain_alloc_paging() op.

For now SVA remains using the old interface, eventually it will get its
own op that can pass in the device and mm_struct which will let us have a
sane lifetime for the mmu_notifier.

Call arm_smmu_domain_finalise() early if dev is available.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 50c26792391b56..5667521bd18091 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2170,14 +2170,15 @@ static bool arm_smmu_capable(struct device *dev, enum iommu_cap cap)
 
 static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
 {
-	struct arm_smmu_domain *smmu_domain;
 
 	if (type == IOMMU_DOMAIN_SVA)
 		return arm_smmu_sva_domain_alloc();
+	return NULL;
+}
 
-	if (type != IOMMU_DOMAIN_UNMANAGED &&
-	    type != IOMMU_DOMAIN_DMA)
-		return NULL;
+static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
+{
+	struct arm_smmu_domain *smmu_domain;
 
 	/*
 	 * Allocate the domain and initialise some of its data structures.
@@ -2193,6 +2194,14 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
 	spin_lock_init(&smmu_domain->devices_lock);
 	INIT_LIST_HEAD(&smmu_domain->mmu_notifiers);
 
+	if (dev) {
+		struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+
+		if (arm_smmu_domain_finalise(smmu_domain, master->smmu)) {
+			kfree(smmu_domain);
+			return NULL;
+		}
+	}
 	return &smmu_domain->domain;
 }
 
@@ -3058,6 +3067,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.blocked_domain		= &arm_smmu_blocked_domain,
 	.capable		= arm_smmu_capable,
 	.domain_alloc		= arm_smmu_domain_alloc,
+	.domain_alloc_paging    = arm_smmu_domain_alloc_paging,
 	.probe_device		= arm_smmu_probe_device,
 	.release_device		= arm_smmu_release_device,
 	.device_group		= arm_smmu_device_group,
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 158+ messages in thread

* [PATCH v2 19/19] iommu/arm-smmu-v3: Convert to domain_alloc_paging()
@ 2023-11-13 17:53   ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-13 17:53 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Now that the BLOCKED and IDENTITY behaviors are managed with their own
domains change to the domain_alloc_paging() op.

For now SVA remains using the old interface, eventually it will get its
own op that can pass in the device and mm_struct which will let us have a
sane lifetime for the mmu_notifier.

Call arm_smmu_domain_finalise() early if dev is available.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 50c26792391b56..5667521bd18091 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2170,14 +2170,15 @@ static bool arm_smmu_capable(struct device *dev, enum iommu_cap cap)
 
 static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
 {
-	struct arm_smmu_domain *smmu_domain;
 
 	if (type == IOMMU_DOMAIN_SVA)
 		return arm_smmu_sva_domain_alloc();
+	return NULL;
+}
 
-	if (type != IOMMU_DOMAIN_UNMANAGED &&
-	    type != IOMMU_DOMAIN_DMA)
-		return NULL;
+static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
+{
+	struct arm_smmu_domain *smmu_domain;
 
 	/*
 	 * Allocate the domain and initialise some of its data structures.
@@ -2193,6 +2194,14 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
 	spin_lock_init(&smmu_domain->devices_lock);
 	INIT_LIST_HEAD(&smmu_domain->mmu_notifiers);
 
+	if (dev) {
+		struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+
+		if (arm_smmu_domain_finalise(smmu_domain, master->smmu)) {
+			kfree(smmu_domain);
+			return NULL;
+		}
+	}
 	return &smmu_domain->domain;
 }
 
@@ -3058,6 +3067,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.blocked_domain		= &arm_smmu_blocked_domain,
 	.capable		= arm_smmu_capable,
 	.domain_alloc		= arm_smmu_domain_alloc,
+	.domain_alloc_paging    = arm_smmu_domain_alloc_paging,
 	.probe_device		= arm_smmu_probe_device,
 	.release_device		= arm_smmu_release_device,
 	.device_group		= arm_smmu_device_group,
-- 
2.42.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 01/19] iommu/arm-smmu-v3: Add a type for the STE
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-11-14 15:06     ` Moritz Fischer
  -1 siblings, 0 replies; 158+ messages in thread
From: Moritz Fischer @ 2023-11-14 15:06 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:08PM -0400, Jason Gunthorpe wrote:
> Instead of passing a naked __le16 * around to represent a STE wrap it in a
> "struct arm_smmu_ste" with an array of the correct size. This makes it
> much clearer which functions will comprise the "STE API".
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Moritz Fischer <mdf@kernel.org>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 54 ++++++++++-----------
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  7 ++-
>  2 files changed, 32 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 7445454c2af244..519749d15fbda0 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1249,7 +1249,7 @@ static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 sid)
>  }
>  
>  static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> -				      __le64 *dst)
> +				      struct arm_smmu_ste *dst)
>  {
>  	/*
>  	 * This is hideously complicated, but we only really care about
> @@ -1267,7 +1267,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  	 * 2. Write everything apart from dword 0, sync, write dword 0, sync
>  	 * 3. Update Config, sync
>  	 */
> -	u64 val = le64_to_cpu(dst[0]);
> +	u64 val = le64_to_cpu(dst->data[0]);
>  	bool ste_live = false;
>  	struct arm_smmu_device *smmu = NULL;
>  	struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
> @@ -1325,10 +1325,10 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  		else
>  			val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
>  
> -		dst[0] = cpu_to_le64(val);
> -		dst[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
> +		dst->data[0] = cpu_to_le64(val);
> +		dst->data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
>  						STRTAB_STE_1_SHCFG_INCOMING));
> -		dst[2] = 0; /* Nuke the VMID */
> +		dst->data[2] = 0; /* Nuke the VMID */
>  		/*
>  		 * The SMMU can perform negative caching, so we must sync
>  		 * the STE regardless of whether the old value was live.
> @@ -1343,7 +1343,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  			STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
>  
>  		BUG_ON(ste_live);
> -		dst[1] = cpu_to_le64(
> +		dst->data[1] = cpu_to_le64(
>  			 FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
>  			 FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
>  			 FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> @@ -1352,7 +1352,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  
>  		if (smmu->features & ARM_SMMU_FEAT_STALLS &&
>  		    !master->stall_enabled)
> -			dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
> +			dst->data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
>  
>  		val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
>  			FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
> @@ -1362,7 +1362,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  
>  	if (s2_cfg) {
>  		BUG_ON(ste_live);
> -		dst[2] = cpu_to_le64(
> +		dst->data[2] = cpu_to_le64(
>  			 FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
>  			 FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
>  #ifdef __BIG_ENDIAN
> @@ -1371,18 +1371,18 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  			 STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
>  			 STRTAB_STE_2_S2R);
>  
> -		dst[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
> +		dst->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
>  
>  		val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
>  	}
>  
>  	if (master->ats_enabled)
> -		dst[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
> +		dst->data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
>  						 STRTAB_STE_1_EATS_TRANS));
>  
>  	arm_smmu_sync_ste_for_sid(smmu, sid);
>  	/* See comment in arm_smmu_write_ctx_desc() */
> -	WRITE_ONCE(dst[0], cpu_to_le64(val));
> +	WRITE_ONCE(dst->data[0], cpu_to_le64(val));
>  	arm_smmu_sync_ste_for_sid(smmu, sid);
>  
>  	/* It's likely that we'll want to use the new STE soon */
> @@ -1390,7 +1390,8 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  		arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd);
>  }
>  
> -static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent, bool force)
> +static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab,
> +				      unsigned int nent, bool force)
>  {
>  	unsigned int i;
>  	u64 val = STRTAB_STE_0_V;
> @@ -1401,11 +1402,11 @@ static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent, bool fo
>  		val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
>  
>  	for (i = 0; i < nent; ++i) {
> -		strtab[0] = cpu_to_le64(val);
> -		strtab[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
> -						   STRTAB_STE_1_SHCFG_INCOMING));
> -		strtab[2] = 0;
> -		strtab += STRTAB_STE_DWORDS;
> +		strtab->data[0] = cpu_to_le64(val);
> +		strtab->data[1] = cpu_to_le64(FIELD_PREP(
> +			STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
> +		strtab->data[2] = 0;
> +		strtab++;
>  	}
>  }
>  
> @@ -2209,26 +2210,22 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
>  	return 0;
>  }
>  
> -static __le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
> +static struct arm_smmu_ste *
> +arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
>  {
> -	__le64 *step;
>  	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
>  
>  	if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
> -		struct arm_smmu_strtab_l1_desc *l1_desc;
>  		int idx;
>  
>  		/* Two-level walk */
>  		idx = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS;
> -		l1_desc = &cfg->l1_desc[idx];
> -		idx = (sid & ((1 << STRTAB_SPLIT) - 1)) * STRTAB_STE_DWORDS;
> -		step = &l1_desc->l2ptr[idx];
> +		return &cfg->l1_desc[idx].l2ptr[sid & ((1 << STRTAB_SPLIT) - 1)];
>  	} else {
>  		/* Simple linear lookup */
> -		step = &cfg->strtab[sid * STRTAB_STE_DWORDS];
> +		return (struct arm_smmu_ste *)&cfg
> +			       ->strtab[sid * STRTAB_STE_DWORDS];
>  	}
> -
> -	return step;
>  }
>  
>  static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
> @@ -2238,7 +2235,8 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
>  
>  	for (i = 0; i < master->num_streams; ++i) {
>  		u32 sid = master->streams[i].id;
> -		__le64 *step = arm_smmu_get_step_for_sid(smmu, sid);
> +		struct arm_smmu_ste *step =
> +			arm_smmu_get_step_for_sid(smmu, sid);
>  
>  		/* Bridged PCI devices may end up with duplicated IDs */
>  		for (j = 0; j < i; j++)
> @@ -3769,7 +3767,7 @@ static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu)
>  	iort_get_rmr_sids(dev_fwnode(smmu->dev), &rmr_list);
>  
>  	list_for_each_entry(e, &rmr_list, list) {
> -		__le64 *step;
> +		struct arm_smmu_ste *step;
>  		struct iommu_iort_rmr_data *rmr;
>  		int ret, i;
>  
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 961205ba86d25d..03f9e526cbd92f 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -206,6 +206,11 @@
>  #define STRTAB_L1_DESC_L2PTR_MASK	GENMASK_ULL(51, 6)
>  
>  #define STRTAB_STE_DWORDS		8
> +
> +struct arm_smmu_ste {
> +	__le64 data[STRTAB_STE_DWORDS];
> +};
> +
>  #define STRTAB_STE_0_V			(1UL << 0)
>  #define STRTAB_STE_0_CFG		GENMASK_ULL(3, 1)
>  #define STRTAB_STE_0_CFG_ABORT		0
> @@ -571,7 +576,7 @@ struct arm_smmu_priq {
>  struct arm_smmu_strtab_l1_desc {
>  	u8				span;
>  
> -	__le64				*l2ptr;
> +	struct arm_smmu_ste		*l2ptr;
>  	dma_addr_t			l2ptr_dma;
>  };
>  
> -- 
> 2.42.0
> 

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 01/19] iommu/arm-smmu-v3: Add a type for the STE
@ 2023-11-14 15:06     ` Moritz Fischer
  0 siblings, 0 replies; 158+ messages in thread
From: Moritz Fischer @ 2023-11-14 15:06 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:08PM -0400, Jason Gunthorpe wrote:
> Instead of passing a naked __le16 * around to represent a STE wrap it in a
> "struct arm_smmu_ste" with an array of the correct size. This makes it
> much clearer which functions will comprise the "STE API".
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Moritz Fischer <mdf@kernel.org>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 54 ++++++++++-----------
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  7 ++-
>  2 files changed, 32 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 7445454c2af244..519749d15fbda0 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1249,7 +1249,7 @@ static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 sid)
>  }
>  
>  static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> -				      __le64 *dst)
> +				      struct arm_smmu_ste *dst)
>  {
>  	/*
>  	 * This is hideously complicated, but we only really care about
> @@ -1267,7 +1267,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  	 * 2. Write everything apart from dword 0, sync, write dword 0, sync
>  	 * 3. Update Config, sync
>  	 */
> -	u64 val = le64_to_cpu(dst[0]);
> +	u64 val = le64_to_cpu(dst->data[0]);
>  	bool ste_live = false;
>  	struct arm_smmu_device *smmu = NULL;
>  	struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
> @@ -1325,10 +1325,10 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  		else
>  			val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
>  
> -		dst[0] = cpu_to_le64(val);
> -		dst[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
> +		dst->data[0] = cpu_to_le64(val);
> +		dst->data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
>  						STRTAB_STE_1_SHCFG_INCOMING));
> -		dst[2] = 0; /* Nuke the VMID */
> +		dst->data[2] = 0; /* Nuke the VMID */
>  		/*
>  		 * The SMMU can perform negative caching, so we must sync
>  		 * the STE regardless of whether the old value was live.
> @@ -1343,7 +1343,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  			STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
>  
>  		BUG_ON(ste_live);
> -		dst[1] = cpu_to_le64(
> +		dst->data[1] = cpu_to_le64(
>  			 FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
>  			 FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
>  			 FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> @@ -1352,7 +1352,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  
>  		if (smmu->features & ARM_SMMU_FEAT_STALLS &&
>  		    !master->stall_enabled)
> -			dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
> +			dst->data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
>  
>  		val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
>  			FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
> @@ -1362,7 +1362,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  
>  	if (s2_cfg) {
>  		BUG_ON(ste_live);
> -		dst[2] = cpu_to_le64(
> +		dst->data[2] = cpu_to_le64(
>  			 FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
>  			 FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
>  #ifdef __BIG_ENDIAN
> @@ -1371,18 +1371,18 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  			 STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
>  			 STRTAB_STE_2_S2R);
>  
> -		dst[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
> +		dst->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
>  
>  		val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
>  	}
>  
>  	if (master->ats_enabled)
> -		dst[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
> +		dst->data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
>  						 STRTAB_STE_1_EATS_TRANS));
>  
>  	arm_smmu_sync_ste_for_sid(smmu, sid);
>  	/* See comment in arm_smmu_write_ctx_desc() */
> -	WRITE_ONCE(dst[0], cpu_to_le64(val));
> +	WRITE_ONCE(dst->data[0], cpu_to_le64(val));
>  	arm_smmu_sync_ste_for_sid(smmu, sid);
>  
>  	/* It's likely that we'll want to use the new STE soon */
> @@ -1390,7 +1390,8 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  		arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd);
>  }
>  
> -static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent, bool force)
> +static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab,
> +				      unsigned int nent, bool force)
>  {
>  	unsigned int i;
>  	u64 val = STRTAB_STE_0_V;
> @@ -1401,11 +1402,11 @@ static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent, bool fo
>  		val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
>  
>  	for (i = 0; i < nent; ++i) {
> -		strtab[0] = cpu_to_le64(val);
> -		strtab[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
> -						   STRTAB_STE_1_SHCFG_INCOMING));
> -		strtab[2] = 0;
> -		strtab += STRTAB_STE_DWORDS;
> +		strtab->data[0] = cpu_to_le64(val);
> +		strtab->data[1] = cpu_to_le64(FIELD_PREP(
> +			STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
> +		strtab->data[2] = 0;
> +		strtab++;
>  	}
>  }
>  
> @@ -2209,26 +2210,22 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
>  	return 0;
>  }
>  
> -static __le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
> +static struct arm_smmu_ste *
> +arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
>  {
> -	__le64 *step;
>  	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
>  
>  	if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
> -		struct arm_smmu_strtab_l1_desc *l1_desc;
>  		int idx;
>  
>  		/* Two-level walk */
>  		idx = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS;
> -		l1_desc = &cfg->l1_desc[idx];
> -		idx = (sid & ((1 << STRTAB_SPLIT) - 1)) * STRTAB_STE_DWORDS;
> -		step = &l1_desc->l2ptr[idx];
> +		return &cfg->l1_desc[idx].l2ptr[sid & ((1 << STRTAB_SPLIT) - 1)];
>  	} else {
>  		/* Simple linear lookup */
> -		step = &cfg->strtab[sid * STRTAB_STE_DWORDS];
> +		return (struct arm_smmu_ste *)&cfg
> +			       ->strtab[sid * STRTAB_STE_DWORDS];
>  	}
> -
> -	return step;
>  }
>  
>  static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
> @@ -2238,7 +2235,8 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
>  
>  	for (i = 0; i < master->num_streams; ++i) {
>  		u32 sid = master->streams[i].id;
> -		__le64 *step = arm_smmu_get_step_for_sid(smmu, sid);
> +		struct arm_smmu_ste *step =
> +			arm_smmu_get_step_for_sid(smmu, sid);
>  
>  		/* Bridged PCI devices may end up with duplicated IDs */
>  		for (j = 0; j < i; j++)
> @@ -3769,7 +3767,7 @@ static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu)
>  	iort_get_rmr_sids(dev_fwnode(smmu->dev), &rmr_list);
>  
>  	list_for_each_entry(e, &rmr_list, list) {
> -		__le64 *step;
> +		struct arm_smmu_ste *step;
>  		struct iommu_iort_rmr_data *rmr;
>  		int ret, i;
>  
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 961205ba86d25d..03f9e526cbd92f 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -206,6 +206,11 @@
>  #define STRTAB_L1_DESC_L2PTR_MASK	GENMASK_ULL(51, 6)
>  
>  #define STRTAB_STE_DWORDS		8
> +
> +struct arm_smmu_ste {
> +	__le64 data[STRTAB_STE_DWORDS];
> +};
> +
>  #define STRTAB_STE_0_V			(1UL << 0)
>  #define STRTAB_STE_0_CFG		GENMASK_ULL(3, 1)
>  #define STRTAB_STE_0_CFG_ABORT		0
> @@ -571,7 +576,7 @@ struct arm_smmu_priq {
>  struct arm_smmu_strtab_l1_desc {
>  	u8				span;
>  
> -	__le64				*l2ptr;
> +	struct arm_smmu_ste		*l2ptr;
>  	dma_addr_t			l2ptr_dma;
>  };
>  
> -- 
> 2.42.0
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 02/19] iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent()
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-11-14 15:17     ` Moritz Fischer
  -1 siblings, 0 replies; 158+ messages in thread
From: Moritz Fischer @ 2023-11-14 15:17 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:09PM -0400, Jason Gunthorpe wrote:
> The only caller is arm_smmu_install_ste_for_dev() which never has a NULL
> master. Remove the confusing if.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Moritz Fischer <mdf@kernel.org>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 9 ++-------
>  1 file changed, 2 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 519749d15fbda0..9117e769a965e1 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1269,10 +1269,10 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  	 */
>  	u64 val = le64_to_cpu(dst->data[0]);
>  	bool ste_live = false;
> -	struct arm_smmu_device *smmu = NULL;
> +	struct arm_smmu_device *smmu = master->smmu;
>  	struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
>  	struct arm_smmu_s2_cfg *s2_cfg = NULL;
> -	struct arm_smmu_domain *smmu_domain = NULL;
> +	struct arm_smmu_domain *smmu_domain = master->domain;
>  	struct arm_smmu_cmdq_ent prefetch_cmd = {
>  		.opcode		= CMDQ_OP_PREFETCH_CFG,
>  		.prefetch	= {
> @@ -1280,11 +1280,6 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  		},
>  	};
>  
> -	if (master) {
> -		smmu_domain = master->domain;
> -		smmu = master->smmu;
> -	}
> -
>  	if (smmu_domain) {
>  		switch (smmu_domain->stage) {
>  		case ARM_SMMU_DOMAIN_S1:
> -- 
> 2.42.0
> 

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 02/19] iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent()
@ 2023-11-14 15:17     ` Moritz Fischer
  0 siblings, 0 replies; 158+ messages in thread
From: Moritz Fischer @ 2023-11-14 15:17 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:09PM -0400, Jason Gunthorpe wrote:
> The only caller is arm_smmu_install_ste_for_dev() which never has a NULL
> master. Remove the confusing if.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Moritz Fischer <mdf@kernel.org>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 9 ++-------
>  1 file changed, 2 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 519749d15fbda0..9117e769a965e1 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1269,10 +1269,10 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  	 */
>  	u64 val = le64_to_cpu(dst->data[0]);
>  	bool ste_live = false;
> -	struct arm_smmu_device *smmu = NULL;
> +	struct arm_smmu_device *smmu = master->smmu;
>  	struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
>  	struct arm_smmu_s2_cfg *s2_cfg = NULL;
> -	struct arm_smmu_domain *smmu_domain = NULL;
> +	struct arm_smmu_domain *smmu_domain = master->domain;
>  	struct arm_smmu_cmdq_ent prefetch_cmd = {
>  		.opcode		= CMDQ_OP_PREFETCH_CFG,
>  		.prefetch	= {
> @@ -1280,11 +1280,6 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  		},
>  	};
>  
> -	if (master) {
> -		smmu_domain = master->domain;
> -		smmu = master->smmu;
> -	}
> -
>  	if (smmu_domain) {
>  		switch (smmu_domain->stage) {
>  		case ARM_SMMU_DOMAIN_S1:
> -- 
> 2.42.0
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 03/19] iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-11-14 15:18     ` Moritz Fischer
  -1 siblings, 0 replies; 158+ messages in thread
From: Moritz Fischer @ 2023-11-14 15:18 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:10PM -0400, Jason Gunthorpe wrote:
> Currently this is exactly the same as ARM_SMMU_DOMAIN_S2, so just remove
> it. The ongoing work to add nesting support through iommufd will do
> something a little different.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Moritz Fischer <mdf@kernel.org>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 4 +---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 -
>  2 files changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 9117e769a965e1..bf7218adbc2822 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1286,7 +1286,6 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  			cd_table = &master->cd_table;
>  			break;
>  		case ARM_SMMU_DOMAIN_S2:
> -		case ARM_SMMU_DOMAIN_NESTED:
>  			s2_cfg = &smmu_domain->s2_cfg;
>  			break;
>  		default:
> @@ -2167,7 +2166,6 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
>  		fmt = ARM_64_LPAE_S1;
>  		finalise_stage_fn = arm_smmu_domain_finalise_s1;
>  		break;
> -	case ARM_SMMU_DOMAIN_NESTED:
>  	case ARM_SMMU_DOMAIN_S2:
>  		ias = smmu->ias;
>  		oas = smmu->oas;
> @@ -2735,7 +2733,7 @@ static int arm_smmu_enable_nesting(struct iommu_domain *domain)
>  	if (smmu_domain->smmu)
>  		ret = -EPERM;
>  	else
> -		smmu_domain->stage = ARM_SMMU_DOMAIN_NESTED;
> +		smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
>  	mutex_unlock(&smmu_domain->init_mutex);
>  
>  	return ret;
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 03f9e526cbd92f..27ddf1acd12cea 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -715,7 +715,6 @@ struct arm_smmu_master {
>  enum arm_smmu_domain_stage {
>  	ARM_SMMU_DOMAIN_S1 = 0,
>  	ARM_SMMU_DOMAIN_S2,
> -	ARM_SMMU_DOMAIN_NESTED,
>  	ARM_SMMU_DOMAIN_BYPASS,
>  };
>  
> -- 
> 2.42.0
> 

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 03/19] iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED
@ 2023-11-14 15:18     ` Moritz Fischer
  0 siblings, 0 replies; 158+ messages in thread
From: Moritz Fischer @ 2023-11-14 15:18 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:10PM -0400, Jason Gunthorpe wrote:
> Currently this is exactly the same as ARM_SMMU_DOMAIN_S2, so just remove
> it. The ongoing work to add nesting support through iommufd will do
> something a little different.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Moritz Fischer <mdf@kernel.org>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 4 +---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 -
>  2 files changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 9117e769a965e1..bf7218adbc2822 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1286,7 +1286,6 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  			cd_table = &master->cd_table;
>  			break;
>  		case ARM_SMMU_DOMAIN_S2:
> -		case ARM_SMMU_DOMAIN_NESTED:
>  			s2_cfg = &smmu_domain->s2_cfg;
>  			break;
>  		default:
> @@ -2167,7 +2166,6 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
>  		fmt = ARM_64_LPAE_S1;
>  		finalise_stage_fn = arm_smmu_domain_finalise_s1;
>  		break;
> -	case ARM_SMMU_DOMAIN_NESTED:
>  	case ARM_SMMU_DOMAIN_S2:
>  		ias = smmu->ias;
>  		oas = smmu->oas;
> @@ -2735,7 +2733,7 @@ static int arm_smmu_enable_nesting(struct iommu_domain *domain)
>  	if (smmu_domain->smmu)
>  		ret = -EPERM;
>  	else
> -		smmu_domain->stage = ARM_SMMU_DOMAIN_NESTED;
> +		smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
>  	mutex_unlock(&smmu_domain->init_mutex);
>  
>  	return ret;
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 03f9e526cbd92f..27ddf1acd12cea 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -715,7 +715,6 @@ struct arm_smmu_master {
>  enum arm_smmu_domain_stage {
>  	ARM_SMMU_DOMAIN_S1 = 0,
>  	ARM_SMMU_DOMAIN_S2,
> -	ARM_SMMU_DOMAIN_NESTED,
>  	ARM_SMMU_DOMAIN_BYPASS,
>  };
>  
> -- 
> 2.42.0
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 07/19] iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into functions
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-11-14 15:24     ` Moritz Fischer
  -1 siblings, 0 replies; 158+ messages in thread
From: Moritz Fischer @ 2023-11-14 15:24 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:14PM -0400, Jason Gunthorpe wrote:
> This is preparation to move the STE calculation higher up in to the call
> chain and remove arm_smmu_write_strtab_ent(). These new functions will be
> called directly from attach_dev.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Moritz Fischer <mdf@kernel.org>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 115 +++++++++++---------
>  1 file changed, 63 insertions(+), 52 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 3fc8787db2dbc1..1c63fdebbda9d4 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1463,13 +1463,70 @@ static void arm_smmu_make_bypass_ste(struct arm_smmu_ste *target)
>  		FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
>  }
>  
> +static void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
> +				      struct arm_smmu_master *master,
> +				      struct arm_smmu_ctx_desc_cfg *cd_table)
> +{
> +	struct arm_smmu_device *smmu = master->smmu;
> +
> +	memset(target, 0, sizeof(*target));
> +	target->data[0] = cpu_to_le64(
> +		STRTAB_STE_0_V |
> +		FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
> +		FIELD_PREP(STRTAB_STE_0_S1FMT, cd_table->s1fmt) |
> +		(cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
> +		FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax));
> +
> +	target->data[1] = cpu_to_le64(
> +		FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
> +		FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> +		FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> +		FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) |
> +		((smmu->features & ARM_SMMU_FEAT_STALLS &&
> +		  !master->stall_enabled) ?
> +			 STRTAB_STE_1_S1STALLD :
> +			 0) |
> +		FIELD_PREP(STRTAB_STE_1_EATS,
> +			   master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0) |
> +		FIELD_PREP(STRTAB_STE_1_STRW,
> +			   (smmu->features & ARM_SMMU_FEAT_E2H) ?
> +				   STRTAB_STE_1_STRW_EL2 :
> +				   STRTAB_STE_1_STRW_NSEL1));
> +}
> +
> +static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
> +					struct arm_smmu_master *master,
> +					struct arm_smmu_domain *smmu_domain)
> +{
> +	struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg;
> +
> +	memset(target, 0, sizeof(*target));
> +
> +	target->data[0] = cpu_to_le64(
> +		STRTAB_STE_0_V |
> +		FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS));
> +
> +	target->data[1] |= cpu_to_le64(
> +		FIELD_PREP(STRTAB_STE_1_EATS,
> +			   master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
> +
> +	target->data[2] = cpu_to_le64(
> +		FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
> +		FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
> +		STRTAB_STE_2_S2AA64 |
> +#ifdef __BIG_ENDIAN
> +		STRTAB_STE_2_S2ENDI |
> +#endif
> +		STRTAB_STE_2_S2PTW |
> +		STRTAB_STE_2_S2R);
> +
> +	target->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
> +}
> +
>  static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  				      struct arm_smmu_ste *dst)
>  {
> -	u64 val;
>  	struct arm_smmu_device *smmu = master->smmu;
> -	struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
> -	struct arm_smmu_s2_cfg *s2_cfg = NULL;
>  	struct arm_smmu_domain *smmu_domain = master->domain;
>  	struct arm_smmu_ste target = {};
>  
> @@ -1484,61 +1541,15 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  
>  	switch (smmu_domain->stage) {
>  	case ARM_SMMU_DOMAIN_S1:
> -		cd_table = &master->cd_table;
> +		arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
>  		break;
>  	case ARM_SMMU_DOMAIN_S2:
> -		s2_cfg = &smmu_domain->s2_cfg;
> +		arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
>  		break;
>  	case ARM_SMMU_DOMAIN_BYPASS:
>  		arm_smmu_make_bypass_ste(&target);
> -		arm_smmu_write_ste(smmu, sid, dst, &target);
> -		return;
> +		break;
>  	}
> -
> -	/* Nuke the existing STE_0 value, as we're going to rewrite it */
> -	val = STRTAB_STE_0_V;
> -
> -	if (cd_table) {
> -		u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ?
> -			STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
> -
> -		target.data[1] = cpu_to_le64(
> -			 FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
> -			 FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> -			 FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> -			 FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) |
> -			 FIELD_PREP(STRTAB_STE_1_STRW, strw));
> -
> -		if (smmu->features & ARM_SMMU_FEAT_STALLS &&
> -		    !master->stall_enabled)
> -			target.data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
> -
> -		val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
> -			FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
> -			FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax) |
> -			FIELD_PREP(STRTAB_STE_0_S1FMT, cd_table->s1fmt);
> -	}
> -
> -	if (s2_cfg) {
> -		target.data[2] = cpu_to_le64(
> -			 FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
> -			 FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
> -#ifdef __BIG_ENDIAN
> -			 STRTAB_STE_2_S2ENDI |
> -#endif
> -			 STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
> -			 STRTAB_STE_2_S2R);
> -
> -		target.data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
> -
> -		val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
> -	}
> -
> -	if (master->ats_enabled)
> -		target.data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
> -						 STRTAB_STE_1_EATS_TRANS));
> -
> -	target.data[0] = cpu_to_le64(val);
>  	arm_smmu_write_ste(smmu, sid, dst, &target);
>  }
>  
> -- 
> 2.42.0
> 

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 07/19] iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into functions
@ 2023-11-14 15:24     ` Moritz Fischer
  0 siblings, 0 replies; 158+ messages in thread
From: Moritz Fischer @ 2023-11-14 15:24 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:14PM -0400, Jason Gunthorpe wrote:
> This is preparation to move the STE calculation higher up in to the call
> chain and remove arm_smmu_write_strtab_ent(). These new functions will be
> called directly from attach_dev.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Moritz Fischer <mdf@kernel.org>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 115 +++++++++++---------
>  1 file changed, 63 insertions(+), 52 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 3fc8787db2dbc1..1c63fdebbda9d4 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1463,13 +1463,70 @@ static void arm_smmu_make_bypass_ste(struct arm_smmu_ste *target)
>  		FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
>  }
>  
> +static void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
> +				      struct arm_smmu_master *master,
> +				      struct arm_smmu_ctx_desc_cfg *cd_table)
> +{
> +	struct arm_smmu_device *smmu = master->smmu;
> +
> +	memset(target, 0, sizeof(*target));
> +	target->data[0] = cpu_to_le64(
> +		STRTAB_STE_0_V |
> +		FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
> +		FIELD_PREP(STRTAB_STE_0_S1FMT, cd_table->s1fmt) |
> +		(cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
> +		FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax));
> +
> +	target->data[1] = cpu_to_le64(
> +		FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
> +		FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> +		FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> +		FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) |
> +		((smmu->features & ARM_SMMU_FEAT_STALLS &&
> +		  !master->stall_enabled) ?
> +			 STRTAB_STE_1_S1STALLD :
> +			 0) |
> +		FIELD_PREP(STRTAB_STE_1_EATS,
> +			   master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0) |
> +		FIELD_PREP(STRTAB_STE_1_STRW,
> +			   (smmu->features & ARM_SMMU_FEAT_E2H) ?
> +				   STRTAB_STE_1_STRW_EL2 :
> +				   STRTAB_STE_1_STRW_NSEL1));
> +}
> +
> +static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
> +					struct arm_smmu_master *master,
> +					struct arm_smmu_domain *smmu_domain)
> +{
> +	struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg;
> +
> +	memset(target, 0, sizeof(*target));
> +
> +	target->data[0] = cpu_to_le64(
> +		STRTAB_STE_0_V |
> +		FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS));
> +
> +	target->data[1] |= cpu_to_le64(
> +		FIELD_PREP(STRTAB_STE_1_EATS,
> +			   master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
> +
> +	target->data[2] = cpu_to_le64(
> +		FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
> +		FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
> +		STRTAB_STE_2_S2AA64 |
> +#ifdef __BIG_ENDIAN
> +		STRTAB_STE_2_S2ENDI |
> +#endif
> +		STRTAB_STE_2_S2PTW |
> +		STRTAB_STE_2_S2R);
> +
> +	target->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
> +}
> +
>  static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  				      struct arm_smmu_ste *dst)
>  {
> -	u64 val;
>  	struct arm_smmu_device *smmu = master->smmu;
> -	struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
> -	struct arm_smmu_s2_cfg *s2_cfg = NULL;
>  	struct arm_smmu_domain *smmu_domain = master->domain;
>  	struct arm_smmu_ste target = {};
>  
> @@ -1484,61 +1541,15 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  
>  	switch (smmu_domain->stage) {
>  	case ARM_SMMU_DOMAIN_S1:
> -		cd_table = &master->cd_table;
> +		arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
>  		break;
>  	case ARM_SMMU_DOMAIN_S2:
> -		s2_cfg = &smmu_domain->s2_cfg;
> +		arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
>  		break;
>  	case ARM_SMMU_DOMAIN_BYPASS:
>  		arm_smmu_make_bypass_ste(&target);
> -		arm_smmu_write_ste(smmu, sid, dst, &target);
> -		return;
> +		break;
>  	}
> -
> -	/* Nuke the existing STE_0 value, as we're going to rewrite it */
> -	val = STRTAB_STE_0_V;
> -
> -	if (cd_table) {
> -		u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ?
> -			STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
> -
> -		target.data[1] = cpu_to_le64(
> -			 FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
> -			 FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> -			 FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> -			 FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) |
> -			 FIELD_PREP(STRTAB_STE_1_STRW, strw));
> -
> -		if (smmu->features & ARM_SMMU_FEAT_STALLS &&
> -		    !master->stall_enabled)
> -			target.data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
> -
> -		val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
> -			FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
> -			FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax) |
> -			FIELD_PREP(STRTAB_STE_0_S1FMT, cd_table->s1fmt);
> -	}
> -
> -	if (s2_cfg) {
> -		target.data[2] = cpu_to_le64(
> -			 FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
> -			 FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
> -#ifdef __BIG_ENDIAN
> -			 STRTAB_STE_2_S2ENDI |
> -#endif
> -			 STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
> -			 STRTAB_STE_2_S2R);
> -
> -		target.data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
> -
> -		val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
> -	}
> -
> -	if (master->ats_enabled)
> -		target.data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
> -						 STRTAB_STE_1_EATS_TRANS));
> -
> -	target.data[0] = cpu_to_le64(val);
>  	arm_smmu_write_ste(smmu, sid, dst, &target);
>  }
>  
> -- 
> 2.42.0
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 01/19] iommu/arm-smmu-v3: Add a type for the STE
  2023-11-14 15:06     ` Moritz Fischer
@ 2023-11-15 11:52       ` Michael Shavit
  -1 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 11:52 UTC (permalink / raw)
  To: Moritz Fischer
  Cc: Jason Gunthorpe, iommu, Joerg Roedel, linux-arm-kernel,
	Robin Murphy, Will Deacon, Nicolin Chen,
	Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 11:07 PM Moritz Fischer <mdf@kernel.org> wrote:
>
> On Mon, Nov 13, 2023 at 01:53:08PM -0400, Jason Gunthorpe wrote:
> > Instead of passing a naked __le16 * around to represent a STE wrap it in a
> > "struct arm_smmu_ste" with an array of the correct size. This makes it
> > much clearer which functions will comprise the "STE API".
> >
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Moritz Fischer <mdf@kernel.org>

Reviewed-by: Michael Shavit <mshavit@google.com>

> > ---
> >  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 54 ++++++++++-----------
> >  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  7 ++-
> >  2 files changed, 32 insertions(+), 29 deletions(-)
> >
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index 7445454c2af244..519749d15fbda0 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -1249,7 +1249,7 @@ static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 sid)
> >  }
> >
> >  static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> > -                                   __le64 *dst)
> > +                                   struct arm_smmu_ste *dst)
> >  {
> >       /*
> >        * This is hideously complicated, but we only really care about
> > @@ -1267,7 +1267,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> >        * 2. Write everything apart from dword 0, sync, write dword 0, sync
> >        * 3. Update Config, sync
> >        */
> > -     u64 val = le64_to_cpu(dst[0]);
> > +     u64 val = le64_to_cpu(dst->data[0]);
> >       bool ste_live = false;
> >       struct arm_smmu_device *smmu = NULL;
> >       struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
> > @@ -1325,10 +1325,10 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> >               else
> >                       val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
> >
> > -             dst[0] = cpu_to_le64(val);
> > -             dst[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
> > +             dst->data[0] = cpu_to_le64(val);
> > +             dst->data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
> >                                               STRTAB_STE_1_SHCFG_INCOMING));
> > -             dst[2] = 0; /* Nuke the VMID */
> > +             dst->data[2] = 0; /* Nuke the VMID */
> >               /*
> >                * The SMMU can perform negative caching, so we must sync
> >                * the STE regardless of whether the old value was live.
> > @@ -1343,7 +1343,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> >                       STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
> >
> >               BUG_ON(ste_live);
> > -             dst[1] = cpu_to_le64(
> > +             dst->data[1] = cpu_to_le64(
> >                        FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
> >                        FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> >                        FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> > @@ -1352,7 +1352,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> >
> >               if (smmu->features & ARM_SMMU_FEAT_STALLS &&
> >                   !master->stall_enabled)
> > -                     dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
> > +                     dst->data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
> >
> >               val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
> >                       FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
> > @@ -1362,7 +1362,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> >
> >       if (s2_cfg) {
> >               BUG_ON(ste_live);
> > -             dst[2] = cpu_to_le64(
> > +             dst->data[2] = cpu_to_le64(
> >                        FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
> >                        FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
> >  #ifdef __BIG_ENDIAN
> > @@ -1371,18 +1371,18 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> >                        STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
> >                        STRTAB_STE_2_S2R);
> >
> > -             dst[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
> > +             dst->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
> >
> >               val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
> >       }
> >
> >       if (master->ats_enabled)
> > -             dst[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
> > +             dst->data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
> >                                                STRTAB_STE_1_EATS_TRANS));
> >
> >       arm_smmu_sync_ste_for_sid(smmu, sid);
> >       /* See comment in arm_smmu_write_ctx_desc() */
> > -     WRITE_ONCE(dst[0], cpu_to_le64(val));
> > +     WRITE_ONCE(dst->data[0], cpu_to_le64(val));
> >       arm_smmu_sync_ste_for_sid(smmu, sid);
> >
> >       /* It's likely that we'll want to use the new STE soon */
> > @@ -1390,7 +1390,8 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> >               arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd);
> >  }
> >
> > -static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent, bool force)
> > +static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab,
> > +                                   unsigned int nent, bool force)
> >  {
> >       unsigned int i;
> >       u64 val = STRTAB_STE_0_V;
> > @@ -1401,11 +1402,11 @@ static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent, bool fo
> >               val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
> >
> >       for (i = 0; i < nent; ++i) {
> > -             strtab[0] = cpu_to_le64(val);
> > -             strtab[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
> > -                                                STRTAB_STE_1_SHCFG_INCOMING));
> > -             strtab[2] = 0;
> > -             strtab += STRTAB_STE_DWORDS;
> > +             strtab->data[0] = cpu_to_le64(val);
> > +             strtab->data[1] = cpu_to_le64(FIELD_PREP(
> > +                     STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
> > +             strtab->data[2] = 0;
> > +             strtab++;
> >       }
> >  }
> >
> > @@ -2209,26 +2210,22 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
> >       return 0;
> >  }
> >
> > -static __le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
> > +static struct arm_smmu_ste *
> > +arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
> >  {
> > -     __le64 *step;
> >       struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
> >
> >       if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
> > -             struct arm_smmu_strtab_l1_desc *l1_desc;
> >               int idx;
> >
> >               /* Two-level walk */
> >               idx = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS;
> > -             l1_desc = &cfg->l1_desc[idx];
> > -             idx = (sid & ((1 << STRTAB_SPLIT) - 1)) * STRTAB_STE_DWORDS;
> > -             step = &l1_desc->l2ptr[idx];
> > +             return &cfg->l1_desc[idx].l2ptr[sid & ((1 << STRTAB_SPLIT) - 1)];
> >       } else {
> >               /* Simple linear lookup */
> > -             step = &cfg->strtab[sid * STRTAB_STE_DWORDS];
> > +             return (struct arm_smmu_ste *)&cfg
> > +                            ->strtab[sid * STRTAB_STE_DWORDS];
> >       }
> > -
> > -     return step;
> >  }
> >
> >  static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
> > @@ -2238,7 +2235,8 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
> >
> >       for (i = 0; i < master->num_streams; ++i) {
> >               u32 sid = master->streams[i].id;
> > -             __le64 *step = arm_smmu_get_step_for_sid(smmu, sid);
> > +             struct arm_smmu_ste *step =
> > +                     arm_smmu_get_step_for_sid(smmu, sid);
> >
> >               /* Bridged PCI devices may end up with duplicated IDs */
> >               for (j = 0; j < i; j++)
> > @@ -3769,7 +3767,7 @@ static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu)
> >       iort_get_rmr_sids(dev_fwnode(smmu->dev), &rmr_list);
> >
> >       list_for_each_entry(e, &rmr_list, list) {
> > -             __le64 *step;
> > +             struct arm_smmu_ste *step;
> >               struct iommu_iort_rmr_data *rmr;
> >               int ret, i;
> >
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > index 961205ba86d25d..03f9e526cbd92f 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > @@ -206,6 +206,11 @@
> >  #define STRTAB_L1_DESC_L2PTR_MASK    GENMASK_ULL(51, 6)
> >
> >  #define STRTAB_STE_DWORDS            8
> > +
> > +struct arm_smmu_ste {
> > +     __le64 data[STRTAB_STE_DWORDS];
> > +};
> > +

nit: This looks out of place at this location compared to the rest of
the file. All of the other structs are defined after this large block
of #defines.

> >  #define STRTAB_STE_0_V                       (1UL << 0)
> >  #define STRTAB_STE_0_CFG             GENMASK_ULL(3, 1)
> >  #define STRTAB_STE_0_CFG_ABORT               0
> > @@ -571,7 +576,7 @@ struct arm_smmu_priq {
> >  struct arm_smmu_strtab_l1_desc {
> >       u8                              span;
> >
> > -     __le64                          *l2ptr;
> > +     struct arm_smmu_ste             *l2ptr;
> >       dma_addr_t                      l2ptr_dma;
> >  };
> >
> > --
> > 2.42.0
> >

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 01/19] iommu/arm-smmu-v3: Add a type for the STE
@ 2023-11-15 11:52       ` Michael Shavit
  0 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 11:52 UTC (permalink / raw)
  To: Moritz Fischer
  Cc: Jason Gunthorpe, iommu, Joerg Roedel, linux-arm-kernel,
	Robin Murphy, Will Deacon, Nicolin Chen,
	Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 11:07 PM Moritz Fischer <mdf@kernel.org> wrote:
>
> On Mon, Nov 13, 2023 at 01:53:08PM -0400, Jason Gunthorpe wrote:
> > Instead of passing a naked __le16 * around to represent a STE wrap it in a
> > "struct arm_smmu_ste" with an array of the correct size. This makes it
> > much clearer which functions will comprise the "STE API".
> >
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Moritz Fischer <mdf@kernel.org>

Reviewed-by: Michael Shavit <mshavit@google.com>

> > ---
> >  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 54 ++++++++++-----------
> >  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  7 ++-
> >  2 files changed, 32 insertions(+), 29 deletions(-)
> >
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index 7445454c2af244..519749d15fbda0 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -1249,7 +1249,7 @@ static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 sid)
> >  }
> >
> >  static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> > -                                   __le64 *dst)
> > +                                   struct arm_smmu_ste *dst)
> >  {
> >       /*
> >        * This is hideously complicated, but we only really care about
> > @@ -1267,7 +1267,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> >        * 2. Write everything apart from dword 0, sync, write dword 0, sync
> >        * 3. Update Config, sync
> >        */
> > -     u64 val = le64_to_cpu(dst[0]);
> > +     u64 val = le64_to_cpu(dst->data[0]);
> >       bool ste_live = false;
> >       struct arm_smmu_device *smmu = NULL;
> >       struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
> > @@ -1325,10 +1325,10 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> >               else
> >                       val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
> >
> > -             dst[0] = cpu_to_le64(val);
> > -             dst[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
> > +             dst->data[0] = cpu_to_le64(val);
> > +             dst->data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
> >                                               STRTAB_STE_1_SHCFG_INCOMING));
> > -             dst[2] = 0; /* Nuke the VMID */
> > +             dst->data[2] = 0; /* Nuke the VMID */
> >               /*
> >                * The SMMU can perform negative caching, so we must sync
> >                * the STE regardless of whether the old value was live.
> > @@ -1343,7 +1343,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> >                       STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
> >
> >               BUG_ON(ste_live);
> > -             dst[1] = cpu_to_le64(
> > +             dst->data[1] = cpu_to_le64(
> >                        FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
> >                        FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> >                        FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> > @@ -1352,7 +1352,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> >
> >               if (smmu->features & ARM_SMMU_FEAT_STALLS &&
> >                   !master->stall_enabled)
> > -                     dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
> > +                     dst->data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
> >
> >               val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
> >                       FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
> > @@ -1362,7 +1362,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> >
> >       if (s2_cfg) {
> >               BUG_ON(ste_live);
> > -             dst[2] = cpu_to_le64(
> > +             dst->data[2] = cpu_to_le64(
> >                        FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
> >                        FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
> >  #ifdef __BIG_ENDIAN
> > @@ -1371,18 +1371,18 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> >                        STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
> >                        STRTAB_STE_2_S2R);
> >
> > -             dst[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
> > +             dst->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
> >
> >               val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
> >       }
> >
> >       if (master->ats_enabled)
> > -             dst[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
> > +             dst->data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
> >                                                STRTAB_STE_1_EATS_TRANS));
> >
> >       arm_smmu_sync_ste_for_sid(smmu, sid);
> >       /* See comment in arm_smmu_write_ctx_desc() */
> > -     WRITE_ONCE(dst[0], cpu_to_le64(val));
> > +     WRITE_ONCE(dst->data[0], cpu_to_le64(val));
> >       arm_smmu_sync_ste_for_sid(smmu, sid);
> >
> >       /* It's likely that we'll want to use the new STE soon */
> > @@ -1390,7 +1390,8 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> >               arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd);
> >  }
> >
> > -static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent, bool force)
> > +static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab,
> > +                                   unsigned int nent, bool force)
> >  {
> >       unsigned int i;
> >       u64 val = STRTAB_STE_0_V;
> > @@ -1401,11 +1402,11 @@ static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent, bool fo
> >               val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
> >
> >       for (i = 0; i < nent; ++i) {
> > -             strtab[0] = cpu_to_le64(val);
> > -             strtab[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
> > -                                                STRTAB_STE_1_SHCFG_INCOMING));
> > -             strtab[2] = 0;
> > -             strtab += STRTAB_STE_DWORDS;
> > +             strtab->data[0] = cpu_to_le64(val);
> > +             strtab->data[1] = cpu_to_le64(FIELD_PREP(
> > +                     STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
> > +             strtab->data[2] = 0;
> > +             strtab++;
> >       }
> >  }
> >
> > @@ -2209,26 +2210,22 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
> >       return 0;
> >  }
> >
> > -static __le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
> > +static struct arm_smmu_ste *
> > +arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
> >  {
> > -     __le64 *step;
> >       struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
> >
> >       if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
> > -             struct arm_smmu_strtab_l1_desc *l1_desc;
> >               int idx;
> >
> >               /* Two-level walk */
> >               idx = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS;
> > -             l1_desc = &cfg->l1_desc[idx];
> > -             idx = (sid & ((1 << STRTAB_SPLIT) - 1)) * STRTAB_STE_DWORDS;
> > -             step = &l1_desc->l2ptr[idx];
> > +             return &cfg->l1_desc[idx].l2ptr[sid & ((1 << STRTAB_SPLIT) - 1)];
> >       } else {
> >               /* Simple linear lookup */
> > -             step = &cfg->strtab[sid * STRTAB_STE_DWORDS];
> > +             return (struct arm_smmu_ste *)&cfg
> > +                            ->strtab[sid * STRTAB_STE_DWORDS];
> >       }
> > -
> > -     return step;
> >  }
> >
> >  static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
> > @@ -2238,7 +2235,8 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
> >
> >       for (i = 0; i < master->num_streams; ++i) {
> >               u32 sid = master->streams[i].id;
> > -             __le64 *step = arm_smmu_get_step_for_sid(smmu, sid);
> > +             struct arm_smmu_ste *step =
> > +                     arm_smmu_get_step_for_sid(smmu, sid);
> >
> >               /* Bridged PCI devices may end up with duplicated IDs */
> >               for (j = 0; j < i; j++)
> > @@ -3769,7 +3767,7 @@ static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu)
> >       iort_get_rmr_sids(dev_fwnode(smmu->dev), &rmr_list);
> >
> >       list_for_each_entry(e, &rmr_list, list) {
> > -             __le64 *step;
> > +             struct arm_smmu_ste *step;
> >               struct iommu_iort_rmr_data *rmr;
> >               int ret, i;
> >
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > index 961205ba86d25d..03f9e526cbd92f 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > @@ -206,6 +206,11 @@
> >  #define STRTAB_L1_DESC_L2PTR_MASK    GENMASK_ULL(51, 6)
> >
> >  #define STRTAB_STE_DWORDS            8
> > +
> > +struct arm_smmu_ste {
> > +     __le64 data[STRTAB_STE_DWORDS];
> > +};
> > +

nit: This looks out of place at this location compared to the rest of
the file. All of the other structs are defined after this large block
of #defines.

> >  #define STRTAB_STE_0_V                       (1UL << 0)
> >  #define STRTAB_STE_0_CFG             GENMASK_ULL(3, 1)
> >  #define STRTAB_STE_0_CFG_ABORT               0
> > @@ -571,7 +576,7 @@ struct arm_smmu_priq {
> >  struct arm_smmu_strtab_l1_desc {
> >       u8                              span;
> >
> > -     __le64                          *l2ptr;
> > +     struct arm_smmu_ste             *l2ptr;
> >       dma_addr_t                      l2ptr_dma;
> >  };
> >
> > --
> > 2.42.0
> >

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 02/19] iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent()
  2023-11-14 15:17     ` Moritz Fischer
@ 2023-11-15 11:55       ` Michael Shavit
  -1 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 11:55 UTC (permalink / raw)
  To: Moritz Fischer
  Cc: Jason Gunthorpe, iommu, Joerg Roedel, linux-arm-kernel,
	Robin Murphy, Will Deacon, Nicolin Chen,
	Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 11:17 PM Moritz Fischer <mdf@kernel.org> wrote:
>
> On Mon, Nov 13, 2023 at 01:53:09PM -0400, Jason Gunthorpe wrote:
> > The only caller is arm_smmu_install_ste_for_dev() which never has a NULL
> > master. Remove the confusing if.
> >
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Moritz Fischer <mdf@kernel.org>
Reviewed-by: Michael Shavit <mshavit@google.com>
> > ---
> >  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 9 ++-------
> >  1 file changed, 2 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index 519749d15fbda0..9117e769a965e1 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -1269,10 +1269,10 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> >        */
> >       u64 val = le64_to_cpu(dst->data[0]);
> >       bool ste_live = false;
> > -     struct arm_smmu_device *smmu = NULL;
> > +     struct arm_smmu_device *smmu = master->smmu;
> >       struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
> >       struct arm_smmu_s2_cfg *s2_cfg = NULL;
> > -     struct arm_smmu_domain *smmu_domain = NULL;
> > +     struct arm_smmu_domain *smmu_domain = master->domain;
> >       struct arm_smmu_cmdq_ent prefetch_cmd = {
> >               .opcode         = CMDQ_OP_PREFETCH_CFG,
> >               .prefetch       = {
> > @@ -1280,11 +1280,6 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> >               },
> >       };
> >
> > -     if (master) {
> > -             smmu_domain = master->domain;
> > -             smmu = master->smmu;
> > -     }
> > -
> >       if (smmu_domain) {
> >               switch (smmu_domain->stage) {
> >               case ARM_SMMU_DOMAIN_S1:
> > --
> > 2.42.0
> >

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 02/19] iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent()
@ 2023-11-15 11:55       ` Michael Shavit
  0 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 11:55 UTC (permalink / raw)
  To: Moritz Fischer
  Cc: Jason Gunthorpe, iommu, Joerg Roedel, linux-arm-kernel,
	Robin Murphy, Will Deacon, Nicolin Chen,
	Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 11:17 PM Moritz Fischer <mdf@kernel.org> wrote:
>
> On Mon, Nov 13, 2023 at 01:53:09PM -0400, Jason Gunthorpe wrote:
> > The only caller is arm_smmu_install_ste_for_dev() which never has a NULL
> > master. Remove the confusing if.
> >
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Moritz Fischer <mdf@kernel.org>
Reviewed-by: Michael Shavit <mshavit@google.com>
> > ---
> >  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 9 ++-------
> >  1 file changed, 2 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index 519749d15fbda0..9117e769a965e1 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -1269,10 +1269,10 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> >        */
> >       u64 val = le64_to_cpu(dst->data[0]);
> >       bool ste_live = false;
> > -     struct arm_smmu_device *smmu = NULL;
> > +     struct arm_smmu_device *smmu = master->smmu;
> >       struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
> >       struct arm_smmu_s2_cfg *s2_cfg = NULL;
> > -     struct arm_smmu_domain *smmu_domain = NULL;
> > +     struct arm_smmu_domain *smmu_domain = master->domain;
> >       struct arm_smmu_cmdq_ent prefetch_cmd = {
> >               .opcode         = CMDQ_OP_PREFETCH_CFG,
> >               .prefetch       = {
> > @@ -1280,11 +1280,6 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> >               },
> >       };
> >
> > -     if (master) {
> > -             smmu_domain = master->domain;
> > -             smmu = master->smmu;
> > -     }
> > -
> >       if (smmu_domain) {
> >               switch (smmu_domain->stage) {
> >               case ARM_SMMU_DOMAIN_S1:
> > --
> > 2.42.0
> >

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 05/19] iommu/arm-smmu-v3: Consolidate the STE generation for abort/bypass
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-11-15 12:17     ` Michael Shavit
  -1 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 12:17 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> This allows writing the flow of arm_smmu_write_strtab_ent() around abort
> and bypass domains more naturally.
>
> Note that the core code no longer supplies NULL domains, though there is
> still a flow in the driver that end up in arm_smmu_write_strtab_ent() with
> NULL. A later patch will remove it.
>
> Remove the duplicate calculation of the STE in arm_smmu_init_bypass_stes()
> and remove the force parameter. arm_smmu_rmr_install_bypass_ste() can now
> simply invoke arm_smmu_make_bypass_ste() directly.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Michael Shavit <mshavit@google.com>

> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 89 +++++++++++----------
>  1 file changed, 47 insertions(+), 42 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 6430a8d89cb471..13cdb959ec8f58 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1443,6 +1443,24 @@ static void arm_smmu_write_ste(struct arm_smmu_device *smmu, u32 sid,
>         }
>  }
>
> +static void arm_smmu_make_abort_ste(struct arm_smmu_ste *target)
> +{
> +       memset(target, 0, sizeof(*target));
> +       target->data[0] = cpu_to_le64(
> +               STRTAB_STE_0_V |
> +               FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT));
> +}
> +
> +static void arm_smmu_make_bypass_ste(struct arm_smmu_ste *target)
> +{
> +       memset(target, 0, sizeof(*target));
> +       target->data[0] = cpu_to_le64(
> +               STRTAB_STE_0_V |
> +               FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS));
> +       target->data[1] = cpu_to_le64(
> +               FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
> +}
> +
>  static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>                                       struct arm_smmu_ste *dst)
>  {
> @@ -1453,37 +1471,31 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>         struct arm_smmu_domain *smmu_domain = master->domain;
>         struct arm_smmu_ste target = {};
>
> -       if (smmu_domain) {
> -               switch (smmu_domain->stage) {
> -               case ARM_SMMU_DOMAIN_S1:
> -                       cd_table = &master->cd_table;
> -                       break;
> -               case ARM_SMMU_DOMAIN_S2:
> -                       s2_cfg = &smmu_domain->s2_cfg;
> -                       break;
> -               default:
> -                       break;
> -               }
> +       if (!smmu_domain) {
> +               if (disable_bypass)
> +                       arm_smmu_make_abort_ste(&target);
> +               else
> +                       arm_smmu_make_bypass_ste(&target);
> +               arm_smmu_write_ste(smmu, sid, dst, &target);
> +               return;
> +       }
> +
> +       switch (smmu_domain->stage) {
> +       case ARM_SMMU_DOMAIN_S1:
> +               cd_table = &master->cd_table;
> +               break;
> +       case ARM_SMMU_DOMAIN_S2:
> +               s2_cfg = &smmu_domain->s2_cfg;
> +               break;
> +       case ARM_SMMU_DOMAIN_BYPASS:
> +               arm_smmu_make_bypass_ste(&target);
> +               arm_smmu_write_ste(smmu, sid, dst, &target);
> +               return;
>         }
>
>         /* Nuke the existing STE_0 value, as we're going to rewrite it */
>         val = STRTAB_STE_0_V;
>
> -       /* Bypass/fault */
> -       if (!smmu_domain || !(cd_table || s2_cfg)) {
> -               if (!smmu_domain && disable_bypass)
> -                       val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT);
> -               else
> -                       val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
> -
> -               target.data[0] = cpu_to_le64(val);
> -               target.data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
> -                                               STRTAB_STE_1_SHCFG_INCOMING));
> -               target.data[2] = 0; /* Nuke the VMID */
> -               arm_smmu_write_ste(smmu, sid, dst, &target);
> -               return;
> -       }
> -
>         if (cd_table) {
>                 u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ?
>                         STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
> @@ -1529,21 +1541,15 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  }
>
>  static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab,
> -                                     unsigned int nent, bool force)
> +                                     unsigned int nent)
>  {
>         unsigned int i;
> -       u64 val = STRTAB_STE_0_V;
> -
> -       if (disable_bypass && !force)
> -               val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT);
> -       else
> -               val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
>
>         for (i = 0; i < nent; ++i) {
> -               strtab->data[0] = cpu_to_le64(val);
> -               strtab->data[1] = cpu_to_le64(FIELD_PREP(
> -                       STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
> -               strtab->data[2] = 0;
> +               if (disable_bypass)
> +                       arm_smmu_make_abort_ste(strtab);
> +               else
> +                       arm_smmu_make_bypass_ste(strtab);
>                 strtab++;
>         }
>  }
> @@ -1571,7 +1577,7 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
>                 return -ENOMEM;
>         }
>
> -       arm_smmu_init_bypass_stes(desc->l2ptr, 1 << STRTAB_SPLIT, false);
> +       arm_smmu_init_bypass_stes(desc->l2ptr, 1 << STRTAB_SPLIT);
>         arm_smmu_write_strtab_l1_desc(strtab, desc);
>         return 0;
>  }
> @@ -3193,7 +3199,7 @@ static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu)
>         reg |= FIELD_PREP(STRTAB_BASE_CFG_LOG2SIZE, smmu->sid_bits);
>         cfg->strtab_base_cfg = reg;
>
> -       arm_smmu_init_bypass_stes(strtab, cfg->num_l1_ents, false);
> +       arm_smmu_init_bypass_stes(strtab, cfg->num_l1_ents);
>         return 0;
>  }
>
> @@ -3904,7 +3910,6 @@ static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu)
>         iort_get_rmr_sids(dev_fwnode(smmu->dev), &rmr_list);
>
>         list_for_each_entry(e, &rmr_list, list) {
> -               struct arm_smmu_ste *step;
>                 struct iommu_iort_rmr_data *rmr;
>                 int ret, i;
>
> @@ -3917,8 +3922,8 @@ static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu)
>                                 continue;
>                         }
>
> -                       step = arm_smmu_get_step_for_sid(smmu, rmr->sids[i]);
> -                       arm_smmu_init_bypass_stes(step, 1, true);
> +                       arm_smmu_make_bypass_ste(
> +                               arm_smmu_get_step_for_sid(smmu, rmr->sids[i]));
>                 }
>         }
>
> --
> 2.42.0
>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 05/19] iommu/arm-smmu-v3: Consolidate the STE generation for abort/bypass
@ 2023-11-15 12:17     ` Michael Shavit
  0 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 12:17 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> This allows writing the flow of arm_smmu_write_strtab_ent() around abort
> and bypass domains more naturally.
>
> Note that the core code no longer supplies NULL domains, though there is
> still a flow in the driver that end up in arm_smmu_write_strtab_ent() with
> NULL. A later patch will remove it.
>
> Remove the duplicate calculation of the STE in arm_smmu_init_bypass_stes()
> and remove the force parameter. arm_smmu_rmr_install_bypass_ste() can now
> simply invoke arm_smmu_make_bypass_ste() directly.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Michael Shavit <mshavit@google.com>

> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 89 +++++++++++----------
>  1 file changed, 47 insertions(+), 42 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 6430a8d89cb471..13cdb959ec8f58 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1443,6 +1443,24 @@ static void arm_smmu_write_ste(struct arm_smmu_device *smmu, u32 sid,
>         }
>  }
>
> +static void arm_smmu_make_abort_ste(struct arm_smmu_ste *target)
> +{
> +       memset(target, 0, sizeof(*target));
> +       target->data[0] = cpu_to_le64(
> +               STRTAB_STE_0_V |
> +               FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT));
> +}
> +
> +static void arm_smmu_make_bypass_ste(struct arm_smmu_ste *target)
> +{
> +       memset(target, 0, sizeof(*target));
> +       target->data[0] = cpu_to_le64(
> +               STRTAB_STE_0_V |
> +               FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS));
> +       target->data[1] = cpu_to_le64(
> +               FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
> +}
> +
>  static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>                                       struct arm_smmu_ste *dst)
>  {
> @@ -1453,37 +1471,31 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>         struct arm_smmu_domain *smmu_domain = master->domain;
>         struct arm_smmu_ste target = {};
>
> -       if (smmu_domain) {
> -               switch (smmu_domain->stage) {
> -               case ARM_SMMU_DOMAIN_S1:
> -                       cd_table = &master->cd_table;
> -                       break;
> -               case ARM_SMMU_DOMAIN_S2:
> -                       s2_cfg = &smmu_domain->s2_cfg;
> -                       break;
> -               default:
> -                       break;
> -               }
> +       if (!smmu_domain) {
> +               if (disable_bypass)
> +                       arm_smmu_make_abort_ste(&target);
> +               else
> +                       arm_smmu_make_bypass_ste(&target);
> +               arm_smmu_write_ste(smmu, sid, dst, &target);
> +               return;
> +       }
> +
> +       switch (smmu_domain->stage) {
> +       case ARM_SMMU_DOMAIN_S1:
> +               cd_table = &master->cd_table;
> +               break;
> +       case ARM_SMMU_DOMAIN_S2:
> +               s2_cfg = &smmu_domain->s2_cfg;
> +               break;
> +       case ARM_SMMU_DOMAIN_BYPASS:
> +               arm_smmu_make_bypass_ste(&target);
> +               arm_smmu_write_ste(smmu, sid, dst, &target);
> +               return;
>         }
>
>         /* Nuke the existing STE_0 value, as we're going to rewrite it */
>         val = STRTAB_STE_0_V;
>
> -       /* Bypass/fault */
> -       if (!smmu_domain || !(cd_table || s2_cfg)) {
> -               if (!smmu_domain && disable_bypass)
> -                       val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT);
> -               else
> -                       val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
> -
> -               target.data[0] = cpu_to_le64(val);
> -               target.data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
> -                                               STRTAB_STE_1_SHCFG_INCOMING));
> -               target.data[2] = 0; /* Nuke the VMID */
> -               arm_smmu_write_ste(smmu, sid, dst, &target);
> -               return;
> -       }
> -
>         if (cd_table) {
>                 u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ?
>                         STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
> @@ -1529,21 +1541,15 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  }
>
>  static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab,
> -                                     unsigned int nent, bool force)
> +                                     unsigned int nent)
>  {
>         unsigned int i;
> -       u64 val = STRTAB_STE_0_V;
> -
> -       if (disable_bypass && !force)
> -               val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT);
> -       else
> -               val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
>
>         for (i = 0; i < nent; ++i) {
> -               strtab->data[0] = cpu_to_le64(val);
> -               strtab->data[1] = cpu_to_le64(FIELD_PREP(
> -                       STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
> -               strtab->data[2] = 0;
> +               if (disable_bypass)
> +                       arm_smmu_make_abort_ste(strtab);
> +               else
> +                       arm_smmu_make_bypass_ste(strtab);
>                 strtab++;
>         }
>  }
> @@ -1571,7 +1577,7 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid)
>                 return -ENOMEM;
>         }
>
> -       arm_smmu_init_bypass_stes(desc->l2ptr, 1 << STRTAB_SPLIT, false);
> +       arm_smmu_init_bypass_stes(desc->l2ptr, 1 << STRTAB_SPLIT);
>         arm_smmu_write_strtab_l1_desc(strtab, desc);
>         return 0;
>  }
> @@ -3193,7 +3199,7 @@ static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu)
>         reg |= FIELD_PREP(STRTAB_BASE_CFG_LOG2SIZE, smmu->sid_bits);
>         cfg->strtab_base_cfg = reg;
>
> -       arm_smmu_init_bypass_stes(strtab, cfg->num_l1_ents, false);
> +       arm_smmu_init_bypass_stes(strtab, cfg->num_l1_ents);
>         return 0;
>  }
>
> @@ -3904,7 +3910,6 @@ static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu)
>         iort_get_rmr_sids(dev_fwnode(smmu->dev), &rmr_list);
>
>         list_for_each_entry(e, &rmr_list, list) {
> -               struct arm_smmu_ste *step;
>                 struct iommu_iort_rmr_data *rmr;
>                 int ret, i;
>
> @@ -3917,8 +3922,8 @@ static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu)
>                                 continue;
>                         }
>
> -                       step = arm_smmu_get_step_for_sid(smmu, rmr->sids[i]);
> -                       arm_smmu_init_bypass_stes(step, 1, true);
> +                       arm_smmu_make_bypass_ste(
> +                               arm_smmu_get_step_for_sid(smmu, rmr->sids[i]));
>                 }
>         }
>
> --
> 2.42.0
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 01/19] iommu/arm-smmu-v3: Add a type for the STE
  2023-11-15 11:52       ` Michael Shavit
@ 2023-11-15 13:35         ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-15 13:35 UTC (permalink / raw)
  To: Michael Shavit
  Cc: Moritz Fischer, iommu, Joerg Roedel, linux-arm-kernel,
	Robin Murphy, Will Deacon, Nicolin Chen,
	Shameerali Kolothum Thodi

On Wed, Nov 15, 2023 at 07:52:17PM +0800, Michael Shavit wrote:

> > > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > > index 961205ba86d25d..03f9e526cbd92f 100644
> > > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > > @@ -206,6 +206,11 @@
> > >  #define STRTAB_L1_DESC_L2PTR_MASK    GENMASK_ULL(51, 6)
> > >
> > >  #define STRTAB_STE_DWORDS            8
> > > +
> > > +struct arm_smmu_ste {
> > > +     __le64 data[STRTAB_STE_DWORDS];
> > > +};
> > > +
> 
> nit: This looks out of place at this location compared to the rest of
> the file. All of the other structs are defined after this large block
> of #defines.

The struct follows the defines that specify how to parse its content,
which is pretty normal

Thanks,
Jason

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 01/19] iommu/arm-smmu-v3: Add a type for the STE
@ 2023-11-15 13:35         ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-15 13:35 UTC (permalink / raw)
  To: Michael Shavit
  Cc: Moritz Fischer, iommu, Joerg Roedel, linux-arm-kernel,
	Robin Murphy, Will Deacon, Nicolin Chen,
	Shameerali Kolothum Thodi

On Wed, Nov 15, 2023 at 07:52:17PM +0800, Michael Shavit wrote:

> > > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > > index 961205ba86d25d..03f9e526cbd92f 100644
> > > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > > @@ -206,6 +206,11 @@
> > >  #define STRTAB_L1_DESC_L2PTR_MASK    GENMASK_ULL(51, 6)
> > >
> > >  #define STRTAB_STE_DWORDS            8
> > > +
> > > +struct arm_smmu_ste {
> > > +     __le64 data[STRTAB_STE_DWORDS];
> > > +};
> > > +
> 
> nit: This looks out of place at this location compared to the rest of
> the file. All of the other structs are defined after this large block
> of #defines.

The struct follows the defines that specify how to parse its content,
which is pretty normal

Thanks,
Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 06/19] iommu/arm-smmu-v3: Move arm_smmu_rmr_install_bypass_ste()
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-11-15 13:57     ` Michael Shavit
  -1 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 13:57 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> Logically arm_smmu_init_strtab_linear() is the function that allocates and
> populates the stream table with the initial value of the STEs. After this
> function returns the stream table should be fully ready.
>
> arm_smmu_rmr_install_bypass_ste() adjusts the initial stream table to force
> any SIDs that the FW says have IOMMU_RESV_DIRECT to use bypass. This
> ensures there is no disruption to the identity mapping during boot.
>
> Put arm_smmu_rmr_install_bypass_ste() into arm_smmu_init_strtab_linear(),
> it already executes immediately after arm_smmu_init_strtab_linear().
>
> No functional change intended.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Michael Shavit <mshavit@google.com>

> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 13cdb959ec8f58..3fc8787db2dbc1 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -86,6 +86,8 @@ static struct arm_smmu_option_prop arm_smmu_options[] = {
>         { 0, NULL},
>  };
>
> +static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu);
> +
>  static void parse_driver_options(struct arm_smmu_device *smmu)
>  {
>         int i = 0;
> @@ -3200,6 +3202,9 @@ static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu)
>         cfg->strtab_base_cfg = reg;
>
>         arm_smmu_init_bypass_stes(strtab, cfg->num_l1_ents);
> +
> +       /* Check for RMRs and install bypass STEs if any */
> +       arm_smmu_rmr_install_bypass_ste(smmu);
>         return 0;
>  }
>
> @@ -4013,9 +4018,6 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
>         /* Record our private device structure */
>         platform_set_drvdata(pdev, smmu);
>
> -       /* Check for RMRs and install bypass STEs if any */
> -       arm_smmu_rmr_install_bypass_ste(smmu);
> -
>         /* Reset the device */
>         ret = arm_smmu_device_reset(smmu, bypass);
>         if (ret)
> --
> 2.42.0
>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 06/19] iommu/arm-smmu-v3: Move arm_smmu_rmr_install_bypass_ste()
@ 2023-11-15 13:57     ` Michael Shavit
  0 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 13:57 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> Logically arm_smmu_init_strtab_linear() is the function that allocates and
> populates the stream table with the initial value of the STEs. After this
> function returns the stream table should be fully ready.
>
> arm_smmu_rmr_install_bypass_ste() adjusts the initial stream table to force
> any SIDs that the FW says have IOMMU_RESV_DIRECT to use bypass. This
> ensures there is no disruption to the identity mapping during boot.
>
> Put arm_smmu_rmr_install_bypass_ste() into arm_smmu_init_strtab_linear(),
> it already executes immediately after arm_smmu_init_strtab_linear().
>
> No functional change intended.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Michael Shavit <mshavit@google.com>

> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 13cdb959ec8f58..3fc8787db2dbc1 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -86,6 +86,8 @@ static struct arm_smmu_option_prop arm_smmu_options[] = {
>         { 0, NULL},
>  };
>
> +static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu);
> +
>  static void parse_driver_options(struct arm_smmu_device *smmu)
>  {
>         int i = 0;
> @@ -3200,6 +3202,9 @@ static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu)
>         cfg->strtab_base_cfg = reg;
>
>         arm_smmu_init_bypass_stes(strtab, cfg->num_l1_ents);
> +
> +       /* Check for RMRs and install bypass STEs if any */
> +       arm_smmu_rmr_install_bypass_ste(smmu);
>         return 0;
>  }
>
> @@ -4013,9 +4018,6 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
>         /* Record our private device structure */
>         platform_set_drvdata(pdev, smmu);
>
> -       /* Check for RMRs and install bypass STEs if any */
> -       arm_smmu_rmr_install_bypass_ste(smmu);
> -
>         /* Reset the device */
>         ret = arm_smmu_device_reset(smmu, bypass);
>         if (ret)
> --
> 2.42.0
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 07/19] iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into functions
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-11-15 14:01     ` Michael Shavit
  -1 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 14:01 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> This is preparation to move the STE calculation higher up in to the call
> chain and remove arm_smmu_write_strtab_ent(). These new functions will be
> called directly from attach_dev.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 115 +++++++++++---------
>  1 file changed, 63 insertions(+), 52 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 3fc8787db2dbc1..1c63fdebbda9d4 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1463,13 +1463,70 @@ static void arm_smmu_make_bypass_ste(struct arm_smmu_ste *target)
>                 FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
>  }
>
> +static void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
> +                                     struct arm_smmu_master *master,
> +                                     struct arm_smmu_ctx_desc_cfg *cd_table)
> +{
> +       struct arm_smmu_device *smmu = master->smmu;
> +
> +       memset(target, 0, sizeof(*target));
> +       target->data[0] = cpu_to_le64(
> +               STRTAB_STE_0_V |
> +               FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
> +               FIELD_PREP(STRTAB_STE_0_S1FMT, cd_table->s1fmt) |
> +               (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
> +               FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax));
> +
> +       target->data[1] = cpu_to_le64(
> +               FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
> +               FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> +               FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> +               FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) |
> +               ((smmu->features & ARM_SMMU_FEAT_STALLS &&
> +                 !master->stall_enabled) ?
> +                        STRTAB_STE_1_S1STALLD :
> +                        0) |
> +               FIELD_PREP(STRTAB_STE_1_EATS,
> +                          master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0) |
> +               FIELD_PREP(STRTAB_STE_1_STRW,
> +                          (smmu->features & ARM_SMMU_FEAT_E2H) ?
> +                                  STRTAB_STE_1_STRW_EL2 :
> +                                  STRTAB_STE_1_STRW_NSEL1));
> +}
> +
> +static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
> +                                       struct arm_smmu_master *master,
> +                                       struct arm_smmu_domain *smmu_domain)
> +{
> +       struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg;
> +
> +       memset(target, 0, sizeof(*target));
> +
> +       target->data[0] = cpu_to_le64(
> +               STRTAB_STE_0_V |
> +               FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS));
> +
> +       target->data[1] |= cpu_to_le64(
> +               FIELD_PREP(STRTAB_STE_1_EATS,
> +                          master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
> +
> +       target->data[2] = cpu_to_le64(
> +               FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
> +               FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
> +               STRTAB_STE_2_S2AA64 |
> +#ifdef __BIG_ENDIAN
> +               STRTAB_STE_2_S2ENDI |
> +#endif
> +               STRTAB_STE_2_S2PTW |
> +               STRTAB_STE_2_S2R);
> +
> +       target->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
> +}
> +
>  static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>                                       struct arm_smmu_ste *dst)
>  {
> -       u64 val;
>         struct arm_smmu_device *smmu = master->smmu;
> -       struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
> -       struct arm_smmu_s2_cfg *s2_cfg = NULL;
>         struct arm_smmu_domain *smmu_domain = master->domain;
>         struct arm_smmu_ste target = {};
>
> @@ -1484,61 +1541,15 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>
>         switch (smmu_domain->stage) {
>         case ARM_SMMU_DOMAIN_S1:
> -               cd_table = &master->cd_table;
> +               arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
>                 break;
>         case ARM_SMMU_DOMAIN_S2:
> -               s2_cfg = &smmu_domain->s2_cfg;
> +               arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
>                 break;
>         case ARM_SMMU_DOMAIN_BYPASS:
>                 arm_smmu_make_bypass_ste(&target);
> -               arm_smmu_write_ste(smmu, sid, dst, &target);
> -               return;
> +               break;
>         }
> -
> -       /* Nuke the existing STE_0 value, as we're going to rewrite it */
> -       val = STRTAB_STE_0_V;
> -
> -       if (cd_table) {
> -               u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ?
> -                       STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
> -
> -               target.data[1] = cpu_to_le64(
> -                        FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
> -                        FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> -                        FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> -                        FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) |
> -                        FIELD_PREP(STRTAB_STE_1_STRW, strw));
> -
> -               if (smmu->features & ARM_SMMU_FEAT_STALLS &&
> -                   !master->stall_enabled)
> -                       target.data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
> -
> -               val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
> -                       FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
> -                       FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax) |
> -                       FIELD_PREP(STRTAB_STE_0_S1FMT, cd_table->s1fmt);
> -       }
> -
> -       if (s2_cfg) {
> -               target.data[2] = cpu_to_le64(
> -                        FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
> -                        FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
> -#ifdef __BIG_ENDIAN
> -                        STRTAB_STE_2_S2ENDI |
> -#endif
> -                        STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
> -                        STRTAB_STE_2_S2R);
> -
> -               target.data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
> -
> -               val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
> -       }
> -
> -       if (master->ats_enabled)
> -               target.data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
> -                                                STRTAB_STE_1_EATS_TRANS));
> -
> -       target.data[0] = cpu_to_le64(val);
>         arm_smmu_write_ste(smmu, sid, dst, &target);
>  }
>
> --
> 2.42.0
>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 07/19] iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into functions
@ 2023-11-15 14:01     ` Michael Shavit
  0 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 14:01 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> This is preparation to move the STE calculation higher up in to the call
> chain and remove arm_smmu_write_strtab_ent(). These new functions will be
> called directly from attach_dev.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 115 +++++++++++---------
>  1 file changed, 63 insertions(+), 52 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 3fc8787db2dbc1..1c63fdebbda9d4 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1463,13 +1463,70 @@ static void arm_smmu_make_bypass_ste(struct arm_smmu_ste *target)
>                 FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
>  }
>
> +static void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
> +                                     struct arm_smmu_master *master,
> +                                     struct arm_smmu_ctx_desc_cfg *cd_table)
> +{
> +       struct arm_smmu_device *smmu = master->smmu;
> +
> +       memset(target, 0, sizeof(*target));
> +       target->data[0] = cpu_to_le64(
> +               STRTAB_STE_0_V |
> +               FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
> +               FIELD_PREP(STRTAB_STE_0_S1FMT, cd_table->s1fmt) |
> +               (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
> +               FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax));
> +
> +       target->data[1] = cpu_to_le64(
> +               FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
> +               FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> +               FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> +               FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) |
> +               ((smmu->features & ARM_SMMU_FEAT_STALLS &&
> +                 !master->stall_enabled) ?
> +                        STRTAB_STE_1_S1STALLD :
> +                        0) |
> +               FIELD_PREP(STRTAB_STE_1_EATS,
> +                          master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0) |
> +               FIELD_PREP(STRTAB_STE_1_STRW,
> +                          (smmu->features & ARM_SMMU_FEAT_E2H) ?
> +                                  STRTAB_STE_1_STRW_EL2 :
> +                                  STRTAB_STE_1_STRW_NSEL1));
> +}
> +
> +static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
> +                                       struct arm_smmu_master *master,
> +                                       struct arm_smmu_domain *smmu_domain)
> +{
> +       struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg;
> +
> +       memset(target, 0, sizeof(*target));
> +
> +       target->data[0] = cpu_to_le64(
> +               STRTAB_STE_0_V |
> +               FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS));
> +
> +       target->data[1] |= cpu_to_le64(
> +               FIELD_PREP(STRTAB_STE_1_EATS,
> +                          master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
> +
> +       target->data[2] = cpu_to_le64(
> +               FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
> +               FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
> +               STRTAB_STE_2_S2AA64 |
> +#ifdef __BIG_ENDIAN
> +               STRTAB_STE_2_S2ENDI |
> +#endif
> +               STRTAB_STE_2_S2PTW |
> +               STRTAB_STE_2_S2R);
> +
> +       target->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
> +}
> +
>  static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>                                       struct arm_smmu_ste *dst)
>  {
> -       u64 val;
>         struct arm_smmu_device *smmu = master->smmu;
> -       struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
> -       struct arm_smmu_s2_cfg *s2_cfg = NULL;
>         struct arm_smmu_domain *smmu_domain = master->domain;
>         struct arm_smmu_ste target = {};
>
> @@ -1484,61 +1541,15 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>
>         switch (smmu_domain->stage) {
>         case ARM_SMMU_DOMAIN_S1:
> -               cd_table = &master->cd_table;
> +               arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
>                 break;
>         case ARM_SMMU_DOMAIN_S2:
> -               s2_cfg = &smmu_domain->s2_cfg;
> +               arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
>                 break;
>         case ARM_SMMU_DOMAIN_BYPASS:
>                 arm_smmu_make_bypass_ste(&target);
> -               arm_smmu_write_ste(smmu, sid, dst, &target);
> -               return;
> +               break;
>         }
> -
> -       /* Nuke the existing STE_0 value, as we're going to rewrite it */
> -       val = STRTAB_STE_0_V;
> -
> -       if (cd_table) {
> -               u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ?
> -                       STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
> -
> -               target.data[1] = cpu_to_le64(
> -                        FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
> -                        FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> -                        FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> -                        FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) |
> -                        FIELD_PREP(STRTAB_STE_1_STRW, strw));
> -
> -               if (smmu->features & ARM_SMMU_FEAT_STALLS &&
> -                   !master->stall_enabled)
> -                       target.data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
> -
> -               val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
> -                       FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
> -                       FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax) |
> -                       FIELD_PREP(STRTAB_STE_0_S1FMT, cd_table->s1fmt);
> -       }
> -
> -       if (s2_cfg) {
> -               target.data[2] = cpu_to_le64(
> -                        FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
> -                        FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
> -#ifdef __BIG_ENDIAN
> -                        STRTAB_STE_2_S2ENDI |
> -#endif
> -                        STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
> -                        STRTAB_STE_2_S2R);
> -
> -               target.data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
> -
> -               val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
> -       }
> -
> -       if (master->ats_enabled)
> -               target.data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
> -                                                STRTAB_STE_1_EATS_TRANS));
> -
> -       target.data[0] = cpu_to_le64(val);
>         arm_smmu_write_ste(smmu, sid, dst, &target);
>  }
>
> --
> 2.42.0
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 08/19] iommu/arm-smmu-v3: Build the whole STE in arm_smmu_make_s2_domain_ste()
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-11-15 14:04     ` Michael Shavit
  -1 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 14:04 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> Half the code was living in arm_smmu_domain_finalise_s2(), just move it
> here and take the values directly from the pgtbl_ops instead of storing
> copies.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 27 ++++++++++++---------
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 --
>  2 files changed, 15 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 1c63fdebbda9d4..e80373885d8b19 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1499,6 +1499,11 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
>                                         struct arm_smmu_domain *smmu_domain)
>  {
>         struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg;
> +       const struct io_pgtable_cfg *pgtbl_cfg =
> +               &io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops)->cfg;
> +       typeof(&pgtbl_cfg->arm_lpae_s2_cfg.vtcr) vtcr =
> +               &pgtbl_cfg->arm_lpae_s2_cfg.vtcr;
> +       u64 vtcr_val;
>
>         memset(target, 0, sizeof(*target));
>
> @@ -1510,9 +1515,16 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
>                 FIELD_PREP(STRTAB_STE_1_EATS,
>                            master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
>
> +       vtcr_val = FIELD_PREP(STRTAB_STE_2_VTCR_S2T0SZ, vtcr->tsz) |
> +                  FIELD_PREP(STRTAB_STE_2_VTCR_S2SL0, vtcr->sl) |
> +                  FIELD_PREP(STRTAB_STE_2_VTCR_S2IR0, vtcr->irgn) |
> +                  FIELD_PREP(STRTAB_STE_2_VTCR_S2OR0, vtcr->orgn) |
> +                  FIELD_PREP(STRTAB_STE_2_VTCR_S2SH0, vtcr->sh) |
> +                  FIELD_PREP(STRTAB_STE_2_VTCR_S2TG, vtcr->tg) |
> +                  FIELD_PREP(STRTAB_STE_2_VTCR_S2PS, vtcr->ps);
>         target->data[2] = cpu_to_le64(
>                 FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
> -               FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
> +               FIELD_PREP(STRTAB_STE_2_VTCR, vtcr_val) |
>                 STRTAB_STE_2_S2AA64 |
>  #ifdef __BIG_ENDIAN
>                 STRTAB_STE_2_S2ENDI |
> @@ -1520,7 +1532,8 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
>                 STRTAB_STE_2_S2PTW |
>                 STRTAB_STE_2_S2R);
>
> -       target->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
> +       target->data[3] = cpu_to_le64(pgtbl_cfg->arm_lpae_s2_cfg.vttbr &
> +                                     STRTAB_STE_3_S2TTB_MASK);
>  }
>
>  static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> @@ -2277,7 +2290,6 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
>         int vmid;
>         struct arm_smmu_device *smmu = smmu_domain->smmu;
>         struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
> -       typeof(&pgtbl_cfg->arm_lpae_s2_cfg.vtcr) vtcr;
>
>         /* Reserve VMID 0 for stage-2 bypass STEs */
>         vmid = ida_alloc_range(&smmu->vmid_map, 1, (1 << smmu->vmid_bits) - 1,
> @@ -2285,16 +2297,7 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
>         if (vmid < 0)
>                 return vmid;
>
> -       vtcr = &pgtbl_cfg->arm_lpae_s2_cfg.vtcr;
>         cfg->vmid       = (u16)vmid;
> -       cfg->vttbr      = pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
> -       cfg->vtcr       = FIELD_PREP(STRTAB_STE_2_VTCR_S2T0SZ, vtcr->tsz) |
> -                         FIELD_PREP(STRTAB_STE_2_VTCR_S2SL0, vtcr->sl) |
> -                         FIELD_PREP(STRTAB_STE_2_VTCR_S2IR0, vtcr->irgn) |
> -                         FIELD_PREP(STRTAB_STE_2_VTCR_S2OR0, vtcr->orgn) |
> -                         FIELD_PREP(STRTAB_STE_2_VTCR_S2SH0, vtcr->sh) |
> -                         FIELD_PREP(STRTAB_STE_2_VTCR_S2TG, vtcr->tg) |
> -                         FIELD_PREP(STRTAB_STE_2_VTCR_S2PS, vtcr->ps);
>         return 0;
>  }
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 27ddf1acd12cea..1be0c1151c50c3 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -609,8 +609,6 @@ struct arm_smmu_ctx_desc_cfg {
>
>  struct arm_smmu_s2_cfg {
>         u16                             vmid;
> -       u64                             vttbr;
> -       u64                             vtcr;
>  };
>
>  struct arm_smmu_strtab_cfg {
> --
> 2.42.0
>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 08/19] iommu/arm-smmu-v3: Build the whole STE in arm_smmu_make_s2_domain_ste()
@ 2023-11-15 14:04     ` Michael Shavit
  0 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 14:04 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> Half the code was living in arm_smmu_domain_finalise_s2(), just move it
> here and take the values directly from the pgtbl_ops instead of storing
> copies.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 27 ++++++++++++---------
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 --
>  2 files changed, 15 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 1c63fdebbda9d4..e80373885d8b19 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1499,6 +1499,11 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
>                                         struct arm_smmu_domain *smmu_domain)
>  {
>         struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg;
> +       const struct io_pgtable_cfg *pgtbl_cfg =
> +               &io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops)->cfg;
> +       typeof(&pgtbl_cfg->arm_lpae_s2_cfg.vtcr) vtcr =
> +               &pgtbl_cfg->arm_lpae_s2_cfg.vtcr;
> +       u64 vtcr_val;
>
>         memset(target, 0, sizeof(*target));
>
> @@ -1510,9 +1515,16 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
>                 FIELD_PREP(STRTAB_STE_1_EATS,
>                            master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0));
>
> +       vtcr_val = FIELD_PREP(STRTAB_STE_2_VTCR_S2T0SZ, vtcr->tsz) |
> +                  FIELD_PREP(STRTAB_STE_2_VTCR_S2SL0, vtcr->sl) |
> +                  FIELD_PREP(STRTAB_STE_2_VTCR_S2IR0, vtcr->irgn) |
> +                  FIELD_PREP(STRTAB_STE_2_VTCR_S2OR0, vtcr->orgn) |
> +                  FIELD_PREP(STRTAB_STE_2_VTCR_S2SH0, vtcr->sh) |
> +                  FIELD_PREP(STRTAB_STE_2_VTCR_S2TG, vtcr->tg) |
> +                  FIELD_PREP(STRTAB_STE_2_VTCR_S2PS, vtcr->ps);
>         target->data[2] = cpu_to_le64(
>                 FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
> -               FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
> +               FIELD_PREP(STRTAB_STE_2_VTCR, vtcr_val) |
>                 STRTAB_STE_2_S2AA64 |
>  #ifdef __BIG_ENDIAN
>                 STRTAB_STE_2_S2ENDI |
> @@ -1520,7 +1532,8 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
>                 STRTAB_STE_2_S2PTW |
>                 STRTAB_STE_2_S2R);
>
> -       target->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
> +       target->data[3] = cpu_to_le64(pgtbl_cfg->arm_lpae_s2_cfg.vttbr &
> +                                     STRTAB_STE_3_S2TTB_MASK);
>  }
>
>  static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> @@ -2277,7 +2290,6 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
>         int vmid;
>         struct arm_smmu_device *smmu = smmu_domain->smmu;
>         struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
> -       typeof(&pgtbl_cfg->arm_lpae_s2_cfg.vtcr) vtcr;
>
>         /* Reserve VMID 0 for stage-2 bypass STEs */
>         vmid = ida_alloc_range(&smmu->vmid_map, 1, (1 << smmu->vmid_bits) - 1,
> @@ -2285,16 +2297,7 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
>         if (vmid < 0)
>                 return vmid;
>
> -       vtcr = &pgtbl_cfg->arm_lpae_s2_cfg.vtcr;
>         cfg->vmid       = (u16)vmid;
> -       cfg->vttbr      = pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
> -       cfg->vtcr       = FIELD_PREP(STRTAB_STE_2_VTCR_S2T0SZ, vtcr->tsz) |
> -                         FIELD_PREP(STRTAB_STE_2_VTCR_S2SL0, vtcr->sl) |
> -                         FIELD_PREP(STRTAB_STE_2_VTCR_S2IR0, vtcr->irgn) |
> -                         FIELD_PREP(STRTAB_STE_2_VTCR_S2OR0, vtcr->orgn) |
> -                         FIELD_PREP(STRTAB_STE_2_VTCR_S2SH0, vtcr->sh) |
> -                         FIELD_PREP(STRTAB_STE_2_VTCR_S2TG, vtcr->tg) |
> -                         FIELD_PREP(STRTAB_STE_2_VTCR_S2PS, vtcr->ps);
>         return 0;
>  }
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 27ddf1acd12cea..1be0c1151c50c3 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -609,8 +609,6 @@ struct arm_smmu_ctx_desc_cfg {
>
>  struct arm_smmu_s2_cfg {
>         u16                             vmid;
> -       u64                             vttbr;
> -       u64                             vtcr;
>  };
>
>  struct arm_smmu_strtab_cfg {
> --
> 2.42.0
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 09/19] iommu/arm-smmu-v3: Hold arm_smmu_asid_lock during all of attach_dev
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-11-15 14:12     ` Michael Shavit
  -1 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 14:12 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> The BTM support wants to be able to change the ASID of any smmu_domain.
> When it goes to do this it holds the arm_smmu_asid_lock and iterates over
> the target domain's devices list.
>
> During attach of a S1 domain we must ensure that the devices list and
> CD are in sync, otherwise we could miss CD updates or a parallel CD update
> could push an out of date CD.
>
> This is pretty complicated, and works today because arm_smmu_detach_dev()
> remove the CD table from the STE before working on the CD entries.
>
> The next patch will allow the CD table to remain in the STE so solve this
> racy by holding the lock for a longer period. The lock covers both of the
> changes to the device list and the CD table entries.
>
> Move arm_smmu_detach_dev() till after we have initialized the domain so
> the lock can be held for less time.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 22 ++++++++++++---------
>  1 file changed, 13 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index e80373885d8b19..b11dc03ee16880 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2560,8 +2560,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>                 return -EBUSY;
>         }
>
> -       arm_smmu_detach_dev(master);
> -
>         mutex_lock(&smmu_domain->init_mutex);
>
>         if (!smmu_domain->smmu) {
> @@ -2576,6 +2574,16 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>         if (ret)
>                 return ret;
>
> +       /*
> +        * Prevent arm_smmu_share_asid() from trying to change the ASID
> +        * of either the old or new domain while we are working on it.
> +        * This allows the STE and the smmu_domain->devices list to
> +        * be inconsistent during this routine.
> +        */
> +       mutex_lock(&arm_smmu_asid_lock);
> +
> +       arm_smmu_detach_dev(master);
> +
>         master->domain = smmu_domain;
>
>         /*
> @@ -2601,13 +2609,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>                         }
>                 }
>
> -               /*
> -                * Prevent SVA from concurrently modifying the CD or writing to
> -                * the CD entry
> -                */
> -               mutex_lock(&arm_smmu_asid_lock);
>                 ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd);
> -               mutex_unlock(&arm_smmu_asid_lock);
>                 if (ret) {
>                         master->domain = NULL;
>                         goto out_list_del;
> @@ -2617,13 +2619,15 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>         arm_smmu_install_ste_for_dev(master);
>
>         arm_smmu_enable_ats(master);
> -       return 0;
> +       goto out_unlock;
>
>  out_list_del:
>         spin_lock_irqsave(&smmu_domain->devices_lock, flags);
>         list_del(&master->domain_head);
>         spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
>
> +out_unlock:
> +       mutex_unlock(&arm_smmu_asid_lock);
>         return ret;
>  }
>
> --
> 2.42.0
>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 09/19] iommu/arm-smmu-v3: Hold arm_smmu_asid_lock during all of attach_dev
@ 2023-11-15 14:12     ` Michael Shavit
  0 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 14:12 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> The BTM support wants to be able to change the ASID of any smmu_domain.
> When it goes to do this it holds the arm_smmu_asid_lock and iterates over
> the target domain's devices list.
>
> During attach of a S1 domain we must ensure that the devices list and
> CD are in sync, otherwise we could miss CD updates or a parallel CD update
> could push an out of date CD.
>
> This is pretty complicated, and works today because arm_smmu_detach_dev()
> remove the CD table from the STE before working on the CD entries.
>
> The next patch will allow the CD table to remain in the STE so solve this
> racy by holding the lock for a longer period. The lock covers both of the
> changes to the device list and the CD table entries.
>
> Move arm_smmu_detach_dev() till after we have initialized the domain so
> the lock can be held for less time.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 22 ++++++++++++---------
>  1 file changed, 13 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index e80373885d8b19..b11dc03ee16880 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2560,8 +2560,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>                 return -EBUSY;
>         }
>
> -       arm_smmu_detach_dev(master);
> -
>         mutex_lock(&smmu_domain->init_mutex);
>
>         if (!smmu_domain->smmu) {
> @@ -2576,6 +2574,16 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>         if (ret)
>                 return ret;
>
> +       /*
> +        * Prevent arm_smmu_share_asid() from trying to change the ASID
> +        * of either the old or new domain while we are working on it.
> +        * This allows the STE and the smmu_domain->devices list to
> +        * be inconsistent during this routine.
> +        */
> +       mutex_lock(&arm_smmu_asid_lock);
> +
> +       arm_smmu_detach_dev(master);
> +
>         master->domain = smmu_domain;
>
>         /*
> @@ -2601,13 +2609,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>                         }
>                 }
>
> -               /*
> -                * Prevent SVA from concurrently modifying the CD or writing to
> -                * the CD entry
> -                */
> -               mutex_lock(&arm_smmu_asid_lock);
>                 ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd);
> -               mutex_unlock(&arm_smmu_asid_lock);
>                 if (ret) {
>                         master->domain = NULL;
>                         goto out_list_del;
> @@ -2617,13 +2619,15 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>         arm_smmu_install_ste_for_dev(master);
>
>         arm_smmu_enable_ats(master);
> -       return 0;
> +       goto out_unlock;
>
>  out_list_del:
>         spin_lock_irqsave(&smmu_domain->devices_lock, flags);
>         list_del(&master->domain_head);
>         spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
>
> +out_unlock:
> +       mutex_unlock(&arm_smmu_asid_lock);
>         return ret;
>  }
>
> --
> 2.42.0
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 10/19] iommu/arm-smmu-v3: Compute the STE only once for each master
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-11-15 14:16     ` Michael Shavit
  -1 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 14:16 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> Currently arm_smmu_install_ste_for_dev() iterates over every SID and
> computes from scratch an identical STE. Every SID should have the same STE
> contents. Turn this inside out so that the STE is supplied by the caller
> and arm_smmu_install_ste_for_dev() simply installs it to every SID.
>
> This is possible now that the STE generation does not inform what sequence
> should be used to program it.
>
> This allows splitting the STE calculation up according to the call site,
> which following patches will make use of, and removes the confusing NULL
> domain special case that only supported arm_smmu_detach_dev().
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 58 ++++++++-------------
>  1 file changed, 22 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index b11dc03ee16880..4b157c2ddf9a80 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1536,36 +1536,6 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
>                                       STRTAB_STE_3_S2TTB_MASK);
>  }
>
> -static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> -                                     struct arm_smmu_ste *dst)
> -{
> -       struct arm_smmu_device *smmu = master->smmu;
> -       struct arm_smmu_domain *smmu_domain = master->domain;
> -       struct arm_smmu_ste target = {};
> -
> -       if (!smmu_domain) {
> -               if (disable_bypass)
> -                       arm_smmu_make_abort_ste(&target);
> -               else
> -                       arm_smmu_make_bypass_ste(&target);
> -               arm_smmu_write_ste(smmu, sid, dst, &target);
> -               return;
> -       }
> -
> -       switch (smmu_domain->stage) {
> -       case ARM_SMMU_DOMAIN_S1:
> -               arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
> -               break;
> -       case ARM_SMMU_DOMAIN_S2:
> -               arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
> -               break;
> -       case ARM_SMMU_DOMAIN_BYPASS:
> -               arm_smmu_make_bypass_ste(&target);
> -               break;
> -       }
> -       arm_smmu_write_ste(smmu, sid, dst, &target);
> -}
> -
>  static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab,
>                                       unsigned int nent)
>  {
> @@ -2387,7 +2357,8 @@ arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
>         }
>  }
>
> -static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
> +static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master,
> +                                        const struct arm_smmu_ste *target)
>  {
>         int i, j;
>         struct arm_smmu_device *smmu = master->smmu;
> @@ -2404,7 +2375,7 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
>                 if (j < i)
>                         continue;
>
> -               arm_smmu_write_strtab_ent(master, sid, step);
> +               arm_smmu_write_ste(smmu, sid, step, target);
>         }
>  }
>
> @@ -2511,6 +2482,7 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master)
>  static void arm_smmu_detach_dev(struct arm_smmu_master *master)
>  {
>         unsigned long flags;
> +       struct arm_smmu_ste target;
>         struct arm_smmu_domain *smmu_domain = master->domain;
>
>         if (!smmu_domain)
> @@ -2524,7 +2496,11 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
>
>         master->domain = NULL;
>         master->ats_enabled = false;
> -       arm_smmu_install_ste_for_dev(master);
> +       if (disable_bypass)
> +               arm_smmu_make_abort_ste(&target);
> +       else
> +               arm_smmu_make_bypass_ste(&target);
> +       arm_smmu_install_ste_for_dev(master, &target);
>         /*
>          * Clearing the CD entry isn't strictly required to detach the domain
>          * since the table is uninstalled anyway, but it helps avoid confusion
> @@ -2539,6 +2515,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>  {
>         int ret = 0;
>         unsigned long flags;
> +       struct arm_smmu_ste target;
>         struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
>         struct arm_smmu_device *smmu;
>         struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> @@ -2600,7 +2577,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>         list_add(&master->domain_head, &smmu_domain->devices);
>         spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
>
> -       if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
> +       switch (smmu_domain->stage) {
> +       case ARM_SMMU_DOMAIN_S1:
>                 if (!master->cd_table.cdtab) {
>                         ret = arm_smmu_alloc_cd_tables(master);
>                         if (ret) {
> @@ -2614,9 +2592,17 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>                         master->domain = NULL;
>                         goto out_list_del;
>                 }
> -       }
>
> -       arm_smmu_install_ste_for_dev(master);
> +               arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
> +               break;
> +       case ARM_SMMU_DOMAIN_S2:
> +               arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
> +               break;
> +       case ARM_SMMU_DOMAIN_BYPASS:
> +               arm_smmu_make_bypass_ste(&target);
> +               break;
> +       }
> +       arm_smmu_install_ste_for_dev(master, &target);
>
>         arm_smmu_enable_ats(master);
>         goto out_unlock;
> --
> 2.42.0
>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 10/19] iommu/arm-smmu-v3: Compute the STE only once for each master
@ 2023-11-15 14:16     ` Michael Shavit
  0 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 14:16 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> Currently arm_smmu_install_ste_for_dev() iterates over every SID and
> computes from scratch an identical STE. Every SID should have the same STE
> contents. Turn this inside out so that the STE is supplied by the caller
> and arm_smmu_install_ste_for_dev() simply installs it to every SID.
>
> This is possible now that the STE generation does not inform what sequence
> should be used to program it.
>
> This allows splitting the STE calculation up according to the call site,
> which following patches will make use of, and removes the confusing NULL
> domain special case that only supported arm_smmu_detach_dev().
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 58 ++++++++-------------
>  1 file changed, 22 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index b11dc03ee16880..4b157c2ddf9a80 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1536,36 +1536,6 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
>                                       STRTAB_STE_3_S2TTB_MASK);
>  }
>
> -static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> -                                     struct arm_smmu_ste *dst)
> -{
> -       struct arm_smmu_device *smmu = master->smmu;
> -       struct arm_smmu_domain *smmu_domain = master->domain;
> -       struct arm_smmu_ste target = {};
> -
> -       if (!smmu_domain) {
> -               if (disable_bypass)
> -                       arm_smmu_make_abort_ste(&target);
> -               else
> -                       arm_smmu_make_bypass_ste(&target);
> -               arm_smmu_write_ste(smmu, sid, dst, &target);
> -               return;
> -       }
> -
> -       switch (smmu_domain->stage) {
> -       case ARM_SMMU_DOMAIN_S1:
> -               arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
> -               break;
> -       case ARM_SMMU_DOMAIN_S2:
> -               arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
> -               break;
> -       case ARM_SMMU_DOMAIN_BYPASS:
> -               arm_smmu_make_bypass_ste(&target);
> -               break;
> -       }
> -       arm_smmu_write_ste(smmu, sid, dst, &target);
> -}
> -
>  static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab,
>                                       unsigned int nent)
>  {
> @@ -2387,7 +2357,8 @@ arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
>         }
>  }
>
> -static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
> +static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master,
> +                                        const struct arm_smmu_ste *target)
>  {
>         int i, j;
>         struct arm_smmu_device *smmu = master->smmu;
> @@ -2404,7 +2375,7 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
>                 if (j < i)
>                         continue;
>
> -               arm_smmu_write_strtab_ent(master, sid, step);
> +               arm_smmu_write_ste(smmu, sid, step, target);
>         }
>  }
>
> @@ -2511,6 +2482,7 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master)
>  static void arm_smmu_detach_dev(struct arm_smmu_master *master)
>  {
>         unsigned long flags;
> +       struct arm_smmu_ste target;
>         struct arm_smmu_domain *smmu_domain = master->domain;
>
>         if (!smmu_domain)
> @@ -2524,7 +2496,11 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
>
>         master->domain = NULL;
>         master->ats_enabled = false;
> -       arm_smmu_install_ste_for_dev(master);
> +       if (disable_bypass)
> +               arm_smmu_make_abort_ste(&target);
> +       else
> +               arm_smmu_make_bypass_ste(&target);
> +       arm_smmu_install_ste_for_dev(master, &target);
>         /*
>          * Clearing the CD entry isn't strictly required to detach the domain
>          * since the table is uninstalled anyway, but it helps avoid confusion
> @@ -2539,6 +2515,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>  {
>         int ret = 0;
>         unsigned long flags;
> +       struct arm_smmu_ste target;
>         struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
>         struct arm_smmu_device *smmu;
>         struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> @@ -2600,7 +2577,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>         list_add(&master->domain_head, &smmu_domain->devices);
>         spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
>
> -       if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
> +       switch (smmu_domain->stage) {
> +       case ARM_SMMU_DOMAIN_S1:
>                 if (!master->cd_table.cdtab) {
>                         ret = arm_smmu_alloc_cd_tables(master);
>                         if (ret) {
> @@ -2614,9 +2592,17 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>                         master->domain = NULL;
>                         goto out_list_del;
>                 }
> -       }
>
> -       arm_smmu_install_ste_for_dev(master);
> +               arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
> +               break;
> +       case ARM_SMMU_DOMAIN_S2:
> +               arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
> +               break;
> +       case ARM_SMMU_DOMAIN_BYPASS:
> +               arm_smmu_make_bypass_ste(&target);
> +               break;
> +       }
> +       arm_smmu_install_ste_for_dev(master, &target);
>
>         arm_smmu_enable_ats(master);
>         goto out_unlock;
> --
> 2.42.0
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 11/19] iommu/arm-smmu-v3: Do not change the STE twice during arm_smmu_attach_dev()
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-11-15 15:15     ` Michael Shavit
  -1 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 15:15 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> This was needed because the STE code required the STE to be in
> ABORT/BYPASS inorder to program a cdtable or S2 STE. Now that the STE code
> can automatically handle all transitions we can remove this step
> from the attach_dev flow.
>
> A few small bugs exist because of this:
>
> 1) If the core code does BLOCKED -> UNMANAGED with disable_bypass=false
>    then there will be a moment where the STE points at BYPASS. Since
>    this can be done by VFIO/IOMMUFD it is a small security race.
>
> 2) If the core code does IDENTITY -> DMA then any IOMMU_RESV_DIRECT
>    regions will temporarily become BLOCKED. We'd like drivers to
>    work in a way that allows IOMMU_RESV_DIRECT to be continuously
>    functional during these transitions.
>
> Make arm_smmu_release_device() put the STE back to the correct
> ABORT/BYPASS setting. Fix a bug where a IOMMU_RESV_DIRECT was ignored on
> this path.
>
> Notice this subtly depends on the prior arm_smmu_asid_lock change as the
> STE must be put to non-paging before removing the device for the linked
> list to avoid races with arm_smmu_share_asid().

I'm a little confused by this comment. Is this suggesting that
arm_smmu_detach_dev had a race condition before the arm_smmu_asid_lock
changes, since it deletes the list entry before deactivating the STE
that uses the domain and without grabbing the asid_lock, thus allowing
a gap where the ASID might be re-acquired by an SVA domain while an
STE with that ASID is still live on this device? Wouldn't that belong
on the asid_lock patch instead if so?

>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 4b157c2ddf9a80..f70862806211de 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2482,7 +2482,6 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master)
>  static void arm_smmu_detach_dev(struct arm_smmu_master *master)
>  {
>         unsigned long flags;
> -       struct arm_smmu_ste target;
>         struct arm_smmu_domain *smmu_domain = master->domain;
>
>         if (!smmu_domain)
> @@ -2496,11 +2495,6 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
>
>         master->domain = NULL;
>         master->ats_enabled = false;
> -       if (disable_bypass)
> -               arm_smmu_make_abort_ste(&target);
> -       else
> -               arm_smmu_make_bypass_ste(&target);
> -       arm_smmu_install_ste_for_dev(master, &target);
>         /*
>          * Clearing the CD entry isn't strictly required to detach the domain
>          * since the table is uninstalled anyway, but it helps avoid confusion
> @@ -2852,9 +2846,18 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
>  static void arm_smmu_release_device(struct device *dev)
>  {
>         struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> +       struct arm_smmu_ste target;
>
>         if (WARN_ON(arm_smmu_master_sva_enabled(master)))
>                 iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
> +
> +       /* Put the STE back to what arm_smmu_init_strtab() sets */

Hmmmm, it seems like checking iommu->require_direct may put STEs in
bypass in scenarios where arm_smmu_init_strtab() wouldn't have.
arm_smmu_init_strtab is calling iort_get_rmr_sids to pick streams to
put into bypass, but IIUC iommu->require_direct also applies to
dts-based reserved-memory regions, not just iort.

I'm not very familiar with the history behind disable_bypass; why is
putting an entire stream into bypass the correct behavior if a
reserved-memory (which may be for a small finite region) exists?

> +       if (disable_bypass && !dev->iommu->require_direct)
> +               arm_smmu_make_abort_ste(&target);
> +       else
> +               arm_smmu_make_bypass_ste(&target);
> +       arm_smmu_install_ste_for_dev(master, &target);
> +
>         arm_smmu_detach_dev(master);
>         arm_smmu_disable_pasid(master);
>         arm_smmu_remove_master(master);
> --
> 2.42.0
>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 11/19] iommu/arm-smmu-v3: Do not change the STE twice during arm_smmu_attach_dev()
@ 2023-11-15 15:15     ` Michael Shavit
  0 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 15:15 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> This was needed because the STE code required the STE to be in
> ABORT/BYPASS inorder to program a cdtable or S2 STE. Now that the STE code
> can automatically handle all transitions we can remove this step
> from the attach_dev flow.
>
> A few small bugs exist because of this:
>
> 1) If the core code does BLOCKED -> UNMANAGED with disable_bypass=false
>    then there will be a moment where the STE points at BYPASS. Since
>    this can be done by VFIO/IOMMUFD it is a small security race.
>
> 2) If the core code does IDENTITY -> DMA then any IOMMU_RESV_DIRECT
>    regions will temporarily become BLOCKED. We'd like drivers to
>    work in a way that allows IOMMU_RESV_DIRECT to be continuously
>    functional during these transitions.
>
> Make arm_smmu_release_device() put the STE back to the correct
> ABORT/BYPASS setting. Fix a bug where a IOMMU_RESV_DIRECT was ignored on
> this path.
>
> Notice this subtly depends on the prior arm_smmu_asid_lock change as the
> STE must be put to non-paging before removing the device for the linked
> list to avoid races with arm_smmu_share_asid().

I'm a little confused by this comment. Is this suggesting that
arm_smmu_detach_dev had a race condition before the arm_smmu_asid_lock
changes, since it deletes the list entry before deactivating the STE
that uses the domain and without grabbing the asid_lock, thus allowing
a gap where the ASID might be re-acquired by an SVA domain while an
STE with that ASID is still live on this device? Wouldn't that belong
on the asid_lock patch instead if so?

>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 4b157c2ddf9a80..f70862806211de 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2482,7 +2482,6 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master)
>  static void arm_smmu_detach_dev(struct arm_smmu_master *master)
>  {
>         unsigned long flags;
> -       struct arm_smmu_ste target;
>         struct arm_smmu_domain *smmu_domain = master->domain;
>
>         if (!smmu_domain)
> @@ -2496,11 +2495,6 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
>
>         master->domain = NULL;
>         master->ats_enabled = false;
> -       if (disable_bypass)
> -               arm_smmu_make_abort_ste(&target);
> -       else
> -               arm_smmu_make_bypass_ste(&target);
> -       arm_smmu_install_ste_for_dev(master, &target);
>         /*
>          * Clearing the CD entry isn't strictly required to detach the domain
>          * since the table is uninstalled anyway, but it helps avoid confusion
> @@ -2852,9 +2846,18 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
>  static void arm_smmu_release_device(struct device *dev)
>  {
>         struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> +       struct arm_smmu_ste target;
>
>         if (WARN_ON(arm_smmu_master_sva_enabled(master)))
>                 iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
> +
> +       /* Put the STE back to what arm_smmu_init_strtab() sets */

Hmmmm, it seems like checking iommu->require_direct may put STEs in
bypass in scenarios where arm_smmu_init_strtab() wouldn't have.
arm_smmu_init_strtab is calling iort_get_rmr_sids to pick streams to
put into bypass, but IIUC iommu->require_direct also applies to
dts-based reserved-memory regions, not just iort.

I'm not very familiar with the history behind disable_bypass; why is
putting an entire stream into bypass the correct behavior if a
reserved-memory (which may be for a small finite region) exists?

> +       if (disable_bypass && !dev->iommu->require_direct)
> +               arm_smmu_make_abort_ste(&target);
> +       else
> +               arm_smmu_make_bypass_ste(&target);
> +       arm_smmu_install_ste_for_dev(master, &target);
> +
>         arm_smmu_detach_dev(master);
>         arm_smmu_disable_pasid(master);
>         arm_smmu_remove_master(master);
> --
> 2.42.0
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 12/19] iommu/arm-smmu-v3: Put writing the context descriptor in the right order
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-11-15 15:32     ` Michael Shavit
  -1 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 15:32 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> Get closer to the IOMMU API ideal that changes between domains can be
> hitless. The ordering for the CD table entry is not entirely clean from
> this perspective.
>
> When switching away from a STE with a CD table programmed in it we should
> write the new STE first, then clear any old data in the CD entry.
>
> If we are programming a CD table for the first time to a STE then the CD
> entry should be programmed before the STE is loaded.
>
> If we are replacing a CD table entry when the STE already points at the CD
> entry then we just need to do the make/break sequence.
>
> Lift this code out of arm_smmu_detach_dev() so it can all be sequenced
> properly. The only other caller is arm_smmu_release_device() and it is
> going to free the cdtable anyhow, so it doesn't matter what is in it.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Michael Shavit <mshavit@google.com>

This patch might be a better fit before the previous one. When going
from S1 to S2 or bypass:
Pre-both patches, attach_dev() installs a NULL STE, then clears the
now unused CDE, then installs a new STE.
After the previous patch, attach_dev() clears the *still used* CDE,
and then replaces the STE.
After this patch, attach_dev() replaces the STE, and then clears the CDE

Reordering the two patches removes the scenario where we could hit a
NULL-ed CDE.

> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 29 ++++++++++++++-------
>  1 file changed, 20 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index f70862806211de..eb5dcd357a42b8 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2495,14 +2495,6 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
>
>         master->domain = NULL;
>         master->ats_enabled = false;
> -       /*
> -        * Clearing the CD entry isn't strictly required to detach the domain
> -        * since the table is uninstalled anyway, but it helps avoid confusion
> -        * in the call to arm_smmu_write_ctx_desc on the next attach (which
> -        * expects the entry to be empty).
> -        */
> -       if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 && master->cd_table.cdtab)
> -               arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
>  }
>
>  static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> @@ -2579,6 +2571,17 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>                                 master->domain = NULL;
>                                 goto out_list_del;
>                         }
> +               } else {
> +                       /*
> +                        * arm_smmu_write_ctx_desc() relies on the entry being
> +                        * invalid to work, clear any existing entry.
> +                        */
> +                       ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
> +                                                     NULL);
> +                       if (ret) {
> +                               master->domain = NULL;
> +                               goto out_list_del;
> +                       }
>                 }
>
>                 ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd);
> @@ -2588,15 +2591,23 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>                 }
>
>                 arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
> +               arm_smmu_install_ste_for_dev(master, &target);
>                 break;
>         case ARM_SMMU_DOMAIN_S2:
>                 arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
> +               arm_smmu_install_ste_for_dev(master, &target);
> +               if (master->cd_table.cdtab)
> +                       arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
> +                                                     NULL);
>                 break;
>         case ARM_SMMU_DOMAIN_BYPASS:
>                 arm_smmu_make_bypass_ste(&target);
> +               arm_smmu_install_ste_for_dev(master, &target);
> +               if (master->cd_table.cdtab)
> +                       arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
> +                                                     NULL);
>                 break;
>         }
> -       arm_smmu_install_ste_for_dev(master, &target);
>
>         arm_smmu_enable_ats(master);
>         goto out_unlock;
> --
> 2.42.0
>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 12/19] iommu/arm-smmu-v3: Put writing the context descriptor in the right order
@ 2023-11-15 15:32     ` Michael Shavit
  0 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 15:32 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> Get closer to the IOMMU API ideal that changes between domains can be
> hitless. The ordering for the CD table entry is not entirely clean from
> this perspective.
>
> When switching away from a STE with a CD table programmed in it we should
> write the new STE first, then clear any old data in the CD entry.
>
> If we are programming a CD table for the first time to a STE then the CD
> entry should be programmed before the STE is loaded.
>
> If we are replacing a CD table entry when the STE already points at the CD
> entry then we just need to do the make/break sequence.
>
> Lift this code out of arm_smmu_detach_dev() so it can all be sequenced
> properly. The only other caller is arm_smmu_release_device() and it is
> going to free the cdtable anyhow, so it doesn't matter what is in it.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Michael Shavit <mshavit@google.com>

This patch might be a better fit before the previous one. When going
from S1 to S2 or bypass:
Pre-both patches, attach_dev() installs a NULL STE, then clears the
now unused CDE, then installs a new STE.
After the previous patch, attach_dev() clears the *still used* CDE,
and then replaces the STE.
After this patch, attach_dev() replaces the STE, and then clears the CDE

Reordering the two patches removes the scenario where we could hit a
NULL-ed CDE.

> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 29 ++++++++++++++-------
>  1 file changed, 20 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index f70862806211de..eb5dcd357a42b8 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2495,14 +2495,6 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
>
>         master->domain = NULL;
>         master->ats_enabled = false;
> -       /*
> -        * Clearing the CD entry isn't strictly required to detach the domain
> -        * since the table is uninstalled anyway, but it helps avoid confusion
> -        * in the call to arm_smmu_write_ctx_desc on the next attach (which
> -        * expects the entry to be empty).
> -        */
> -       if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 && master->cd_table.cdtab)
> -               arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
>  }
>
>  static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> @@ -2579,6 +2571,17 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>                                 master->domain = NULL;
>                                 goto out_list_del;
>                         }
> +               } else {
> +                       /*
> +                        * arm_smmu_write_ctx_desc() relies on the entry being
> +                        * invalid to work, clear any existing entry.
> +                        */
> +                       ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
> +                                                     NULL);
> +                       if (ret) {
> +                               master->domain = NULL;
> +                               goto out_list_del;
> +                       }
>                 }
>
>                 ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd);
> @@ -2588,15 +2591,23 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>                 }
>
>                 arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
> +               arm_smmu_install_ste_for_dev(master, &target);
>                 break;
>         case ARM_SMMU_DOMAIN_S2:
>                 arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
> +               arm_smmu_install_ste_for_dev(master, &target);
> +               if (master->cd_table.cdtab)
> +                       arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
> +                                                     NULL);
>                 break;
>         case ARM_SMMU_DOMAIN_BYPASS:
>                 arm_smmu_make_bypass_ste(&target);
> +               arm_smmu_install_ste_for_dev(master, &target);
> +               if (master->cd_table.cdtab)
> +                       arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
> +                                                     NULL);
>                 break;
>         }
> -       arm_smmu_install_ste_for_dev(master, &target);
>
>         arm_smmu_enable_ats(master);
>         goto out_unlock;
> --
> 2.42.0
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 15/19] iommu/arm-smmu-v3: Add a global static IDENTITY domain
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-11-15 15:50     ` Michael Shavit
  -1 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 15:50 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> Move to the new static global for identity domains. Move all the logic out
> of arm_smmu_attach_dev into an identity only function.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Michael Shavit <mshavit@google.com>

> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 82 +++++++++++++++------
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  1 -
>  2 files changed, 58 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 23dda64722ea17..d6f68a6187d290 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2174,8 +2174,7 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
>                 return arm_smmu_sva_domain_alloc();
>
>         if (type != IOMMU_DOMAIN_UNMANAGED &&
> -           type != IOMMU_DOMAIN_DMA &&
> -           type != IOMMU_DOMAIN_IDENTITY)
> +           type != IOMMU_DOMAIN_DMA)
>                 return NULL;
>
>         /*
> @@ -2283,11 +2282,6 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
>         struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>         struct arm_smmu_device *smmu = smmu_domain->smmu;
>
> -       if (domain->type == IOMMU_DOMAIN_IDENTITY) {
> -               smmu_domain->stage = ARM_SMMU_DOMAIN_BYPASS;
> -               return 0;
> -       }
> -
>         /* Restrict the stage to what we can actually support */
>         if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
>                 smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
> @@ -2484,7 +2478,7 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
>         struct arm_smmu_domain *smmu_domain;
>         unsigned long flags;
>
> -       if (!domain)
> +       if (!domain || !(domain->type & __IOMMU_DOMAIN_PAGING))
>                 return;
>
>         smmu_domain = to_smmu_domain(domain);
> @@ -2547,15 +2541,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>
>         arm_smmu_detach_dev(master);
>
> -       /*
> -        * The SMMU does not support enabling ATS with bypass. When the STE is
> -        * in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests and
> -        * Translated transactions are denied as though ATS is disabled for the
> -        * stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
> -        * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
> -        */
> -       if (smmu_domain->stage != ARM_SMMU_DOMAIN_BYPASS)
> -               master->ats_enabled = arm_smmu_ats_supported(master);
> +       master->ats_enabled = arm_smmu_ats_supported(master);
>
>         spin_lock_irqsave(&smmu_domain->devices_lock, flags);
>         list_add(&master->domain_head, &smmu_domain->devices);
> @@ -2592,13 +2578,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>                         arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
>                                                       NULL);
>                 break;
> -       case ARM_SMMU_DOMAIN_BYPASS:
> -               arm_smmu_make_bypass_ste(&target);
> -               arm_smmu_install_ste_for_dev(master, &target);
> -               if (master->cd_table.cdtab)
> -                       arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
> -                                                     NULL);
> -               break;
>         }
>
>         arm_smmu_enable_ats(master, smmu_domain);
> @@ -2614,6 +2593,60 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>         return ret;
>  }
>
> +static int arm_smmu_attach_dev_ste(struct device *dev,
> +                                  struct arm_smmu_ste *ste)
> +{
> +       struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> +
> +       if (arm_smmu_master_sva_enabled(master))
> +               return -EBUSY;
> +
> +       /*
> +        * Do not allow any ASID to be changed while are working on the STE,
> +        * otherwise we could miss invalidations.
> +        */
> +       mutex_lock(&arm_smmu_asid_lock);
> +
> +       /*
> +        * The SMMU does not support enabling ATS with bypass/abort. When the
> +        * STE is in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests
> +        * and Translated transactions are denied as though ATS is disabled for
> +        * the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
> +        * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
> +        */
> +       arm_smmu_detach_dev(master);
> +
> +       arm_smmu_install_ste_for_dev(master, ste);
> +       mutex_unlock(&arm_smmu_asid_lock);
> +
> +       /*
> +        * This has to be done after removing the master from the
> +        * arm_smmu_domain->devices to avoid races updating the same context
> +        * descriptor from arm_smmu_share_asid().
> +        */
> +       if (master->cd_table.cdtab)
> +               arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
> +       return 0;
> +}
> +
> +static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
> +                                       struct device *dev)
> +{
> +       struct arm_smmu_ste ste;
> +
> +       arm_smmu_make_bypass_ste(&ste);
> +       return arm_smmu_attach_dev_ste(dev, &ste);
> +}
> +
> +static const struct iommu_domain_ops arm_smmu_identity_ops = {
> +       .attach_dev = arm_smmu_attach_dev_identity,
> +};
> +
> +static struct iommu_domain arm_smmu_identity_domain = {
> +       .type = IOMMU_DOMAIN_IDENTITY,
> +       .ops = &arm_smmu_identity_ops,
> +};
> +
>  static int arm_smmu_map_pages(struct iommu_domain *domain, unsigned long iova,
>                               phys_addr_t paddr, size_t pgsize, size_t pgcount,
>                               int prot, gfp_t gfp, size_t *mapped)
> @@ -3006,6 +3039,7 @@ static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
>  }
>
>  static struct iommu_ops arm_smmu_ops = {
> +       .identity_domain        = &arm_smmu_identity_domain,
>         .capable                = arm_smmu_capable,
>         .domain_alloc           = arm_smmu_domain_alloc,
>         .probe_device           = arm_smmu_probe_device,
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 21f2f73501019a..154808f96718df 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -712,7 +712,6 @@ struct arm_smmu_master {
>  enum arm_smmu_domain_stage {
>         ARM_SMMU_DOMAIN_S1 = 0,
>         ARM_SMMU_DOMAIN_S2,
> -       ARM_SMMU_DOMAIN_BYPASS,
>  };
>
>  struct arm_smmu_domain {
> --
> 2.42.0
>

On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> Move to the new static global for identity domains. Move all the logic out
> of arm_smmu_attach_dev into an identity only function.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 82 +++++++++++++++------
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  1 -
>  2 files changed, 58 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 23dda64722ea17..d6f68a6187d290 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2174,8 +2174,7 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
>                 return arm_smmu_sva_domain_alloc();
>
>         if (type != IOMMU_DOMAIN_UNMANAGED &&
> -           type != IOMMU_DOMAIN_DMA &&
> -           type != IOMMU_DOMAIN_IDENTITY)
> +           type != IOMMU_DOMAIN_DMA)
>                 return NULL;
>
>         /*
> @@ -2283,11 +2282,6 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
>         struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>         struct arm_smmu_device *smmu = smmu_domain->smmu;
>
> -       if (domain->type == IOMMU_DOMAIN_IDENTITY) {
> -               smmu_domain->stage = ARM_SMMU_DOMAIN_BYPASS;
> -               return 0;
> -       }
> -
>         /* Restrict the stage to what we can actually support */
>         if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
>                 smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
> @@ -2484,7 +2478,7 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
>         struct arm_smmu_domain *smmu_domain;
>         unsigned long flags;
>
> -       if (!domain)
> +       if (!domain || !(domain->type & __IOMMU_DOMAIN_PAGING))
>                 return;
>
>         smmu_domain = to_smmu_domain(domain);
> @@ -2547,15 +2541,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>
>         arm_smmu_detach_dev(master);
>
> -       /*
> -        * The SMMU does not support enabling ATS with bypass. When the STE is
> -        * in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests and
> -        * Translated transactions are denied as though ATS is disabled for the
> -        * stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
> -        * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
> -        */
> -       if (smmu_domain->stage != ARM_SMMU_DOMAIN_BYPASS)
> -               master->ats_enabled = arm_smmu_ats_supported(master);
> +       master->ats_enabled = arm_smmu_ats_supported(master);
>
>         spin_lock_irqsave(&smmu_domain->devices_lock, flags);
>         list_add(&master->domain_head, &smmu_domain->devices);
> @@ -2592,13 +2578,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>                         arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
>                                                       NULL);
>                 break;
> -       case ARM_SMMU_DOMAIN_BYPASS:
> -               arm_smmu_make_bypass_ste(&target);
> -               arm_smmu_install_ste_for_dev(master, &target);
> -               if (master->cd_table.cdtab)
> -                       arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
> -                                                     NULL);
> -               break;
>         }
>
>         arm_smmu_enable_ats(master, smmu_domain);
> @@ -2614,6 +2593,60 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>         return ret;
>  }
>
> +static int arm_smmu_attach_dev_ste(struct device *dev,
> +                                  struct arm_smmu_ste *ste)
> +{
> +       struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> +
> +       if (arm_smmu_master_sva_enabled(master))
> +               return -EBUSY;
> +
> +       /*
> +        * Do not allow any ASID to be changed while are working on the STE,
> +        * otherwise we could miss invalidations.
> +        */
> +       mutex_lock(&arm_smmu_asid_lock);
> +
> +       /*
> +        * The SMMU does not support enabling ATS with bypass/abort. When the
> +        * STE is in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests
> +        * and Translated transactions are denied as though ATS is disabled for
> +        * the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
> +        * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
> +        */
> +       arm_smmu_detach_dev(master);
> +
> +       arm_smmu_install_ste_for_dev(master, ste);
> +       mutex_unlock(&arm_smmu_asid_lock);
> +
> +       /*
> +        * This has to be done after removing the master from the
> +        * arm_smmu_domain->devices to avoid races updating the same context
> +        * descriptor from arm_smmu_share_asid().
> +        */
> +       if (master->cd_table.cdtab)
> +               arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
> +       return 0;
> +}
> +
> +static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
> +                                       struct device *dev)
> +{
> +       struct arm_smmu_ste ste;
> +
> +       arm_smmu_make_bypass_ste(&ste);
> +       return arm_smmu_attach_dev_ste(dev, &ste);
> +}
> +
> +static const struct iommu_domain_ops arm_smmu_identity_ops = {
> +       .attach_dev = arm_smmu_attach_dev_identity,
> +};
> +
> +static struct iommu_domain arm_smmu_identity_domain = {
> +       .type = IOMMU_DOMAIN_IDENTITY,
> +       .ops = &arm_smmu_identity_ops,
> +};
> +
>  static int arm_smmu_map_pages(struct iommu_domain *domain, unsigned long iova,
>                               phys_addr_t paddr, size_t pgsize, size_t pgcount,
>                               int prot, gfp_t gfp, size_t *mapped)
> @@ -3006,6 +3039,7 @@ static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
>  }
>
>  static struct iommu_ops arm_smmu_ops = {
> +       .identity_domain        = &arm_smmu_identity_domain,
>         .capable                = arm_smmu_capable,
>         .domain_alloc           = arm_smmu_domain_alloc,
>         .probe_device           = arm_smmu_probe_device,
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 21f2f73501019a..154808f96718df 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -712,7 +712,6 @@ struct arm_smmu_master {
>  enum arm_smmu_domain_stage {
>         ARM_SMMU_DOMAIN_S1 = 0,
>         ARM_SMMU_DOMAIN_S2,
> -       ARM_SMMU_DOMAIN_BYPASS,
>  };
>
>  struct arm_smmu_domain {
> --
> 2.42.0
>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 15/19] iommu/arm-smmu-v3: Add a global static IDENTITY domain
@ 2023-11-15 15:50     ` Michael Shavit
  0 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 15:50 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> Move to the new static global for identity domains. Move all the logic out
> of arm_smmu_attach_dev into an identity only function.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Michael Shavit <mshavit@google.com>

> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 82 +++++++++++++++------
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  1 -
>  2 files changed, 58 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 23dda64722ea17..d6f68a6187d290 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2174,8 +2174,7 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
>                 return arm_smmu_sva_domain_alloc();
>
>         if (type != IOMMU_DOMAIN_UNMANAGED &&
> -           type != IOMMU_DOMAIN_DMA &&
> -           type != IOMMU_DOMAIN_IDENTITY)
> +           type != IOMMU_DOMAIN_DMA)
>                 return NULL;
>
>         /*
> @@ -2283,11 +2282,6 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
>         struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>         struct arm_smmu_device *smmu = smmu_domain->smmu;
>
> -       if (domain->type == IOMMU_DOMAIN_IDENTITY) {
> -               smmu_domain->stage = ARM_SMMU_DOMAIN_BYPASS;
> -               return 0;
> -       }
> -
>         /* Restrict the stage to what we can actually support */
>         if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
>                 smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
> @@ -2484,7 +2478,7 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
>         struct arm_smmu_domain *smmu_domain;
>         unsigned long flags;
>
> -       if (!domain)
> +       if (!domain || !(domain->type & __IOMMU_DOMAIN_PAGING))
>                 return;
>
>         smmu_domain = to_smmu_domain(domain);
> @@ -2547,15 +2541,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>
>         arm_smmu_detach_dev(master);
>
> -       /*
> -        * The SMMU does not support enabling ATS with bypass. When the STE is
> -        * in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests and
> -        * Translated transactions are denied as though ATS is disabled for the
> -        * stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
> -        * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
> -        */
> -       if (smmu_domain->stage != ARM_SMMU_DOMAIN_BYPASS)
> -               master->ats_enabled = arm_smmu_ats_supported(master);
> +       master->ats_enabled = arm_smmu_ats_supported(master);
>
>         spin_lock_irqsave(&smmu_domain->devices_lock, flags);
>         list_add(&master->domain_head, &smmu_domain->devices);
> @@ -2592,13 +2578,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>                         arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
>                                                       NULL);
>                 break;
> -       case ARM_SMMU_DOMAIN_BYPASS:
> -               arm_smmu_make_bypass_ste(&target);
> -               arm_smmu_install_ste_for_dev(master, &target);
> -               if (master->cd_table.cdtab)
> -                       arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
> -                                                     NULL);
> -               break;
>         }
>
>         arm_smmu_enable_ats(master, smmu_domain);
> @@ -2614,6 +2593,60 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>         return ret;
>  }
>
> +static int arm_smmu_attach_dev_ste(struct device *dev,
> +                                  struct arm_smmu_ste *ste)
> +{
> +       struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> +
> +       if (arm_smmu_master_sva_enabled(master))
> +               return -EBUSY;
> +
> +       /*
> +        * Do not allow any ASID to be changed while are working on the STE,
> +        * otherwise we could miss invalidations.
> +        */
> +       mutex_lock(&arm_smmu_asid_lock);
> +
> +       /*
> +        * The SMMU does not support enabling ATS with bypass/abort. When the
> +        * STE is in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests
> +        * and Translated transactions are denied as though ATS is disabled for
> +        * the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
> +        * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
> +        */
> +       arm_smmu_detach_dev(master);
> +
> +       arm_smmu_install_ste_for_dev(master, ste);
> +       mutex_unlock(&arm_smmu_asid_lock);
> +
> +       /*
> +        * This has to be done after removing the master from the
> +        * arm_smmu_domain->devices to avoid races updating the same context
> +        * descriptor from arm_smmu_share_asid().
> +        */
> +       if (master->cd_table.cdtab)
> +               arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
> +       return 0;
> +}
> +
> +static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
> +                                       struct device *dev)
> +{
> +       struct arm_smmu_ste ste;
> +
> +       arm_smmu_make_bypass_ste(&ste);
> +       return arm_smmu_attach_dev_ste(dev, &ste);
> +}
> +
> +static const struct iommu_domain_ops arm_smmu_identity_ops = {
> +       .attach_dev = arm_smmu_attach_dev_identity,
> +};
> +
> +static struct iommu_domain arm_smmu_identity_domain = {
> +       .type = IOMMU_DOMAIN_IDENTITY,
> +       .ops = &arm_smmu_identity_ops,
> +};
> +
>  static int arm_smmu_map_pages(struct iommu_domain *domain, unsigned long iova,
>                               phys_addr_t paddr, size_t pgsize, size_t pgcount,
>                               int prot, gfp_t gfp, size_t *mapped)
> @@ -3006,6 +3039,7 @@ static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
>  }
>
>  static struct iommu_ops arm_smmu_ops = {
> +       .identity_domain        = &arm_smmu_identity_domain,
>         .capable                = arm_smmu_capable,
>         .domain_alloc           = arm_smmu_domain_alloc,
>         .probe_device           = arm_smmu_probe_device,
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 21f2f73501019a..154808f96718df 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -712,7 +712,6 @@ struct arm_smmu_master {
>  enum arm_smmu_domain_stage {
>         ARM_SMMU_DOMAIN_S1 = 0,
>         ARM_SMMU_DOMAIN_S2,
> -       ARM_SMMU_DOMAIN_BYPASS,
>  };
>
>  struct arm_smmu_domain {
> --
> 2.42.0
>

On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> Move to the new static global for identity domains. Move all the logic out
> of arm_smmu_attach_dev into an identity only function.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 82 +++++++++++++++------
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  1 -
>  2 files changed, 58 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 23dda64722ea17..d6f68a6187d290 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2174,8 +2174,7 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
>                 return arm_smmu_sva_domain_alloc();
>
>         if (type != IOMMU_DOMAIN_UNMANAGED &&
> -           type != IOMMU_DOMAIN_DMA &&
> -           type != IOMMU_DOMAIN_IDENTITY)
> +           type != IOMMU_DOMAIN_DMA)
>                 return NULL;
>
>         /*
> @@ -2283,11 +2282,6 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
>         struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>         struct arm_smmu_device *smmu = smmu_domain->smmu;
>
> -       if (domain->type == IOMMU_DOMAIN_IDENTITY) {
> -               smmu_domain->stage = ARM_SMMU_DOMAIN_BYPASS;
> -               return 0;
> -       }
> -
>         /* Restrict the stage to what we can actually support */
>         if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
>                 smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
> @@ -2484,7 +2478,7 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
>         struct arm_smmu_domain *smmu_domain;
>         unsigned long flags;
>
> -       if (!domain)
> +       if (!domain || !(domain->type & __IOMMU_DOMAIN_PAGING))
>                 return;
>
>         smmu_domain = to_smmu_domain(domain);
> @@ -2547,15 +2541,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>
>         arm_smmu_detach_dev(master);
>
> -       /*
> -        * The SMMU does not support enabling ATS with bypass. When the STE is
> -        * in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests and
> -        * Translated transactions are denied as though ATS is disabled for the
> -        * stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
> -        * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
> -        */
> -       if (smmu_domain->stage != ARM_SMMU_DOMAIN_BYPASS)
> -               master->ats_enabled = arm_smmu_ats_supported(master);
> +       master->ats_enabled = arm_smmu_ats_supported(master);
>
>         spin_lock_irqsave(&smmu_domain->devices_lock, flags);
>         list_add(&master->domain_head, &smmu_domain->devices);
> @@ -2592,13 +2578,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>                         arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
>                                                       NULL);
>                 break;
> -       case ARM_SMMU_DOMAIN_BYPASS:
> -               arm_smmu_make_bypass_ste(&target);
> -               arm_smmu_install_ste_for_dev(master, &target);
> -               if (master->cd_table.cdtab)
> -                       arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
> -                                                     NULL);
> -               break;
>         }
>
>         arm_smmu_enable_ats(master, smmu_domain);
> @@ -2614,6 +2593,60 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>         return ret;
>  }
>
> +static int arm_smmu_attach_dev_ste(struct device *dev,
> +                                  struct arm_smmu_ste *ste)
> +{
> +       struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> +
> +       if (arm_smmu_master_sva_enabled(master))
> +               return -EBUSY;
> +
> +       /*
> +        * Do not allow any ASID to be changed while are working on the STE,
> +        * otherwise we could miss invalidations.
> +        */
> +       mutex_lock(&arm_smmu_asid_lock);
> +
> +       /*
> +        * The SMMU does not support enabling ATS with bypass/abort. When the
> +        * STE is in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests
> +        * and Translated transactions are denied as though ATS is disabled for
> +        * the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
> +        * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
> +        */
> +       arm_smmu_detach_dev(master);
> +
> +       arm_smmu_install_ste_for_dev(master, ste);
> +       mutex_unlock(&arm_smmu_asid_lock);
> +
> +       /*
> +        * This has to be done after removing the master from the
> +        * arm_smmu_domain->devices to avoid races updating the same context
> +        * descriptor from arm_smmu_share_asid().
> +        */
> +       if (master->cd_table.cdtab)
> +               arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
> +       return 0;
> +}
> +
> +static int arm_smmu_attach_dev_identity(struct iommu_domain *domain,
> +                                       struct device *dev)
> +{
> +       struct arm_smmu_ste ste;
> +
> +       arm_smmu_make_bypass_ste(&ste);
> +       return arm_smmu_attach_dev_ste(dev, &ste);
> +}
> +
> +static const struct iommu_domain_ops arm_smmu_identity_ops = {
> +       .attach_dev = arm_smmu_attach_dev_identity,
> +};
> +
> +static struct iommu_domain arm_smmu_identity_domain = {
> +       .type = IOMMU_DOMAIN_IDENTITY,
> +       .ops = &arm_smmu_identity_ops,
> +};
> +
>  static int arm_smmu_map_pages(struct iommu_domain *domain, unsigned long iova,
>                               phys_addr_t paddr, size_t pgsize, size_t pgcount,
>                               int prot, gfp_t gfp, size_t *mapped)
> @@ -3006,6 +3039,7 @@ static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
>  }
>
>  static struct iommu_ops arm_smmu_ops = {
> +       .identity_domain        = &arm_smmu_identity_domain,
>         .capable                = arm_smmu_capable,
>         .domain_alloc           = arm_smmu_domain_alloc,
>         .probe_device           = arm_smmu_probe_device,
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 21f2f73501019a..154808f96718df 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -712,7 +712,6 @@ struct arm_smmu_master {
>  enum arm_smmu_domain_stage {
>         ARM_SMMU_DOMAIN_S1 = 0,
>         ARM_SMMU_DOMAIN_S2,
> -       ARM_SMMU_DOMAIN_BYPASS,
>  };
>
>  struct arm_smmu_domain {
> --
> 2.42.0
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 16/19] iommu/arm-smmu-v3: Add a global static BLOCKED domain
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-11-15 15:57     ` Michael Shavit
  -1 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 15:57 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> Using the same design as the IDENTITY domain install an
> STRTAB_STE_0_CFG_ABORT STE.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Michael Shavit <mshavit@google.com>

Are there any subtle observable changes hidden here? IIUC the iommu
framework would have previously installed an empty UNMANAGED domain
but will now end up installing an abort STE instead.

> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index d6f68a6187d290..48981c2ff7a746 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2647,6 +2647,24 @@ static struct iommu_domain arm_smmu_identity_domain = {
>         .ops = &arm_smmu_identity_ops,
>  };
>
> +static int arm_smmu_attach_dev_blocked(struct iommu_domain *domain,
> +                                       struct device *dev)
> +{
> +       struct arm_smmu_ste ste;
> +
> +       arm_smmu_make_abort_ste(&ste);
> +       return arm_smmu_attach_dev_ste(dev, &ste);
> +}
> +
> +static const struct iommu_domain_ops arm_smmu_blocked_ops = {
> +       .attach_dev = arm_smmu_attach_dev_blocked,
> +};
> +
> +static struct iommu_domain arm_smmu_blocked_domain = {
> +       .type = IOMMU_DOMAIN_BLOCKED,
> +       .ops = &arm_smmu_blocked_ops,
> +};
> +
>  static int arm_smmu_map_pages(struct iommu_domain *domain, unsigned long iova,
>                               phys_addr_t paddr, size_t pgsize, size_t pgcount,
>                               int prot, gfp_t gfp, size_t *mapped)
> @@ -3040,6 +3058,7 @@ static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
>
>  static struct iommu_ops arm_smmu_ops = {
>         .identity_domain        = &arm_smmu_identity_domain,
> +       .blocked_domain         = &arm_smmu_blocked_domain,
>         .capable                = arm_smmu_capable,
>         .domain_alloc           = arm_smmu_domain_alloc,
>         .probe_device           = arm_smmu_probe_device,
> --
> 2.42.0
>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 16/19] iommu/arm-smmu-v3: Add a global static BLOCKED domain
@ 2023-11-15 15:57     ` Michael Shavit
  0 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 15:57 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> Using the same design as the IDENTITY domain install an
> STRTAB_STE_0_CFG_ABORT STE.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Michael Shavit <mshavit@google.com>

Are there any subtle observable changes hidden here? IIUC the iommu
framework would have previously installed an empty UNMANAGED domain
but will now end up installing an abort STE instead.

> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index d6f68a6187d290..48981c2ff7a746 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2647,6 +2647,24 @@ static struct iommu_domain arm_smmu_identity_domain = {
>         .ops = &arm_smmu_identity_ops,
>  };
>
> +static int arm_smmu_attach_dev_blocked(struct iommu_domain *domain,
> +                                       struct device *dev)
> +{
> +       struct arm_smmu_ste ste;
> +
> +       arm_smmu_make_abort_ste(&ste);
> +       return arm_smmu_attach_dev_ste(dev, &ste);
> +}
> +
> +static const struct iommu_domain_ops arm_smmu_blocked_ops = {
> +       .attach_dev = arm_smmu_attach_dev_blocked,
> +};
> +
> +static struct iommu_domain arm_smmu_blocked_domain = {
> +       .type = IOMMU_DOMAIN_BLOCKED,
> +       .ops = &arm_smmu_blocked_ops,
> +};
> +
>  static int arm_smmu_map_pages(struct iommu_domain *domain, unsigned long iova,
>                               phys_addr_t paddr, size_t pgsize, size_t pgcount,
>                               int prot, gfp_t gfp, size_t *mapped)
> @@ -3040,6 +3058,7 @@ static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
>
>  static struct iommu_ops arm_smmu_ops = {
>         .identity_domain        = &arm_smmu_identity_domain,
> +       .blocked_domain         = &arm_smmu_blocked_domain,
>         .capable                = arm_smmu_capable,
>         .domain_alloc           = arm_smmu_domain_alloc,
>         .probe_device           = arm_smmu_probe_device,
> --
> 2.42.0
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 18/19] iommu/arm-smmu-v3: Pass arm_smmu_domain and arm_smmu_device to finalize
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-11-15 16:02     ` Michael Shavit
  -1 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 16:02 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> Instead of putting container_of() casts in the internals, use the proper
> type in this call chain. This makes it easier to check that the two global
> static domains are not leaking into call chains they should not.
>
> Passing the smmu avoids the only caller from having to set it and unset it
> in the error path.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 34 ++++++++++-----------
>  1 file changed, 17 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 331568e086c70a..50c26792391b56 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -87,6 +87,8 @@ static struct arm_smmu_option_prop arm_smmu_options[] = {
>  };
>
>  static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu);
> +static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
> +                                   struct arm_smmu_device *smmu);
>
>  static void parse_driver_options(struct arm_smmu_device *smmu)
>  {
> @@ -2216,12 +2218,12 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
>         kfree(smmu_domain);
>  }
>
> -static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
> +static int arm_smmu_domain_finalise_s1(struct arm_smmu_device *smmu,
> +                                      struct arm_smmu_domain *smmu_domain,
>                                        struct io_pgtable_cfg *pgtbl_cfg)
>  {
>         int ret;
>         u32 asid;
> -       struct arm_smmu_device *smmu = smmu_domain->smmu;
>         struct arm_smmu_ctx_desc *cd = &smmu_domain->cd;
>         typeof(&pgtbl_cfg->arm_lpae_s1_cfg.tcr) tcr = &pgtbl_cfg->arm_lpae_s1_cfg.tcr;
>
> @@ -2253,11 +2255,11 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
>         return ret;
>  }
>
> -static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
> +static int arm_smmu_domain_finalise_s2(struct arm_smmu_device *smmu,
> +                                      struct arm_smmu_domain *smmu_domain,
>                                        struct io_pgtable_cfg *pgtbl_cfg)
>  {
>         int vmid;
> -       struct arm_smmu_device *smmu = smmu_domain->smmu;
>         struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
>
>         /* Reserve VMID 0 for stage-2 bypass STEs */
> @@ -2270,17 +2272,17 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
>         return 0;
>  }
>
> -static int arm_smmu_domain_finalise(struct iommu_domain *domain)
> +static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
> +                                   struct arm_smmu_device *smmu)
>  {
>         int ret;
>         unsigned long ias, oas;
>         enum io_pgtable_fmt fmt;
>         struct io_pgtable_cfg pgtbl_cfg;
>         struct io_pgtable_ops *pgtbl_ops;
> -       int (*finalise_stage_fn)(struct arm_smmu_domain *,
> -                                struct io_pgtable_cfg *);
> -       struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> -       struct arm_smmu_device *smmu = smmu_domain->smmu;
> +       int (*finalise_stage_fn)(struct arm_smmu_device *smmu,
> +                                struct arm_smmu_domain *smmu_domain,
> +                                struct io_pgtable_cfg *pgtbl_cfg);
>
>         /* Restrict the stage to what we can actually support */
>         if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
> @@ -2319,17 +2321,18 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
>         if (!pgtbl_ops)
>                 return -ENOMEM;
>
> -       domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
> -       domain->geometry.aperture_end = (1UL << pgtbl_cfg.ias) - 1;
> -       domain->geometry.force_aperture = true;
> +       smmu_domain->domain.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
> +       smmu_domain->domain.geometry.aperture_end = (1UL << pgtbl_cfg.ias) - 1;
> +       smmu_domain->domain.geometry.force_aperture = true;
>
> -       ret = finalise_stage_fn(smmu_domain, &pgtbl_cfg);
> +       ret = finalise_stage_fn(smmu, smmu_domain, &pgtbl_cfg);
>         if (ret < 0) {
>                 free_io_pgtable_ops(pgtbl_ops);
>                 return ret;
>         }
>
>         smmu_domain->pgtbl_ops = pgtbl_ops;
> +       smmu_domain->smmu = smmu;
>         return 0;
>  }
>
> @@ -2520,10 +2523,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>         mutex_lock(&smmu_domain->init_mutex);
>
>         if (!smmu_domain->smmu) {
> -               smmu_domain->smmu = smmu;
> -               ret = arm_smmu_domain_finalise(domain);
> -               if (ret)
> -                       smmu_domain->smmu = NULL;
> +               ret = arm_smmu_domain_finalise(smmu_domain, smmu);
>         } else if (smmu_domain->smmu != smmu)
>                 ret = -EINVAL;
>
> --
> 2.42.0
>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 18/19] iommu/arm-smmu-v3: Pass arm_smmu_domain and arm_smmu_device to finalize
@ 2023-11-15 16:02     ` Michael Shavit
  0 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-15 16:02 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> Instead of putting container_of() casts in the internals, use the proper
> type in this call chain. This makes it easier to check that the two global
> static domains are not leaking into call chains they should not.
>
> Passing the smmu avoids the only caller from having to set it and unset it
> in the error path.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 34 ++++++++++-----------
>  1 file changed, 17 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 331568e086c70a..50c26792391b56 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -87,6 +87,8 @@ static struct arm_smmu_option_prop arm_smmu_options[] = {
>  };
>
>  static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu);
> +static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
> +                                   struct arm_smmu_device *smmu);
>
>  static void parse_driver_options(struct arm_smmu_device *smmu)
>  {
> @@ -2216,12 +2218,12 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
>         kfree(smmu_domain);
>  }
>
> -static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
> +static int arm_smmu_domain_finalise_s1(struct arm_smmu_device *smmu,
> +                                      struct arm_smmu_domain *smmu_domain,
>                                        struct io_pgtable_cfg *pgtbl_cfg)
>  {
>         int ret;
>         u32 asid;
> -       struct arm_smmu_device *smmu = smmu_domain->smmu;
>         struct arm_smmu_ctx_desc *cd = &smmu_domain->cd;
>         typeof(&pgtbl_cfg->arm_lpae_s1_cfg.tcr) tcr = &pgtbl_cfg->arm_lpae_s1_cfg.tcr;
>
> @@ -2253,11 +2255,11 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
>         return ret;
>  }
>
> -static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
> +static int arm_smmu_domain_finalise_s2(struct arm_smmu_device *smmu,
> +                                      struct arm_smmu_domain *smmu_domain,
>                                        struct io_pgtable_cfg *pgtbl_cfg)
>  {
>         int vmid;
> -       struct arm_smmu_device *smmu = smmu_domain->smmu;
>         struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg;
>
>         /* Reserve VMID 0 for stage-2 bypass STEs */
> @@ -2270,17 +2272,17 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain,
>         return 0;
>  }
>
> -static int arm_smmu_domain_finalise(struct iommu_domain *domain)
> +static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
> +                                   struct arm_smmu_device *smmu)
>  {
>         int ret;
>         unsigned long ias, oas;
>         enum io_pgtable_fmt fmt;
>         struct io_pgtable_cfg pgtbl_cfg;
>         struct io_pgtable_ops *pgtbl_ops;
> -       int (*finalise_stage_fn)(struct arm_smmu_domain *,
> -                                struct io_pgtable_cfg *);
> -       struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> -       struct arm_smmu_device *smmu = smmu_domain->smmu;
> +       int (*finalise_stage_fn)(struct arm_smmu_device *smmu,
> +                                struct arm_smmu_domain *smmu_domain,
> +                                struct io_pgtable_cfg *pgtbl_cfg);
>
>         /* Restrict the stage to what we can actually support */
>         if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
> @@ -2319,17 +2321,18 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
>         if (!pgtbl_ops)
>                 return -ENOMEM;
>
> -       domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
> -       domain->geometry.aperture_end = (1UL << pgtbl_cfg.ias) - 1;
> -       domain->geometry.force_aperture = true;
> +       smmu_domain->domain.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
> +       smmu_domain->domain.geometry.aperture_end = (1UL << pgtbl_cfg.ias) - 1;
> +       smmu_domain->domain.geometry.force_aperture = true;
>
> -       ret = finalise_stage_fn(smmu_domain, &pgtbl_cfg);
> +       ret = finalise_stage_fn(smmu, smmu_domain, &pgtbl_cfg);
>         if (ret < 0) {
>                 free_io_pgtable_ops(pgtbl_ops);
>                 return ret;
>         }
>
>         smmu_domain->pgtbl_ops = pgtbl_ops;
> +       smmu_domain->smmu = smmu;
>         return 0;
>  }
>
> @@ -2520,10 +2523,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>         mutex_lock(&smmu_domain->init_mutex);
>
>         if (!smmu_domain->smmu) {
> -               smmu_domain->smmu = smmu;
> -               ret = arm_smmu_domain_finalise(domain);
> -               if (ret)
> -                       smmu_domain->smmu = NULL;
> +               ret = arm_smmu_domain_finalise(smmu_domain, smmu);
>         } else if (smmu_domain->smmu != smmu)
>                 ret = -EINVAL;
>
> --
> 2.42.0
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 16/19] iommu/arm-smmu-v3: Add a global static BLOCKED domain
  2023-11-15 15:57     ` Michael Shavit
@ 2023-11-16 15:44       ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-16 15:44 UTC (permalink / raw)
  To: Michael Shavit
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Wed, Nov 15, 2023 at 11:57:39PM +0800, Michael Shavit wrote:
> On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > Using the same design as the IDENTITY domain install an
> > STRTAB_STE_0_CFG_ABORT STE.
> >
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Michael Shavit <mshavit@google.com>
> 
> Are there any subtle observable changes hidden here? IIUC the iommu
> framework would have previously installed an empty UNMANAGED domain
> but will now end up installing an abort STE instead.

I don't think meaningfully. The empty UNMANAGED domain is a fallback
for drivers that don't natively support BLOCKED. SMMUv3 already uses
ABORT for this use case in the pre-boot and unattached devices flows,
so if there is something wrong we have an existing issue already.

I suspect there are slight differences about what gets logged if a DMA
hits ABORT vs UNMANAGED, but not material.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 16/19] iommu/arm-smmu-v3: Add a global static BLOCKED domain
@ 2023-11-16 15:44       ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-16 15:44 UTC (permalink / raw)
  To: Michael Shavit
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Wed, Nov 15, 2023 at 11:57:39PM +0800, Michael Shavit wrote:
> On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > Using the same design as the IDENTITY domain install an
> > STRTAB_STE_0_CFG_ABORT STE.
> >
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Michael Shavit <mshavit@google.com>
> 
> Are there any subtle observable changes hidden here? IIUC the iommu
> framework would have previously installed an empty UNMANAGED domain
> but will now end up installing an abort STE instead.

I don't think meaningfully. The empty UNMANAGED domain is a fallback
for drivers that don't natively support BLOCKED. SMMUv3 already uses
ABORT for this use case in the pre-boot and unattached devices flows,
so if there is something wrong we have an existing issue already.

I suspect there are slight differences about what gets logged if a DMA
hits ABORT vs UNMANAGED, but not material.

Thanks,
Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 11/19] iommu/arm-smmu-v3: Do not change the STE twice during arm_smmu_attach_dev()
  2023-11-15 15:15     ` Michael Shavit
@ 2023-11-16 16:28       ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-16 16:28 UTC (permalink / raw)
  To: Michael Shavit
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Wed, Nov 15, 2023 at 11:15:23PM +0800, Michael Shavit wrote:
> On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > This was needed because the STE code required the STE to be in
> > ABORT/BYPASS inorder to program a cdtable or S2 STE. Now that the STE code
> > can automatically handle all transitions we can remove this step
> > from the attach_dev flow.
> >
> > A few small bugs exist because of this:
> >
> > 1) If the core code does BLOCKED -> UNMANAGED with disable_bypass=false
> >    then there will be a moment where the STE points at BYPASS. Since
> >    this can be done by VFIO/IOMMUFD it is a small security race.
> >
> > 2) If the core code does IDENTITY -> DMA then any IOMMU_RESV_DIRECT
> >    regions will temporarily become BLOCKED. We'd like drivers to
> >    work in a way that allows IOMMU_RESV_DIRECT to be continuously
> >    functional during these transitions.
> >
> > Make arm_smmu_release_device() put the STE back to the correct
> > ABORT/BYPASS setting. Fix a bug where a IOMMU_RESV_DIRECT was ignored on
> > this path.
> >
> > Notice this subtly depends on the prior arm_smmu_asid_lock change as the
> > STE must be put to non-paging before removing the device for the linked
> > list to avoid races with arm_smmu_share_asid().
> 
> I'm a little confused by this comment. Is this suggesting that
> arm_smmu_detach_dev had a race condition before the arm_smmu_asid_lock
> changes, since it deletes the list entry before deactivating the STE
> that uses the domain and without grabbing the asid_lock, thus allowing
> a gap where the ASID might be re-acquired by an SVA domain while an
> STE with that ASID is still live on this device? Wouldn't that belong
> on the asid_lock patch instead if so?

I wasn't intending to say there is an existing bug, this was more to
point out why it was organized like this, and why it is OK to remove
the detach manipulation of the STE considering races with share_asid.

However, I agree that the code in rc1 is troubled and fixed in the
prior patch:

	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
	list_del(&master->domain_head);
	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);

^^^^ Prevents arm_smmu_update_ctx_desc_devices() from storing to the STE
     However the STE is still pointing at the ASID

	master->domain = NULL;
	master->ats_enabled = false;
	arm_smmu_install_ste_for_dev(master);

^^^^ Now the STE is gone, so the CD becomes unreferenced

	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 && master->cd_table.cdtab)
		arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);

^^^^ Now the CD is non-valid

I was primarily concerned with corrupting the CD, ie that share_asid
would race and un-clear the write_ctx_desc(). That is prevented by the
ordering above.

However, I agree the above is still problematic because there is a
short time window where the ASID can be installed in two CDs with two
different translations. I suppose there is a security issue where this
could corrupt the IOTLB.

This is all fixed in this series too by having more robust locking. So
this does deserve a note in the commit message for the earlier patch
about this issue.

> > @@ -2852,9 +2846,18 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
> >  static void arm_smmu_release_device(struct device *dev)
> >  {
> >         struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> > +       struct arm_smmu_ste target;
> >
> >         if (WARN_ON(arm_smmu_master_sva_enabled(master)))
> >                 iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
> > +
> > +       /* Put the STE back to what arm_smmu_init_strtab() sets */
> 
> Hmmmm, it seems like checking iommu->require_direct may put STEs in
> bypass in scenarios where arm_smmu_init_strtab() wouldn't have.
> arm_smmu_init_strtab is calling iort_get_rmr_sids to pick streams to
> put into bypass, but IIUC iommu->require_direct also applies to
> dts-based reserved-memory regions, not just iort.

Indeed, that actually looks like a little bug as the DT should
technicaly be the same behavior as the iort.. I'm going to ignore it
:)

> I'm not very familiar with the history behind disable_bypass; why is
> putting an entire stream into bypass the correct behavior if a
> reserved-memory (which may be for a small finite region) exists?

This specific reserved memory region is requesting a 1:1 translation
for a chunk of IOVA. This translation is being used by some agent
outside Linux's knowledge and the desire is for the translation to
always be in effect.

So, if we put the STE to ABORT then the translation will stop working
with unknown side effects.

This is also why we install the translation in the DMA domain and
block use of VIFO if these are set - to ensure the 1:1 translation is
always there.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 11/19] iommu/arm-smmu-v3: Do not change the STE twice during arm_smmu_attach_dev()
@ 2023-11-16 16:28       ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-16 16:28 UTC (permalink / raw)
  To: Michael Shavit
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Wed, Nov 15, 2023 at 11:15:23PM +0800, Michael Shavit wrote:
> On Tue, Nov 14, 2023 at 1:53 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > This was needed because the STE code required the STE to be in
> > ABORT/BYPASS inorder to program a cdtable or S2 STE. Now that the STE code
> > can automatically handle all transitions we can remove this step
> > from the attach_dev flow.
> >
> > A few small bugs exist because of this:
> >
> > 1) If the core code does BLOCKED -> UNMANAGED with disable_bypass=false
> >    then there will be a moment where the STE points at BYPASS. Since
> >    this can be done by VFIO/IOMMUFD it is a small security race.
> >
> > 2) If the core code does IDENTITY -> DMA then any IOMMU_RESV_DIRECT
> >    regions will temporarily become BLOCKED. We'd like drivers to
> >    work in a way that allows IOMMU_RESV_DIRECT to be continuously
> >    functional during these transitions.
> >
> > Make arm_smmu_release_device() put the STE back to the correct
> > ABORT/BYPASS setting. Fix a bug where a IOMMU_RESV_DIRECT was ignored on
> > this path.
> >
> > Notice this subtly depends on the prior arm_smmu_asid_lock change as the
> > STE must be put to non-paging before removing the device for the linked
> > list to avoid races with arm_smmu_share_asid().
> 
> I'm a little confused by this comment. Is this suggesting that
> arm_smmu_detach_dev had a race condition before the arm_smmu_asid_lock
> changes, since it deletes the list entry before deactivating the STE
> that uses the domain and without grabbing the asid_lock, thus allowing
> a gap where the ASID might be re-acquired by an SVA domain while an
> STE with that ASID is still live on this device? Wouldn't that belong
> on the asid_lock patch instead if so?

I wasn't intending to say there is an existing bug, this was more to
point out why it was organized like this, and why it is OK to remove
the detach manipulation of the STE considering races with share_asid.

However, I agree that the code in rc1 is troubled and fixed in the
prior patch:

	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
	list_del(&master->domain_head);
	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);

^^^^ Prevents arm_smmu_update_ctx_desc_devices() from storing to the STE
     However the STE is still pointing at the ASID

	master->domain = NULL;
	master->ats_enabled = false;
	arm_smmu_install_ste_for_dev(master);

^^^^ Now the STE is gone, so the CD becomes unreferenced

	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 && master->cd_table.cdtab)
		arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);

^^^^ Now the CD is non-valid

I was primarily concerned with corrupting the CD, ie that share_asid
would race and un-clear the write_ctx_desc(). That is prevented by the
ordering above.

However, I agree the above is still problematic because there is a
short time window where the ASID can be installed in two CDs with two
different translations. I suppose there is a security issue where this
could corrupt the IOTLB.

This is all fixed in this series too by having more robust locking. So
this does deserve a note in the commit message for the earlier patch
about this issue.

> > @@ -2852,9 +2846,18 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
> >  static void arm_smmu_release_device(struct device *dev)
> >  {
> >         struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> > +       struct arm_smmu_ste target;
> >
> >         if (WARN_ON(arm_smmu_master_sva_enabled(master)))
> >                 iopf_queue_remove_device(master->smmu->evtq.iopf, dev);
> > +
> > +       /* Put the STE back to what arm_smmu_init_strtab() sets */
> 
> Hmmmm, it seems like checking iommu->require_direct may put STEs in
> bypass in scenarios where arm_smmu_init_strtab() wouldn't have.
> arm_smmu_init_strtab is calling iort_get_rmr_sids to pick streams to
> put into bypass, but IIUC iommu->require_direct also applies to
> dts-based reserved-memory regions, not just iort.

Indeed, that actually looks like a little bug as the DT should
technicaly be the same behavior as the iort.. I'm going to ignore it
:)

> I'm not very familiar with the history behind disable_bypass; why is
> putting an entire stream into bypass the correct behavior if a
> reserved-memory (which may be for a small finite region) exists?

This specific reserved memory region is requesting a 1:1 translation
for a chunk of IOVA. This translation is being used by some agent
outside Linux's knowledge and the desire is for the translation to
always be in effect.

So, if we put the STE to ABORT then the translation will stop working
with unknown side effects.

This is also why we install the translation in the DMA domain and
block use of VIFO if these are set - to ensure the 1:1 translation is
always there.

Thanks,
Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 12/19] iommu/arm-smmu-v3: Put writing the context descriptor in the right order
  2023-11-15 15:32     ` Michael Shavit
@ 2023-11-16 16:46       ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-16 16:46 UTC (permalink / raw)
  To: Michael Shavit
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Wed, Nov 15, 2023 at 11:32:28PM +0800, Michael Shavit wrote:

> > Lift this code out of arm_smmu_detach_dev() so it can all be sequenced
> > properly. The only other caller is arm_smmu_release_device() and it is
> > going to free the cdtable anyhow, so it doesn't matter what is in it.
> >
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Michael Shavit <mshavit@google.com>
> 
> This patch might be a better fit before the previous one. When going
> from S1 to S2 or bypass:
> Pre-both patches, attach_dev() installs a NULL STE, then clears the
> now unused CDE, then installs a new STE.
> After the previous patch, attach_dev() clears the *still used* CDE,
> and then replaces the STE.
> After this patch, attach_dev() replaces the STE, and then clears the CDE
> 
> Reordering the two patches removes the scenario where we could hit a
> NULL-ed CDE.

NULLed = non-valid

I see what you mean, but I haven't thought carefully about a different
order so I'd rather leave it..

Regardless of order the two prior patches will have cases that hit
non-valid/abort STE/CDEs, each step removes a few cases.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 12/19] iommu/arm-smmu-v3: Put writing the context descriptor in the right order
@ 2023-11-16 16:46       ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-16 16:46 UTC (permalink / raw)
  To: Michael Shavit
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Wed, Nov 15, 2023 at 11:32:28PM +0800, Michael Shavit wrote:

> > Lift this code out of arm_smmu_detach_dev() so it can all be sequenced
> > properly. The only other caller is arm_smmu_release_device() and it is
> > going to free the cdtable anyhow, so it doesn't matter what is in it.
> >
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Michael Shavit <mshavit@google.com>
> 
> This patch might be a better fit before the previous one. When going
> from S1 to S2 or bypass:
> Pre-both patches, attach_dev() installs a NULL STE, then clears the
> now unused CDE, then installs a new STE.
> After the previous patch, attach_dev() clears the *still used* CDE,
> and then replaces the STE.
> After this patch, attach_dev() replaces the STE, and then clears the CDE
> 
> Reordering the two patches removes the scenario where we could hit a
> NULL-ed CDE.

NULLed = non-valid

I see what you mean, but I haven't thought carefully about a different
order so I'd rather leave it..

Regardless of order the two prior patches will have cases that hit
non-valid/abort STE/CDEs, each step removes a few cases.

Thanks,
Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 12/19] iommu/arm-smmu-v3: Put writing the context descriptor in the right order
  2023-11-16 16:46       ` Jason Gunthorpe
@ 2023-11-17  4:14         ` Michael Shavit
  -1 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-17  4:14 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Fri, Nov 17, 2023 at 12:47 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Wed, Nov 15, 2023 at 11:32:28PM +0800, Michael Shavit wrote:
>
> > > Lift this code out of arm_smmu_detach_dev() so it can all be sequenced
> > > properly. The only other caller is arm_smmu_release_device() and it is
> > > going to free the cdtable anyhow, so it doesn't matter what is in it.
> > >
> > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > Reviewed-by: Michael Shavit <mshavit@google.com>
> >
> > This patch might be a better fit before the previous one. When going
> > from S1 to S2 or bypass:
> > Pre-both patches, attach_dev() installs a NULL STE, then clears the
> > now unused CDE, then installs a new STE.
> > After the previous patch, attach_dev() clears the *still used* CDE,
> > and then replaces the STE.
> > After this patch, attach_dev() replaces the STE, and then clears the CDE
> >
> > Reordering the two patches removes the scenario where we could hit a
> > NULL-ed CDE.
>
> NULLed = non-valid
>
> I see what you mean, but I haven't thought carefully about a different
> order so I'd rather leave it..
Ack; it's probably too subtle to matter much anyhow.
>
> Regardless of order the two prior patches will have cases that hit
> non-valid/abort STE/CDEs, each step removes a few cases.
>
> Thanks,
> Jason

On Fri, Nov 17, 2023 at 12:47 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Wed, Nov 15, 2023 at 11:32:28PM +0800, Michael Shavit wrote:
>
> > > Lift this code out of arm_smmu_detach_dev() so it can all be sequenced
> > > properly. The only other caller is arm_smmu_release_device() and it is
> > > going to free the cdtable anyhow, so it doesn't matter what is in it.
> > >
> > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > Reviewed-by: Michael Shavit <mshavit@google.com>
> >
> > This patch might be a better fit before the previous one. When going
> > from S1 to S2 or bypass:
> > Pre-both patches, attach_dev() installs a NULL STE, then clears the
> > now unused CDE, then installs a new STE.
> > After the previous patch, attach_dev() clears the *still used* CDE,
> > and then replaces the STE.
> > After this patch, attach_dev() replaces the STE, and then clears the CDE
> >
> > Reordering the two patches removes the scenario where we could hit a
> > NULL-ed CDE.
>
> NULLed = non-valid
>
> I see what you mean, but I haven't thought carefully about a different
> order so I'd rather leave it..
>
> Regardless of order the two prior patches will have cases that hit
> non-valid/abort STE/CDEs, each step removes a few cases.
>
> Thanks,
> Jason

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 12/19] iommu/arm-smmu-v3: Put writing the context descriptor in the right order
@ 2023-11-17  4:14         ` Michael Shavit
  0 siblings, 0 replies; 158+ messages in thread
From: Michael Shavit @ 2023-11-17  4:14 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Nicolin Chen, Shameerali Kolothum Thodi

On Fri, Nov 17, 2023 at 12:47 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Wed, Nov 15, 2023 at 11:32:28PM +0800, Michael Shavit wrote:
>
> > > Lift this code out of arm_smmu_detach_dev() so it can all be sequenced
> > > properly. The only other caller is arm_smmu_release_device() and it is
> > > going to free the cdtable anyhow, so it doesn't matter what is in it.
> > >
> > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > Reviewed-by: Michael Shavit <mshavit@google.com>
> >
> > This patch might be a better fit before the previous one. When going
> > from S1 to S2 or bypass:
> > Pre-both patches, attach_dev() installs a NULL STE, then clears the
> > now unused CDE, then installs a new STE.
> > After the previous patch, attach_dev() clears the *still used* CDE,
> > and then replaces the STE.
> > After this patch, attach_dev() replaces the STE, and then clears the CDE
> >
> > Reordering the two patches removes the scenario where we could hit a
> > NULL-ed CDE.
>
> NULLed = non-valid
>
> I see what you mean, but I haven't thought carefully about a different
> order so I'd rather leave it..
Ack; it's probably too subtle to matter much anyhow.
>
> Regardless of order the two prior patches will have cases that hit
> non-valid/abort STE/CDEs, each step removes a few cases.
>
> Thanks,
> Jason

On Fri, Nov 17, 2023 at 12:47 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Wed, Nov 15, 2023 at 11:32:28PM +0800, Michael Shavit wrote:
>
> > > Lift this code out of arm_smmu_detach_dev() so it can all be sequenced
> > > properly. The only other caller is arm_smmu_release_device() and it is
> > > going to free the cdtable anyhow, so it doesn't matter what is in it.
> > >
> > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > Reviewed-by: Michael Shavit <mshavit@google.com>
> >
> > This patch might be a better fit before the previous one. When going
> > from S1 to S2 or bypass:
> > Pre-both patches, attach_dev() installs a NULL STE, then clears the
> > now unused CDE, then installs a new STE.
> > After the previous patch, attach_dev() clears the *still used* CDE,
> > and then replaces the STE.
> > After this patch, attach_dev() replaces the STE, and then clears the CDE
> >
> > Reordering the two patches removes the scenario where we could hit a
> > NULL-ed CDE.
>
> NULLed = non-valid
>
> I see what you mean, but I haven't thought carefully about a different
> order so I'd rather leave it..
>
> Regardless of order the two prior patches will have cases that hit
> non-valid/abort STE/CDEs, each step removes a few cases.
>
> Thanks,
> Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 02/19] iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent()
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-11-27 15:41     ` Eric Auger
  -1 siblings, 0 replies; 158+ messages in thread
From: Eric Auger @ 2023-11-27 15:41 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, Joerg Roedel, linux-arm-kernel,
	Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Hi Jason,
On 11/13/23 18:53, Jason Gunthorpe wrote:
> The only caller is arm_smmu_install_ste_for_dev() which never has a NULL
> master. Remove the confusing if.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Eric
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 9 ++-------
>  1 file changed, 2 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 519749d15fbda0..9117e769a965e1 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1269,10 +1269,10 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  	 */
>  	u64 val = le64_to_cpu(dst->data[0]);
>  	bool ste_live = false;
> -	struct arm_smmu_device *smmu = NULL;
> +	struct arm_smmu_device *smmu = master->smmu;
>  	struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
>  	struct arm_smmu_s2_cfg *s2_cfg = NULL;
> -	struct arm_smmu_domain *smmu_domain = NULL;
> +	struct arm_smmu_domain *smmu_domain = master->domain;
>  	struct arm_smmu_cmdq_ent prefetch_cmd = {
>  		.opcode		= CMDQ_OP_PREFETCH_CFG,
>  		.prefetch	= {
> @@ -1280,11 +1280,6 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  		},
>  	};
>  
> -	if (master) {
> -		smmu_domain = master->domain;
> -		smmu = master->smmu;
> -	}
> -
>  	if (smmu_domain) {
>  		switch (smmu_domain->stage) {
>  		case ARM_SMMU_DOMAIN_S1:


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 02/19] iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent()
@ 2023-11-27 15:41     ` Eric Auger
  0 siblings, 0 replies; 158+ messages in thread
From: Eric Auger @ 2023-11-27 15:41 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, Joerg Roedel, linux-arm-kernel,
	Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Hi Jason,
On 11/13/23 18:53, Jason Gunthorpe wrote:
> The only caller is arm_smmu_install_ste_for_dev() which never has a NULL
> master. Remove the confusing if.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Eric
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 9 ++-------
>  1 file changed, 2 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 519749d15fbda0..9117e769a965e1 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1269,10 +1269,10 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  	 */
>  	u64 val = le64_to_cpu(dst->data[0]);
>  	bool ste_live = false;
> -	struct arm_smmu_device *smmu = NULL;
> +	struct arm_smmu_device *smmu = master->smmu;
>  	struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
>  	struct arm_smmu_s2_cfg *s2_cfg = NULL;
> -	struct arm_smmu_domain *smmu_domain = NULL;
> +	struct arm_smmu_domain *smmu_domain = master->domain;
>  	struct arm_smmu_cmdq_ent prefetch_cmd = {
>  		.opcode		= CMDQ_OP_PREFETCH_CFG,
>  		.prefetch	= {
> @@ -1280,11 +1280,6 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  		},
>  	};
>  
> -	if (master) {
> -		smmu_domain = master->domain;
> -		smmu = master->smmu;
> -	}
> -
>  	if (smmu_domain) {
>  		switch (smmu_domain->stage) {
>  		case ARM_SMMU_DOMAIN_S1:


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 01/19] iommu/arm-smmu-v3: Add a type for the STE
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-11-27 16:03     ` Eric Auger
  -1 siblings, 0 replies; 158+ messages in thread
From: Eric Auger @ 2023-11-27 16:03 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, Joerg Roedel, linux-arm-kernel,
	Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Hi Jason,

On 11/13/23 18:53, Jason Gunthorpe wrote:
> Instead of passing a naked __le16 * around to represent a STE wrap it in a
> "struct arm_smmu_ste" with an array of the correct size. This makes it
> much clearer which functions will comprise the "STE API".
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 54 ++++++++++-----------
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  7 ++-
>  2 files changed, 32 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 7445454c2af244..519749d15fbda0 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1249,7 +1249,7 @@ static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 sid)
>  }
>  
>  static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> -				      __le64 *dst)
> +				      struct arm_smmu_ste *dst)
>  {
>  	/*
>  	 * This is hideously complicated, but we only really care about
> @@ -1267,7 +1267,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  	 * 2. Write everything apart from dword 0, sync, write dword 0, sync
>  	 * 3. Update Config, sync
>  	 */
> -	u64 val = le64_to_cpu(dst[0]);
> +	u64 val = le64_to_cpu(dst->data[0]);
>  	bool ste_live = false;
>  	struct arm_smmu_device *smmu = NULL;
>  	struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
> @@ -1325,10 +1325,10 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  		else
>  			val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
>  
> -		dst[0] = cpu_to_le64(val);
> -		dst[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
> +		dst->data[0] = cpu_to_le64(val);
> +		dst->data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
>  						STRTAB_STE_1_SHCFG_INCOMING));
> -		dst[2] = 0; /* Nuke the VMID */
> +		dst->data[2] = 0; /* Nuke the VMID */
>  		/*
>  		 * The SMMU can perform negative caching, so we must sync
>  		 * the STE regardless of whether the old value was live.
> @@ -1343,7 +1343,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  			STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
>  
>  		BUG_ON(ste_live);
> -		dst[1] = cpu_to_le64(
> +		dst->data[1] = cpu_to_le64(
>  			 FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
>  			 FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
>  			 FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> @@ -1352,7 +1352,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  
>  		if (smmu->features & ARM_SMMU_FEAT_STALLS &&
>  		    !master->stall_enabled)
> -			dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
> +			dst->data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
>  
>  		val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
>  			FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
> @@ -1362,7 +1362,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  
>  	if (s2_cfg) {
>  		BUG_ON(ste_live);
> -		dst[2] = cpu_to_le64(
> +		dst->data[2] = cpu_to_le64(
>  			 FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
>  			 FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
>  #ifdef __BIG_ENDIAN
> @@ -1371,18 +1371,18 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  			 STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
>  			 STRTAB_STE_2_S2R);
>  
> -		dst[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
> +		dst->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
>  
>  		val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
>  	}
>  
>  	if (master->ats_enabled)
> -		dst[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
> +		dst->data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
>  						 STRTAB_STE_1_EATS_TRANS));
>  
>  	arm_smmu_sync_ste_for_sid(smmu, sid);
>  	/* See comment in arm_smmu_write_ctx_desc() */
> -	WRITE_ONCE(dst[0], cpu_to_le64(val));
> +	WRITE_ONCE(dst->data[0], cpu_to_le64(val));
>  	arm_smmu_sync_ste_for_sid(smmu, sid);
>  
>  	/* It's likely that we'll want to use the new STE soon */
> @@ -1390,7 +1390,8 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  		arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd);
>  }
>  
> -static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent, bool force)
> +static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab,
> +				      unsigned int nent, bool force)
>  {
>  	unsigned int i;
>  	u64 val = STRTAB_STE_0_V;
> @@ -1401,11 +1402,11 @@ static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent, bool fo
>  		val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
>  
>  	for (i = 0; i < nent; ++i) {
> -		strtab[0] = cpu_to_le64(val);
> -		strtab[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
> -						   STRTAB_STE_1_SHCFG_INCOMING));
> -		strtab[2] = 0;
> -		strtab += STRTAB_STE_DWORDS;
> +		strtab->data[0] = cpu_to_le64(val);
> +		strtab->data[1] = cpu_to_le64(FIELD_PREP(
> +			STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
> +		strtab->data[2] = 0;
> +		strtab++;
>  	}
>  }
>  
> @@ -2209,26 +2210,22 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
>  	return 0;
>  }
>  
> -static __le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
> +static struct arm_smmu_ste *
> +arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
>  {
> -	__le64 *step;
>  	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
>  
>  	if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
> -		struct arm_smmu_strtab_l1_desc *l1_desc;
>  		int idx;
>  
>  		/* Two-level walk */
>  		idx = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS;
> -		l1_desc = &cfg->l1_desc[idx];
> -		idx = (sid & ((1 << STRTAB_SPLIT) - 1)) * STRTAB_STE_DWORDS;
> -		step = &l1_desc->l2ptr[idx];
> +		return &cfg->l1_desc[idx].l2ptr[sid & ((1 << STRTAB_SPLIT) - 1)];
This looks less readable to me than it was before.
>  	} else {
>  		/* Simple linear lookup */
> -		step = &cfg->strtab[sid * STRTAB_STE_DWORDS];
> +		return (struct arm_smmu_ste *)&cfg
> +			       ->strtab[sid * STRTAB_STE_DWORDS];

Besides
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Eric


>  	}
> -
> -	return step;
>  }
>  
>  static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
> @@ -2238,7 +2235,8 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
>  
>  	for (i = 0; i < master->num_streams; ++i) {
>  		u32 sid = master->streams[i].id;
> -		__le64 *step = arm_smmu_get_step_for_sid(smmu, sid);
> +		struct arm_smmu_ste *step =
> +			arm_smmu_get_step_for_sid(smmu, sid);
>  
>  		/* Bridged PCI devices may end up with duplicated IDs */
>  		for (j = 0; j < i; j++)
> @@ -3769,7 +3767,7 @@ static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu)
>  	iort_get_rmr_sids(dev_fwnode(smmu->dev), &rmr_list);
>  
>  	list_for_each_entry(e, &rmr_list, list) {
> -		__le64 *step;
> +		struct arm_smmu_ste *step;
>  		struct iommu_iort_rmr_data *rmr;
>  		int ret, i;
>  
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 961205ba86d25d..03f9e526cbd92f 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -206,6 +206,11 @@
>  #define STRTAB_L1_DESC_L2PTR_MASK	GENMASK_ULL(51, 6)
>  
>  #define STRTAB_STE_DWORDS		8
> +
> +struct arm_smmu_ste {
> +	__le64 data[STRTAB_STE_DWORDS];
> +};
> +
>  #define STRTAB_STE_0_V			(1UL << 0)
>  #define STRTAB_STE_0_CFG		GENMASK_ULL(3, 1)
>  #define STRTAB_STE_0_CFG_ABORT		0
> @@ -571,7 +576,7 @@ struct arm_smmu_priq {
>  struct arm_smmu_strtab_l1_desc {
>  	u8				span;
>  
> -	__le64				*l2ptr;
> +	struct arm_smmu_ste		*l2ptr;
>  	dma_addr_t			l2ptr_dma;
>  };
>  


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 01/19] iommu/arm-smmu-v3: Add a type for the STE
@ 2023-11-27 16:03     ` Eric Auger
  0 siblings, 0 replies; 158+ messages in thread
From: Eric Auger @ 2023-11-27 16:03 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, Joerg Roedel, linux-arm-kernel,
	Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Hi Jason,

On 11/13/23 18:53, Jason Gunthorpe wrote:
> Instead of passing a naked __le16 * around to represent a STE wrap it in a
> "struct arm_smmu_ste" with an array of the correct size. This makes it
> much clearer which functions will comprise the "STE API".
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 54 ++++++++++-----------
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  7 ++-
>  2 files changed, 32 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 7445454c2af244..519749d15fbda0 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1249,7 +1249,7 @@ static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 sid)
>  }
>  
>  static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
> -				      __le64 *dst)
> +				      struct arm_smmu_ste *dst)
>  {
>  	/*
>  	 * This is hideously complicated, but we only really care about
> @@ -1267,7 +1267,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  	 * 2. Write everything apart from dword 0, sync, write dword 0, sync
>  	 * 3. Update Config, sync
>  	 */
> -	u64 val = le64_to_cpu(dst[0]);
> +	u64 val = le64_to_cpu(dst->data[0]);
>  	bool ste_live = false;
>  	struct arm_smmu_device *smmu = NULL;
>  	struct arm_smmu_ctx_desc_cfg *cd_table = NULL;
> @@ -1325,10 +1325,10 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  		else
>  			val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
>  
> -		dst[0] = cpu_to_le64(val);
> -		dst[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
> +		dst->data[0] = cpu_to_le64(val);
> +		dst->data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
>  						STRTAB_STE_1_SHCFG_INCOMING));
> -		dst[2] = 0; /* Nuke the VMID */
> +		dst->data[2] = 0; /* Nuke the VMID */
>  		/*
>  		 * The SMMU can perform negative caching, so we must sync
>  		 * the STE regardless of whether the old value was live.
> @@ -1343,7 +1343,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  			STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1;
>  
>  		BUG_ON(ste_live);
> -		dst[1] = cpu_to_le64(
> +		dst->data[1] = cpu_to_le64(
>  			 FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
>  			 FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
>  			 FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) |
> @@ -1352,7 +1352,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  
>  		if (smmu->features & ARM_SMMU_FEAT_STALLS &&
>  		    !master->stall_enabled)
> -			dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
> +			dst->data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
>  
>  		val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) |
>  			FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) |
> @@ -1362,7 +1362,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  
>  	if (s2_cfg) {
>  		BUG_ON(ste_live);
> -		dst[2] = cpu_to_le64(
> +		dst->data[2] = cpu_to_le64(
>  			 FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) |
>  			 FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) |
>  #ifdef __BIG_ENDIAN
> @@ -1371,18 +1371,18 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  			 STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 |
>  			 STRTAB_STE_2_S2R);
>  
> -		dst[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
> +		dst->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK);
>  
>  		val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS);
>  	}
>  
>  	if (master->ats_enabled)
> -		dst[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
> +		dst->data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS,
>  						 STRTAB_STE_1_EATS_TRANS));
>  
>  	arm_smmu_sync_ste_for_sid(smmu, sid);
>  	/* See comment in arm_smmu_write_ctx_desc() */
> -	WRITE_ONCE(dst[0], cpu_to_le64(val));
> +	WRITE_ONCE(dst->data[0], cpu_to_le64(val));
>  	arm_smmu_sync_ste_for_sid(smmu, sid);
>  
>  	/* It's likely that we'll want to use the new STE soon */
> @@ -1390,7 +1390,8 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  		arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd);
>  }
>  
> -static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent, bool force)
> +static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab,
> +				      unsigned int nent, bool force)
>  {
>  	unsigned int i;
>  	u64 val = STRTAB_STE_0_V;
> @@ -1401,11 +1402,11 @@ static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent, bool fo
>  		val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
>  
>  	for (i = 0; i < nent; ++i) {
> -		strtab[0] = cpu_to_le64(val);
> -		strtab[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
> -						   STRTAB_STE_1_SHCFG_INCOMING));
> -		strtab[2] = 0;
> -		strtab += STRTAB_STE_DWORDS;
> +		strtab->data[0] = cpu_to_le64(val);
> +		strtab->data[1] = cpu_to_le64(FIELD_PREP(
> +			STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING));
> +		strtab->data[2] = 0;
> +		strtab++;
>  	}
>  }
>  
> @@ -2209,26 +2210,22 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
>  	return 0;
>  }
>  
> -static __le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
> +static struct arm_smmu_ste *
> +arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
>  {
> -	__le64 *step;
>  	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
>  
>  	if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
> -		struct arm_smmu_strtab_l1_desc *l1_desc;
>  		int idx;
>  
>  		/* Two-level walk */
>  		idx = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS;
> -		l1_desc = &cfg->l1_desc[idx];
> -		idx = (sid & ((1 << STRTAB_SPLIT) - 1)) * STRTAB_STE_DWORDS;
> -		step = &l1_desc->l2ptr[idx];
> +		return &cfg->l1_desc[idx].l2ptr[sid & ((1 << STRTAB_SPLIT) - 1)];
This looks less readable to me than it was before.
>  	} else {
>  		/* Simple linear lookup */
> -		step = &cfg->strtab[sid * STRTAB_STE_DWORDS];
> +		return (struct arm_smmu_ste *)&cfg
> +			       ->strtab[sid * STRTAB_STE_DWORDS];

Besides
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Eric


>  	}
> -
> -	return step;
>  }
>  
>  static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
> @@ -2238,7 +2235,8 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
>  
>  	for (i = 0; i < master->num_streams; ++i) {
>  		u32 sid = master->streams[i].id;
> -		__le64 *step = arm_smmu_get_step_for_sid(smmu, sid);
> +		struct arm_smmu_ste *step =
> +			arm_smmu_get_step_for_sid(smmu, sid);
>  
>  		/* Bridged PCI devices may end up with duplicated IDs */
>  		for (j = 0; j < i; j++)
> @@ -3769,7 +3767,7 @@ static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu)
>  	iort_get_rmr_sids(dev_fwnode(smmu->dev), &rmr_list);
>  
>  	list_for_each_entry(e, &rmr_list, list) {
> -		__le64 *step;
> +		struct arm_smmu_ste *step;
>  		struct iommu_iort_rmr_data *rmr;
>  		int ret, i;
>  
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 961205ba86d25d..03f9e526cbd92f 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -206,6 +206,11 @@
>  #define STRTAB_L1_DESC_L2PTR_MASK	GENMASK_ULL(51, 6)
>  
>  #define STRTAB_STE_DWORDS		8
> +
> +struct arm_smmu_ste {
> +	__le64 data[STRTAB_STE_DWORDS];
> +};
> +
>  #define STRTAB_STE_0_V			(1UL << 0)
>  #define STRTAB_STE_0_CFG		GENMASK_ULL(3, 1)
>  #define STRTAB_STE_0_CFG_ABORT		0
> @@ -571,7 +576,7 @@ struct arm_smmu_priq {
>  struct arm_smmu_strtab_l1_desc {
>  	u8				span;
>  
> -	__le64				*l2ptr;
> +	struct arm_smmu_ste		*l2ptr;
>  	dma_addr_t			l2ptr_dma;
>  };
>  


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* RE: [PATCH v2 00/19] Update SMMUv3 to the modern iommu API (part 1/3)
  2023-11-13 17:53 ` Jason Gunthorpe
@ 2023-11-27 16:10   ` Shameerali Kolothum Thodi
  -1 siblings, 0 replies; 158+ messages in thread
From: Shameerali Kolothum Thodi @ 2023-11-27 16:10 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, Joerg Roedel, linux-arm-kernel,
	Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen



> -----Original Message-----
> From: Jason Gunthorpe [mailto:jgg@nvidia.com]
> Sent: 13 November 2023 17:53
> To: iommu@lists.linux.dev; Joerg Roedel <joro@8bytes.org>;
> linux-arm-kernel@lists.infradead.org; Robin Murphy
> <robin.murphy@arm.com>; Will Deacon <will@kernel.org>
> Cc: Michael Shavit <mshavit@google.com>; Nicolin Chen
> <nicolinc@nvidia.com>; Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@huawei.com>
> Subject: [PATCH v2 00/19] Update SMMUv3 to the modern iommu API (part
> 1/3)
> 
> The SMMUv3 driver was originally written in 2015 when the iommu driver
> facing API looked quite different. The API has evolved, especially lately,
> and the driver has fallen behind.
> 
> This work aims to bring make the SMMUv3 driver the best IOMMU driver
> with
> the most comprehensive implementation of the API. After all parts it
> addresses:
> 
>  - Global static BLOCKED and IDENTITY domains with 'never fail' attach
>    semantics. BLOCKED is desired for efficient VFIO.
> 
>  - Support map before attach for PAGING iommu_domains.
> 
>  - attach_dev failure does not change the HW configuration.
> 
>  - Fully hitless transitions between IDENTITY -> DMA -> IDENTITY.
>    The API has IOMMU_RESV_DIRECT which is expected to be
>    continuously translating.
> 
>  - Safe transitions between PAGING -> BLOCKED, do not ever temporarily
>    do IDENTITY. This is required for iommufd security.
> 
>  - Full PASID API support including:
>     - S1/SVA domains attached to PASIDs
>     - IDENTITY/BLOCKED/S1 attached to RID
>     - Change of the RID domain while PASIDs are attached
> 
>  - Streamlined SVA support using the core infrastructure
> 
>  - Hitless, whenever possible, change between two domains
> 
>  - iommufd IOMMU_GET_HW_INFO, IOMMU_HWPT_ALLOC_NEST_PARENT,
> and
>    IOMMU_DOMAIN_NESTED support
> 
> Over all these things are going to become more accessible to iommufd, and
> exposed to VMs, so it is important for the driver to have a robust
> implementation of the API.
> 
> The work is split into three parts, with this part largely focusing on the
> STE and building up to the BLOCKED & IDENTITY global static domains.
> 
> The second part largely focuses on the CD and builds up to having a common
> PASID infrastructure that SVA and S1 domains equally use.
> 
> The third part has some random cleanups and the iommufd related parts.
> 
> Overall this takes the approach of turning the STE/CD programming upside
> down where the CD/STE value is computed right at a driver callback
> function and then pushed down into programming logic. The programming
> logic hides the details of the required CD/STE tear-less update. This
> makes the CD/STE functions independent of the arm_smmu_domain which
> makes
> it fairly straightforward to untangle all the different call chains, and
> add news ones.
> 
> Further, this frees the arm_smmu_domain related logic from keeping track
> of what state the STE/CD is currently in so it can carefully sequence the
> correct update. There are many new update pairs that are subtly introduced
> as the work progresses.
> 
> The locking to support BTM via arm_smmu_asid_lock is a bit subtle right
> now and patches throughout this work adjust and tighten this so that it is
> clearer and doesn't get broken.
> 
> Once the lower STE layers no longer need to touch arm_smmu_domain we
> can
> isolate struct arm_smmu_domain to be only used for PAGING domains, audit
> all the to_smmu_domain() calls to be only in PAGING domain ops, and
> introduce the normal global static BLOCKED/IDENTITY domains using the
> new
> STE infrastructure. Part 2 will ultimately migrate SVA over to use
> arm_smmu_domain as well.
> 
> All parts are on github:
> 
>  https://github.com/jgunthorpe/linux/commits/smmuv3_newapi

Hi Jason,

I had a go with the above branch on our HiSilicon D06 board(SMMUv3).

Basically covered the following functionality test runs:
-Host kernel: boot with DOMAIN_DMA.
-Host kernel: boot with DOMAIN_IDENTITY.
-Host kernel: ACC dev SVA test run with uadk/uadk_tool benchmark.

And with Qemu branch: https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_cdev_v7

-Guest boot with a n/w VF dev assigned, legacy VFIO mode.
-Guest boot with a n/w VF dev assigned, IOMMUFD mode.
-Device Hot plug(add/del) on both VFIO and IOMMUFD modes.

All the tests seems to be fine so far.

FWIW:
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>

Thanks,
Shameer


^ permalink raw reply	[flat|nested] 158+ messages in thread

* RE: [PATCH v2 00/19] Update SMMUv3 to the modern iommu API (part 1/3)
@ 2023-11-27 16:10   ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 158+ messages in thread
From: Shameerali Kolothum Thodi @ 2023-11-27 16:10 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, Joerg Roedel, linux-arm-kernel,
	Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen



> -----Original Message-----
> From: Jason Gunthorpe [mailto:jgg@nvidia.com]
> Sent: 13 November 2023 17:53
> To: iommu@lists.linux.dev; Joerg Roedel <joro@8bytes.org>;
> linux-arm-kernel@lists.infradead.org; Robin Murphy
> <robin.murphy@arm.com>; Will Deacon <will@kernel.org>
> Cc: Michael Shavit <mshavit@google.com>; Nicolin Chen
> <nicolinc@nvidia.com>; Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@huawei.com>
> Subject: [PATCH v2 00/19] Update SMMUv3 to the modern iommu API (part
> 1/3)
> 
> The SMMUv3 driver was originally written in 2015 when the iommu driver
> facing API looked quite different. The API has evolved, especially lately,
> and the driver has fallen behind.
> 
> This work aims to bring make the SMMUv3 driver the best IOMMU driver
> with
> the most comprehensive implementation of the API. After all parts it
> addresses:
> 
>  - Global static BLOCKED and IDENTITY domains with 'never fail' attach
>    semantics. BLOCKED is desired for efficient VFIO.
> 
>  - Support map before attach for PAGING iommu_domains.
> 
>  - attach_dev failure does not change the HW configuration.
> 
>  - Fully hitless transitions between IDENTITY -> DMA -> IDENTITY.
>    The API has IOMMU_RESV_DIRECT which is expected to be
>    continuously translating.
> 
>  - Safe transitions between PAGING -> BLOCKED, do not ever temporarily
>    do IDENTITY. This is required for iommufd security.
> 
>  - Full PASID API support including:
>     - S1/SVA domains attached to PASIDs
>     - IDENTITY/BLOCKED/S1 attached to RID
>     - Change of the RID domain while PASIDs are attached
> 
>  - Streamlined SVA support using the core infrastructure
> 
>  - Hitless, whenever possible, change between two domains
> 
>  - iommufd IOMMU_GET_HW_INFO, IOMMU_HWPT_ALLOC_NEST_PARENT,
> and
>    IOMMU_DOMAIN_NESTED support
> 
> Over all these things are going to become more accessible to iommufd, and
> exposed to VMs, so it is important for the driver to have a robust
> implementation of the API.
> 
> The work is split into three parts, with this part largely focusing on the
> STE and building up to the BLOCKED & IDENTITY global static domains.
> 
> The second part largely focuses on the CD and builds up to having a common
> PASID infrastructure that SVA and S1 domains equally use.
> 
> The third part has some random cleanups and the iommufd related parts.
> 
> Overall this takes the approach of turning the STE/CD programming upside
> down where the CD/STE value is computed right at a driver callback
> function and then pushed down into programming logic. The programming
> logic hides the details of the required CD/STE tear-less update. This
> makes the CD/STE functions independent of the arm_smmu_domain which
> makes
> it fairly straightforward to untangle all the different call chains, and
> add news ones.
> 
> Further, this frees the arm_smmu_domain related logic from keeping track
> of what state the STE/CD is currently in so it can carefully sequence the
> correct update. There are many new update pairs that are subtly introduced
> as the work progresses.
> 
> The locking to support BTM via arm_smmu_asid_lock is a bit subtle right
> now and patches throughout this work adjust and tighten this so that it is
> clearer and doesn't get broken.
> 
> Once the lower STE layers no longer need to touch arm_smmu_domain we
> can
> isolate struct arm_smmu_domain to be only used for PAGING domains, audit
> all the to_smmu_domain() calls to be only in PAGING domain ops, and
> introduce the normal global static BLOCKED/IDENTITY domains using the
> new
> STE infrastructure. Part 2 will ultimately migrate SVA over to use
> arm_smmu_domain as well.
> 
> All parts are on github:
> 
>  https://github.com/jgunthorpe/linux/commits/smmuv3_newapi

Hi Jason,

I had a go with the above branch on our HiSilicon D06 board(SMMUv3).

Basically covered the following functionality test runs:
-Host kernel: boot with DOMAIN_DMA.
-Host kernel: boot with DOMAIN_IDENTITY.
-Host kernel: ACC dev SVA test run with uadk/uadk_tool benchmark.

And with Qemu branch: https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_cdev_v7

-Guest boot with a n/w VF dev assigned, legacy VFIO mode.
-Guest boot with a n/w VF dev assigned, IOMMUFD mode.
-Device Hot plug(add/del) on both VFIO and IOMMUFD modes.

All the tests seems to be fine so far.

FWIW:
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>

Thanks,
Shameer


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 03/19] iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-11-27 16:35     ` Eric Auger
  -1 siblings, 0 replies; 158+ messages in thread
From: Eric Auger @ 2023-11-27 16:35 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, Joerg Roedel, linux-arm-kernel,
	Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi



On 11/13/23 18:53, Jason Gunthorpe wrote:
> Currently this is exactly the same as ARM_SMMU_DOMAIN_S2, so just remove
> it. The ongoing work to add nesting support through iommufd will do
> something a little different.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Eric
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 4 +---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 -
>  2 files changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 9117e769a965e1..bf7218adbc2822 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1286,7 +1286,6 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  			cd_table = &master->cd_table;
>  			break;
>  		case ARM_SMMU_DOMAIN_S2:
> -		case ARM_SMMU_DOMAIN_NESTED:
>  			s2_cfg = &smmu_domain->s2_cfg;
>  			break;
>  		default:
> @@ -2167,7 +2166,6 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
>  		fmt = ARM_64_LPAE_S1;
>  		finalise_stage_fn = arm_smmu_domain_finalise_s1;
>  		break;
> -	case ARM_SMMU_DOMAIN_NESTED:
>  	case ARM_SMMU_DOMAIN_S2:
>  		ias = smmu->ias;
>  		oas = smmu->oas;
> @@ -2735,7 +2733,7 @@ static int arm_smmu_enable_nesting(struct iommu_domain *domain)
>  	if (smmu_domain->smmu)
>  		ret = -EPERM;
>  	else
> -		smmu_domain->stage = ARM_SMMU_DOMAIN_NESTED;
> +		smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
>  	mutex_unlock(&smmu_domain->init_mutex);
>  
>  	return ret;
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 03f9e526cbd92f..27ddf1acd12cea 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -715,7 +715,6 @@ struct arm_smmu_master {
>  enum arm_smmu_domain_stage {
>  	ARM_SMMU_DOMAIN_S1 = 0,
>  	ARM_SMMU_DOMAIN_S2,
> -	ARM_SMMU_DOMAIN_NESTED,
>  	ARM_SMMU_DOMAIN_BYPASS,
>  };
>  


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 03/19] iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED
@ 2023-11-27 16:35     ` Eric Auger
  0 siblings, 0 replies; 158+ messages in thread
From: Eric Auger @ 2023-11-27 16:35 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, Joerg Roedel, linux-arm-kernel,
	Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi



On 11/13/23 18:53, Jason Gunthorpe wrote:
> Currently this is exactly the same as ARM_SMMU_DOMAIN_S2, so just remove
> it. The ongoing work to add nesting support through iommufd will do
> something a little different.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Eric
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 4 +---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 -
>  2 files changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 9117e769a965e1..bf7218adbc2822 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1286,7 +1286,6 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
>  			cd_table = &master->cd_table;
>  			break;
>  		case ARM_SMMU_DOMAIN_S2:
> -		case ARM_SMMU_DOMAIN_NESTED:
>  			s2_cfg = &smmu_domain->s2_cfg;
>  			break;
>  		default:
> @@ -2167,7 +2166,6 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
>  		fmt = ARM_64_LPAE_S1;
>  		finalise_stage_fn = arm_smmu_domain_finalise_s1;
>  		break;
> -	case ARM_SMMU_DOMAIN_NESTED:
>  	case ARM_SMMU_DOMAIN_S2:
>  		ias = smmu->ias;
>  		oas = smmu->oas;
> @@ -2735,7 +2733,7 @@ static int arm_smmu_enable_nesting(struct iommu_domain *domain)
>  	if (smmu_domain->smmu)
>  		ret = -EPERM;
>  	else
> -		smmu_domain->stage = ARM_SMMU_DOMAIN_NESTED;
> +		smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
>  	mutex_unlock(&smmu_domain->init_mutex);
>  
>  	return ret;
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 03f9e526cbd92f..27ddf1acd12cea 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -715,7 +715,6 @@ struct arm_smmu_master {
>  enum arm_smmu_domain_stage {
>  	ARM_SMMU_DOMAIN_S1 = 0,
>  	ARM_SMMU_DOMAIN_S2,
> -	ARM_SMMU_DOMAIN_NESTED,
>  	ARM_SMMU_DOMAIN_BYPASS,
>  };
>  


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 14/19] iommu/arm-smmu-v3: Remove arm_smmu_master->domain
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-11-27 17:14     ` Eric Auger
  -1 siblings, 0 replies; 158+ messages in thread
From: Eric Auger @ 2023-11-27 17:14 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, Joerg Roedel, linux-arm-kernel,
	Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Hi Jason,

On 11/13/23 18:53, Jason Gunthorpe wrote:
> Introducing global statics which are of type struct iommu_domain, not
> struct arm_smmu_domain makes it difficult to retain
> arm_smmu_master->domain, as it can no longer point to an IDENTITY or
> BLOCKED domain.
> 
> The only place that uses the value is arm_smmu_detach_dev(). Change things
> to work like other drivers and call iommu_get_domain_for_dev() to obtain
> the current domain.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

This patch introduces a crash on my machine. See below.

Eric

[    7.209854] list_del corruption, ffff007f82e5f890->next is NULL
[    7.216154] ------------[ cut here ]------------
[    7.220952] kernel BUG at lib/list_debug.c:52!
[    7.225750] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
[    7.232188] Modules linked in:
[    7.235407] CPU: 3 PID: 263 Comm: kworker/u97:1 Not tainted
6.7.0-rc2-upstream+ #11
[    7.243598] Hardware name: FUJITSU FX700/CMUA        , BIOS 1.8.0 Nov
26 2021
[    7.251073] Workqueue: events_unbound deferred_probe_work_func
[    7.257452] pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS
BTYPE=--)
[    7.264779] pc : __list_del_entry_valid_or_report+0x6c/0xd8
[    7.270736] lr : __list_del_entry_valid_or_report+0x6c/0xd8
[    7.276689] sp : ffff800089def970
[    7.280164] x29: ffff800089def970 x28: ffff8000820f19c0 x27:
ffff007f82e01248
[    7.287869] x26: ffff007f80030f00 x25: ffff007f82e01268 x24:
ffff007f82e00900
[    7.295372] x23: ffff800082580cf8 x22: 0000000000000000 x21:
0000000000000000
[    7.303075] x20: ffff007f82e009d8 x19: ffff007f82e5f880 x18:
ffffffffffffffff
[    7.310773] x17: 0000000000000001 x16: 0000000000000040 x15:
ffff800109def597
[    7.318254] x14: 0000000000000000 x13: 4c4c554e20736920 x12:
7478656e3e2d3039
[    7.325956] x11: 00000000ffff7fff x10: 00000000ffff7fff x9 :
ffff80008012cee4
[    7.333463] x8 : 00000000000bffe8 x7 : c0000000ffff7fff x6 :
00000000002bffa8
[    7.341161] x5 : 0000000000007fff x4 : 0000000000000000 x3 :
0000000000000000
[    7.348636] x2 : 0000000000000000 x1 : ffff007f824bc840 x0 :
0000000000000033
[    7.356329] Call trace:
[    7.358961]  __list_del_entry_valid_or_report+0x6c/0xd8
[    7.364556]  arm_smmu_detach_dev+0x4c/0x128
[    7.368908]  arm_smmu_attach_dev+0xe0/0x580
[    7.373475]  __iommu_attach_device+0x28/0xf8
[    7.377934]  __iommu_device_set_domain+0x74/0xd8
[    7.382900]  __iommu_probe_device+0x15c/0x270
[    7.387423]  iommu_probe_device+0x20/0x60
[    7.391819]  acpi_dma_configure_id+0xc4/0x150
[    7.396365]  pci_dma_configure+0xe8/0xf8
[    7.400671]  really_probe+0x78/0x3d0
[    7.404412]  __driver_probe_device+0x80/0x178
[    7.409153]  driver_probe_device+0x44/0x120
[    7.413499]  __device_attach_driver+0xb8/0x158
[    7.418369]  bus_for_each_drv+0x84/0xe8
[    7.422387]  __device_attach+0xac/0x1e0
[    7.426407]  device_initial_probe+0x18/0x28
[    7.430946]  bus_probe_device+0xa8/0xb8
[    7.434942]  deferred_probe_work_func+0xb8/0x110
[    7.439918]  process_one_work+0x174/0x3c8
[    7.444110]  worker_thread+0x2c8/0x3e0
[    7.448222]  kthread+0x100/0x110
[    7.451616]  ret_from_fork+0x10/0x20
[    7.455386] Code: d65f03c0 b00062a0 910d2000 97ec9112 (d4210000)
[    7.461832] ---[ end trace 0000000000000000 ]---
[    7.466827] Kernel panic - not syncing: Oops - BUG: Fatal exception
[    7.473439] SMP: stopping secondary CPUs
[    7.477729] Kernel Offset: disabled
[    7.481399] CPU features: 0x0,00000001,90028144,21017203
[    7.487061] Memory Limit: none
[    7.490280] ---[ end Kernel panic - not syncing: Oops - BUG: Fatal
exception ]---

> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 21 +++++++--------------
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  1 -
>  2 files changed, 7 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 7d2dd3ea47ab68..23dda64722ea17 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2480,19 +2480,20 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master)
>  
>  static void arm_smmu_detach_dev(struct arm_smmu_master *master)
>  {
> +	struct iommu_domain *domain = iommu_get_domain_for_dev(master->dev);
> +	struct arm_smmu_domain *smmu_domain;
>  	unsigned long flags;
> -	struct arm_smmu_domain *smmu_domain = master->domain;
>  
> -	if (!smmu_domain)
> +	if (!domain)
>  		return;
>  
> +	smmu_domain = to_smmu_domain(domain);
>  	arm_smmu_disable_ats(master, smmu_domain);
>  
>  	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
>  	list_del(&master->domain_head);
>  	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
>  
> -	master->domain = NULL;
>  	master->ats_enabled = false;
>  }
>  
> @@ -2546,8 +2547,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>  
>  	arm_smmu_detach_dev(master);
>  
> -	master->domain = smmu_domain;
> -
>  	/*
>  	 * The SMMU does not support enabling ATS with bypass. When the STE is
>  	 * in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests and
> @@ -2566,10 +2565,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>  	case ARM_SMMU_DOMAIN_S1:
>  		if (!master->cd_table.cdtab) {
>  			ret = arm_smmu_alloc_cd_tables(master);
> -			if (ret) {
> -				master->domain = NULL;
> +			if (ret)
>  				goto out_list_del;
> -			}
>  		} else {
>  			/*
>  			 * arm_smmu_write_ctx_desc() relies on the entry being
> @@ -2577,17 +2574,13 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>  			 */
>  			ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
>  						      NULL);
> -			if (ret) {
> -				master->domain = NULL;
> +			if (ret)
>  				goto out_list_del;
> -			}
>  		}
>  
>  		ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd);
> -		if (ret) {
> -			master->domain = NULL;
> +		if (ret)
>  			goto out_list_del;
> -		}
>  
>  		arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
>  		arm_smmu_install_ste_for_dev(master, &target);
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 1be0c1151c50c3..21f2f73501019a 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -695,7 +695,6 @@ struct arm_smmu_stream {
>  struct arm_smmu_master {
>  	struct arm_smmu_device		*smmu;
>  	struct device			*dev;
> -	struct arm_smmu_domain		*domain;
>  	struct list_head		domain_head;
>  	struct arm_smmu_stream		*streams;
>  	/* Locked by the iommu core using the group mutex */


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 14/19] iommu/arm-smmu-v3: Remove arm_smmu_master->domain
@ 2023-11-27 17:14     ` Eric Auger
  0 siblings, 0 replies; 158+ messages in thread
From: Eric Auger @ 2023-11-27 17:14 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, Joerg Roedel, linux-arm-kernel,
	Robin Murphy, Will Deacon
  Cc: Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Hi Jason,

On 11/13/23 18:53, Jason Gunthorpe wrote:
> Introducing global statics which are of type struct iommu_domain, not
> struct arm_smmu_domain makes it difficult to retain
> arm_smmu_master->domain, as it can no longer point to an IDENTITY or
> BLOCKED domain.
> 
> The only place that uses the value is arm_smmu_detach_dev(). Change things
> to work like other drivers and call iommu_get_domain_for_dev() to obtain
> the current domain.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

This patch introduces a crash on my machine. See below.

Eric

[    7.209854] list_del corruption, ffff007f82e5f890->next is NULL
[    7.216154] ------------[ cut here ]------------
[    7.220952] kernel BUG at lib/list_debug.c:52!
[    7.225750] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
[    7.232188] Modules linked in:
[    7.235407] CPU: 3 PID: 263 Comm: kworker/u97:1 Not tainted
6.7.0-rc2-upstream+ #11
[    7.243598] Hardware name: FUJITSU FX700/CMUA        , BIOS 1.8.0 Nov
26 2021
[    7.251073] Workqueue: events_unbound deferred_probe_work_func
[    7.257452] pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS
BTYPE=--)
[    7.264779] pc : __list_del_entry_valid_or_report+0x6c/0xd8
[    7.270736] lr : __list_del_entry_valid_or_report+0x6c/0xd8
[    7.276689] sp : ffff800089def970
[    7.280164] x29: ffff800089def970 x28: ffff8000820f19c0 x27:
ffff007f82e01248
[    7.287869] x26: ffff007f80030f00 x25: ffff007f82e01268 x24:
ffff007f82e00900
[    7.295372] x23: ffff800082580cf8 x22: 0000000000000000 x21:
0000000000000000
[    7.303075] x20: ffff007f82e009d8 x19: ffff007f82e5f880 x18:
ffffffffffffffff
[    7.310773] x17: 0000000000000001 x16: 0000000000000040 x15:
ffff800109def597
[    7.318254] x14: 0000000000000000 x13: 4c4c554e20736920 x12:
7478656e3e2d3039
[    7.325956] x11: 00000000ffff7fff x10: 00000000ffff7fff x9 :
ffff80008012cee4
[    7.333463] x8 : 00000000000bffe8 x7 : c0000000ffff7fff x6 :
00000000002bffa8
[    7.341161] x5 : 0000000000007fff x4 : 0000000000000000 x3 :
0000000000000000
[    7.348636] x2 : 0000000000000000 x1 : ffff007f824bc840 x0 :
0000000000000033
[    7.356329] Call trace:
[    7.358961]  __list_del_entry_valid_or_report+0x6c/0xd8
[    7.364556]  arm_smmu_detach_dev+0x4c/0x128
[    7.368908]  arm_smmu_attach_dev+0xe0/0x580
[    7.373475]  __iommu_attach_device+0x28/0xf8
[    7.377934]  __iommu_device_set_domain+0x74/0xd8
[    7.382900]  __iommu_probe_device+0x15c/0x270
[    7.387423]  iommu_probe_device+0x20/0x60
[    7.391819]  acpi_dma_configure_id+0xc4/0x150
[    7.396365]  pci_dma_configure+0xe8/0xf8
[    7.400671]  really_probe+0x78/0x3d0
[    7.404412]  __driver_probe_device+0x80/0x178
[    7.409153]  driver_probe_device+0x44/0x120
[    7.413499]  __device_attach_driver+0xb8/0x158
[    7.418369]  bus_for_each_drv+0x84/0xe8
[    7.422387]  __device_attach+0xac/0x1e0
[    7.426407]  device_initial_probe+0x18/0x28
[    7.430946]  bus_probe_device+0xa8/0xb8
[    7.434942]  deferred_probe_work_func+0xb8/0x110
[    7.439918]  process_one_work+0x174/0x3c8
[    7.444110]  worker_thread+0x2c8/0x3e0
[    7.448222]  kthread+0x100/0x110
[    7.451616]  ret_from_fork+0x10/0x20
[    7.455386] Code: d65f03c0 b00062a0 910d2000 97ec9112 (d4210000)
[    7.461832] ---[ end trace 0000000000000000 ]---
[    7.466827] Kernel panic - not syncing: Oops - BUG: Fatal exception
[    7.473439] SMP: stopping secondary CPUs
[    7.477729] Kernel Offset: disabled
[    7.481399] CPU features: 0x0,00000001,90028144,21017203
[    7.487061] Memory Limit: none
[    7.490280] ---[ end Kernel panic - not syncing: Oops - BUG: Fatal
exception ]---

> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 21 +++++++--------------
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  1 -
>  2 files changed, 7 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 7d2dd3ea47ab68..23dda64722ea17 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2480,19 +2480,20 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master)
>  
>  static void arm_smmu_detach_dev(struct arm_smmu_master *master)
>  {
> +	struct iommu_domain *domain = iommu_get_domain_for_dev(master->dev);
> +	struct arm_smmu_domain *smmu_domain;
>  	unsigned long flags;
> -	struct arm_smmu_domain *smmu_domain = master->domain;
>  
> -	if (!smmu_domain)
> +	if (!domain)
>  		return;
>  
> +	smmu_domain = to_smmu_domain(domain);
>  	arm_smmu_disable_ats(master, smmu_domain);
>  
>  	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
>  	list_del(&master->domain_head);
>  	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
>  
> -	master->domain = NULL;
>  	master->ats_enabled = false;
>  }
>  
> @@ -2546,8 +2547,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>  
>  	arm_smmu_detach_dev(master);
>  
> -	master->domain = smmu_domain;
> -
>  	/*
>  	 * The SMMU does not support enabling ATS with bypass. When the STE is
>  	 * in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests and
> @@ -2566,10 +2565,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>  	case ARM_SMMU_DOMAIN_S1:
>  		if (!master->cd_table.cdtab) {
>  			ret = arm_smmu_alloc_cd_tables(master);
> -			if (ret) {
> -				master->domain = NULL;
> +			if (ret)
>  				goto out_list_del;
> -			}
>  		} else {
>  			/*
>  			 * arm_smmu_write_ctx_desc() relies on the entry being
> @@ -2577,17 +2574,13 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>  			 */
>  			ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
>  						      NULL);
> -			if (ret) {
> -				master->domain = NULL;
> +			if (ret)
>  				goto out_list_del;
> -			}
>  		}
>  
>  		ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd);
> -		if (ret) {
> -			master->domain = NULL;
> +		if (ret)
>  			goto out_list_del;
> -		}
>  
>  		arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
>  		arm_smmu_install_ste_for_dev(master, &target);
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 1be0c1151c50c3..21f2f73501019a 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -695,7 +695,6 @@ struct arm_smmu_stream {
>  struct arm_smmu_master {
>  	struct arm_smmu_device		*smmu;
>  	struct device			*dev;
> -	struct arm_smmu_domain		*domain;
>  	struct list_head		domain_head;
>  	struct arm_smmu_stream		*streams;
>  	/* Locked by the iommu core using the group mutex */


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 01/19] iommu/arm-smmu-v3: Add a type for the STE
  2023-11-27 16:03     ` Eric Auger
@ 2023-11-27 17:42       ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-27 17:42 UTC (permalink / raw)
  To: Eric Auger
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

On Mon, Nov 27, 2023 at 05:03:36PM +0100, Eric Auger wrote:

> > -static __le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
> > +static struct arm_smmu_ste *
> > +arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
> >  {
> > -	__le64 *step;
> >  	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
> >  
> >  	if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
> > -		struct arm_smmu_strtab_l1_desc *l1_desc;
> >  		int idx;
> >  
> >  		/* Two-level walk */
> >  		idx = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS;
> > -		l1_desc = &cfg->l1_desc[idx];
> > -		idx = (sid & ((1 << STRTAB_SPLIT) - 1)) * STRTAB_STE_DWORDS;
> > -		step = &l1_desc->l2ptr[idx];
> > +		return &cfg->l1_desc[idx].l2ptr[sid & ((1 << STRTAB_SPLIT) - 1)];
> This looks less readable to me than it was before.

You would like the idx calculation outside?

  		idx1 = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS;
		idx2 = sid & ((1 << STRTAB_SPLIT) - 1);
		return &cfg->l1_desc[idx].l2ptr[idx2];

?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 01/19] iommu/arm-smmu-v3: Add a type for the STE
@ 2023-11-27 17:42       ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-27 17:42 UTC (permalink / raw)
  To: Eric Auger
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

On Mon, Nov 27, 2023 at 05:03:36PM +0100, Eric Auger wrote:

> > -static __le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
> > +static struct arm_smmu_ste *
> > +arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
> >  {
> > -	__le64 *step;
> >  	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
> >  
> >  	if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
> > -		struct arm_smmu_strtab_l1_desc *l1_desc;
> >  		int idx;
> >  
> >  		/* Two-level walk */
> >  		idx = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS;
> > -		l1_desc = &cfg->l1_desc[idx];
> > -		idx = (sid & ((1 << STRTAB_SPLIT) - 1)) * STRTAB_STE_DWORDS;
> > -		step = &l1_desc->l2ptr[idx];
> > +		return &cfg->l1_desc[idx].l2ptr[sid & ((1 << STRTAB_SPLIT) - 1)];
> This looks less readable to me than it was before.

You would like the idx calculation outside?

  		idx1 = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS;
		idx2 = sid & ((1 << STRTAB_SPLIT) - 1);
		return &cfg->l1_desc[idx].l2ptr[idx2];

?

Thanks,
Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 00/19] Update SMMUv3 to the modern iommu API (part 1/3)
  2023-11-27 16:10   ` Shameerali Kolothum Thodi
@ 2023-11-27 17:48     ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-27 17:48 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Nicolin Chen

On Mon, Nov 27, 2023 at 04:10:33PM +0000, Shameerali Kolothum Thodi wrote:
> 
> I had a go with the above branch on our HiSilicon D06 board(SMMUv3).
> 
> Basically covered the following functionality test runs:
> -Host kernel: boot with DOMAIN_DMA.
> -Host kernel: boot with DOMAIN_IDENTITY.
> -Host kernel: ACC dev SVA test run with uadk/uadk_tool benchmark.
> 
> And with Qemu branch: https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_cdev_v7
> 
> -Guest boot with a n/w VF dev assigned, legacy VFIO mode.
> -Guest boot with a n/w VF dev assigned, IOMMUFD mode.
> -Device Hot plug(add/del) on both VFIO and IOMMUFD modes.
> 
> All the tests seems to be fine so far.
> 
> FWIW:
> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>

Great, thanks!

Jason

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 00/19] Update SMMUv3 to the modern iommu API (part 1/3)
@ 2023-11-27 17:48     ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-27 17:48 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Nicolin Chen

On Mon, Nov 27, 2023 at 04:10:33PM +0000, Shameerali Kolothum Thodi wrote:
> 
> I had a go with the above branch on our HiSilicon D06 board(SMMUv3).
> 
> Basically covered the following functionality test runs:
> -Host kernel: boot with DOMAIN_DMA.
> -Host kernel: boot with DOMAIN_IDENTITY.
> -Host kernel: ACC dev SVA test run with uadk/uadk_tool benchmark.
> 
> And with Qemu branch: https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_cdev_v7
> 
> -Guest boot with a n/w VF dev assigned, legacy VFIO mode.
> -Guest boot with a n/w VF dev assigned, IOMMUFD mode.
> -Device Hot plug(add/del) on both VFIO and IOMMUFD modes.
> 
> All the tests seems to be fine so far.
> 
> FWIW:
> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>

Great, thanks!

Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 01/19] iommu/arm-smmu-v3: Add a type for the STE
  2023-11-27 17:42       ` Jason Gunthorpe
@ 2023-11-27 17:51         ` Eric Auger
  -1 siblings, 0 replies; 158+ messages in thread
From: Eric Auger @ 2023-11-27 17:51 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Hi Jason,

On 11/27/23 18:42, Jason Gunthorpe wrote:
> On Mon, Nov 27, 2023 at 05:03:36PM +0100, Eric Auger wrote:
> 
>>> -static __le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
>>> +static struct arm_smmu_ste *
>>> +arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
>>>  {
>>> -	__le64 *step;
>>>  	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
>>>  
>>>  	if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
>>> -		struct arm_smmu_strtab_l1_desc *l1_desc;
>>>  		int idx;
>>>  
>>>  		/* Two-level walk */
>>>  		idx = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS;
>>> -		l1_desc = &cfg->l1_desc[idx];
>>> -		idx = (sid & ((1 << STRTAB_SPLIT) - 1)) * STRTAB_STE_DWORDS;
>>> -		step = &l1_desc->l2ptr[idx];
>>> +		return &cfg->l1_desc[idx].l2ptr[sid & ((1 << STRTAB_SPLIT) - 1)];
>> This looks less readable to me than it was before.
> 
> You would like the idx calculation outside?
> 
>   		idx1 = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS;
> 		idx2 = sid & ((1 << STRTAB_SPLIT) - 1);
> 		return &cfg->l1_desc[idx].l2ptr[idx2];
Yes this looks more readable to me

Eric
> 
> ?
> 
> Thanks,
> Jason
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 01/19] iommu/arm-smmu-v3: Add a type for the STE
@ 2023-11-27 17:51         ` Eric Auger
  0 siblings, 0 replies; 158+ messages in thread
From: Eric Auger @ 2023-11-27 17:51 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Hi Jason,

On 11/27/23 18:42, Jason Gunthorpe wrote:
> On Mon, Nov 27, 2023 at 05:03:36PM +0100, Eric Auger wrote:
> 
>>> -static __le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
>>> +static struct arm_smmu_ste *
>>> +arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
>>>  {
>>> -	__le64 *step;
>>>  	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
>>>  
>>>  	if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
>>> -		struct arm_smmu_strtab_l1_desc *l1_desc;
>>>  		int idx;
>>>  
>>>  		/* Two-level walk */
>>>  		idx = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS;
>>> -		l1_desc = &cfg->l1_desc[idx];
>>> -		idx = (sid & ((1 << STRTAB_SPLIT) - 1)) * STRTAB_STE_DWORDS;
>>> -		step = &l1_desc->l2ptr[idx];
>>> +		return &cfg->l1_desc[idx].l2ptr[sid & ((1 << STRTAB_SPLIT) - 1)];
>> This looks less readable to me than it was before.
> 
> You would like the idx calculation outside?
> 
>   		idx1 = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS;
> 		idx2 = sid & ((1 << STRTAB_SPLIT) - 1);
> 		return &cfg->l1_desc[idx].l2ptr[idx2];
Yes this looks more readable to me

Eric
> 
> ?
> 
> Thanks,
> Jason
> 


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 01/19] iommu/arm-smmu-v3: Add a type for the STE
  2023-11-27 17:51         ` Eric Auger
@ 2023-11-27 18:21           ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-27 18:21 UTC (permalink / raw)
  To: Eric Auger
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

On Mon, Nov 27, 2023 at 06:51:21PM +0100, Eric Auger wrote:
> Hi Jason,
> 
> On 11/27/23 18:42, Jason Gunthorpe wrote:
> > On Mon, Nov 27, 2023 at 05:03:36PM +0100, Eric Auger wrote:
> > 
> >>> -static __le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
> >>> +static struct arm_smmu_ste *
> >>> +arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
> >>>  {
> >>> -	__le64 *step;
> >>>  	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
> >>>  
> >>>  	if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
> >>> -		struct arm_smmu_strtab_l1_desc *l1_desc;
> >>>  		int idx;
> >>>  
> >>>  		/* Two-level walk */
> >>>  		idx = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS;
> >>> -		l1_desc = &cfg->l1_desc[idx];
> >>> -		idx = (sid & ((1 << STRTAB_SPLIT) - 1)) * STRTAB_STE_DWORDS;
> >>> -		step = &l1_desc->l2ptr[idx];
> >>> +		return &cfg->l1_desc[idx].l2ptr[sid & ((1 << STRTAB_SPLIT) - 1)];
> >> This looks less readable to me than it was before.
> > 
> > You would like the idx calculation outside?
> > 
> >   		idx1 = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS;
> > 		idx2 = sid & ((1 << STRTAB_SPLIT) - 1);
> > 		return &cfg->l1_desc[idx].l2ptr[idx2];
> Yes this looks more readable to me

done

Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 01/19] iommu/arm-smmu-v3: Add a type for the STE
@ 2023-11-27 18:21           ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-27 18:21 UTC (permalink / raw)
  To: Eric Auger
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

On Mon, Nov 27, 2023 at 06:51:21PM +0100, Eric Auger wrote:
> Hi Jason,
> 
> On 11/27/23 18:42, Jason Gunthorpe wrote:
> > On Mon, Nov 27, 2023 at 05:03:36PM +0100, Eric Auger wrote:
> > 
> >>> -static __le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
> >>> +static struct arm_smmu_ste *
> >>> +arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
> >>>  {
> >>> -	__le64 *step;
> >>>  	struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
> >>>  
> >>>  	if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
> >>> -		struct arm_smmu_strtab_l1_desc *l1_desc;
> >>>  		int idx;
> >>>  
> >>>  		/* Two-level walk */
> >>>  		idx = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS;
> >>> -		l1_desc = &cfg->l1_desc[idx];
> >>> -		idx = (sid & ((1 << STRTAB_SPLIT) - 1)) * STRTAB_STE_DWORDS;
> >>> -		step = &l1_desc->l2ptr[idx];
> >>> +		return &cfg->l1_desc[idx].l2ptr[sid & ((1 << STRTAB_SPLIT) - 1)];
> >> This looks less readable to me than it was before.
> > 
> > You would like the idx calculation outside?
> > 
> >   		idx1 = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS;
> > 		idx2 = sid & ((1 << STRTAB_SPLIT) - 1);
> > 		return &cfg->l1_desc[idx].l2ptr[idx2];
> Yes this looks more readable to me

done

Jason

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 14/19] iommu/arm-smmu-v3: Remove arm_smmu_master->domain
  2023-11-27 17:14     ` Eric Auger
@ 2023-11-30 12:03       ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-30 12:03 UTC (permalink / raw)
  To: Eric Auger
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

On Mon, Nov 27, 2023 at 06:14:30PM +0100, Eric Auger wrote:
> Hi Jason,
> 
> On 11/13/23 18:53, Jason Gunthorpe wrote:
> > Introducing global statics which are of type struct iommu_domain, not
> > struct arm_smmu_domain makes it difficult to retain
> > arm_smmu_master->domain, as it can no longer point to an IDENTITY or
> > BLOCKED domain.
> > 
> > The only place that uses the value is arm_smmu_detach_dev(). Change things
> > to work like other drivers and call iommu_get_domain_for_dev() to obtain
> > the current domain.
> > 
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> 
> This patch introduces a crash on my machine. See below.

Ah, your system must have multi-device groups

The master->domain was subtly protecting the domain_head to ensure
we don't touch it unless it is already in a domain list. This issue is
solved in part 2 (iommu/arm-smmu-v3: Make smmu_domain->devices into an
allocated list) which removes the domain_head.

This hunk should fix this patch. I updated the github

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 23dda64722ea17..102e13b65bcdec 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2491,7 +2491,7 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
 	arm_smmu_disable_ats(master, smmu_domain);
 
 	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
-	list_del(&master->domain_head);
+	list_del_init(&master->domain_head);
 	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
 
 	master->ats_enabled = false;
@@ -2606,7 +2606,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 
 out_list_del:
 	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
-	list_del(&master->domain_head);
+	list_del_init(&master->domain_head);
 	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
 
 out_unlock:
@@ -2810,6 +2810,7 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
 	master->dev = dev;
 	master->smmu = smmu;
 	INIT_LIST_HEAD(&master->bonds);
+	INIT_LIST_HEAD(&master->domain_head);
 	dev_iommu_priv_set(dev, master);
 
 	ret = arm_smmu_insert_master(smmu, master);



Thank!!
Jason

^ permalink raw reply related	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 14/19] iommu/arm-smmu-v3: Remove arm_smmu_master->domain
@ 2023-11-30 12:03       ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-11-30 12:03 UTC (permalink / raw)
  To: Eric Auger
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

On Mon, Nov 27, 2023 at 06:14:30PM +0100, Eric Auger wrote:
> Hi Jason,
> 
> On 11/13/23 18:53, Jason Gunthorpe wrote:
> > Introducing global statics which are of type struct iommu_domain, not
> > struct arm_smmu_domain makes it difficult to retain
> > arm_smmu_master->domain, as it can no longer point to an IDENTITY or
> > BLOCKED domain.
> > 
> > The only place that uses the value is arm_smmu_detach_dev(). Change things
> > to work like other drivers and call iommu_get_domain_for_dev() to obtain
> > the current domain.
> > 
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> 
> This patch introduces a crash on my machine. See below.

Ah, your system must have multi-device groups

The master->domain was subtly protecting the domain_head to ensure
we don't touch it unless it is already in a domain list. This issue is
solved in part 2 (iommu/arm-smmu-v3: Make smmu_domain->devices into an
allocated list) which removes the domain_head.

This hunk should fix this patch. I updated the github

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 23dda64722ea17..102e13b65bcdec 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2491,7 +2491,7 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
 	arm_smmu_disable_ats(master, smmu_domain);
 
 	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
-	list_del(&master->domain_head);
+	list_del_init(&master->domain_head);
 	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
 
 	master->ats_enabled = false;
@@ -2606,7 +2606,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 
 out_list_del:
 	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
-	list_del(&master->domain_head);
+	list_del_init(&master->domain_head);
 	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
 
 out_unlock:
@@ -2810,6 +2810,7 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
 	master->dev = dev;
 	master->smmu = smmu;
 	INIT_LIST_HEAD(&master->bonds);
+	INIT_LIST_HEAD(&master->domain_head);
 	dev_iommu_priv_set(dev, master);
 
 	ret = arm_smmu_insert_master(smmu, master);



Thank!!
Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 01/19] iommu/arm-smmu-v3: Add a type for the STE
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-12-05  0:44     ` Nicolin Chen
  -1 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  0:44 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:08PM -0400, Jason Gunthorpe wrote:
> Instead of passing a naked __le16 * around to represent a STE wrap it in a
> "struct arm_smmu_ste" with an array of the correct size. This makes it
> much clearer which functions will comprise the "STE API".
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 01/19] iommu/arm-smmu-v3: Add a type for the STE
@ 2023-12-05  0:44     ` Nicolin Chen
  0 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  0:44 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:08PM -0400, Jason Gunthorpe wrote:
> Instead of passing a naked __le16 * around to represent a STE wrap it in a
> "struct arm_smmu_ste" with an array of the correct size. This makes it
> much clearer which functions will comprise the "STE API".
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 02/19] iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent()
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-12-05  0:45     ` Nicolin Chen
  -1 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  0:45 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:09PM -0400, Jason Gunthorpe wrote:
> The only caller is arm_smmu_install_ste_for_dev() which never has a NULL
> master. Remove the confusing if.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 02/19] iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent()
@ 2023-12-05  0:45     ` Nicolin Chen
  0 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  0:45 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:09PM -0400, Jason Gunthorpe wrote:
> The only caller is arm_smmu_install_ste_for_dev() which never has a NULL
> master. Remove the confusing if.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 03/19] iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-12-05  0:46     ` Nicolin Chen
  -1 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  0:46 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:10PM -0400, Jason Gunthorpe wrote:
> Currently this is exactly the same as ARM_SMMU_DOMAIN_S2, so just remove
> it. The ongoing work to add nesting support through iommufd will do
> something a little different.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 03/19] iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED
@ 2023-12-05  0:46     ` Nicolin Chen
  0 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  0:46 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:10PM -0400, Jason Gunthorpe wrote:
> Currently this is exactly the same as ARM_SMMU_DOMAIN_S2, so just remove
> it. The ongoing work to add nesting support through iommufd will do
> something a little different.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 04/19] iommu/arm-smmu-v3: Make STE programming independent of the callers
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-12-05  1:38     ` Nicolin Chen
  -1 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  1:38 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:11PM -0400, Jason Gunthorpe wrote:
> As the comment in arm_smmu_write_strtab_ent() explains, this routine has
> been limited to only work correctly in certain scenarios that the caller
> must ensure. Generally the caller must put the STE into ABORT or BYPASS
> before attempting to program it to something else.
> 
> The next patches/series are going to start removing some of this logic
> from the callers, and add more complex state combinations than currently.
> 
> Thus, consolidate all the complexity here. Callers do not have to care
> about what STE transition they are doing, this function will handle
> everything optimally.
> 
> Revise arm_smmu_write_strtab_ent() so it algorithmically computes the
> required programming sequence to avoid creating an incoherent 'torn' STE
> in the HW caches. The update algorithm follows the same design that the
> driver already uses: it is safe to change bits that HW doesn't currently
> use and then do a single 64 bit update, with sync's in between.
> 
> The basic idea is to express in a bitmask what bits the HW is actually
> using based on the V and CFG bits. Based on that mask we know what STE
> changes are safe and which are disruptive. We can count how many 64 bit
> QWORDS need a disruptive update and know if a step with V=0 is required.
> 
> This gives two basic flows through the algorithm.
> 
> If only a single 64 bit quantity needs disruptive replacement:
>  - Write the target value into all currently unused bits
>  - Write the single 64 bit quantity
>  - Zero the remaining different bits
> 
> If multiple 64 bit quantities need disruptive replacement then do:
>  - Write V=0 to QWORD 0
>  - Write the entire STE except QWORD 0
>  - Write QWORD 0
> 
> With HW flushes at each step, that can be skipped if the STE didn't change
> in that step.
> 
> At this point it generates the same sequence of updates as the current
> code, except that zeroing the VMID on entry to BYPASS/ABORT will do an
> extra sync (this seems to be an existing bug).
> 
> Going forward this will use a V=0 transition instead of cycling through
> ABORT if a hitfull change is required. This seems more appropriate as ABORT
> will fail DMAs without any logging, but dropping a DMA due to transient
> V=0 is probably signaling a bug, so the C_BAD_STE is valuable.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 04/19] iommu/arm-smmu-v3: Make STE programming independent of the callers
@ 2023-12-05  1:38     ` Nicolin Chen
  0 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  1:38 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:11PM -0400, Jason Gunthorpe wrote:
> As the comment in arm_smmu_write_strtab_ent() explains, this routine has
> been limited to only work correctly in certain scenarios that the caller
> must ensure. Generally the caller must put the STE into ABORT or BYPASS
> before attempting to program it to something else.
> 
> The next patches/series are going to start removing some of this logic
> from the callers, and add more complex state combinations than currently.
> 
> Thus, consolidate all the complexity here. Callers do not have to care
> about what STE transition they are doing, this function will handle
> everything optimally.
> 
> Revise arm_smmu_write_strtab_ent() so it algorithmically computes the
> required programming sequence to avoid creating an incoherent 'torn' STE
> in the HW caches. The update algorithm follows the same design that the
> driver already uses: it is safe to change bits that HW doesn't currently
> use and then do a single 64 bit update, with sync's in between.
> 
> The basic idea is to express in a bitmask what bits the HW is actually
> using based on the V and CFG bits. Based on that mask we know what STE
> changes are safe and which are disruptive. We can count how many 64 bit
> QWORDS need a disruptive update and know if a step with V=0 is required.
> 
> This gives two basic flows through the algorithm.
> 
> If only a single 64 bit quantity needs disruptive replacement:
>  - Write the target value into all currently unused bits
>  - Write the single 64 bit quantity
>  - Zero the remaining different bits
> 
> If multiple 64 bit quantities need disruptive replacement then do:
>  - Write V=0 to QWORD 0
>  - Write the entire STE except QWORD 0
>  - Write QWORD 0
> 
> With HW flushes at each step, that can be skipped if the STE didn't change
> in that step.
> 
> At this point it generates the same sequence of updates as the current
> code, except that zeroing the VMID on entry to BYPASS/ABORT will do an
> extra sync (this seems to be an existing bug).
> 
> Going forward this will use a V=0 transition instead of cycling through
> ABORT if a hitfull change is required. This seems more appropriate as ABORT
> will fail DMAs without any logging, but dropping a DMA due to transient
> V=0 is probably signaling a bug, so the C_BAD_STE is valuable.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 05/19] iommu/arm-smmu-v3: Consolidate the STE generation for abort/bypass
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-12-05  1:43     ` Nicolin Chen
  -1 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  1:43 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:12PM -0400, Jason Gunthorpe wrote:
> This allows writing the flow of arm_smmu_write_strtab_ent() around abort
> and bypass domains more naturally.
> 
> Note that the core code no longer supplies NULL domains, though there is
> still a flow in the driver that end up in arm_smmu_write_strtab_ent() with
> NULL. A later patch will remove it.
> 
> Remove the duplicate calculation of the STE in arm_smmu_init_bypass_stes()
> and remove the force parameter. arm_smmu_rmr_install_bypass_ste() can now
> simply invoke arm_smmu_make_bypass_ste() directly.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
 
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 05/19] iommu/arm-smmu-v3: Consolidate the STE generation for abort/bypass
@ 2023-12-05  1:43     ` Nicolin Chen
  0 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  1:43 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:12PM -0400, Jason Gunthorpe wrote:
> This allows writing the flow of arm_smmu_write_strtab_ent() around abort
> and bypass domains more naturally.
> 
> Note that the core code no longer supplies NULL domains, though there is
> still a flow in the driver that end up in arm_smmu_write_strtab_ent() with
> NULL. A later patch will remove it.
> 
> Remove the duplicate calculation of the STE in arm_smmu_init_bypass_stes()
> and remove the force parameter. arm_smmu_rmr_install_bypass_ste() can now
> simply invoke arm_smmu_make_bypass_ste() directly.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
 
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 06/19] iommu/arm-smmu-v3: Move arm_smmu_rmr_install_bypass_ste()
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-12-05  1:45     ` Nicolin Chen
  -1 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  1:45 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:13PM -0400, Jason Gunthorpe wrote:
> Logically arm_smmu_init_strtab_linear() is the function that allocates and
> populates the stream table with the initial value of the STEs. After this
> function returns the stream table should be fully ready.
> 
> arm_smmu_rmr_install_bypass_ste() adjusts the initial stream table to force
> any SIDs that the FW says have IOMMU_RESV_DIRECT to use bypass. This
> ensures there is no disruption to the identity mapping during boot.
> 
> Put arm_smmu_rmr_install_bypass_ste() into arm_smmu_init_strtab_linear(),
> it already executes immediately after arm_smmu_init_strtab_linear().
> 
> No functional change intended.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 06/19] iommu/arm-smmu-v3: Move arm_smmu_rmr_install_bypass_ste()
@ 2023-12-05  1:45     ` Nicolin Chen
  0 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  1:45 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:13PM -0400, Jason Gunthorpe wrote:
> Logically arm_smmu_init_strtab_linear() is the function that allocates and
> populates the stream table with the initial value of the STEs. After this
> function returns the stream table should be fully ready.
> 
> arm_smmu_rmr_install_bypass_ste() adjusts the initial stream table to force
> any SIDs that the FW says have IOMMU_RESV_DIRECT to use bypass. This
> ensures there is no disruption to the identity mapping during boot.
> 
> Put arm_smmu_rmr_install_bypass_ste() into arm_smmu_init_strtab_linear(),
> it already executes immediately after arm_smmu_init_strtab_linear().
> 
> No functional change intended.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 07/19] iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into functions
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-12-05  1:55     ` Nicolin Chen
  -1 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  1:55 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:14PM -0400, Jason Gunthorpe wrote:
> This is preparation to move the STE calculation higher up in to the call
> chain and remove arm_smmu_write_strtab_ent(). These new functions will be
> called directly from attach_dev.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

> +static void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
> +				      struct arm_smmu_master *master,
> +				      struct arm_smmu_ctx_desc_cfg *cd_table)
> +{
> +	struct arm_smmu_device *smmu = master->smmu;
> +
> +	memset(target, 0, sizeof(*target));
> +	target->data[0] = cpu_to_le64(

Nit: can add a line in-between like arm_smmu_make_s2_domain_ste does?

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

> +static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
> +					struct arm_smmu_master *master,
> +					struct arm_smmu_domain *smmu_domain)
> +{
> +	struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg;
> +
> +	memset(target, 0, sizeof(*target));
> +
> +	target->data[0] = cpu_to_le64(

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 07/19] iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into functions
@ 2023-12-05  1:55     ` Nicolin Chen
  0 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  1:55 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:14PM -0400, Jason Gunthorpe wrote:
> This is preparation to move the STE calculation higher up in to the call
> chain and remove arm_smmu_write_strtab_ent(). These new functions will be
> called directly from attach_dev.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

> +static void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
> +				      struct arm_smmu_master *master,
> +				      struct arm_smmu_ctx_desc_cfg *cd_table)
> +{
> +	struct arm_smmu_device *smmu = master->smmu;
> +
> +	memset(target, 0, sizeof(*target));
> +	target->data[0] = cpu_to_le64(

Nit: can add a line in-between like arm_smmu_make_s2_domain_ste does?

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

> +static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target,
> +					struct arm_smmu_master *master,
> +					struct arm_smmu_domain *smmu_domain)
> +{
> +	struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg;
> +
> +	memset(target, 0, sizeof(*target));
> +
> +	target->data[0] = cpu_to_le64(

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 08/19] iommu/arm-smmu-v3: Build the whole STE in arm_smmu_make_s2_domain_ste()
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-12-05  1:58     ` Nicolin Chen
  -1 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  1:58 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:15PM -0400, Jason Gunthorpe wrote:
> Half the code was living in arm_smmu_domain_finalise_s2(), just move it
> here and take the values directly from the pgtbl_ops instead of storing
> copies.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
 
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 08/19] iommu/arm-smmu-v3: Build the whole STE in arm_smmu_make_s2_domain_ste()
@ 2023-12-05  1:58     ` Nicolin Chen
  0 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  1:58 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:15PM -0400, Jason Gunthorpe wrote:
> Half the code was living in arm_smmu_domain_finalise_s2(), just move it
> here and take the values directly from the pgtbl_ops instead of storing
> copies.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
 
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 10/19] iommu/arm-smmu-v3: Compute the STE only once for each master
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-12-05  2:13     ` Nicolin Chen
  -1 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  2:13 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:17PM -0400, Jason Gunthorpe wrote:
> Currently arm_smmu_install_ste_for_dev() iterates over every SID and
> computes from scratch an identical STE. Every SID should have the same STE
> contents. Turn this inside out so that the STE is supplied by the caller
> and arm_smmu_install_ste_for_dev() simply installs it to every SID.
> 
> This is possible now that the STE generation does not inform what sequence
> should be used to program it.
> 
> This allows splitting the STE calculation up according to the call site,
> which following patches will make use of, and removes the confusing NULL
> domain special case that only supported arm_smmu_detach_dev().
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 10/19] iommu/arm-smmu-v3: Compute the STE only once for each master
@ 2023-12-05  2:13     ` Nicolin Chen
  0 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  2:13 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:17PM -0400, Jason Gunthorpe wrote:
> Currently arm_smmu_install_ste_for_dev() iterates over every SID and
> computes from scratch an identical STE. Every SID should have the same STE
> contents. Turn this inside out so that the STE is supplied by the caller
> and arm_smmu_install_ste_for_dev() simply installs it to every SID.
> 
> This is possible now that the STE generation does not inform what sequence
> should be used to program it.
> 
> This allows splitting the STE calculation up according to the call site,
> which following patches will make use of, and removes the confusing NULL
> domain special case that only supported arm_smmu_detach_dev().
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 09/19] iommu/arm-smmu-v3: Hold arm_smmu_asid_lock during all of attach_dev
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-12-05  2:16     ` Nicolin Chen
  -1 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  2:16 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:16PM -0400, Jason Gunthorpe wrote:
> The BTM support wants to be able to change the ASID of any smmu_domain.
> When it goes to do this it holds the arm_smmu_asid_lock and iterates over
> the target domain's devices list.
> 
> During attach of a S1 domain we must ensure that the devices list and
> CD are in sync, otherwise we could miss CD updates or a parallel CD update
> could push an out of date CD.
> 
> This is pretty complicated, and works today because arm_smmu_detach_dev()
> remove the CD table from the STE before working on the CD entries.
> 
> The next patch will allow the CD table to remain in the STE so solve this
> racy by holding the lock for a longer period. The lock covers both of the
> changes to the device list and the CD table entries.
> 
> Move arm_smmu_detach_dev() till after we have initialized the domain so
> the lock can be held for less time.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
 
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 09/19] iommu/arm-smmu-v3: Hold arm_smmu_asid_lock during all of attach_dev
@ 2023-12-05  2:16     ` Nicolin Chen
  0 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  2:16 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:16PM -0400, Jason Gunthorpe wrote:
> The BTM support wants to be able to change the ASID of any smmu_domain.
> When it goes to do this it holds the arm_smmu_asid_lock and iterates over
> the target domain's devices list.
> 
> During attach of a S1 domain we must ensure that the devices list and
> CD are in sync, otherwise we could miss CD updates or a parallel CD update
> could push an out of date CD.
> 
> This is pretty complicated, and works today because arm_smmu_detach_dev()
> remove the CD table from the STE before working on the CD entries.
> 
> The next patch will allow the CD table to remain in the STE so solve this
> racy by holding the lock for a longer period. The lock covers both of the
> changes to the device list and the CD table entries.
> 
> Move arm_smmu_detach_dev() till after we have initialized the domain so
> the lock can be held for less time.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
 
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 11/19] iommu/arm-smmu-v3: Do not change the STE twice during arm_smmu_attach_dev()
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-12-05  2:46     ` Nicolin Chen
  -1 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  2:46 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:18PM -0400, Jason Gunthorpe wrote:
> This was needed because the STE code required the STE to be in
> ABORT/BYPASS inorder to program a cdtable or S2 STE. Now that the STE code
> can automatically handle all transitions we can remove this step
> from the attach_dev flow.
> 
> A few small bugs exist because of this:
> 
> 1) If the core code does BLOCKED -> UNMANAGED with disable_bypass=false
>    then there will be a moment where the STE points at BYPASS. Since
>    this can be done by VFIO/IOMMUFD it is a small security race.
> 
> 2) If the core code does IDENTITY -> DMA then any IOMMU_RESV_DIRECT
>    regions will temporarily become BLOCKED. We'd like drivers to
>    work in a way that allows IOMMU_RESV_DIRECT to be continuously
>    functional during these transitions.
> 
> Make arm_smmu_release_device() put the STE back to the correct
> ABORT/BYPASS setting. Fix a bug where a IOMMU_RESV_DIRECT was ignored on
> this path.
> 
> Notice this subtly depends on the prior arm_smmu_asid_lock change as the
> STE must be put to non-paging before removing the device for the linked
> list to avoid races with arm_smmu_share_asid().
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 11/19] iommu/arm-smmu-v3: Do not change the STE twice during arm_smmu_attach_dev()
@ 2023-12-05  2:46     ` Nicolin Chen
  0 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  2:46 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:18PM -0400, Jason Gunthorpe wrote:
> This was needed because the STE code required the STE to be in
> ABORT/BYPASS inorder to program a cdtable or S2 STE. Now that the STE code
> can automatically handle all transitions we can remove this step
> from the attach_dev flow.
> 
> A few small bugs exist because of this:
> 
> 1) If the core code does BLOCKED -> UNMANAGED with disable_bypass=false
>    then there will be a moment where the STE points at BYPASS. Since
>    this can be done by VFIO/IOMMUFD it is a small security race.
> 
> 2) If the core code does IDENTITY -> DMA then any IOMMU_RESV_DIRECT
>    regions will temporarily become BLOCKED. We'd like drivers to
>    work in a way that allows IOMMU_RESV_DIRECT to be continuously
>    functional during these transitions.
> 
> Make arm_smmu_release_device() put the STE back to the correct
> ABORT/BYPASS setting. Fix a bug where a IOMMU_RESV_DIRECT was ignored on
> this path.
> 
> Notice this subtly depends on the prior arm_smmu_asid_lock change as the
> STE must be put to non-paging before removing the device for the linked
> list to avoid races with arm_smmu_share_asid().
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 12/19] iommu/arm-smmu-v3: Put writing the context descriptor in the right order
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-12-05  3:38     ` Nicolin Chen
  -1 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  3:38 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:19PM -0400, Jason Gunthorpe wrote:
> Get closer to the IOMMU API ideal that changes between domains can be
> hitless. The ordering for the CD table entry is not entirely clean from
> this perspective.
> 
> When switching away from a STE with a CD table programmed in it we should
> write the new STE first, then clear any old data in the CD entry.
> 
> If we are programming a CD table for the first time to a STE then the CD
> entry should be programmed before the STE is loaded.
> 
> If we are replacing a CD table entry when the STE already points at the CD
> entry then we just need to do the make/break sequence.
> 
> Lift this code out of arm_smmu_detach_dev() so it can all be sequenced
> properly. The only other caller is arm_smmu_release_device() and it is
> going to free the cdtable anyhow, so it doesn't matter what is in it.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 12/19] iommu/arm-smmu-v3: Put writing the context descriptor in the right order
@ 2023-12-05  3:38     ` Nicolin Chen
  0 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  3:38 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:19PM -0400, Jason Gunthorpe wrote:
> Get closer to the IOMMU API ideal that changes between domains can be
> hitless. The ordering for the CD table entry is not entirely clean from
> this perspective.
> 
> When switching away from a STE with a CD table programmed in it we should
> write the new STE first, then clear any old data in the CD entry.
> 
> If we are programming a CD table for the first time to a STE then the CD
> entry should be programmed before the STE is loaded.
> 
> If we are replacing a CD table entry when the STE already points at the CD
> entry then we just need to do the make/break sequence.
> 
> Lift this code out of arm_smmu_detach_dev() so it can all be sequenced
> properly. The only other caller is arm_smmu_release_device() and it is
> going to free the cdtable anyhow, so it doesn't matter what is in it.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 00/19] Update SMMUv3 to the modern iommu API (part 1/3)
  2023-11-13 17:53 ` Jason Gunthorpe
@ 2023-12-05  3:54   ` Nicolin Chen
  -1 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  3:54 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:07PM -0400, Jason Gunthorpe wrote:
 
> Overall this takes the approach of turning the STE/CD programming upside
> down where the CD/STE value is computed right at a driver callback
> function and then pushed down into programming logic. The programming
> logic hides the details of the required CD/STE tear-less update. This
> makes the CD/STE functions independent of the arm_smmu_domain which makes
> it fairly straightforward to untangle all the different call chains, and
> add news ones.
> 
> Further, this frees the arm_smmu_domain related logic from keeping track
> of what state the STE/CD is currently in so it can carefully sequence the
> correct update. There are many new update pairs that are subtly introduced
> as the work progresses.
> 
> The locking to support BTM via arm_smmu_asid_lock is a bit subtle right
> now and patches throughout this work adjust and tighten this so that it is
> clearer and doesn't get broken.
> 
> Once the lower STE layers no longer need to touch arm_smmu_domain we can
> isolate struct arm_smmu_domain to be only used for PAGING domains, audit
> all the to_smmu_domain() calls to be only in PAGING domain ops, and
> introduce the normal global static BLOCKED/IDENTITY domains using the new
> STE infrastructure. Part 2 will ultimately migrate SVA over to use
> arm_smmu_domain as well.
> 
> All parts are on github:
> 
>  https://github.com/jgunthorpe/linux/commits/smmuv3_newapi

Ran sanity with part-1 alone, covering S1 Translate and SVA cases.

And ran more functional tests with part-2 and part-3 (nested) in
the repo above, covering host-level S1 Translate, S1DSS_BYPASS +
SVA, and guest VM (stage-2 alone, and stage-1+2 with S1DSS_SSID0
and S1DSS_BYPASS). These should test quite a list of combinations
against the STE updating algorithm.

Tested-by: Nicolin Chen <nicolinc@nvidia.com>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 00/19] Update SMMUv3 to the modern iommu API (part 1/3)
@ 2023-12-05  3:54   ` Nicolin Chen
  0 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  3:54 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:07PM -0400, Jason Gunthorpe wrote:
 
> Overall this takes the approach of turning the STE/CD programming upside
> down where the CD/STE value is computed right at a driver callback
> function and then pushed down into programming logic. The programming
> logic hides the details of the required CD/STE tear-less update. This
> makes the CD/STE functions independent of the arm_smmu_domain which makes
> it fairly straightforward to untangle all the different call chains, and
> add news ones.
> 
> Further, this frees the arm_smmu_domain related logic from keeping track
> of what state the STE/CD is currently in so it can carefully sequence the
> correct update. There are many new update pairs that are subtly introduced
> as the work progresses.
> 
> The locking to support BTM via arm_smmu_asid_lock is a bit subtle right
> now and patches throughout this work adjust and tighten this so that it is
> clearer and doesn't get broken.
> 
> Once the lower STE layers no longer need to touch arm_smmu_domain we can
> isolate struct arm_smmu_domain to be only used for PAGING domains, audit
> all the to_smmu_domain() calls to be only in PAGING domain ops, and
> introduce the normal global static BLOCKED/IDENTITY domains using the new
> STE infrastructure. Part 2 will ultimately migrate SVA over to use
> arm_smmu_domain as well.
> 
> All parts are on github:
> 
>  https://github.com/jgunthorpe/linux/commits/smmuv3_newapi

Ran sanity with part-1 alone, covering S1 Translate and SVA cases.

And ran more functional tests with part-2 and part-3 (nested) in
the repo above, covering host-level S1 Translate, S1DSS_BYPASS +
SVA, and guest VM (stage-2 alone, and stage-1+2 with S1DSS_SSID0
and S1DSS_BYPASS). These should test quite a list of combinations
against the STE updating algorithm.

Tested-by: Nicolin Chen <nicolinc@nvidia.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 13/19] iommu/arm-smmu-v3: Pass smmu_domain to arm_enable/disable_ats()
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-12-05  3:56     ` Nicolin Chen
  -1 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  3:56 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:20PM -0400, Jason Gunthorpe wrote:
> The caller already has the domain, just pass it in. A following patch will
> remove master->domain.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
 
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 13/19] iommu/arm-smmu-v3: Pass smmu_domain to arm_enable/disable_ats()
@ 2023-12-05  3:56     ` Nicolin Chen
  0 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  3:56 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:20PM -0400, Jason Gunthorpe wrote:
> The caller already has the domain, just pass it in. A following patch will
> remove master->domain.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
 
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 16/19] iommu/arm-smmu-v3: Add a global static BLOCKED domain
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-12-05  4:05     ` Nicolin Chen
  -1 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  4:05 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:23PM -0400, Jason Gunthorpe wrote:
> Using the same design as the IDENTITY domain install an
> STRTAB_STE_0_CFG_ABORT STE.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
 
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 16/19] iommu/arm-smmu-v3: Add a global static BLOCKED domain
@ 2023-12-05  4:05     ` Nicolin Chen
  0 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  4:05 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:23PM -0400, Jason Gunthorpe wrote:
> Using the same design as the IDENTITY domain install an
> STRTAB_STE_0_CFG_ABORT STE.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
 
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 17/19] iommu/arm-smmu-v3: Use the identity/blocked domain during release
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-12-05  4:07     ` Nicolin Chen
  -1 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  4:07 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:24PM -0400, Jason Gunthorpe wrote:
> Consolidate some more core by having release call
> arm_smmu_attach_dev_identity/blocked() instead of open coding this.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 17/19] iommu/arm-smmu-v3: Use the identity/blocked domain during release
@ 2023-12-05  4:07     ` Nicolin Chen
  0 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  4:07 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:24PM -0400, Jason Gunthorpe wrote:
> Consolidate some more core by having release call
> arm_smmu_attach_dev_identity/blocked() instead of open coding this.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 15/19] iommu/arm-smmu-v3: Add a global static IDENTITY domain
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-12-05  4:28     ` Nicolin Chen
  -1 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  4:28 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:22PM -0400, Jason Gunthorpe wrote:
> @@ -2592,13 +2578,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>  			arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
>  						      NULL);
>  		break;
> -	case ARM_SMMU_DOMAIN_BYPASS:
> -		arm_smmu_make_bypass_ste(&target);
> -		arm_smmu_install_ste_for_dev(master, &target);
> -		if (master->cd_table.cdtab)
> -			arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
> -						      NULL);
> -		break;
>  	}
>  
>  	arm_smmu_enable_ats(master, smmu_domain);
> @@ -2614,6 +2593,60 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>  	return ret;
>  }
>  
> +static int arm_smmu_attach_dev_ste(struct device *dev,
> +				   struct arm_smmu_ste *ste)
> +{
> +	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> +
> +	if (arm_smmu_master_sva_enabled(master))
> +		return -EBUSY;
> +
> +	/*
> +	 * Do not allow any ASID to be changed while are working on the STE,
> +	 * otherwise we could miss invalidations.
> +	 */
> +	mutex_lock(&arm_smmu_asid_lock);
> +
> +	/*
> +	 * The SMMU does not support enabling ATS with bypass/abort. When the
> +	 * STE is in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests
> +	 * and Translated transactions are denied as though ATS is disabled for
> +	 * the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
> +	 * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
> +	 */
> +	arm_smmu_detach_dev(master);
> +
> +	arm_smmu_install_ste_for_dev(master, ste);
> +	mutex_unlock(&arm_smmu_asid_lock);
> +
> +	/*
> +	 * This has to be done after removing the master from the
> +	 * arm_smmu_domain->devices to avoid races updating the same context
> +	 * descriptor from arm_smmu_share_asid().
> +	 */
> +	if (master->cd_table.cdtab)
> +		arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
 
This arm_smmu_write_ctx_desc was previously within the asid lock
protection, yet now it's moved out of that?

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 15/19] iommu/arm-smmu-v3: Add a global static IDENTITY domain
@ 2023-12-05  4:28     ` Nicolin Chen
  0 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  4:28 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:22PM -0400, Jason Gunthorpe wrote:
> @@ -2592,13 +2578,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>  			arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
>  						      NULL);
>  		break;
> -	case ARM_SMMU_DOMAIN_BYPASS:
> -		arm_smmu_make_bypass_ste(&target);
> -		arm_smmu_install_ste_for_dev(master, &target);
> -		if (master->cd_table.cdtab)
> -			arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
> -						      NULL);
> -		break;
>  	}
>  
>  	arm_smmu_enable_ats(master, smmu_domain);
> @@ -2614,6 +2593,60 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>  	return ret;
>  }
>  
> +static int arm_smmu_attach_dev_ste(struct device *dev,
> +				   struct arm_smmu_ste *ste)
> +{
> +	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> +
> +	if (arm_smmu_master_sva_enabled(master))
> +		return -EBUSY;
> +
> +	/*
> +	 * Do not allow any ASID to be changed while are working on the STE,
> +	 * otherwise we could miss invalidations.
> +	 */
> +	mutex_lock(&arm_smmu_asid_lock);
> +
> +	/*
> +	 * The SMMU does not support enabling ATS with bypass/abort. When the
> +	 * STE is in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests
> +	 * and Translated transactions are denied as though ATS is disabled for
> +	 * the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
> +	 * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
> +	 */
> +	arm_smmu_detach_dev(master);
> +
> +	arm_smmu_install_ste_for_dev(master, ste);
> +	mutex_unlock(&arm_smmu_asid_lock);
> +
> +	/*
> +	 * This has to be done after removing the master from the
> +	 * arm_smmu_domain->devices to avoid races updating the same context
> +	 * descriptor from arm_smmu_share_asid().
> +	 */
> +	if (master->cd_table.cdtab)
> +		arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
 
This arm_smmu_write_ctx_desc was previously within the asid lock
protection, yet now it's moved out of that?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 19/19] iommu/arm-smmu-v3: Convert to domain_alloc_paging()
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-12-05  4:40     ` Nicolin Chen
  -1 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  4:40 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:26PM -0400, Jason Gunthorpe wrote:
> Now that the BLOCKED and IDENTITY behaviors are managed with their own
> domains change to the domain_alloc_paging() op.
> 
> For now SVA remains using the old interface, eventually it will get its
> own op that can pass in the device and mm_struct which will let us have a
> sane lifetime for the mmu_notifier.
> 
> Call arm_smmu_domain_finalise() early if dev is available.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 19/19] iommu/arm-smmu-v3: Convert to domain_alloc_paging()
@ 2023-12-05  4:40     ` Nicolin Chen
  0 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  4:40 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:26PM -0400, Jason Gunthorpe wrote:
> Now that the BLOCKED and IDENTITY behaviors are managed with their own
> domains change to the domain_alloc_paging() op.
> 
> For now SVA remains using the old interface, eventually it will get its
> own op that can pass in the device and mm_struct which will let us have a
> sane lifetime for the mmu_notifier.
> 
> Call arm_smmu_domain_finalise() early if dev is available.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 18/19] iommu/arm-smmu-v3: Pass arm_smmu_domain and arm_smmu_device to finalize
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-12-05  4:42     ` Nicolin Chen
  -1 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  4:42 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:25PM -0400, Jason Gunthorpe wrote:
> Instead of putting container_of() casts in the internals, use the proper
> type in this call chain. This makes it easier to check that the two global
> static domains are not leaking into call chains they should not.
> 
> Passing the smmu avoids the only caller from having to set it and unset it
> in the error path.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 18/19] iommu/arm-smmu-v3: Pass arm_smmu_domain and arm_smmu_device to finalize
@ 2023-12-05  4:42     ` Nicolin Chen
  0 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  4:42 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:25PM -0400, Jason Gunthorpe wrote:
> Instead of putting container_of() casts in the internals, use the proper
> type in this call chain. This makes it easier to check that the two global
> static domains are not leaking into call chains they should not.
> 
> Passing the smmu avoids the only caller from having to set it and unset it
> in the error path.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 14/19] iommu/arm-smmu-v3: Remove arm_smmu_master->domain
  2023-11-13 17:53   ` Jason Gunthorpe
@ 2023-12-05  4:47     ` Nicolin Chen
  -1 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  4:47 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:21PM -0400, Jason Gunthorpe wrote:
> Introducing global statics which are of type struct iommu_domain, not
> struct arm_smmu_domain makes it difficult to retain
> arm_smmu_master->domain, as it can no longer point to an IDENTITY or
> BLOCKED domain.
> 
> The only place that uses the value is arm_smmu_detach_dev(). Change things
> to work like other drivers and call iommu_get_domain_for_dev() to obtain
> the current domain.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

The new version on the github looks good to me.

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 14/19] iommu/arm-smmu-v3: Remove arm_smmu_master->domain
@ 2023-12-05  4:47     ` Nicolin Chen
  0 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05  4:47 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Nov 13, 2023 at 01:53:21PM -0400, Jason Gunthorpe wrote:
> Introducing global statics which are of type struct iommu_domain, not
> struct arm_smmu_domain makes it difficult to retain
> arm_smmu_master->domain, as it can no longer point to an IDENTITY or
> BLOCKED domain.
> 
> The only place that uses the value is arm_smmu_detach_dev(). Change things
> to work like other drivers and call iommu_get_domain_for_dev() to obtain
> the current domain.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

The new version on the github looks good to me.

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 14/19] iommu/arm-smmu-v3: Remove arm_smmu_master->domain
  2023-11-30 12:03       ` Jason Gunthorpe
@ 2023-12-05 13:25         ` Eric Auger
  -1 siblings, 0 replies; 158+ messages in thread
From: Eric Auger @ 2023-12-05 13:25 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Hi Jason,

On 11/30/23 13:03, Jason Gunthorpe wrote:
> On Mon, Nov 27, 2023 at 06:14:30PM +0100, Eric Auger wrote:
>> Hi Jason,
>>
>> On 11/13/23 18:53, Jason Gunthorpe wrote:
>>> Introducing global statics which are of type struct iommu_domain, not
>>> struct arm_smmu_domain makes it difficult to retain
>>> arm_smmu_master->domain, as it can no longer point to an IDENTITY or
>>> BLOCKED domain.
>>>
>>> The only place that uses the value is arm_smmu_detach_dev(). Change things
>>> to work like other drivers and call iommu_get_domain_for_dev() to obtain
>>> the current domain.
>>>
>>> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
>>
>> This patch introduces a crash on my machine. See below.
> 
> Ah, your system must have multi-device groups
> 
> The master->domain was subtly protecting the domain_head to ensure
> we don't touch it unless it is already in a domain list. This issue is
> solved in part 2 (iommu/arm-smmu-v3: Make smmu_domain->devices into an
> allocated list) which removes the domain_head.
> 
> This hunk should fix this patch. I updated the github

I confirm that the hunk hereafter fixes the crash.

Thanks

Eric
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 23dda64722ea17..102e13b65bcdec 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2491,7 +2491,7 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
>  	arm_smmu_disable_ats(master, smmu_domain);
>  
>  	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
> -	list_del(&master->domain_head);
> +	list_del_init(&master->domain_head);
>  	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
>  
>  	master->ats_enabled = false;
> @@ -2606,7 +2606,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>  
>  out_list_del:
>  	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
> -	list_del(&master->domain_head);
> +	list_del_init(&master->domain_head);
>  	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
>  
>  out_unlock:
> @@ -2810,6 +2810,7 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
>  	master->dev = dev;
>  	master->smmu = smmu;
>  	INIT_LIST_HEAD(&master->bonds);
> +	INIT_LIST_HEAD(&master->domain_head);
>  	dev_iommu_priv_set(dev, master);
>  
>  	ret = arm_smmu_insert_master(smmu, master);
> 
> 
> 
> Thank!!
> Jason
> 


^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 14/19] iommu/arm-smmu-v3: Remove arm_smmu_master->domain
@ 2023-12-05 13:25         ` Eric Auger
  0 siblings, 0 replies; 158+ messages in thread
From: Eric Auger @ 2023-12-05 13:25 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Nicolin Chen, Shameerali Kolothum Thodi

Hi Jason,

On 11/30/23 13:03, Jason Gunthorpe wrote:
> On Mon, Nov 27, 2023 at 06:14:30PM +0100, Eric Auger wrote:
>> Hi Jason,
>>
>> On 11/13/23 18:53, Jason Gunthorpe wrote:
>>> Introducing global statics which are of type struct iommu_domain, not
>>> struct arm_smmu_domain makes it difficult to retain
>>> arm_smmu_master->domain, as it can no longer point to an IDENTITY or
>>> BLOCKED domain.
>>>
>>> The only place that uses the value is arm_smmu_detach_dev(). Change things
>>> to work like other drivers and call iommu_get_domain_for_dev() to obtain
>>> the current domain.
>>>
>>> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
>>
>> This patch introduces a crash on my machine. See below.
> 
> Ah, your system must have multi-device groups
> 
> The master->domain was subtly protecting the domain_head to ensure
> we don't touch it unless it is already in a domain list. This issue is
> solved in part 2 (iommu/arm-smmu-v3: Make smmu_domain->devices into an
> allocated list) which removes the domain_head.
> 
> This hunk should fix this patch. I updated the github

I confirm that the hunk hereafter fixes the crash.

Thanks

Eric
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 23dda64722ea17..102e13b65bcdec 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2491,7 +2491,7 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master)
>  	arm_smmu_disable_ats(master, smmu_domain);
>  
>  	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
> -	list_del(&master->domain_head);
> +	list_del_init(&master->domain_head);
>  	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
>  
>  	master->ats_enabled = false;
> @@ -2606,7 +2606,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>  
>  out_list_del:
>  	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
> -	list_del(&master->domain_head);
> +	list_del_init(&master->domain_head);
>  	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
>  
>  out_unlock:
> @@ -2810,6 +2810,7 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev)
>  	master->dev = dev;
>  	master->smmu = smmu;
>  	INIT_LIST_HEAD(&master->bonds);
> +	INIT_LIST_HEAD(&master->domain_head);
>  	dev_iommu_priv_set(dev, master);
>  
>  	ret = arm_smmu_insert_master(smmu, master);
> 
> 
> 
> Thank!!
> Jason
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 07/19] iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into functions
  2023-12-05  1:55     ` Nicolin Chen
@ 2023-12-05 14:35       ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 14:35 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Dec 04, 2023 at 05:55:03PM -0800, Nicolin Chen wrote:
> On Mon, Nov 13, 2023 at 01:53:14PM -0400, Jason Gunthorpe wrote:
> > This is preparation to move the STE calculation higher up in to the call
> > chain and remove arm_smmu_write_strtab_ent(). These new functions will be
> > called directly from attach_dev.
> > 
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> 
> > +static void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
> > +				      struct arm_smmu_master *master,
> > +				      struct arm_smmu_ctx_desc_cfg *cd_table)
> > +{
> > +	struct arm_smmu_device *smmu = master->smmu;
> > +
> > +	memset(target, 0, sizeof(*target));
> > +	target->data[0] = cpu_to_le64(
> 
> Nit: can add a line in-between like arm_smmu_make_s2_domain_ste does?

I removed the line since the arm_smmu_make_abort_ste/bypass functions
also had no line

Thanks,
Jason

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 07/19] iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into functions
@ 2023-12-05 14:35       ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 14:35 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Dec 04, 2023 at 05:55:03PM -0800, Nicolin Chen wrote:
> On Mon, Nov 13, 2023 at 01:53:14PM -0400, Jason Gunthorpe wrote:
> > This is preparation to move the STE calculation higher up in to the call
> > chain and remove arm_smmu_write_strtab_ent(). These new functions will be
> > called directly from attach_dev.
> > 
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> 
> > +static void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target,
> > +				      struct arm_smmu_master *master,
> > +				      struct arm_smmu_ctx_desc_cfg *cd_table)
> > +{
> > +	struct arm_smmu_device *smmu = master->smmu;
> > +
> > +	memset(target, 0, sizeof(*target));
> > +	target->data[0] = cpu_to_le64(
> 
> Nit: can add a line in-between like arm_smmu_make_s2_domain_ste does?

I removed the line since the arm_smmu_make_abort_ste/bypass functions
also had no line

Thanks,
Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 15/19] iommu/arm-smmu-v3: Add a global static IDENTITY domain
  2023-12-05  4:28     ` Nicolin Chen
@ 2023-12-05 14:37       ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 14:37 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Dec 04, 2023 at 08:28:23PM -0800, Nicolin Chen wrote:

> > +static int arm_smmu_attach_dev_ste(struct device *dev,
> > +				   struct arm_smmu_ste *ste)
> > +{
> > +	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> > +
> > +	if (arm_smmu_master_sva_enabled(master))
> > +		return -EBUSY;
> > +
> > +	/*
> > +	 * Do not allow any ASID to be changed while are working on the STE,
> > +	 * otherwise we could miss invalidations.
> > +	 */
> > +	mutex_lock(&arm_smmu_asid_lock);
> > +
> > +	/*
> > +	 * The SMMU does not support enabling ATS with bypass/abort. When the
> > +	 * STE is in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests
> > +	 * and Translated transactions are denied as though ATS is disabled for
> > +	 * the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
> > +	 * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
> > +	 */
> > +	arm_smmu_detach_dev(master);
> > +
> > +	arm_smmu_install_ste_for_dev(master, ste);
> > +	mutex_unlock(&arm_smmu_asid_lock);
> > +
> > +	/*
> > +	 * This has to be done after removing the master from the
> > +	 * arm_smmu_domain->devices to avoid races updating the same context
> > +	 * descriptor from arm_smmu_share_asid().
> > +	 */
> > +	if (master->cd_table.cdtab)
> > +		arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
>  
> This arm_smmu_write_ctx_desc was previously within the asid lock
> protection, yet now it's moved out of that?

Yes, arm_smmu_write_ctx_desc() updates a CD table entry and that does
not need ASID lock protection. The ASID lock exists because of the BTM
code rewriting STEs asyncronously.

Jason

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 15/19] iommu/arm-smmu-v3: Add a global static IDENTITY domain
@ 2023-12-05 14:37       ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 14:37 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Mon, Dec 04, 2023 at 08:28:23PM -0800, Nicolin Chen wrote:

> > +static int arm_smmu_attach_dev_ste(struct device *dev,
> > +				   struct arm_smmu_ste *ste)
> > +{
> > +	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> > +
> > +	if (arm_smmu_master_sva_enabled(master))
> > +		return -EBUSY;
> > +
> > +	/*
> > +	 * Do not allow any ASID to be changed while are working on the STE,
> > +	 * otherwise we could miss invalidations.
> > +	 */
> > +	mutex_lock(&arm_smmu_asid_lock);
> > +
> > +	/*
> > +	 * The SMMU does not support enabling ATS with bypass/abort. When the
> > +	 * STE is in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests
> > +	 * and Translated transactions are denied as though ATS is disabled for
> > +	 * the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
> > +	 * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
> > +	 */
> > +	arm_smmu_detach_dev(master);
> > +
> > +	arm_smmu_install_ste_for_dev(master, ste);
> > +	mutex_unlock(&arm_smmu_asid_lock);
> > +
> > +	/*
> > +	 * This has to be done after removing the master from the
> > +	 * arm_smmu_domain->devices to avoid races updating the same context
> > +	 * descriptor from arm_smmu_share_asid().
> > +	 */
> > +	if (master->cd_table.cdtab)
> > +		arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
>  
> This arm_smmu_write_ctx_desc was previously within the asid lock
> protection, yet now it's moved out of that?

Yes, arm_smmu_write_ctx_desc() updates a CD table entry and that does
not need ASID lock protection. The ASID lock exists because of the BTM
code rewriting STEs asyncronously.

Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 15/19] iommu/arm-smmu-v3: Add a global static IDENTITY domain
  2023-12-05 14:37       ` Jason Gunthorpe
@ 2023-12-05 17:25         ` Nicolin Chen
  -1 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05 17:25 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Tue, Dec 05, 2023 at 10:37:42AM -0400, Jason Gunthorpe wrote:
> On Mon, Dec 04, 2023 at 08:28:23PM -0800, Nicolin Chen wrote:
> 
> > > +static int arm_smmu_attach_dev_ste(struct device *dev,
> > > +				   struct arm_smmu_ste *ste)
> > > +{
> > > +	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> > > +
> > > +	if (arm_smmu_master_sva_enabled(master))
> > > +		return -EBUSY;
> > > +
> > > +	/*
> > > +	 * Do not allow any ASID to be changed while are working on the STE,
> > > +	 * otherwise we could miss invalidations.
> > > +	 */
> > > +	mutex_lock(&arm_smmu_asid_lock);
> > > +
> > > +	/*
> > > +	 * The SMMU does not support enabling ATS with bypass/abort. When the
> > > +	 * STE is in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests
> > > +	 * and Translated transactions are denied as though ATS is disabled for
> > > +	 * the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
> > > +	 * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
> > > +	 */
> > > +	arm_smmu_detach_dev(master);
> > > +
> > > +	arm_smmu_install_ste_for_dev(master, ste);
> > > +	mutex_unlock(&arm_smmu_asid_lock);
> > > +
> > > +	/*
> > > +	 * This has to be done after removing the master from the
> > > +	 * arm_smmu_domain->devices to avoid races updating the same context
> > > +	 * descriptor from arm_smmu_share_asid().
> > > +	 */
> > > +	if (master->cd_table.cdtab)
> > > +		arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
> >  
> > This arm_smmu_write_ctx_desc was previously within the asid lock
> > protection, yet now it's moved out of that?
> 
> Yes, arm_smmu_write_ctx_desc() updates a CD table entry and that does
> not need ASID lock protection. The ASID lock exists because of the BTM
> code rewriting STEs asyncronously.

I see. Thanks for elaborating. For this patch:

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 15/19] iommu/arm-smmu-v3: Add a global static IDENTITY domain
@ 2023-12-05 17:25         ` Nicolin Chen
  0 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05 17:25 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Tue, Dec 05, 2023 at 10:37:42AM -0400, Jason Gunthorpe wrote:
> On Mon, Dec 04, 2023 at 08:28:23PM -0800, Nicolin Chen wrote:
> 
> > > +static int arm_smmu_attach_dev_ste(struct device *dev,
> > > +				   struct arm_smmu_ste *ste)
> > > +{
> > > +	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> > > +
> > > +	if (arm_smmu_master_sva_enabled(master))
> > > +		return -EBUSY;
> > > +
> > > +	/*
> > > +	 * Do not allow any ASID to be changed while are working on the STE,
> > > +	 * otherwise we could miss invalidations.
> > > +	 */
> > > +	mutex_lock(&arm_smmu_asid_lock);
> > > +
> > > +	/*
> > > +	 * The SMMU does not support enabling ATS with bypass/abort. When the
> > > +	 * STE is in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests
> > > +	 * and Translated transactions are denied as though ATS is disabled for
> > > +	 * the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
> > > +	 * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
> > > +	 */
> > > +	arm_smmu_detach_dev(master);
> > > +
> > > +	arm_smmu_install_ste_for_dev(master, ste);
> > > +	mutex_unlock(&arm_smmu_asid_lock);
> > > +
> > > +	/*
> > > +	 * This has to be done after removing the master from the
> > > +	 * arm_smmu_domain->devices to avoid races updating the same context
> > > +	 * descriptor from arm_smmu_share_asid().
> > > +	 */
> > > +	if (master->cd_table.cdtab)
> > > +		arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
> >  
> > This arm_smmu_write_ctx_desc was previously within the asid lock
> > protection, yet now it's moved out of that?
> 
> Yes, arm_smmu_write_ctx_desc() updates a CD table entry and that does
> not need ASID lock protection. The ASID lock exists because of the BTM
> code rewriting STEs asyncronously.

I see. Thanks for elaborating. For this patch:

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 15/19] iommu/arm-smmu-v3: Add a global static IDENTITY domain
  2023-12-05 17:25         ` Nicolin Chen
@ 2023-12-05 17:42           ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 17:42 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Tue, Dec 05, 2023 at 09:25:59AM -0800, Nicolin Chen wrote:
> On Tue, Dec 05, 2023 at 10:37:42AM -0400, Jason Gunthorpe wrote:
> > On Mon, Dec 04, 2023 at 08:28:23PM -0800, Nicolin Chen wrote:
> > 
> > > > +static int arm_smmu_attach_dev_ste(struct device *dev,
> > > > +				   struct arm_smmu_ste *ste)
> > > > +{
> > > > +	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> > > > +
> > > > +	if (arm_smmu_master_sva_enabled(master))
> > > > +		return -EBUSY;
> > > > +
> > > > +	/*
> > > > +	 * Do not allow any ASID to be changed while are working on the STE,
> > > > +	 * otherwise we could miss invalidations.
> > > > +	 */
> > > > +	mutex_lock(&arm_smmu_asid_lock);
> > > > +
> > > > +	/*
> > > > +	 * The SMMU does not support enabling ATS with bypass/abort. When the
> > > > +	 * STE is in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests
> > > > +	 * and Translated transactions are denied as though ATS is disabled for
> > > > +	 * the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
> > > > +	 * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
> > > > +	 */
> > > > +	arm_smmu_detach_dev(master);
> > > > +
> > > > +	arm_smmu_install_ste_for_dev(master, ste);
> > > > +	mutex_unlock(&arm_smmu_asid_lock);
> > > > +
> > > > +	/*
> > > > +	 * This has to be done after removing the master from the
> > > > +	 * arm_smmu_domain->devices to avoid races updating the same context
> > > > +	 * descriptor from arm_smmu_share_asid().
> > > > +	 */
> > > > +	if (master->cd_table.cdtab)
> > > > +		arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
> > >  
> > > This arm_smmu_write_ctx_desc was previously within the asid lock
> > > protection, yet now it's moved out of that?
> > 
> > Yes, arm_smmu_write_ctx_desc() updates a CD table entry and that does
> > not need ASID lock protection. The ASID lock exists because of the BTM
> > code rewriting STEs asyncronously.
> 
> I see. Thanks for elaborating. For this patch

Actually wait, that explanation is not right..

The BTM code is changing the ASID which is done with a CD update

The ordering is OK here because the BTM code iterates over the
&smmu_domain->devices list.

The arm_smmu_detach_dev() has removed the master from the devices list
under a lock so the BTM code won't see this.

Thus there is no race between the arm_smmu_share_asid() flow and this
code, indeed we've already removed the cdtable from the STE at this
point.

Jason

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 15/19] iommu/arm-smmu-v3: Add a global static IDENTITY domain
@ 2023-12-05 17:42           ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 17:42 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Tue, Dec 05, 2023 at 09:25:59AM -0800, Nicolin Chen wrote:
> On Tue, Dec 05, 2023 at 10:37:42AM -0400, Jason Gunthorpe wrote:
> > On Mon, Dec 04, 2023 at 08:28:23PM -0800, Nicolin Chen wrote:
> > 
> > > > +static int arm_smmu_attach_dev_ste(struct device *dev,
> > > > +				   struct arm_smmu_ste *ste)
> > > > +{
> > > > +	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> > > > +
> > > > +	if (arm_smmu_master_sva_enabled(master))
> > > > +		return -EBUSY;
> > > > +
> > > > +	/*
> > > > +	 * Do not allow any ASID to be changed while are working on the STE,
> > > > +	 * otherwise we could miss invalidations.
> > > > +	 */
> > > > +	mutex_lock(&arm_smmu_asid_lock);
> > > > +
> > > > +	/*
> > > > +	 * The SMMU does not support enabling ATS with bypass/abort. When the
> > > > +	 * STE is in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests
> > > > +	 * and Translated transactions are denied as though ATS is disabled for
> > > > +	 * the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
> > > > +	 * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
> > > > +	 */
> > > > +	arm_smmu_detach_dev(master);
> > > > +
> > > > +	arm_smmu_install_ste_for_dev(master, ste);
> > > > +	mutex_unlock(&arm_smmu_asid_lock);
> > > > +
> > > > +	/*
> > > > +	 * This has to be done after removing the master from the
> > > > +	 * arm_smmu_domain->devices to avoid races updating the same context
> > > > +	 * descriptor from arm_smmu_share_asid().
> > > > +	 */
> > > > +	if (master->cd_table.cdtab)
> > > > +		arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
> > >  
> > > This arm_smmu_write_ctx_desc was previously within the asid lock
> > > protection, yet now it's moved out of that?
> > 
> > Yes, arm_smmu_write_ctx_desc() updates a CD table entry and that does
> > not need ASID lock protection. The ASID lock exists because of the BTM
> > code rewriting STEs asyncronously.
> 
> I see. Thanks for elaborating. For this patch

Actually wait, that explanation is not right..

The BTM code is changing the ASID which is done with a CD update

The ordering is OK here because the BTM code iterates over the
&smmu_domain->devices list.

The arm_smmu_detach_dev() has removed the master from the devices list
under a lock so the BTM code won't see this.

Thus there is no race between the arm_smmu_share_asid() flow and this
code, indeed we've already removed the cdtable from the STE at this
point.

Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 15/19] iommu/arm-smmu-v3: Add a global static IDENTITY domain
  2023-12-05 17:42           ` Jason Gunthorpe
@ 2023-12-05 18:21             ` Nicolin Chen
  -1 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05 18:21 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Tue, Dec 05, 2023 at 01:42:19PM -0400, Jason Gunthorpe wrote:
> On Tue, Dec 05, 2023 at 09:25:59AM -0800, Nicolin Chen wrote:
> > On Tue, Dec 05, 2023 at 10:37:42AM -0400, Jason Gunthorpe wrote:
> > > On Mon, Dec 04, 2023 at 08:28:23PM -0800, Nicolin Chen wrote:
> > > 
> > > > > +static int arm_smmu_attach_dev_ste(struct device *dev,
> > > > > +				   struct arm_smmu_ste *ste)
> > > > > +{
> > > > > +	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> > > > > +
> > > > > +	if (arm_smmu_master_sva_enabled(master))
> > > > > +		return -EBUSY;
> > > > > +
> > > > > +	/*
> > > > > +	 * Do not allow any ASID to be changed while are working on the STE,
> > > > > +	 * otherwise we could miss invalidations.
> > > > > +	 */
> > > > > +	mutex_lock(&arm_smmu_asid_lock);
> > > > > +
> > > > > +	/*
> > > > > +	 * The SMMU does not support enabling ATS with bypass/abort. When the
> > > > > +	 * STE is in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests
> > > > > +	 * and Translated transactions are denied as though ATS is disabled for
> > > > > +	 * the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
> > > > > +	 * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
> > > > > +	 */
> > > > > +	arm_smmu_detach_dev(master);
> > > > > +
> > > > > +	arm_smmu_install_ste_for_dev(master, ste);
> > > > > +	mutex_unlock(&arm_smmu_asid_lock);
> > > > > +
> > > > > +	/*
> > > > > +	 * This has to be done after removing the master from the
> > > > > +	 * arm_smmu_domain->devices to avoid races updating the same context
> > > > > +	 * descriptor from arm_smmu_share_asid().
> > > > > +	 */
> > > > > +	if (master->cd_table.cdtab)
> > > > > +		arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
> > > >  
> > > > This arm_smmu_write_ctx_desc was previously within the asid lock
> > > > protection, yet now it's moved out of that?
> > > 
> > > Yes, arm_smmu_write_ctx_desc() updates a CD table entry and that does
> > > not need ASID lock protection. The ASID lock exists because of the BTM
> > > code rewriting STEs asyncronously.
> > 
> > I see. Thanks for elaborating. For this patch
> 
> Actually wait, that explanation is not right..
> 
> The BTM code is changing the ASID which is done with a CD update
> 
> The ordering is OK here because the BTM code iterates over the
> &smmu_domain->devices list.
> 
> The arm_smmu_detach_dev() has removed the master from the devices list
> under a lock so the BTM code won't see this.
> 
> Thus there is no race between the arm_smmu_share_asid() flow and this
> code, indeed we've already removed the cdtable from the STE at this
> point.

I see. Maybe worth mentioning this in the comments above or commit
message?

Also, the arm_smmu_attach_dev_ste seems to be used by IDENTITY and
BLOCK domains only. This turns its nature to be a cleanup function
against a translate domain, while the naming sounds very generic.
I cannot think of any better name than this one, yet maybe we can
highlight this with a line of comments above the function?

Nicolin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 15/19] iommu/arm-smmu-v3: Add a global static IDENTITY domain
@ 2023-12-05 18:21             ` Nicolin Chen
  0 siblings, 0 replies; 158+ messages in thread
From: Nicolin Chen @ 2023-12-05 18:21 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Tue, Dec 05, 2023 at 01:42:19PM -0400, Jason Gunthorpe wrote:
> On Tue, Dec 05, 2023 at 09:25:59AM -0800, Nicolin Chen wrote:
> > On Tue, Dec 05, 2023 at 10:37:42AM -0400, Jason Gunthorpe wrote:
> > > On Mon, Dec 04, 2023 at 08:28:23PM -0800, Nicolin Chen wrote:
> > > 
> > > > > +static int arm_smmu_attach_dev_ste(struct device *dev,
> > > > > +				   struct arm_smmu_ste *ste)
> > > > > +{
> > > > > +	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> > > > > +
> > > > > +	if (arm_smmu_master_sva_enabled(master))
> > > > > +		return -EBUSY;
> > > > > +
> > > > > +	/*
> > > > > +	 * Do not allow any ASID to be changed while are working on the STE,
> > > > > +	 * otherwise we could miss invalidations.
> > > > > +	 */
> > > > > +	mutex_lock(&arm_smmu_asid_lock);
> > > > > +
> > > > > +	/*
> > > > > +	 * The SMMU does not support enabling ATS with bypass/abort. When the
> > > > > +	 * STE is in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests
> > > > > +	 * and Translated transactions are denied as though ATS is disabled for
> > > > > +	 * the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
> > > > > +	 * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
> > > > > +	 */
> > > > > +	arm_smmu_detach_dev(master);
> > > > > +
> > > > > +	arm_smmu_install_ste_for_dev(master, ste);
> > > > > +	mutex_unlock(&arm_smmu_asid_lock);
> > > > > +
> > > > > +	/*
> > > > > +	 * This has to be done after removing the master from the
> > > > > +	 * arm_smmu_domain->devices to avoid races updating the same context
> > > > > +	 * descriptor from arm_smmu_share_asid().
> > > > > +	 */
> > > > > +	if (master->cd_table.cdtab)
> > > > > +		arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
> > > >  
> > > > This arm_smmu_write_ctx_desc was previously within the asid lock
> > > > protection, yet now it's moved out of that?
> > > 
> > > Yes, arm_smmu_write_ctx_desc() updates a CD table entry and that does
> > > not need ASID lock protection. The ASID lock exists because of the BTM
> > > code rewriting STEs asyncronously.
> > 
> > I see. Thanks for elaborating. For this patch
> 
> Actually wait, that explanation is not right..
> 
> The BTM code is changing the ASID which is done with a CD update
> 
> The ordering is OK here because the BTM code iterates over the
> &smmu_domain->devices list.
> 
> The arm_smmu_detach_dev() has removed the master from the devices list
> under a lock so the BTM code won't see this.
> 
> Thus there is no race between the arm_smmu_share_asid() flow and this
> code, indeed we've already removed the cdtable from the STE at this
> point.

I see. Maybe worth mentioning this in the comments above or commit
message?

Also, the arm_smmu_attach_dev_ste seems to be used by IDENTITY and
BLOCK domains only. This turns its nature to be a cleanup function
against a translate domain, while the naming sounds very generic.
I cannot think of any better name than this one, yet maybe we can
highlight this with a line of comments above the function?

Nicolin

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 15/19] iommu/arm-smmu-v3: Add a global static IDENTITY domain
  2023-12-05 18:21             ` Nicolin Chen
@ 2023-12-05 19:03               ` Jason Gunthorpe
  -1 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 19:03 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Tue, Dec 05, 2023 at 10:21:18AM -0800, Nicolin Chen wrote:
> On Tue, Dec 05, 2023 at 01:42:19PM -0400, Jason Gunthorpe wrote:
> > On Tue, Dec 05, 2023 at 09:25:59AM -0800, Nicolin Chen wrote:
> > > On Tue, Dec 05, 2023 at 10:37:42AM -0400, Jason Gunthorpe wrote:
> > > > On Mon, Dec 04, 2023 at 08:28:23PM -0800, Nicolin Chen wrote:
> > > > 
> > > > > > +static int arm_smmu_attach_dev_ste(struct device *dev,
> > > > > > +				   struct arm_smmu_ste *ste)
> > > > > > +{
> > > > > > +	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> > > > > > +
> > > > > > +	if (arm_smmu_master_sva_enabled(master))
> > > > > > +		return -EBUSY;
> > > > > > +
> > > > > > +	/*
> > > > > > +	 * Do not allow any ASID to be changed while are working on the STE,
> > > > > > +	 * otherwise we could miss invalidations.
> > > > > > +	 */
> > > > > > +	mutex_lock(&arm_smmu_asid_lock);
> > > > > > +
> > > > > > +	/*
> > > > > > +	 * The SMMU does not support enabling ATS with bypass/abort. When the
> > > > > > +	 * STE is in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests
> > > > > > +	 * and Translated transactions are denied as though ATS is disabled for
> > > > > > +	 * the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
> > > > > > +	 * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
> > > > > > +	 */
> > > > > > +	arm_smmu_detach_dev(master);
> > > > > > +
> > > > > > +	arm_smmu_install_ste_for_dev(master, ste);
> > > > > > +	mutex_unlock(&arm_smmu_asid_lock);
> > > > > > +
> > > > > > +	/*
> > > > > > +	 * This has to be done after removing the master from the
> > > > > > +	 * arm_smmu_domain->devices to avoid races updating the same context
> > > > > > +	 * descriptor from arm_smmu_share_asid().
> > > > > > +	 */
> > > > > > +	if (master->cd_table.cdtab)
> > > > > > +		arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
> > > > >  
> > > > > This arm_smmu_write_ctx_desc was previously within the asid lock
> > > > > protection, yet now it's moved out of that?
> > > > 
> > > > Yes, arm_smmu_write_ctx_desc() updates a CD table entry and that does
> > > > not need ASID lock protection. The ASID lock exists because of the BTM
> > > > code rewriting STEs asyncronously.
> > > 
> > > I see. Thanks for elaborating. For this patch
> > 
> > Actually wait, that explanation is not right..
> > 
> > The BTM code is changing the ASID which is done with a CD update
> > 
> > The ordering is OK here because the BTM code iterates over the
> > &smmu_domain->devices list.
> > 
> > The arm_smmu_detach_dev() has removed the master from the devices list
> > under a lock so the BTM code won't see this.
> > 
> > Thus there is no race between the arm_smmu_share_asid() flow and this
> > code, indeed we've already removed the cdtable from the STE at this
> > point.
> 
> I see. Maybe worth mentioning this in the comments above or commit
> message?

It does have a comment:

 +	/*
 +	 * This has to be done after removing the master from the
 +	 * arm_smmu_domain->devices to avoid races updating the same context
 +	 * descriptor from arm_smmu_share_asid().
 +	 */
 
> Also, the arm_smmu_attach_dev_ste seems to be used by IDENTITY and
> BLOCK domains only. This turns its nature to be a cleanup function
> against a translate domain, while the naming sounds very generic.
> I cannot think of any better name than this one, yet maybe we can
> highlight this with a line of comments above the function?

I would not call it a cleanup, it installs a domain-less thing via a
raw STE

Jason

^ permalink raw reply	[flat|nested] 158+ messages in thread

* Re: [PATCH v2 15/19] iommu/arm-smmu-v3: Add a global static IDENTITY domain
@ 2023-12-05 19:03               ` Jason Gunthorpe
  0 siblings, 0 replies; 158+ messages in thread
From: Jason Gunthorpe @ 2023-12-05 19:03 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Michael Shavit, Shameerali Kolothum Thodi

On Tue, Dec 05, 2023 at 10:21:18AM -0800, Nicolin Chen wrote:
> On Tue, Dec 05, 2023 at 01:42:19PM -0400, Jason Gunthorpe wrote:
> > On Tue, Dec 05, 2023 at 09:25:59AM -0800, Nicolin Chen wrote:
> > > On Tue, Dec 05, 2023 at 10:37:42AM -0400, Jason Gunthorpe wrote:
> > > > On Mon, Dec 04, 2023 at 08:28:23PM -0800, Nicolin Chen wrote:
> > > > 
> > > > > > +static int arm_smmu_attach_dev_ste(struct device *dev,
> > > > > > +				   struct arm_smmu_ste *ste)
> > > > > > +{
> > > > > > +	struct arm_smmu_master *master = dev_iommu_priv_get(dev);
> > > > > > +
> > > > > > +	if (arm_smmu_master_sva_enabled(master))
> > > > > > +		return -EBUSY;
> > > > > > +
> > > > > > +	/*
> > > > > > +	 * Do not allow any ASID to be changed while are working on the STE,
> > > > > > +	 * otherwise we could miss invalidations.
> > > > > > +	 */
> > > > > > +	mutex_lock(&arm_smmu_asid_lock);
> > > > > > +
> > > > > > +	/*
> > > > > > +	 * The SMMU does not support enabling ATS with bypass/abort. When the
> > > > > > +	 * STE is in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests
> > > > > > +	 * and Translated transactions are denied as though ATS is disabled for
> > > > > > +	 * the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and
> > > > > > +	 * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry).
> > > > > > +	 */
> > > > > > +	arm_smmu_detach_dev(master);
> > > > > > +
> > > > > > +	arm_smmu_install_ste_for_dev(master, ste);
> > > > > > +	mutex_unlock(&arm_smmu_asid_lock);
> > > > > > +
> > > > > > +	/*
> > > > > > +	 * This has to be done after removing the master from the
> > > > > > +	 * arm_smmu_domain->devices to avoid races updating the same context
> > > > > > +	 * descriptor from arm_smmu_share_asid().
> > > > > > +	 */
> > > > > > +	if (master->cd_table.cdtab)
> > > > > > +		arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
> > > > >  
> > > > > This arm_smmu_write_ctx_desc was previously within the asid lock
> > > > > protection, yet now it's moved out of that?
> > > > 
> > > > Yes, arm_smmu_write_ctx_desc() updates a CD table entry and that does
> > > > not need ASID lock protection. The ASID lock exists because of the BTM
> > > > code rewriting STEs asyncronously.
> > > 
> > > I see. Thanks for elaborating. For this patch
> > 
> > Actually wait, that explanation is not right..
> > 
> > The BTM code is changing the ASID which is done with a CD update
> > 
> > The ordering is OK here because the BTM code iterates over the
> > &smmu_domain->devices list.
> > 
> > The arm_smmu_detach_dev() has removed the master from the devices list
> > under a lock so the BTM code won't see this.
> > 
> > Thus there is no race between the arm_smmu_share_asid() flow and this
> > code, indeed we've already removed the cdtable from the STE at this
> > point.
> 
> I see. Maybe worth mentioning this in the comments above or commit
> message?

It does have a comment:

 +	/*
 +	 * This has to be done after removing the master from the
 +	 * arm_smmu_domain->devices to avoid races updating the same context
 +	 * descriptor from arm_smmu_share_asid().
 +	 */
 
> Also, the arm_smmu_attach_dev_ste seems to be used by IDENTITY and
> BLOCK domains only. This turns its nature to be a cleanup function
> against a translate domain, while the naming sounds very generic.
> I cannot think of any better name than this one, yet maybe we can
> highlight this with a line of comments above the function?

I would not call it a cleanup, it installs a domain-less thing via a
raw STE

Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 158+ messages in thread

end of thread, other threads:[~2023-12-05 19:04 UTC | newest]

Thread overview: 158+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-13 17:53 [PATCH v2 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Jason Gunthorpe
2023-11-13 17:53 ` Jason Gunthorpe
2023-11-13 17:53 ` [PATCH v2 01/19] iommu/arm-smmu-v3: Add a type for the STE Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-14 15:06   ` Moritz Fischer
2023-11-14 15:06     ` Moritz Fischer
2023-11-15 11:52     ` Michael Shavit
2023-11-15 11:52       ` Michael Shavit
2023-11-15 13:35       ` Jason Gunthorpe
2023-11-15 13:35         ` Jason Gunthorpe
2023-11-27 16:03   ` Eric Auger
2023-11-27 16:03     ` Eric Auger
2023-11-27 17:42     ` Jason Gunthorpe
2023-11-27 17:42       ` Jason Gunthorpe
2023-11-27 17:51       ` Eric Auger
2023-11-27 17:51         ` Eric Auger
2023-11-27 18:21         ` Jason Gunthorpe
2023-11-27 18:21           ` Jason Gunthorpe
2023-12-05  0:44   ` Nicolin Chen
2023-12-05  0:44     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 02/19] iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent() Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-14 15:17   ` Moritz Fischer
2023-11-14 15:17     ` Moritz Fischer
2023-11-15 11:55     ` Michael Shavit
2023-11-15 11:55       ` Michael Shavit
2023-11-27 15:41   ` Eric Auger
2023-11-27 15:41     ` Eric Auger
2023-12-05  0:45   ` Nicolin Chen
2023-12-05  0:45     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 03/19] iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-14 15:18   ` Moritz Fischer
2023-11-14 15:18     ` Moritz Fischer
2023-11-27 16:35   ` Eric Auger
2023-11-27 16:35     ` Eric Auger
2023-12-05  0:46   ` Nicolin Chen
2023-12-05  0:46     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 04/19] iommu/arm-smmu-v3: Make STE programming independent of the callers Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-12-05  1:38   ` Nicolin Chen
2023-12-05  1:38     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 05/19] iommu/arm-smmu-v3: Consolidate the STE generation for abort/bypass Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-15 12:17   ` Michael Shavit
2023-11-15 12:17     ` Michael Shavit
2023-12-05  1:43   ` Nicolin Chen
2023-12-05  1:43     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 06/19] iommu/arm-smmu-v3: Move arm_smmu_rmr_install_bypass_ste() Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-15 13:57   ` Michael Shavit
2023-11-15 13:57     ` Michael Shavit
2023-12-05  1:45   ` Nicolin Chen
2023-12-05  1:45     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 07/19] iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into functions Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-14 15:24   ` Moritz Fischer
2023-11-14 15:24     ` Moritz Fischer
2023-11-15 14:01   ` Michael Shavit
2023-11-15 14:01     ` Michael Shavit
2023-12-05  1:55   ` Nicolin Chen
2023-12-05  1:55     ` Nicolin Chen
2023-12-05 14:35     ` Jason Gunthorpe
2023-12-05 14:35       ` Jason Gunthorpe
2023-11-13 17:53 ` [PATCH v2 08/19] iommu/arm-smmu-v3: Build the whole STE in arm_smmu_make_s2_domain_ste() Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-15 14:04   ` Michael Shavit
2023-11-15 14:04     ` Michael Shavit
2023-12-05  1:58   ` Nicolin Chen
2023-12-05  1:58     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 09/19] iommu/arm-smmu-v3: Hold arm_smmu_asid_lock during all of attach_dev Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-15 14:12   ` Michael Shavit
2023-11-15 14:12     ` Michael Shavit
2023-12-05  2:16   ` Nicolin Chen
2023-12-05  2:16     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 10/19] iommu/arm-smmu-v3: Compute the STE only once for each master Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-15 14:16   ` Michael Shavit
2023-11-15 14:16     ` Michael Shavit
2023-12-05  2:13   ` Nicolin Chen
2023-12-05  2:13     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 11/19] iommu/arm-smmu-v3: Do not change the STE twice during arm_smmu_attach_dev() Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-15 15:15   ` Michael Shavit
2023-11-15 15:15     ` Michael Shavit
2023-11-16 16:28     ` Jason Gunthorpe
2023-11-16 16:28       ` Jason Gunthorpe
2023-12-05  2:46   ` Nicolin Chen
2023-12-05  2:46     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 12/19] iommu/arm-smmu-v3: Put writing the context descriptor in the right order Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-15 15:32   ` Michael Shavit
2023-11-15 15:32     ` Michael Shavit
2023-11-16 16:46     ` Jason Gunthorpe
2023-11-16 16:46       ` Jason Gunthorpe
2023-11-17  4:14       ` Michael Shavit
2023-11-17  4:14         ` Michael Shavit
2023-12-05  3:38   ` Nicolin Chen
2023-12-05  3:38     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 13/19] iommu/arm-smmu-v3: Pass smmu_domain to arm_enable/disable_ats() Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-12-05  3:56   ` Nicolin Chen
2023-12-05  3:56     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 14/19] iommu/arm-smmu-v3: Remove arm_smmu_master->domain Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-27 17:14   ` Eric Auger
2023-11-27 17:14     ` Eric Auger
2023-11-30 12:03     ` Jason Gunthorpe
2023-11-30 12:03       ` Jason Gunthorpe
2023-12-05 13:25       ` Eric Auger
2023-12-05 13:25         ` Eric Auger
2023-12-05  4:47   ` Nicolin Chen
2023-12-05  4:47     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 15/19] iommu/arm-smmu-v3: Add a global static IDENTITY domain Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-15 15:50   ` Michael Shavit
2023-11-15 15:50     ` Michael Shavit
2023-12-05  4:28   ` Nicolin Chen
2023-12-05  4:28     ` Nicolin Chen
2023-12-05 14:37     ` Jason Gunthorpe
2023-12-05 14:37       ` Jason Gunthorpe
2023-12-05 17:25       ` Nicolin Chen
2023-12-05 17:25         ` Nicolin Chen
2023-12-05 17:42         ` Jason Gunthorpe
2023-12-05 17:42           ` Jason Gunthorpe
2023-12-05 18:21           ` Nicolin Chen
2023-12-05 18:21             ` Nicolin Chen
2023-12-05 19:03             ` Jason Gunthorpe
2023-12-05 19:03               ` Jason Gunthorpe
2023-11-13 17:53 ` [PATCH v2 16/19] iommu/arm-smmu-v3: Add a global static BLOCKED domain Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-15 15:57   ` Michael Shavit
2023-11-15 15:57     ` Michael Shavit
2023-11-16 15:44     ` Jason Gunthorpe
2023-11-16 15:44       ` Jason Gunthorpe
2023-12-05  4:05   ` Nicolin Chen
2023-12-05  4:05     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 17/19] iommu/arm-smmu-v3: Use the identity/blocked domain during release Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-12-05  4:07   ` Nicolin Chen
2023-12-05  4:07     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 18/19] iommu/arm-smmu-v3: Pass arm_smmu_domain and arm_smmu_device to finalize Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-11-15 16:02   ` Michael Shavit
2023-11-15 16:02     ` Michael Shavit
2023-12-05  4:42   ` Nicolin Chen
2023-12-05  4:42     ` Nicolin Chen
2023-11-13 17:53 ` [PATCH v2 19/19] iommu/arm-smmu-v3: Convert to domain_alloc_paging() Jason Gunthorpe
2023-11-13 17:53   ` Jason Gunthorpe
2023-12-05  4:40   ` Nicolin Chen
2023-12-05  4:40     ` Nicolin Chen
2023-11-27 16:10 ` [PATCH v2 00/19] Update SMMUv3 to the modern iommu API (part 1/3) Shameerali Kolothum Thodi
2023-11-27 16:10   ` Shameerali Kolothum Thodi
2023-11-27 17:48   ` Jason Gunthorpe
2023-11-27 17:48     ` Jason Gunthorpe
2023-12-05  3:54 ` Nicolin Chen
2023-12-05  3:54   ` Nicolin Chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.