iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv3 0/7] System Cache support for GPU and required SMMU support
@ 2020-06-29 15:52 Sai Prakash Ranjan
  2020-06-29 15:52 ` [PATCHv3 1/7] iommu/arm-smmu: Add a init_context_bank implementation hook Sai Prakash Ranjan
                   ` (6 more replies)
  0 siblings, 7 replies; 14+ messages in thread
From: Sai Prakash Ranjan @ 2020-06-29 15:52 UTC (permalink / raw)
  To: Robin Murphy, Will Deacon, Joerg Roedel, Jordan Crouse, Rob Clark
  Cc: freedreno, David Airlie, linux-arm-msm, Sharat Masetty,
	linux-kernel, dri-devel, Akhil P Oommen, iommu,
	Matthias Kaehlcke, Kristian H . Kristensen, Daniel Vetter,
	Stephen Boyd, Sean Paul, linux-arm-kernel, Emil Velikov

Some hardware variants contain a system cache or the last level
cache(llc). This cache is typically a large block which is shared
by multiple clients on the SOC. GPU uses the system cache to cache
both the GPU data buffers(like textures) as well the SMMU pagetables.
This helps with improved render performance as well as lower power
consumption by reducing the bus traffic to the system memory.

The system cache architecture allows the cache to be split into slices
which then be used by multiple SOC clients. This patch series is an
effort to enable and use two of those slices perallocated for the GPU,
one for the GPU data buffers and another for the GPU SMMU hardware
pagetables.

Patch 1 adds a init_context_bank implementation hook to set SCTLR.HUPCF.
Patch 2,3,6,7 adds system cache support in SMMU and GPU driver.
Patch 4 and 5 are minor cleanups for arm-smmu impl.

Changes in v3:
 * Fix domain attribute setting to before iommu_attach_device()
 * Fix few code style and checkpatch warnings
 * Rebase on top of Jordan's latest split pagetables and per-instance
   pagetables support [1][2]

Changes in v2:
 * Addressed review comments and rebased on top of Jordan's split
   pagetables series

[1] https://lore.kernel.org/patchwork/cover/1264446/
[2] https://lore.kernel.org/patchwork/cover/1264460/

Jordan Crouse (1):
  iommu/arm-smmu: Add a init_context_bank implementation hook

Sai Prakash Ranjan (4):
  iommu/io-pgtable-arm: Add support to use system cache
  iommu/arm-smmu: Add domain attribute for system cache
  iommu: arm-smmu-impl: Remove unwanted extra blank lines
  iommu: arm-smmu-impl: Convert to use of_match_node() for qcom impl

Sharat Masetty (2):
  drm/msm: rearrange the gpu_rmw() function
  drm/msm/a6xx: Add support for using system cache(LLC)

 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 82 +++++++++++++++++++++++++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h   |  3 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 23 ++++++-
 drivers/gpu/drm/msm/msm_drv.c           |  8 +++
 drivers/gpu/drm/msm/msm_drv.h           |  1 +
 drivers/gpu/drm/msm/msm_gpu.h           |  5 +-
 drivers/gpu/drm/msm/msm_iommu.c         |  3 +
 drivers/gpu/drm/msm/msm_mmu.h           |  4 ++
 drivers/iommu/arm-smmu-impl.c           | 13 ++--
 drivers/iommu/arm-smmu-qcom.c           | 13 ++++
 drivers/iommu/arm-smmu.c                | 46 +++++++++-----
 drivers/iommu/arm-smmu.h                | 13 ++++
 drivers/iommu/io-pgtable-arm.c          |  7 ++-
 include/linux/io-pgtable.h              |  4 ++
 include/linux/iommu.h                   |  1 +
 15 files changed, 198 insertions(+), 28 deletions(-)

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCHv3 1/7] iommu/arm-smmu: Add a init_context_bank implementation hook
  2020-06-29 15:52 [PATCHv3 0/7] System Cache support for GPU and required SMMU support Sai Prakash Ranjan
@ 2020-06-29 15:52 ` Sai Prakash Ranjan
  2020-06-29 15:52 ` [PATCHv3 2/7] iommu/io-pgtable-arm: Add support to use system cache Sai Prakash Ranjan
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Sai Prakash Ranjan @ 2020-06-29 15:52 UTC (permalink / raw)
  To: Robin Murphy, Will Deacon, Joerg Roedel, Jordan Crouse, Rob Clark
  Cc: freedreno, David Airlie, linux-arm-msm, Sharat Masetty,
	linux-kernel, dri-devel, Akhil P Oommen, iommu,
	Matthias Kaehlcke, Kristian H . Kristensen, Daniel Vetter,
	Stephen Boyd, Sean Paul, linux-arm-kernel, Emil Velikov

From: Jordan Crouse <jcrouse@codeaurora.org>

Add a new implementation hook to allow the implementation specific code
to tweek the context bank configuration just before it gets written.
The first user will be the Adreno GPU implementation to turn on
SCTLR.HUPCF to ensure that a page fault doesn't terminating pending
transactions. Doing so could hang the GPU if one of the terminated
transactions is a CP read.

This depends on the arm-smmu adreno SMMU implementation [1].

[1] https://lore.kernel.org/patchwork/patch/1264452/

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
---
 drivers/iommu/arm-smmu-qcom.c | 13 +++++++++++++
 drivers/iommu/arm-smmu.c      | 29 +++++++++++++----------------
 drivers/iommu/arm-smmu.h      | 12 ++++++++++++
 3 files changed, 38 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/arm-smmu-qcom.c b/drivers/iommu/arm-smmu-qcom.c
index cb2acb6b19dd..6462fb00f493 100644
--- a/drivers/iommu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm-smmu-qcom.c
@@ -17,6 +17,18 @@ static bool qcom_adreno_smmu_is_gpu_device(struct arm_smmu_domain *smmu_domain)
 	return of_device_is_compatible(smmu_domain->dev->of_node, "qcom,adreno");
 }
 
+static void qcom_adreno_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
+		struct arm_smmu_cb *cb)
+{
+	/*
+	 * On the GPU device we want to process subsequent transactions after a
+	 * fault to keep the GPU from hanging
+	 */
+
+	if (qcom_adreno_smmu_is_gpu_device(smmu_domain))
+		cb->sctlr |= ARM_SMMU_SCTLR_HUPCF;
+}
+
 static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain,
 		struct io_pgtable_cfg *pgtbl_cfg)
 {
@@ -92,6 +104,7 @@ static const struct arm_smmu_impl qcom_adreno_smmu_impl = {
 	.init_context = qcom_adreno_smmu_init_context,
 	.def_domain_type = qcom_smmu_def_domain_type,
 	.reset = qcom_smmu500_reset,
+	.init_context_bank = qcom_adreno_smmu_init_context_bank,
 };
 
 
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 4bd247dfd703..b2564f93d685 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -86,14 +86,6 @@ struct arm_smmu_smr {
 	bool				valid;
 };
 
-struct arm_smmu_cb {
-	u64				ttbr[2];
-	u32				tcr[2];
-	u32				mair[2];
-	struct arm_smmu_cfg		*cfg;
-	atomic_t			aux;
-};
-
 struct arm_smmu_master_cfg {
 	struct arm_smmu_device		*smmu;
 	s16				smendx[];
@@ -580,6 +572,18 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
 			cb->mair[1] = pgtbl_cfg->arm_lpae_s1_cfg.mair >> 32;
 		}
 	}
+
+	cb->sctlr = ARM_SMMU_SCTLR_CFIE | ARM_SMMU_SCTLR_CFRE | ARM_SMMU_SCTLR_AFE |
+		ARM_SMMU_SCTLR_TRE | ARM_SMMU_SCTLR_M;
+
+	if (stage1)
+		cb->sctlr |= ARM_SMMU_SCTLR_S1_ASIDPNE;
+	if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
+		cb->sctlr |= ARM_SMMU_SCTLR_E;
+
+	/* Give the implementation a chance to adjust the configuration */
+	if (smmu_domain->smmu->impl && smmu_domain->smmu->impl->init_context_bank)
+		smmu_domain->smmu->impl->init_context_bank(smmu_domain, cb);
 }
 
 static void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx)
@@ -658,14 +662,7 @@ static void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx)
 	}
 
 	/* SCTLR */
-	reg = ARM_SMMU_SCTLR_CFIE | ARM_SMMU_SCTLR_CFRE | ARM_SMMU_SCTLR_AFE |
-	      ARM_SMMU_SCTLR_TRE | ARM_SMMU_SCTLR_M;
-	if (stage1)
-		reg |= ARM_SMMU_SCTLR_S1_ASIDPNE;
-	if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
-		reg |= ARM_SMMU_SCTLR_E;
-
-	arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, reg);
+	arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, cb->sctlr);
 }
 
 /*
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index 79d441024043..4a335ef3d97a 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -142,6 +142,7 @@ enum arm_smmu_cbar_type {
 
 #define ARM_SMMU_CB_SCTLR		0x0
 #define ARM_SMMU_SCTLR_S1_ASIDPNE	BIT(12)
+#define ARM_SMMU_SCTLR_HUPCF		BIT(8)
 #define ARM_SMMU_SCTLR_CFCFG		BIT(7)
 #define ARM_SMMU_SCTLR_CFIE		BIT(6)
 #define ARM_SMMU_SCTLR_CFRE		BIT(5)
@@ -349,6 +350,15 @@ struct arm_smmu_domain {
 	bool				aux;
 };
 
+struct arm_smmu_cb {
+	u64			ttbr[2];
+	u32			tcr[2];
+	u32			mair[2];
+	u32			sctlr;
+	struct arm_smmu_cfg	*cfg;
+	atomic_t		aux;
+};
+
 static inline u32 arm_smmu_lpae_tcr(struct io_pgtable_cfg *cfg)
 {
 	u32 tcr = FIELD_PREP(ARM_SMMU_TCR_TG0, cfg->arm_lpae_s1_cfg.tcr.tg) |
@@ -403,6 +413,8 @@ struct arm_smmu_impl {
 	void (*tlb_sync)(struct arm_smmu_device *smmu, int page, int sync,
 			 int status);
 	int (*def_domain_type)(struct device *dev);
+	void (*init_context_bank)(struct arm_smmu_domain *smmu_domain,
+			struct arm_smmu_cb *cb);
 };
 
 static inline void __iomem *arm_smmu_page(struct arm_smmu_device *smmu, int n)
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCHv3 2/7] iommu/io-pgtable-arm: Add support to use system cache
  2020-06-29 15:52 [PATCHv3 0/7] System Cache support for GPU and required SMMU support Sai Prakash Ranjan
  2020-06-29 15:52 ` [PATCHv3 1/7] iommu/arm-smmu: Add a init_context_bank implementation hook Sai Prakash Ranjan
@ 2020-06-29 15:52 ` Sai Prakash Ranjan
  2020-06-29 15:52 ` [PATCHv3 3/7] iommu/arm-smmu: Add domain attribute for " Sai Prakash Ranjan
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Sai Prakash Ranjan @ 2020-06-29 15:52 UTC (permalink / raw)
  To: Robin Murphy, Will Deacon, Joerg Roedel, Jordan Crouse, Rob Clark
  Cc: freedreno, David Airlie, linux-arm-msm, Sharat Masetty,
	linux-kernel, dri-devel, Akhil P Oommen, iommu,
	Matthias Kaehlcke, Kristian H . Kristensen, Daniel Vetter,
	Stephen Boyd, Sean Paul, linux-arm-kernel, Emil Velikov

Add a quirk IO_PGTABLE_QUIRK_SYS_CACHE to override the
attributes set in TCR for the page table walker when
using system cache.

Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
---
 drivers/iommu/io-pgtable-arm.c | 7 ++++++-
 include/linux/io-pgtable.h     | 4 ++++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 04fbd4bf0ff9..0a6cb82fd98a 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -792,7 +792,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 
 	if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS |
 			    IO_PGTABLE_QUIRK_NON_STRICT |
-			    IO_PGTABLE_QUIRK_ARM_TTBR1))
+			    IO_PGTABLE_QUIRK_ARM_TTBR1 |
+			    IO_PGTABLE_QUIRK_SYS_CACHE))
 		return NULL;
 
 	data = arm_lpae_alloc_pgtable(cfg);
@@ -804,6 +805,10 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 		tcr->sh = ARM_LPAE_TCR_SH_IS;
 		tcr->irgn = ARM_LPAE_TCR_RGN_WBWA;
 		tcr->orgn = ARM_LPAE_TCR_RGN_WBWA;
+	} else if (cfg->quirks & IO_PGTABLE_QUIRK_SYS_CACHE) {
+		tcr->sh = ARM_LPAE_TCR_SH_OS;
+		tcr->irgn = ARM_LPAE_TCR_RGN_NC;
+		tcr->orgn = ARM_LPAE_TCR_RGN_WBWA;
 	} else {
 		tcr->sh = ARM_LPAE_TCR_SH_OS;
 		tcr->irgn = ARM_LPAE_TCR_RGN_NC;
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index bbed1d3925ba..23114b3fe2a5 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -86,6 +86,9 @@ struct io_pgtable_cfg {
 	 *
 	 * IO_PGTABLE_QUIRK_ARM_TTBR1: (ARM LPAE format) Configure the table
 	 *	for use in the upper half of a split address space.
+	 *
+	 * IO_PGTABLE_QUIRK_SYS_CACHE: Override the attributes set in TCR for
+	 *	the page table walker when using system cache.
 	 */
 	#define IO_PGTABLE_QUIRK_ARM_NS		BIT(0)
 	#define IO_PGTABLE_QUIRK_NO_PERMS	BIT(1)
@@ -93,6 +96,7 @@ struct io_pgtable_cfg {
 	#define IO_PGTABLE_QUIRK_ARM_MTK_EXT	BIT(3)
 	#define IO_PGTABLE_QUIRK_NON_STRICT	BIT(4)
 	#define IO_PGTABLE_QUIRK_ARM_TTBR1	BIT(5)
+	#define IO_PGTABLE_QUIRK_SYS_CACHE	BIT(6)
 	unsigned long			quirks;
 	unsigned long			pgsize_bitmap;
 	unsigned int			ias;
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCHv3 3/7] iommu/arm-smmu: Add domain attribute for system cache
  2020-06-29 15:52 [PATCHv3 0/7] System Cache support for GPU and required SMMU support Sai Prakash Ranjan
  2020-06-29 15:52 ` [PATCHv3 1/7] iommu/arm-smmu: Add a init_context_bank implementation hook Sai Prakash Ranjan
  2020-06-29 15:52 ` [PATCHv3 2/7] iommu/io-pgtable-arm: Add support to use system cache Sai Prakash Ranjan
@ 2020-06-29 15:52 ` Sai Prakash Ranjan
  2020-06-29 15:52 ` [PATCHv3 4/7] iommu: arm-smmu-impl: Remove unwanted extra blank lines Sai Prakash Ranjan
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Sai Prakash Ranjan @ 2020-06-29 15:52 UTC (permalink / raw)
  To: Robin Murphy, Will Deacon, Joerg Roedel, Jordan Crouse, Rob Clark
  Cc: freedreno, David Airlie, linux-arm-msm, Sharat Masetty,
	linux-kernel, dri-devel, Akhil P Oommen, iommu,
	Matthias Kaehlcke, Kristian H . Kristensen, Daniel Vetter,
	Stephen Boyd, Sean Paul, linux-arm-kernel, Emil Velikov

Add iommu domain attribute for using system cache aka last level
cache by client drivers like GPU to set right attributes for caching
the hardware pagetables into the system cache.

Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
---
 drivers/iommu/arm-smmu.c | 17 +++++++++++++++++
 drivers/iommu/arm-smmu.h |  1 +
 include/linux/iommu.h    |  1 +
 3 files changed, 19 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index b2564f93d685..71b6f7038423 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -897,6 +897,9 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 			goto out_unlock;
 	}
 
+	if (smmu_domain->sys_cache)
+		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_SYS_CACHE;
+
 	if (smmu_domain->non_strict)
 		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
 
@@ -1732,6 +1735,9 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 		case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
 			*(int *)data = smmu_domain->non_strict;
 			return 0;
+		case DOMAIN_ATTR_SYS_CACHE:
+			*((int *)data) = smmu_domain->sys_cache;
+			return 0;
 		default:
 			return -ENODEV;
 		}
@@ -1763,6 +1769,17 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
 			else
 				smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
 			break;
+		case DOMAIN_ATTR_SYS_CACHE:
+			if (smmu_domain->smmu) {
+				ret = -EPERM;
+				goto out_unlock;
+			}
+
+			if (*((int *)data))
+				smmu_domain->sys_cache = true;
+			else
+				smmu_domain->sys_cache = false;
+			break;
 		default:
 			ret = -ENODEV;
 		}
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index 4a335ef3d97a..bbae48bdc022 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -348,6 +348,7 @@ struct arm_smmu_domain {
 	struct iommu_domain		domain;
 	struct device			*dev;	/* Device attached to this domain */
 	bool				aux;
+	bool				sys_cache;
 };
 
 struct arm_smmu_cb {
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 2388117641f1..a48edbebe3cb 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -125,6 +125,7 @@ enum iommu_attr {
 	DOMAIN_ATTR_NESTING,	/* two stages of translation */
 	DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
 	DOMAIN_ATTR_PGTABLE_CFG,
+	DOMAIN_ATTR_SYS_CACHE,
 	DOMAIN_ATTR_MAX,
 };
 
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCHv3 4/7] iommu: arm-smmu-impl: Remove unwanted extra blank lines
  2020-06-29 15:52 [PATCHv3 0/7] System Cache support for GPU and required SMMU support Sai Prakash Ranjan
                   ` (2 preceding siblings ...)
  2020-06-29 15:52 ` [PATCHv3 3/7] iommu/arm-smmu: Add domain attribute for " Sai Prakash Ranjan
@ 2020-06-29 15:52 ` Sai Prakash Ranjan
  2020-06-29 15:52 ` [PATCHv3 5/7] iommu: arm-smmu-impl: Convert to use of_match_node() for qcom impl Sai Prakash Ranjan
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Sai Prakash Ranjan @ 2020-06-29 15:52 UTC (permalink / raw)
  To: Robin Murphy, Will Deacon, Joerg Roedel, Jordan Crouse, Rob Clark
  Cc: freedreno, David Airlie, linux-arm-msm, Sharat Masetty,
	linux-kernel, dri-devel, Akhil P Oommen, iommu,
	Matthias Kaehlcke, Kristian H . Kristensen, Daniel Vetter,
	Stephen Boyd, Sean Paul, linux-arm-kernel, Emil Velikov

There are few places in arm-smmu-impl where there are
extra blank lines, remove them and while at it fix the
checkpatch warning for space required before the open
parenthesis.

Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
---
 drivers/iommu/arm-smmu-impl.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c
index 309675cf6699..8fbab9c38b61 100644
--- a/drivers/iommu/arm-smmu-impl.c
+++ b/drivers/iommu/arm-smmu-impl.c
@@ -9,10 +9,9 @@
 
 #include "arm-smmu.h"
 
-
 static int arm_smmu_gr0_ns(int offset)
 {
-	switch(offset) {
+	switch (offset) {
 	case ARM_SMMU_GR0_sCR0:
 	case ARM_SMMU_GR0_sACR:
 	case ARM_SMMU_GR0_sGFSR:
@@ -47,7 +46,6 @@ static const struct arm_smmu_impl calxeda_impl = {
 	.write_reg = arm_smmu_write_ns,
 };
 
-
 struct cavium_smmu {
 	struct arm_smmu_device smmu;
 	u32 id_base;
@@ -103,7 +101,6 @@ static struct arm_smmu_device *cavium_smmu_impl_init(struct arm_smmu_device *smm
 	return &cs->smmu;
 }
 
-
 #define ARM_MMU500_ACTLR_CPRE		(1 << 1)
 
 #define ARM_MMU500_ACR_CACHE_LOCK	(1 << 26)
@@ -148,7 +145,6 @@ static const struct arm_smmu_impl arm_mmu500_impl = {
 	.reset = arm_mmu500_reset,
 };
 
-
 struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu)
 {
 	const struct device_node *np = smmu->dev->of_node;
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCHv3 5/7] iommu: arm-smmu-impl: Convert to use of_match_node() for qcom impl
  2020-06-29 15:52 [PATCHv3 0/7] System Cache support for GPU and required SMMU support Sai Prakash Ranjan
                   ` (3 preceding siblings ...)
  2020-06-29 15:52 ` [PATCHv3 4/7] iommu: arm-smmu-impl: Remove unwanted extra blank lines Sai Prakash Ranjan
@ 2020-06-29 15:52 ` Sai Prakash Ranjan
  2020-06-29 15:52 ` [PATCHv3 6/7] drm/msm: rearrange the gpu_rmw() function Sai Prakash Ranjan
  2020-06-29 15:52 ` [PATCHv3 7/7] drm/msm/a6xx: Add support for using system cache(LLC) Sai Prakash Ranjan
  6 siblings, 0 replies; 14+ messages in thread
From: Sai Prakash Ranjan @ 2020-06-29 15:52 UTC (permalink / raw)
  To: Robin Murphy, Will Deacon, Joerg Roedel, Jordan Crouse, Rob Clark
  Cc: freedreno, David Airlie, linux-arm-msm, Sharat Masetty,
	linux-kernel, dri-devel, Akhil P Oommen, iommu,
	Matthias Kaehlcke, Kristian H . Kristensen, Daniel Vetter,
	Stephen Boyd, Sean Paul, linux-arm-kernel, Emil Velikov

Use of_match_node() to match qcom implementation instead
of multiple of_device_compatible() calls for each qcom
implementation.

Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
---
 drivers/iommu/arm-smmu-impl.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c
index 8fbab9c38b61..42020d50ce12 100644
--- a/drivers/iommu/arm-smmu-impl.c
+++ b/drivers/iommu/arm-smmu-impl.c
@@ -9,6 +9,12 @@
 
 #include "arm-smmu.h"
 
+static const struct of_device_id __maybe_unused qcom_smmu_impl_of_match[] = {
+	{ .compatible = "qcom,sc7180-smmu-500" },
+	{ .compatible = "qcom,sdm845-smmu-500" },
+	{ }
+};
+
 static int arm_smmu_gr0_ns(int offset)
 {
 	switch (offset) {
@@ -168,8 +174,7 @@ struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu)
 	if (of_property_read_bool(np, "calxeda,smmu-secure-config-access"))
 		smmu->impl = &calxeda_impl;
 
-	if (of_device_is_compatible(np, "qcom,sdm845-smmu-500") ||
-	    of_device_is_compatible(np, "qcom,sc7180-smmu-500"))
+	if (of_match_node(qcom_smmu_impl_of_match, np))
 		return qcom_smmu_impl_init(smmu);
 
 	if (of_device_is_compatible(smmu->dev->of_node, "qcom,adreno-smmu"))
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCHv3 6/7] drm/msm: rearrange the gpu_rmw() function
  2020-06-29 15:52 [PATCHv3 0/7] System Cache support for GPU and required SMMU support Sai Prakash Ranjan
                   ` (4 preceding siblings ...)
  2020-06-29 15:52 ` [PATCHv3 5/7] iommu: arm-smmu-impl: Convert to use of_match_node() for qcom impl Sai Prakash Ranjan
@ 2020-06-29 15:52 ` Sai Prakash Ranjan
  2020-06-29 15:52 ` [PATCHv3 7/7] drm/msm/a6xx: Add support for using system cache(LLC) Sai Prakash Ranjan
  6 siblings, 0 replies; 14+ messages in thread
From: Sai Prakash Ranjan @ 2020-06-29 15:52 UTC (permalink / raw)
  To: Robin Murphy, Will Deacon, Joerg Roedel, Jordan Crouse, Rob Clark
  Cc: freedreno, David Airlie, linux-arm-msm, Sharat Masetty,
	linux-kernel, dri-devel, Akhil P Oommen, iommu,
	Matthias Kaehlcke, Kristian H . Kristensen, Daniel Vetter,
	Stephen Boyd, Sean Paul, linux-arm-kernel, Emil Velikov

From: Sharat Masetty <smasetty@codeaurora.org>

The register read-modify-write construct is generic enough
that it can be used by other subsystems as needed, create
a more generic rmw() function and have the gpu_rmw() use
this new function.

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
---
 drivers/gpu/drm/msm/msm_drv.c | 8 ++++++++
 drivers/gpu/drm/msm/msm_drv.h | 1 +
 drivers/gpu/drm/msm/msm_gpu.h | 5 +----
 3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 0c219b954943..5aa070929220 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -166,6 +166,14 @@ u32 msm_readl(const void __iomem *addr)
 	return val;
 }
 
+void msm_rmw(void __iomem *addr, u32 mask, u32 or)
+{
+	u32 val = msm_readl(addr);
+
+	val &= ~mask;
+	msm_writel(val | or, addr);
+}
+
 struct msm_vblank_work {
 	struct work_struct work;
 	int crtc_id;
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index 983a8b7e5a74..5bb02ccb863a 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -417,6 +417,7 @@ void __iomem *msm_ioremap(struct platform_device *pdev, const char *name,
 		const char *dbgname);
 void msm_writel(u32 data, void __iomem *addr);
 u32 msm_readl(const void __iomem *addr);
+void msm_rmw(void __iomem *addr, u32 mask, u32 or);
 
 struct msm_gpu_submitqueue;
 int msm_submitqueue_init(struct drm_device *drm, struct msm_file_private *ctx);
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index f1762b77bea8..3519777c8cb2 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -232,10 +232,7 @@ static inline u32 gpu_read(struct msm_gpu *gpu, u32 reg)
 
 static inline void gpu_rmw(struct msm_gpu *gpu, u32 reg, u32 mask, u32 or)
 {
-	uint32_t val = gpu_read(gpu, reg);
-
-	val &= ~mask;
-	gpu_write(gpu, reg, val | or);
+	msm_rmw(gpu->mmio + (reg << 2), mask, or);
 }
 
 static inline u64 gpu_read64(struct msm_gpu *gpu, u32 lo, u32 hi)
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCHv3 7/7] drm/msm/a6xx: Add support for using system cache(LLC)
  2020-06-29 15:52 [PATCHv3 0/7] System Cache support for GPU and required SMMU support Sai Prakash Ranjan
                   ` (5 preceding siblings ...)
  2020-06-29 15:52 ` [PATCHv3 6/7] drm/msm: rearrange the gpu_rmw() function Sai Prakash Ranjan
@ 2020-06-29 15:52 ` Sai Prakash Ranjan
  2020-07-03 13:37   ` Will Deacon
  6 siblings, 1 reply; 14+ messages in thread
From: Sai Prakash Ranjan @ 2020-06-29 15:52 UTC (permalink / raw)
  To: Robin Murphy, Will Deacon, Joerg Roedel, Jordan Crouse, Rob Clark
  Cc: freedreno, David Airlie, linux-arm-msm, Sharat Masetty,
	linux-kernel, dri-devel, Akhil P Oommen, iommu,
	Matthias Kaehlcke, Kristian H . Kristensen, Daniel Vetter,
	Stephen Boyd, Sean Paul, linux-arm-kernel, Emil Velikov

From: Sharat Masetty <smasetty@codeaurora.org>

The last level system cache can be partitioned to 32 different
slices of which GPU has two slices preallocated. One slice is
used for caching GPU buffers and the other slice is used for
caching the GPU SMMU pagetables. This talks to the core system
cache driver to acquire the slice handles, configure the SCID's
to those slices and activates and deactivates the slices upon
GPU power collapse and restore.

Some support from the IOMMU driver is also needed to make use
of the system cache. IOMMU_SYS_CACHE_ONLY is a buffer protection
flag which enables caching GPU data buffers in the system cache
with memory attributes such as outer cacheable, read-allocate,
write-allocate for buffers. The GPU then has the ability to
override a few cacheability parameters which it does to override
write-allocate to write-no-allocate as the GPU hardware does not
benefit much from it.

Similarly DOMAIN_ATTR_SYS_CACHE is another domain level attribute
used by the IOMMU driver to set the right attributes to cache the
hardware pagetables into the system cache.

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
(sai: fix to set attr before device attach to IOMMU and rebase)
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 82 +++++++++++++++++++++++++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h   |  3 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 23 ++++++-
 drivers/gpu/drm/msm/msm_iommu.c         |  3 +
 drivers/gpu/drm/msm/msm_mmu.h           |  4 ++
 5 files changed, 114 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 6bee70853ea8..c33cd2a588e6 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -9,6 +9,8 @@
 #include "a6xx_gmu.xml.h"
 
 #include <linux/devfreq.h>
+#include <linux/bitfield.h>
+#include <linux/soc/qcom/llcc-qcom.h>
 
 #define GPU_PAS_ID 13
 
@@ -808,6 +810,79 @@ static const u32 a6xx_register_offsets[REG_ADRENO_REGISTER_MAX] = {
 	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_CNTL, REG_A6XX_CP_RB_CNTL),
 };
 
+static void a6xx_llc_rmw(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 mask, u32 or)
+{
+	return msm_rmw(a6xx_gpu->llc_mmio + (reg << 2), mask, or);
+}
+
+static void a6xx_llc_write(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 value)
+{
+	return msm_writel(value, a6xx_gpu->llc_mmio + (reg << 2));
+}
+
+static void a6xx_llc_deactivate(struct a6xx_gpu *a6xx_gpu)
+{
+	llcc_slice_deactivate(a6xx_gpu->llc_slice);
+	llcc_slice_deactivate(a6xx_gpu->htw_llc_slice);
+}
+
+static void a6xx_llc_activate(struct a6xx_gpu *a6xx_gpu)
+{
+	u32 cntl1_regval = 0;
+
+	if (IS_ERR(a6xx_gpu->llc_mmio))
+		return;
+
+	if (!llcc_slice_activate(a6xx_gpu->llc_slice)) {
+		u32 gpu_scid = llcc_get_slice_id(a6xx_gpu->llc_slice);
+
+		gpu_scid &= 0x1f;
+		cntl1_regval = (gpu_scid << 0) | (gpu_scid << 5) | (gpu_scid << 10) |
+			       (gpu_scid << 15) | (gpu_scid << 20);
+	}
+
+	if (!llcc_slice_activate(a6xx_gpu->htw_llc_slice)) {
+		u32 gpuhtw_scid = llcc_get_slice_id(a6xx_gpu->htw_llc_slice);
+
+		gpuhtw_scid &= 0x1f;
+		cntl1_regval |= FIELD_PREP(GENMASK(29, 25), gpuhtw_scid);
+	}
+
+	if (cntl1_regval) {
+		/*
+		 * Program the slice IDs for the various GPU blocks and GPU MMU
+		 * pagetables
+		 */
+		a6xx_llc_write(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_1, cntl1_regval);
+
+		/*
+		 * Program cacheability overrides to not allocate cache lines on
+		 * a write miss
+		 */
+		a6xx_llc_rmw(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_0, 0xF, 0x03);
+	}
+}
+
+static void a6xx_llc_slices_destroy(struct a6xx_gpu *a6xx_gpu)
+{
+	llcc_slice_putd(a6xx_gpu->llc_slice);
+	llcc_slice_putd(a6xx_gpu->htw_llc_slice);
+}
+
+static void a6xx_llc_slices_init(struct platform_device *pdev,
+		struct a6xx_gpu *a6xx_gpu)
+{
+	a6xx_gpu->llc_mmio = msm_ioremap(pdev, "cx_mem", "gpu_cx");
+	if (IS_ERR(a6xx_gpu->llc_mmio))
+		return;
+
+	a6xx_gpu->llc_slice = llcc_slice_getd(LLCC_GPU);
+	a6xx_gpu->htw_llc_slice = llcc_slice_getd(LLCC_GPUHTW);
+
+	if (IS_ERR(a6xx_gpu->llc_slice) && IS_ERR(a6xx_gpu->htw_llc_slice))
+		a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
+}
+
 static int a6xx_pm_resume(struct msm_gpu *gpu)
 {
 	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
@@ -822,6 +897,8 @@ static int a6xx_pm_resume(struct msm_gpu *gpu)
 
 	msm_gpu_resume_devfreq(gpu);
 
+	a6xx_llc_activate(a6xx_gpu);
+
 	return 0;
 }
 
@@ -830,6 +907,8 @@ static int a6xx_pm_suspend(struct msm_gpu *gpu)
 	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
 	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
 
+	a6xx_llc_deactivate(a6xx_gpu);
+
 	devfreq_suspend_device(gpu->devfreq.devfreq);
 
 	return a6xx_gmu_stop(a6xx_gpu);
@@ -868,6 +947,7 @@ static void a6xx_destroy(struct msm_gpu *gpu)
 		drm_gem_object_put_unlocked(a6xx_gpu->sqe_bo);
 	}
 
+	a6xx_llc_slices_destroy(a6xx_gpu);
 	a6xx_gmu_remove(a6xx_gpu);
 
 	adreno_gpu_cleanup(adreno_gpu);
@@ -962,6 +1042,8 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
 	adreno_gpu->registers = NULL;
 	adreno_gpu->reg_offsets = a6xx_register_offsets;
 
+	a6xx_llc_slices_init(pdev, a6xx_gpu);
+
 	ret = adreno_gpu_init(dev, pdev, adreno_gpu, &funcs, 1);
 	if (ret) {
 		a6xx_destroy(&(a6xx_gpu->base.base));
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
index 7239b8b60939..90043448fab1 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
@@ -21,6 +21,9 @@ struct a6xx_gpu {
 	struct msm_ringbuffer *cur_ring;
 
 	struct a6xx_gmu gmu;
+	void __iomem *llc_mmio;
+	void *llc_slice;
+	void *htw_llc_slice;
 };
 
 #define to_a6xx_gpu(x) container_of(x, struct a6xx_gpu, base)
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 3e717c1ebb7f..4666d2df8e65 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -190,10 +190,31 @@ adreno_iommu_create_address_space(struct msm_gpu *gpu,
 		struct platform_device *pdev)
 {
 	struct iommu_domain *iommu = iommu_domain_alloc(&platform_bus_type);
-	struct msm_mmu *mmu = msm_iommu_new(&pdev->dev, iommu);
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
 	struct msm_gem_address_space *aspace;
+	struct msm_mmu *mmu;
 	u64 start, size;
 
+	/*
+	 * This allows GPU to set the bus attributes required to use system
+	 * cache on behalf of the iommu page table walker.
+	 */
+	if (!IS_ERR(a6xx_gpu->htw_llc_slice)) {
+		int gpu_htw_llc = 1;
+
+		iommu_domain_set_attr(iommu, DOMAIN_ATTR_SYS_CACHE, &gpu_htw_llc);
+	}
+
+	mmu = msm_iommu_new(&pdev->dev, iommu);
+	if (IS_ERR(mmu)) {
+		iommu_domain_free(iommu);
+		return ERR_CAST(mmu);
+	}
+
+	if (!IS_ERR(a6xx_gpu->llc_slice))
+		mmu->features |= MMU_FEATURE_USE_SYSTEM_CACHE;
+
 	/*
 	 * Use the aperture start or SZ_16M, whichever is greater. This will
 	 * ensure that we align with the allocated pagetable range while still
diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index f455c597f76d..bd1d58229cc2 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -218,6 +218,9 @@ static int msm_iommu_map(struct msm_mmu *mmu, uint64_t iova,
 		iova |= GENMASK_ULL(63, 49);
 
 
+	if (mmu->features & MMU_FEATURE_USE_SYSTEM_CACHE)
+		prot |= IOMMU_SYS_CACHE_ONLY;
+
 	ret = iommu_map_sg(iommu->domain, iova, sgt->sgl, sgt->nents, prot);
 	WARN_ON(!ret);
 
diff --git a/drivers/gpu/drm/msm/msm_mmu.h b/drivers/gpu/drm/msm/msm_mmu.h
index 61ade89d9e48..90965241e567 100644
--- a/drivers/gpu/drm/msm/msm_mmu.h
+++ b/drivers/gpu/drm/msm/msm_mmu.h
@@ -23,12 +23,16 @@ enum msm_mmu_type {
 	MSM_MMU_IOMMU_PAGETABLE,
 };
 
+/* MMU features */
+#define MMU_FEATURE_USE_SYSTEM_CACHE	BIT(0)
+
 struct msm_mmu {
 	const struct msm_mmu_funcs *funcs;
 	struct device *dev;
 	int (*handler)(void *arg, unsigned long iova, int flags);
 	void *arg;
 	enum msm_mmu_type type;
+	u32 features;
 };
 
 static inline void msm_mmu_init(struct msm_mmu *mmu, struct device *dev,
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCHv3 7/7] drm/msm/a6xx: Add support for using system cache(LLC)
  2020-06-29 15:52 ` [PATCHv3 7/7] drm/msm/a6xx: Add support for using system cache(LLC) Sai Prakash Ranjan
@ 2020-07-03 13:37   ` Will Deacon
  2020-07-03 14:53     ` Sai Prakash Ranjan
  0 siblings, 1 reply; 14+ messages in thread
From: Will Deacon @ 2020-07-03 13:37 UTC (permalink / raw)
  To: Sai Prakash Ranjan
  Cc: Sean Paul, freedreno, David Airlie, linux-arm-msm, iommu,
	linux-kernel, Matthias Kaehlcke, Akhil P Oommen, dri-devel,
	Daniel Vetter, Kristian H . Kristensen, Stephen Boyd,
	Robin Murphy, Sharat Masetty, linux-arm-kernel, Emil Velikov

On Mon, Jun 29, 2020 at 09:22:50PM +0530, Sai Prakash Ranjan wrote:
> diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
> index f455c597f76d..bd1d58229cc2 100644
> --- a/drivers/gpu/drm/msm/msm_iommu.c
> +++ b/drivers/gpu/drm/msm/msm_iommu.c
> @@ -218,6 +218,9 @@ static int msm_iommu_map(struct msm_mmu *mmu, uint64_t iova,
>  		iova |= GENMASK_ULL(63, 49);
>  
>  
> +	if (mmu->features & MMU_FEATURE_USE_SYSTEM_CACHE)
> +		prot |= IOMMU_SYS_CACHE_ONLY;

Given that I think this is the only user of IOMMU_SYS_CACHE_ONLY, then it
looks like it should actually be a property on the domain because we never
need to configure it on a per-mapping basis within a domain, and therefore
it shouldn't be exposed by the IOMMU API as a prot flag.

Do you agree?

Will
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCHv3 7/7] drm/msm/a6xx: Add support for using system cache(LLC)
  2020-07-03 13:37   ` Will Deacon
@ 2020-07-03 14:53     ` Sai Prakash Ranjan
  2020-07-03 15:40       ` Will Deacon
  2020-07-03 16:04       ` Rob Clark
  0 siblings, 2 replies; 14+ messages in thread
From: Sai Prakash Ranjan @ 2020-07-03 14:53 UTC (permalink / raw)
  To: Will Deacon
  Cc: Sean Paul, freedreno, David Airlie, linux-arm-msm, iommu,
	linux-kernel, Matthias Kaehlcke, Akhil P Oommen, dri-devel,
	Daniel Vetter, Kristian H . Kristensen, Stephen Boyd,
	Robin Murphy, Sharat Masetty, linux-arm-kernel, Emil Velikov

Hi Will,

On 2020-07-03 19:07, Will Deacon wrote:
> On Mon, Jun 29, 2020 at 09:22:50PM +0530, Sai Prakash Ranjan wrote:
>> diff --git a/drivers/gpu/drm/msm/msm_iommu.c 
>> b/drivers/gpu/drm/msm/msm_iommu.c
>> index f455c597f76d..bd1d58229cc2 100644
>> --- a/drivers/gpu/drm/msm/msm_iommu.c
>> +++ b/drivers/gpu/drm/msm/msm_iommu.c
>> @@ -218,6 +218,9 @@ static int msm_iommu_map(struct msm_mmu *mmu, 
>> uint64_t iova,
>>  		iova |= GENMASK_ULL(63, 49);
>> 
>> 
>> +	if (mmu->features & MMU_FEATURE_USE_SYSTEM_CACHE)
>> +		prot |= IOMMU_SYS_CACHE_ONLY;
> 
> Given that I think this is the only user of IOMMU_SYS_CACHE_ONLY, then 
> it
> looks like it should actually be a property on the domain because we 
> never
> need to configure it on a per-mapping basis within a domain, and 
> therefore
> it shouldn't be exposed by the IOMMU API as a prot flag.
> 
> Do you agree?
> 

GPU being the only user is for now, but there are other clients which 
can use this.
Plus how do we set the memory attributes if we do not expose this as 
prot flag?

Thanks,
Sai

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCHv3 7/7] drm/msm/a6xx: Add support for using system cache(LLC)
  2020-07-03 14:53     ` Sai Prakash Ranjan
@ 2020-07-03 15:40       ` Will Deacon
  2020-07-03 16:04       ` Rob Clark
  1 sibling, 0 replies; 14+ messages in thread
From: Will Deacon @ 2020-07-03 15:40 UTC (permalink / raw)
  To: Sai Prakash Ranjan
  Cc: Sean Paul, freedreno, David Airlie, linux-arm-msm, iommu,
	linux-kernel, Matthias Kaehlcke, Akhil P Oommen, dri-devel,
	Daniel Vetter, Kristian H . Kristensen, Stephen Boyd,
	Robin Murphy, Sharat Masetty, linux-arm-kernel, Emil Velikov

On Fri, Jul 03, 2020 at 08:23:07PM +0530, Sai Prakash Ranjan wrote:
> On 2020-07-03 19:07, Will Deacon wrote:
> > On Mon, Jun 29, 2020 at 09:22:50PM +0530, Sai Prakash Ranjan wrote:
> > > diff --git a/drivers/gpu/drm/msm/msm_iommu.c
> > > b/drivers/gpu/drm/msm/msm_iommu.c
> > > index f455c597f76d..bd1d58229cc2 100644
> > > --- a/drivers/gpu/drm/msm/msm_iommu.c
> > > +++ b/drivers/gpu/drm/msm/msm_iommu.c
> > > @@ -218,6 +218,9 @@ static int msm_iommu_map(struct msm_mmu *mmu,
> > > uint64_t iova,
> > >  		iova |= GENMASK_ULL(63, 49);
> > > 
> > > 
> > > +	if (mmu->features & MMU_FEATURE_USE_SYSTEM_CACHE)
> > > +		prot |= IOMMU_SYS_CACHE_ONLY;
> > 
> > Given that I think this is the only user of IOMMU_SYS_CACHE_ONLY, then
> > it
> > looks like it should actually be a property on the domain because we
> > never
> > need to configure it on a per-mapping basis within a domain, and
> > therefore
> > it shouldn't be exposed by the IOMMU API as a prot flag.
> > 
> > Do you agree?
> > 
> 
> GPU being the only user is for now, but there are other clients which can
> use this.
> Plus how do we set the memory attributes if we do not expose this as prot
> flag?

I just don't understand the need for it to be per-map operation. Put another
way, if we extended the domain attribute to apply to cacheable mappings
on the domain and not just the table walk, what would break?

Will
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCHv3 7/7] drm/msm/a6xx: Add support for using system cache(LLC)
  2020-07-03 14:53     ` Sai Prakash Ranjan
  2020-07-03 15:40       ` Will Deacon
@ 2020-07-03 16:04       ` Rob Clark
  2020-07-03 16:59         ` Sai Prakash Ranjan
  2020-07-09 16:13         ` Jordan Crouse
  1 sibling, 2 replies; 14+ messages in thread
From: Rob Clark @ 2020-07-03 16:04 UTC (permalink / raw)
  To: Sai Prakash Ranjan
  Cc: Sean Paul, freedreno, David Airlie, Will Deacon,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>, ,
	dri-devel, Linux Kernel Mailing List, Matthias Kaehlcke,
	Akhil P Oommen, Daniel Vetter, linux-arm-msm,
	Kristian H . Kristensen, Stephen Boyd, Robin Murphy,
	Sharat Masetty,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	Emil Velikov

On Fri, Jul 3, 2020 at 7:53 AM Sai Prakash Ranjan
<saiprakash.ranjan@codeaurora.org> wrote:
>
> Hi Will,
>
> On 2020-07-03 19:07, Will Deacon wrote:
> > On Mon, Jun 29, 2020 at 09:22:50PM +0530, Sai Prakash Ranjan wrote:
> >> diff --git a/drivers/gpu/drm/msm/msm_iommu.c
> >> b/drivers/gpu/drm/msm/msm_iommu.c
> >> index f455c597f76d..bd1d58229cc2 100644
> >> --- a/drivers/gpu/drm/msm/msm_iommu.c
> >> +++ b/drivers/gpu/drm/msm/msm_iommu.c
> >> @@ -218,6 +218,9 @@ static int msm_iommu_map(struct msm_mmu *mmu,
> >> uint64_t iova,
> >>              iova |= GENMASK_ULL(63, 49);
> >>
> >>
> >> +    if (mmu->features & MMU_FEATURE_USE_SYSTEM_CACHE)
> >> +            prot |= IOMMU_SYS_CACHE_ONLY;
> >
> > Given that I think this is the only user of IOMMU_SYS_CACHE_ONLY, then
> > it
> > looks like it should actually be a property on the domain because we
> > never
> > need to configure it on a per-mapping basis within a domain, and
> > therefore
> > it shouldn't be exposed by the IOMMU API as a prot flag.
> >
> > Do you agree?
> >
>
> GPU being the only user is for now, but there are other clients which
> can use this.
> Plus how do we set the memory attributes if we do not expose this as
> prot flag?

It does appear that the downstream kgsl driver sets this for basically
all mappings.. well there is some conditional stuff around
DOMAIN_ATTR_USE_LLC_NWA but it seems based on the property of the
domain.  (Jordan may know more about what that is about.)  But looks
like there are a lot of different paths into iommu_map in kgsl so I
might have missed something.

Assuming there isn't some case where we specifically don't want to use
the system cache for some mapping, I think it could be a domain
attribute that sets an io_pgtable_cfg::quirks flag

BR,
-R
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCHv3 7/7] drm/msm/a6xx: Add support for using system cache(LLC)
  2020-07-03 16:04       ` Rob Clark
@ 2020-07-03 16:59         ` Sai Prakash Ranjan
  2020-07-09 16:13         ` Jordan Crouse
  1 sibling, 0 replies; 14+ messages in thread
From: Sai Prakash Ranjan @ 2020-07-03 16:59 UTC (permalink / raw)
  To: Rob Clark
  Cc: Will Deacon, David Airlie, Sean Paul, Linux Kernel Mailing List,
	dri-devel, Akhil P Oommen, Emil Velikov,
	list@263.net:IOMMU DRIVERS , Joerg Roedel <joro@8bytes.org>,
	, Matthias Kaehlcke, Kristian H . Kristensen, Daniel Vetter,
	linux-arm-msm, Stephen Boyd, freedreno, Sharat Masetty,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	Robin Murphy

On 2020-07-03 21:34, Rob Clark wrote:
> On Fri, Jul 3, 2020 at 7:53 AM Sai Prakash Ranjan
> <saiprakash.ranjan@codeaurora.org> wrote:
>> 
>> Hi Will,
>> 
>> On 2020-07-03 19:07, Will Deacon wrote:
>> > On Mon, Jun 29, 2020 at 09:22:50PM +0530, Sai Prakash Ranjan wrote:
>> >> diff --git a/drivers/gpu/drm/msm/msm_iommu.c
>> >> b/drivers/gpu/drm/msm/msm_iommu.c
>> >> index f455c597f76d..bd1d58229cc2 100644
>> >> --- a/drivers/gpu/drm/msm/msm_iommu.c
>> >> +++ b/drivers/gpu/drm/msm/msm_iommu.c
>> >> @@ -218,6 +218,9 @@ static int msm_iommu_map(struct msm_mmu *mmu,
>> >> uint64_t iova,
>> >>              iova |= GENMASK_ULL(63, 49);
>> >>
>> >>
>> >> +    if (mmu->features & MMU_FEATURE_USE_SYSTEM_CACHE)
>> >> +            prot |= IOMMU_SYS_CACHE_ONLY;
>> >
>> > Given that I think this is the only user of IOMMU_SYS_CACHE_ONLY, then
>> > it
>> > looks like it should actually be a property on the domain because we
>> > never
>> > need to configure it on a per-mapping basis within a domain, and
>> > therefore
>> > it shouldn't be exposed by the IOMMU API as a prot flag.
>> >
>> > Do you agree?
>> >
>> 
>> GPU being the only user is for now, but there are other clients which
>> can use this.
>> Plus how do we set the memory attributes if we do not expose this as
>> prot flag?
> 
> It does appear that the downstream kgsl driver sets this for basically
> all mappings.. well there is some conditional stuff around
> DOMAIN_ATTR_USE_LLC_NWA but it seems based on the property of the
> domain.  (Jordan may know more about what that is about.)  But looks
> like there are a lot of different paths into iommu_map in kgsl so I
> might have missed something.
> 
> Assuming there isn't some case where we specifically don't want to use
> the system cache for some mapping, I think it could be a domain
> attribute that sets an io_pgtable_cfg::quirks flag
> 

Ok then we are good to remove unused sys cache prot flag which Will has 
posted.

Thanks,
Sai

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCHv3 7/7] drm/msm/a6xx: Add support for using system cache(LLC)
  2020-07-03 16:04       ` Rob Clark
  2020-07-03 16:59         ` Sai Prakash Ranjan
@ 2020-07-09 16:13         ` Jordan Crouse
  1 sibling, 0 replies; 14+ messages in thread
From: Jordan Crouse @ 2020-07-09 16:13 UTC (permalink / raw)
  To: Rob Clark
  Cc: Sean Paul, freedreno, David Airlie, Will Deacon,
	Linux Kernel Mailing List, dri-devel, Akhil P Oommen,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>, ,
	Matthias Kaehlcke, Kristian H . Kristensen, Daniel Vetter,
	linux-arm-msm, Stephen Boyd, Robin Murphy, Sharat Masetty,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	Emil Velikov

On Fri, Jul 03, 2020 at 09:04:49AM -0700, Rob Clark wrote:
> On Fri, Jul 3, 2020 at 7:53 AM Sai Prakash Ranjan
> <saiprakash.ranjan@codeaurora.org> wrote:
> >
> > Hi Will,
> >
> > On 2020-07-03 19:07, Will Deacon wrote:
> > > On Mon, Jun 29, 2020 at 09:22:50PM +0530, Sai Prakash Ranjan wrote:
> > >> diff --git a/drivers/gpu/drm/msm/msm_iommu.c
> > >> b/drivers/gpu/drm/msm/msm_iommu.c
> > >> index f455c597f76d..bd1d58229cc2 100644
> > >> --- a/drivers/gpu/drm/msm/msm_iommu.c
> > >> +++ b/drivers/gpu/drm/msm/msm_iommu.c
> > >> @@ -218,6 +218,9 @@ static int msm_iommu_map(struct msm_mmu *mmu,
> > >> uint64_t iova,
> > >>              iova |= GENMASK_ULL(63, 49);
> > >>
> > >>
> > >> +    if (mmu->features & MMU_FEATURE_USE_SYSTEM_CACHE)
> > >> +            prot |= IOMMU_SYS_CACHE_ONLY;
> > >
> > > Given that I think this is the only user of IOMMU_SYS_CACHE_ONLY, then
> > > it
> > > looks like it should actually be a property on the domain because we
> > > never
> > > need to configure it on a per-mapping basis within a domain, and
> > > therefore
> > > it shouldn't be exposed by the IOMMU API as a prot flag.
> > >
> > > Do you agree?
> > >
> >
> > GPU being the only user is for now, but there are other clients which
> > can use this.
> > Plus how do we set the memory attributes if we do not expose this as
> > prot flag?
> 
> It does appear that the downstream kgsl driver sets this for basically
> all mappings.. well there is some conditional stuff around
> DOMAIN_ATTR_USE_LLC_NWA but it seems based on the property of the
> domain.  (Jordan may know more about what that is about.)  But looks
> like there are a lot of different paths into iommu_map in kgsl so I
> might have missed something.

Downstream does set it universally. There are some theoretical use cases
where it might be beneficial to set it on a per-mapping basis with a bunch
of hinting from userspace and nobody has tried to characterize this on real
hardware so it is not clear to me if it is worth it.

I think a domain wide attribute works for now but if a compelling per-mapping
use case does comes down the pipeline we need to have a backup in mind -
possibly a prot flag to disable NWA?

Jordan

> Assuming there isn't some case where we specifically don't want to use
> the system cache for some mapping, I think it could be a domain
> attribute that sets an io_pgtable_cfg::quirks flag
> 
> BR,
> -R

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-07-09 16:14 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-29 15:52 [PATCHv3 0/7] System Cache support for GPU and required SMMU support Sai Prakash Ranjan
2020-06-29 15:52 ` [PATCHv3 1/7] iommu/arm-smmu: Add a init_context_bank implementation hook Sai Prakash Ranjan
2020-06-29 15:52 ` [PATCHv3 2/7] iommu/io-pgtable-arm: Add support to use system cache Sai Prakash Ranjan
2020-06-29 15:52 ` [PATCHv3 3/7] iommu/arm-smmu: Add domain attribute for " Sai Prakash Ranjan
2020-06-29 15:52 ` [PATCHv3 4/7] iommu: arm-smmu-impl: Remove unwanted extra blank lines Sai Prakash Ranjan
2020-06-29 15:52 ` [PATCHv3 5/7] iommu: arm-smmu-impl: Convert to use of_match_node() for qcom impl Sai Prakash Ranjan
2020-06-29 15:52 ` [PATCHv3 6/7] drm/msm: rearrange the gpu_rmw() function Sai Prakash Ranjan
2020-06-29 15:52 ` [PATCHv3 7/7] drm/msm/a6xx: Add support for using system cache(LLC) Sai Prakash Ranjan
2020-07-03 13:37   ` Will Deacon
2020-07-03 14:53     ` Sai Prakash Ranjan
2020-07-03 15:40       ` Will Deacon
2020-07-03 16:04       ` Rob Clark
2020-07-03 16:59         ` Sai Prakash Ranjan
2020-07-09 16:13         ` Jordan Crouse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).