linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv7 0/7] System Cache support for GPU and required SMMU support
@ 2020-10-30  9:23 Sai Prakash Ranjan
  2020-10-30  9:23 ` [PATCHv7 1/7] iommu/io-pgtable-arm: Add support to use system cache Sai Prakash Ranjan
                   ` (7 more replies)
  0 siblings, 8 replies; 18+ messages in thread
From: Sai Prakash Ranjan @ 2020-10-30  9:23 UTC (permalink / raw)
  To: Will Deacon, Robin Murphy, Joerg Roedel, Jordan Crouse, Rob Clark
  Cc: iommu, linux-arm-kernel, linux-kernel, linux-arm-msm,
	Akhil P Oommen, freedreno, Kristian H . Kristensen, dri-devel,
	Sai Prakash Ranjan

Some hardware variants contain a system cache or the last level
cache(llc). This cache is typically a large block which is shared
by multiple clients on the SOC. GPU uses the system cache to cache
both the GPU data buffers(like textures) as well the SMMU pagetables.
This helps with improved render performance as well as lower power
consumption by reducing the bus traffic to the system memory.

The system cache architecture allows the cache to be split into slices
which then be used by multiple SOC clients. This patch series is an
effort to enable and use two of those slices preallocated for the GPU,
one for the GPU data buffers and another for the GPU SMMU hardware
pagetables.

Patch 1 - Patch 5 adds system cache support in SMMU and GPU driver.
Patch 6 and 7 are minor cleanups for arm-smmu impl.

Changes in v7:
 * Squash Jordan's patch to support MMU500 targets
 * Rebase on top of for-joerg/arm-smmu/updates and Jordan's short series for adreno-smmu impl

Changes in v6:
 * Move table to arm-smmu-qcom (Robin)

Changes in v5:
 * Drop cleanup of blank lines since it was intentional (Robin)
 * Rebase again on top of msm-next-pgtables as it moves pretty fast

Changes in v4:
 * Drop IOMMU_SYS_CACHE prot flag
 * Rebase on top of https://gitlab.freedesktop.org/drm/msm/-/tree/msm-next-pgtables

Changes in v3:
 * Fix domain attribute setting to before iommu_attach_device()
 * Fix few code style and checkpatch warnings
 * Rebase on top of Jordan's latest split pagetables and per-instance
   pagetables support

Changes in v2:
 * Addressed review comments and rebased on top of Jordan's split
   pagetables series

Jordan Crouse (1):
  drm/msm/a6xx: Add support for using system cache on MMU500 based
    targets

Sai Prakash Ranjan (4):
  iommu/io-pgtable-arm: Add support to use system cache
  iommu/arm-smmu: Add domain attribute for system cache
  iommu: arm-smmu-impl: Use table to list QCOM implementations
  iommu: arm-smmu-impl: Add a space before open parenthesis

Sharat Masetty (2):
  drm/msm: rearrange the gpu_rmw() function
  drm/msm/a6xx: Add support for using system cache(LLC)

 drivers/gpu/drm/msm/adreno/a6xx_gpu.c      | 109 +++++++++++++++++++++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h      |   5 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.c    |  17 ++++
 drivers/gpu/drm/msm/msm_drv.c              |   8 ++
 drivers/gpu/drm/msm/msm_drv.h              |   1 +
 drivers/gpu/drm/msm/msm_gpu.h              |   5 +-
 drivers/iommu/arm/arm-smmu/arm-smmu-impl.c |  11 +--
 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c |  21 +++-
 drivers/iommu/arm/arm-smmu/arm-smmu.c      |  17 ++++
 drivers/iommu/arm/arm-smmu/arm-smmu.h      |   2 +-
 drivers/iommu/io-pgtable-arm.c             |   7 +-
 include/linux/io-pgtable.h                 |   4 +
 include/linux/iommu.h                      |   1 +
 13 files changed, 188 insertions(+), 20 deletions(-)


base-commit: f9081b8ff5934b8d69c748d0200e844cadd2c667
prerequisite-patch-id: db09851f375ca5efde35f2e5c21b3959eed7d8a8
prerequisite-patch-id: 55c6af17808c2047b67cdbd04af5541156ef496e
prerequisite-patch-id: e82c1e678da701e112ac255ea966c6797d975692
prerequisite-patch-id: f7978f5f2fb06528b7a1f75fa4255e386a30b91a
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv7 1/7] iommu/io-pgtable-arm: Add support to use system cache
  2020-10-30  9:23 [PATCHv7 0/7] System Cache support for GPU and required SMMU support Sai Prakash Ranjan
@ 2020-10-30  9:23 ` Sai Prakash Ranjan
  2020-11-10 12:18   ` Will Deacon
  2020-10-30  9:23 ` [PATCHv7 2/7] iommu/arm-smmu: Add domain attribute for " Sai Prakash Ranjan
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Sai Prakash Ranjan @ 2020-10-30  9:23 UTC (permalink / raw)
  To: Will Deacon, Robin Murphy, Joerg Roedel, Jordan Crouse, Rob Clark
  Cc: iommu, linux-arm-kernel, linux-kernel, linux-arm-msm,
	Akhil P Oommen, freedreno, Kristian H . Kristensen, dri-devel,
	Sai Prakash Ranjan

Add a quirk IO_PGTABLE_QUIRK_SYS_CACHE to override the
attributes set in TCR for the page table walker when
using system cache.

Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
---
 drivers/iommu/io-pgtable-arm.c | 7 ++++++-
 include/linux/io-pgtable.h     | 4 ++++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index a7a9bc08dcd1..a356caf1683a 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -761,7 +761,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 
 	if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS |
 			    IO_PGTABLE_QUIRK_NON_STRICT |
-			    IO_PGTABLE_QUIRK_ARM_TTBR1))
+			    IO_PGTABLE_QUIRK_ARM_TTBR1 |
+			    IO_PGTABLE_QUIRK_SYS_CACHE))
 		return NULL;
 
 	data = arm_lpae_alloc_pgtable(cfg);
@@ -773,6 +774,10 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 		tcr->sh = ARM_LPAE_TCR_SH_IS;
 		tcr->irgn = ARM_LPAE_TCR_RGN_WBWA;
 		tcr->orgn = ARM_LPAE_TCR_RGN_WBWA;
+	} else if (cfg->quirks & IO_PGTABLE_QUIRK_SYS_CACHE) {
+		tcr->sh = ARM_LPAE_TCR_SH_OS;
+		tcr->irgn = ARM_LPAE_TCR_RGN_NC;
+		tcr->orgn = ARM_LPAE_TCR_RGN_WBWA;
 	} else {
 		tcr->sh = ARM_LPAE_TCR_SH_OS;
 		tcr->irgn = ARM_LPAE_TCR_RGN_NC;
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 4cde111e425b..86631f711e05 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -86,6 +86,9 @@ struct io_pgtable_cfg {
 	 *
 	 * IO_PGTABLE_QUIRK_ARM_TTBR1: (ARM LPAE format) Configure the table
 	 *	for use in the upper half of a split address space.
+	 *
+	 * IO_PGTABLE_QUIRK_SYS_CACHE: Override the attributes set in TCR for
+	 *	the page table walker when using system cache.
 	 */
 	#define IO_PGTABLE_QUIRK_ARM_NS		BIT(0)
 	#define IO_PGTABLE_QUIRK_NO_PERMS	BIT(1)
@@ -93,6 +96,7 @@ struct io_pgtable_cfg {
 	#define IO_PGTABLE_QUIRK_ARM_MTK_EXT	BIT(3)
 	#define IO_PGTABLE_QUIRK_NON_STRICT	BIT(4)
 	#define IO_PGTABLE_QUIRK_ARM_TTBR1	BIT(5)
+	#define IO_PGTABLE_QUIRK_SYS_CACHE	BIT(6)
 	unsigned long			quirks;
 	unsigned long			pgsize_bitmap;
 	unsigned int			ias;
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCHv7 2/7] iommu/arm-smmu: Add domain attribute for system cache
  2020-10-30  9:23 [PATCHv7 0/7] System Cache support for GPU and required SMMU support Sai Prakash Ranjan
  2020-10-30  9:23 ` [PATCHv7 1/7] iommu/io-pgtable-arm: Add support to use system cache Sai Prakash Ranjan
@ 2020-10-30  9:23 ` Sai Prakash Ranjan
  2020-11-10 12:18   ` Will Deacon
  2020-10-30  9:23 ` [PATCHv7 3/7] drm/msm: rearrange the gpu_rmw() function Sai Prakash Ranjan
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Sai Prakash Ranjan @ 2020-10-30  9:23 UTC (permalink / raw)
  To: Will Deacon, Robin Murphy, Joerg Roedel, Jordan Crouse, Rob Clark
  Cc: iommu, linux-arm-kernel, linux-kernel, linux-arm-msm,
	Akhil P Oommen, freedreno, Kristian H . Kristensen, dri-devel,
	Sai Prakash Ranjan

Add iommu domain attribute for using system cache aka last level
cache by client drivers like GPU to set right attributes for caching
the hardware pagetables into the system cache.

Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
---
 drivers/iommu/arm/arm-smmu/arm-smmu.c | 17 +++++++++++++++++
 drivers/iommu/arm/arm-smmu/arm-smmu.h |  1 +
 include/linux/iommu.h                 |  1 +
 3 files changed, 19 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index b1cf8f0abc29..070d13f80c7e 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -789,6 +789,9 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	if (smmu_domain->non_strict)
 		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
 
+	if (smmu_domain->sys_cache)
+		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_SYS_CACHE;
+
 	pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
 	if (!pgtbl_ops) {
 		ret = -ENOMEM;
@@ -1520,6 +1523,9 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 		case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
 			*(int *)data = smmu_domain->non_strict;
 			return 0;
+		case DOMAIN_ATTR_SYS_CACHE:
+			*((int *)data) = smmu_domain->sys_cache;
+			return 0;
 		default:
 			return -ENODEV;
 		}
@@ -1551,6 +1557,17 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
 			else
 				smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
 			break;
+		case DOMAIN_ATTR_SYS_CACHE:
+			if (smmu_domain->smmu) {
+				ret = -EPERM;
+				goto out_unlock;
+			}
+
+			if (*((int *)data))
+				smmu_domain->sys_cache = true;
+			else
+				smmu_domain->sys_cache = false;
+			break;
 		default:
 			ret = -ENODEV;
 		}
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h
index 885840f3bec8..dfc44d806671 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
@@ -373,6 +373,7 @@ struct arm_smmu_domain {
 	struct mutex			init_mutex; /* Protects smmu pointer */
 	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
 	struct iommu_domain		domain;
+	bool				sys_cache;
 };
 
 struct arm_smmu_master_cfg {
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index b95a6f8db6ff..4f4bb9c6f8f6 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -118,6 +118,7 @@ enum iommu_attr {
 	DOMAIN_ATTR_FSL_PAMUV1,
 	DOMAIN_ATTR_NESTING,	/* two stages of translation */
 	DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
+	DOMAIN_ATTR_SYS_CACHE,
 	DOMAIN_ATTR_MAX,
 };
 
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCHv7 3/7] drm/msm: rearrange the gpu_rmw() function
  2020-10-30  9:23 [PATCHv7 0/7] System Cache support for GPU and required SMMU support Sai Prakash Ranjan
  2020-10-30  9:23 ` [PATCHv7 1/7] iommu/io-pgtable-arm: Add support to use system cache Sai Prakash Ranjan
  2020-10-30  9:23 ` [PATCHv7 2/7] iommu/arm-smmu: Add domain attribute for " Sai Prakash Ranjan
@ 2020-10-30  9:23 ` Sai Prakash Ranjan
  2020-10-30  9:23 ` [PATCHv7 4/7] drm/msm/a6xx: Add support for using system cache(LLC) Sai Prakash Ranjan
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Sai Prakash Ranjan @ 2020-10-30  9:23 UTC (permalink / raw)
  To: Will Deacon, Robin Murphy, Joerg Roedel, Jordan Crouse, Rob Clark
  Cc: iommu, linux-arm-kernel, linux-kernel, linux-arm-msm,
	Akhil P Oommen, freedreno, Kristian H . Kristensen, dri-devel,
	Sharat Masetty, Sai Prakash Ranjan

From: Sharat Masetty <smasetty@codeaurora.org>

The register read-modify-write construct is generic enough
that it can be used by other subsystems as needed, create
a more generic rmw() function and have the gpu_rmw() use
this new function.

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
---
 drivers/gpu/drm/msm/msm_drv.c | 8 ++++++++
 drivers/gpu/drm/msm/msm_drv.h | 1 +
 drivers/gpu/drm/msm/msm_gpu.h | 5 +----
 3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 49685571dc0e..a1e22b974b77 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -180,6 +180,14 @@ u32 msm_readl(const void __iomem *addr)
 	return val;
 }
 
+void msm_rmw(void __iomem *addr, u32 mask, u32 or)
+{
+	u32 val = msm_readl(addr);
+
+	val &= ~mask;
+	msm_writel(val | or, addr);
+}
+
 struct msm_vblank_work {
 	struct work_struct work;
 	int crtc_id;
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index b9dd8f8f4887..655b3b0424a1 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -478,6 +478,7 @@ void __iomem *msm_ioremap_quiet(struct platform_device *pdev, const char *name,
 		const char *dbgname);
 void msm_writel(u32 data, void __iomem *addr);
 u32 msm_readl(const void __iomem *addr);
+void msm_rmw(void __iomem *addr, u32 mask, u32 or);
 
 struct msm_gpu_submitqueue;
 int msm_submitqueue_init(struct drm_device *drm, struct msm_file_private *ctx);
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 6c9e1fdc1a76..b2b419277953 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -246,10 +246,7 @@ static inline u32 gpu_read(struct msm_gpu *gpu, u32 reg)
 
 static inline void gpu_rmw(struct msm_gpu *gpu, u32 reg, u32 mask, u32 or)
 {
-	uint32_t val = gpu_read(gpu, reg);
-
-	val &= ~mask;
-	gpu_write(gpu, reg, val | or);
+	msm_rmw(gpu->mmio + (reg << 2), mask, or);
 }
 
 static inline u64 gpu_read64(struct msm_gpu *gpu, u32 lo, u32 hi)
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCHv7 4/7] drm/msm/a6xx: Add support for using system cache(LLC)
  2020-10-30  9:23 [PATCHv7 0/7] System Cache support for GPU and required SMMU support Sai Prakash Ranjan
                   ` (2 preceding siblings ...)
  2020-10-30  9:23 ` [PATCHv7 3/7] drm/msm: rearrange the gpu_rmw() function Sai Prakash Ranjan
@ 2020-10-30  9:23 ` Sai Prakash Ranjan
  2020-10-30  9:23 ` [PATCHv7 5/7] drm/msm/a6xx: Add support for using system cache on MMU500 based targets Sai Prakash Ranjan
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Sai Prakash Ranjan @ 2020-10-30  9:23 UTC (permalink / raw)
  To: Will Deacon, Robin Murphy, Joerg Roedel, Jordan Crouse, Rob Clark
  Cc: iommu, linux-arm-kernel, linux-kernel, linux-arm-msm,
	Akhil P Oommen, freedreno, Kristian H . Kristensen, dri-devel,
	Sharat Masetty, Sai Prakash Ranjan

From: Sharat Masetty <smasetty@codeaurora.org>

The last level system cache can be partitioned to 32 different
slices of which GPU has two slices preallocated. One slice is
used for caching GPU buffers and the other slice is used for
caching the GPU SMMU pagetables. This talks to the core system
cache driver to acquire the slice handles, configure the SCID's
to those slices and activates and deactivates the slices upon
GPU power collapse and restore.

Some support from the IOMMU driver is also needed to make use
of the system cache to set the right TCR attributes. GPU then
has the ability to override a few cacheability parameters which
it does to override write-allocate to write-no-allocate as the
GPU hardware does not benefit much from it.

DOMAIN_ATTR_SYS_CACHE is another domain level attribute used by the
IOMMU driver to set the right attributes to cache the hardware
pagetables into the system cache.

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
[saiprakash.ranjan: fix to set attr before device attach to iommu and rebase]
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 83 +++++++++++++++++++++++++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h   |  4 ++
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 17 +++++
 3 files changed, 104 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 948f3656c20c..95c98c642876 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -8,7 +8,9 @@
 #include "a6xx_gpu.h"
 #include "a6xx_gmu.xml.h"
 
+#include <linux/bitfield.h>
 #include <linux/devfreq.h>
+#include <linux/soc/qcom/llcc-qcom.h>
 
 #define GPU_PAS_ID 13
 
@@ -1022,6 +1024,79 @@ static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
 	return IRQ_HANDLED;
 }
 
+static void a6xx_llc_rmw(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 mask, u32 or)
+{
+	return msm_rmw(a6xx_gpu->llc_mmio + (reg << 2), mask, or);
+}
+
+static void a6xx_llc_write(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 value)
+{
+	return msm_writel(value, a6xx_gpu->llc_mmio + (reg << 2));
+}
+
+static void a6xx_llc_deactivate(struct a6xx_gpu *a6xx_gpu)
+{
+	llcc_slice_deactivate(a6xx_gpu->llc_slice);
+	llcc_slice_deactivate(a6xx_gpu->htw_llc_slice);
+}
+
+static void a6xx_llc_activate(struct a6xx_gpu *a6xx_gpu)
+{
+	u32 cntl1_regval = 0;
+
+	if (IS_ERR(a6xx_gpu->llc_mmio))
+		return;
+
+	if (!llcc_slice_activate(a6xx_gpu->llc_slice)) {
+		u32 gpu_scid = llcc_get_slice_id(a6xx_gpu->llc_slice);
+
+		gpu_scid &= 0x1f;
+		cntl1_regval = (gpu_scid << 0) | (gpu_scid << 5) | (gpu_scid << 10) |
+			       (gpu_scid << 15) | (gpu_scid << 20);
+	}
+
+	if (!llcc_slice_activate(a6xx_gpu->htw_llc_slice)) {
+		u32 gpuhtw_scid = llcc_get_slice_id(a6xx_gpu->htw_llc_slice);
+
+		gpuhtw_scid &= 0x1f;
+		cntl1_regval |= FIELD_PREP(GENMASK(29, 25), gpuhtw_scid);
+	}
+
+	if (cntl1_regval) {
+		/*
+		 * Program the slice IDs for the various GPU blocks and GPU MMU
+		 * pagetables
+		 */
+		a6xx_llc_write(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_1, cntl1_regval);
+
+		/*
+		 * Program cacheability overrides to not allocate cache lines on
+		 * a write miss
+		 */
+		a6xx_llc_rmw(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_0, 0xF, 0x03);
+	}
+}
+
+static void a6xx_llc_slices_destroy(struct a6xx_gpu *a6xx_gpu)
+{
+	llcc_slice_putd(a6xx_gpu->llc_slice);
+	llcc_slice_putd(a6xx_gpu->htw_llc_slice);
+}
+
+static void a6xx_llc_slices_init(struct platform_device *pdev,
+		struct a6xx_gpu *a6xx_gpu)
+{
+	a6xx_gpu->llc_mmio = msm_ioremap(pdev, "cx_mem", "gpu_cx");
+	if (IS_ERR(a6xx_gpu->llc_mmio))
+		return;
+
+	a6xx_gpu->llc_slice = llcc_slice_getd(LLCC_GPU);
+	a6xx_gpu->htw_llc_slice = llcc_slice_getd(LLCC_GPUHTW);
+
+	if (IS_ERR(a6xx_gpu->llc_slice) && IS_ERR(a6xx_gpu->htw_llc_slice))
+		a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
+}
+
 static int a6xx_pm_resume(struct msm_gpu *gpu)
 {
 	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
@@ -1038,6 +1113,8 @@ static int a6xx_pm_resume(struct msm_gpu *gpu)
 
 	msm_gpu_resume_devfreq(gpu);
 
+	a6xx_llc_activate(a6xx_gpu);
+
 	return 0;
 }
 
@@ -1048,6 +1125,8 @@ static int a6xx_pm_suspend(struct msm_gpu *gpu)
 
 	trace_msm_gpu_suspend(0);
 
+	a6xx_llc_deactivate(a6xx_gpu);
+
 	devfreq_suspend_device(gpu->devfreq.devfreq);
 
 	return a6xx_gmu_stop(a6xx_gpu);
@@ -1091,6 +1170,8 @@ static void a6xx_destroy(struct msm_gpu *gpu)
 		drm_gem_object_put(a6xx_gpu->shadow_bo);
 	}
 
+	a6xx_llc_slices_destroy(a6xx_gpu);
+
 	a6xx_gmu_remove(a6xx_gpu);
 
 	adreno_gpu_cleanup(adreno_gpu);
@@ -1209,6 +1290,8 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
 	if (info && info->revn == 650)
 		adreno_gpu->base.hw_apriv = true;
 
+	a6xx_llc_slices_init(pdev, a6xx_gpu);
+
 	ret = adreno_gpu_init(dev, pdev, adreno_gpu, &funcs, 1);
 	if (ret) {
 		a6xx_destroy(&(a6xx_gpu->base.base));
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
index 3eeebf6a754b..9e6079af679c 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
@@ -28,6 +28,10 @@ struct a6xx_gpu {
 	uint32_t *shadow;
 
 	bool has_whereami;
+
+	void __iomem *llc_mmio;
+	void *llc_slice;
+	void *htw_llc_slice;
 };
 
 #define to_a6xx_gpu(x) container_of(x, struct a6xx_gpu, base)
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 458b5b26d3c2..7684a8e588cb 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -16,6 +16,7 @@
 #include <linux/soc/qcom/mdt_loader.h>
 #include <soc/qcom/ocmem.h>
 #include "adreno_gpu.h"
+#include "a6xx_gpu.h"
 #include "msm_gem.h"
 #include "msm_mmu.h"
 
@@ -189,6 +190,8 @@ struct msm_gem_address_space *
 adreno_iommu_create_address_space(struct msm_gpu *gpu,
 		struct platform_device *pdev)
 {
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
 	struct iommu_domain *iommu;
 	struct msm_mmu *mmu;
 	struct msm_gem_address_space *aspace;
@@ -198,7 +201,21 @@ adreno_iommu_create_address_space(struct msm_gpu *gpu,
 	if (!iommu)
 		return NULL;
 
+	/*
+	 * This allows GPU to set the bus attributes required to use system
+	 * cache on behalf of the iommu page table walker.
+	 */
+	if (!IS_ERR(a6xx_gpu->htw_llc_slice)) {
+		int gpu_htw_llc = 1;
+
+		iommu_domain_set_attr(iommu, DOMAIN_ATTR_SYS_CACHE, &gpu_htw_llc);
+	}
+
 	mmu = msm_iommu_new(&pdev->dev, iommu);
+	if (IS_ERR(mmu)) {
+		iommu_domain_free(iommu);
+		return ERR_CAST(mmu);
+	}
 
 	/*
 	 * Use the aperture start or SZ_16M, whichever is greater. This will
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCHv7 5/7] drm/msm/a6xx: Add support for using system cache on MMU500 based targets
  2020-10-30  9:23 [PATCHv7 0/7] System Cache support for GPU and required SMMU support Sai Prakash Ranjan
                   ` (3 preceding siblings ...)
  2020-10-30  9:23 ` [PATCHv7 4/7] drm/msm/a6xx: Add support for using system cache(LLC) Sai Prakash Ranjan
@ 2020-10-30  9:23 ` Sai Prakash Ranjan
  2020-10-30  9:23 ` [PATCHv7 6/7] iommu: arm-smmu-impl: Use table to list QCOM implementations Sai Prakash Ranjan
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Sai Prakash Ranjan @ 2020-10-30  9:23 UTC (permalink / raw)
  To: Will Deacon, Robin Murphy, Joerg Roedel, Jordan Crouse, Rob Clark
  Cc: iommu, linux-arm-kernel, linux-kernel, linux-arm-msm,
	Akhil P Oommen, freedreno, Kristian H . Kristensen, dri-devel,
	Sai Prakash Ranjan

From: Jordan Crouse <jcrouse@codeaurora.org>

This is an extension to the series [1] to enable the System Cache (LLC) for
Adreno a6xx targets.

GPU targets with an MMU-500 attached have a slightly different process for
enabling system cache. Use the compatible string on the IOMMU phandle
to see if an MMU-500 is attached and modify the programming sequence
accordingly.

[1] https://patchwork.freedesktop.org/series/83037/

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 46 +++++++++++++++++++++------
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h |  1 +
 2 files changed, 37 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 95c98c642876..3f8b92da8cba 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1042,6 +1042,8 @@ static void a6xx_llc_deactivate(struct a6xx_gpu *a6xx_gpu)
 
 static void a6xx_llc_activate(struct a6xx_gpu *a6xx_gpu)
 {
+	struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
+	struct msm_gpu *gpu = &adreno_gpu->base;
 	u32 cntl1_regval = 0;
 
 	if (IS_ERR(a6xx_gpu->llc_mmio))
@@ -1055,11 +1057,17 @@ static void a6xx_llc_activate(struct a6xx_gpu *a6xx_gpu)
 			       (gpu_scid << 15) | (gpu_scid << 20);
 	}
 
+	/*
+	 * For targets with a MMU500, activate the slice but don't program the
+	 * register.  The XBL will take care of that.
+	 */
 	if (!llcc_slice_activate(a6xx_gpu->htw_llc_slice)) {
-		u32 gpuhtw_scid = llcc_get_slice_id(a6xx_gpu->htw_llc_slice);
+		if (!a6xx_gpu->have_mmu500) {
+			u32 gpuhtw_scid = llcc_get_slice_id(a6xx_gpu->htw_llc_slice);
 
-		gpuhtw_scid &= 0x1f;
-		cntl1_regval |= FIELD_PREP(GENMASK(29, 25), gpuhtw_scid);
+			gpuhtw_scid &= 0x1f;
+			cntl1_regval |= FIELD_PREP(GENMASK(29, 25), gpuhtw_scid);
+		}
 	}
 
 	if (cntl1_regval) {
@@ -1067,13 +1075,20 @@ static void a6xx_llc_activate(struct a6xx_gpu *a6xx_gpu)
 		 * Program the slice IDs for the various GPU blocks and GPU MMU
 		 * pagetables
 		 */
-		a6xx_llc_write(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_1, cntl1_regval);
-
-		/*
-		 * Program cacheability overrides to not allocate cache lines on
-		 * a write miss
-		 */
-		a6xx_llc_rmw(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_0, 0xF, 0x03);
+		if (a6xx_gpu->have_mmu500)
+			gpu_rmw(gpu, REG_A6XX_GBIF_SCACHE_CNTL1, GENMASK(24, 0),
+				cntl1_regval);
+		else {
+			a6xx_llc_write(a6xx_gpu,
+				REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_1, cntl1_regval);
+
+			/*
+			 * Program cacheability overrides to not allocate cache
+			 * lines on a write miss
+			 */
+			a6xx_llc_rmw(a6xx_gpu,
+				REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_0, 0xF, 0x03);
+		}
 	}
 }
 
@@ -1086,10 +1101,21 @@ static void a6xx_llc_slices_destroy(struct a6xx_gpu *a6xx_gpu)
 static void a6xx_llc_slices_init(struct platform_device *pdev,
 		struct a6xx_gpu *a6xx_gpu)
 {
+	struct device_node *phandle;
+
 	a6xx_gpu->llc_mmio = msm_ioremap(pdev, "cx_mem", "gpu_cx");
 	if (IS_ERR(a6xx_gpu->llc_mmio))
 		return;
 
+	/*
+	 * There is a different programming path for targets with an mmu500
+	 * attached, so detect if that is the case
+	 */
+	phandle = of_parse_phandle(pdev->dev.of_node, "iommus", 0);
+	a6xx_gpu->have_mmu500 = (phandle &&
+		of_device_is_compatible(phandle, "arm,mmu-500"));
+	of_node_put(phandle);
+
 	a6xx_gpu->llc_slice = llcc_slice_getd(LLCC_GPU);
 	a6xx_gpu->htw_llc_slice = llcc_slice_getd(LLCC_GPUHTW);
 
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
index 9e6079af679c..e793d329e77b 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
@@ -32,6 +32,7 @@ struct a6xx_gpu {
 	void __iomem *llc_mmio;
 	void *llc_slice;
 	void *htw_llc_slice;
+	bool have_mmu500;
 };
 
 #define to_a6xx_gpu(x) container_of(x, struct a6xx_gpu, base)
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCHv7 6/7] iommu: arm-smmu-impl: Use table to list QCOM implementations
  2020-10-30  9:23 [PATCHv7 0/7] System Cache support for GPU and required SMMU support Sai Prakash Ranjan
                   ` (4 preceding siblings ...)
  2020-10-30  9:23 ` [PATCHv7 5/7] drm/msm/a6xx: Add support for using system cache on MMU500 based targets Sai Prakash Ranjan
@ 2020-10-30  9:23 ` Sai Prakash Ranjan
  2020-11-10 12:11   ` Will Deacon
  2020-10-30  9:23 ` [PATCHv7 7/7] iommu: arm-smmu-impl: Add a space before open parenthesis Sai Prakash Ranjan
  2020-11-09  5:15 ` [PATCHv7 0/7] System Cache support for GPU and required SMMU support Sai Prakash Ranjan
  7 siblings, 1 reply; 18+ messages in thread
From: Sai Prakash Ranjan @ 2020-10-30  9:23 UTC (permalink / raw)
  To: Will Deacon, Robin Murphy, Joerg Roedel, Jordan Crouse, Rob Clark
  Cc: iommu, linux-arm-kernel, linux-kernel, linux-arm-msm,
	Akhil P Oommen, freedreno, Kristian H . Kristensen, dri-devel,
	Sai Prakash Ranjan

Use table and of_match_node() to match qcom implementation
instead of multiple of_device_compatible() calls for each
QCOM SMMU implementation.

Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
---
 drivers/iommu/arm/arm-smmu/arm-smmu-impl.c |  9 +--------
 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 21 ++++++++++++++++-----
 drivers/iommu/arm/arm-smmu/arm-smmu.h      |  1 -
 3 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
index d199b4bff15d..ffaf3f91ba52 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
@@ -217,14 +217,7 @@ struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu)
 	if (of_device_is_compatible(np, "nvidia,tegra194-smmu"))
 		return nvidia_smmu_impl_init(smmu);
 
-	if (of_device_is_compatible(np, "qcom,sdm845-smmu-500") ||
-	    of_device_is_compatible(np, "qcom,sc7180-smmu-500") ||
-	    of_device_is_compatible(np, "qcom,sm8150-smmu-500") ||
-	    of_device_is_compatible(np, "qcom,sm8250-smmu-500"))
-		return qcom_smmu_impl_init(smmu);
-
-	if (of_device_is_compatible(smmu->dev->of_node, "qcom,adreno-smmu"))
-		return qcom_adreno_smmu_impl_init(smmu);
+	smmu = qcom_smmu_impl_init(smmu);
 
 	if (of_device_is_compatible(np, "marvell,ap806-smmu-500"))
 		smmu->impl = &mrvl_mmu500_impl;
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
index 0f763d555c92..221e2a6a3231 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
@@ -314,12 +314,23 @@ static struct arm_smmu_device *qcom_smmu_create(struct arm_smmu_device *smmu,
 	return &qsmmu->smmu;
 }
 
+static const struct of_device_id __maybe_unused qcom_smmu_impl_of_match[] = {
+	{ .compatible = "qcom,sc7180-smmu-500" },
+	{ .compatible = "qcom,sdm845-smmu-500" },
+	{ .compatible = "qcom,sm8150-smmu-500" },
+	{ .compatible = "qcom,sm8250-smmu-500" },
+	{ }
+};
+
 struct arm_smmu_device *qcom_smmu_impl_init(struct arm_smmu_device *smmu)
 {
-	return qcom_smmu_create(smmu, &qcom_smmu_impl);
-}
+	const struct device_node *np = smmu->dev->of_node;
 
-struct arm_smmu_device *qcom_adreno_smmu_impl_init(struct arm_smmu_device *smmu)
-{
-	return qcom_smmu_create(smmu, &qcom_adreno_smmu_impl);
+	if (of_match_node(qcom_smmu_impl_of_match, np))
+		return qcom_smmu_create(smmu, &qcom_smmu_impl);
+
+	if (of_device_is_compatible(np, "qcom,adreno-smmu"))
+		return qcom_smmu_create(smmu, &qcom_adreno_smmu_impl);
+
+	return smmu;
 }
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h
index dfc44d806671..43b2411e65cc 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
@@ -525,7 +525,6 @@ static inline void arm_smmu_writeq(struct arm_smmu_device *smmu, int page,
 struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu);
 struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu);
 struct arm_smmu_device *qcom_smmu_impl_init(struct arm_smmu_device *smmu);
-struct arm_smmu_device *qcom_adreno_smmu_impl_init(struct arm_smmu_device *smmu);
 
 void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx);
 int arm_mmu500_reset(struct arm_smmu_device *smmu);
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCHv7 7/7] iommu: arm-smmu-impl: Add a space before open parenthesis
  2020-10-30  9:23 [PATCHv7 0/7] System Cache support for GPU and required SMMU support Sai Prakash Ranjan
                   ` (5 preceding siblings ...)
  2020-10-30  9:23 ` [PATCHv7 6/7] iommu: arm-smmu-impl: Use table to list QCOM implementations Sai Prakash Ranjan
@ 2020-10-30  9:23 ` Sai Prakash Ranjan
  2020-11-10 12:12   ` Will Deacon
  2020-11-09  5:15 ` [PATCHv7 0/7] System Cache support for GPU and required SMMU support Sai Prakash Ranjan
  7 siblings, 1 reply; 18+ messages in thread
From: Sai Prakash Ranjan @ 2020-10-30  9:23 UTC (permalink / raw)
  To: Will Deacon, Robin Murphy, Joerg Roedel, Jordan Crouse, Rob Clark
  Cc: iommu, linux-arm-kernel, linux-kernel, linux-arm-msm,
	Akhil P Oommen, freedreno, Kristian H . Kristensen, dri-devel,
	Sai Prakash Ranjan

Fix the checkpatch warning for space required before the open
parenthesis.

Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
---
 drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
index ffaf3f91ba52..f16da4a21270 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
@@ -12,7 +12,7 @@
 
 static int arm_smmu_gr0_ns(int offset)
 {
-	switch(offset) {
+	switch (offset) {
 	case ARM_SMMU_GR0_sCR0:
 	case ARM_SMMU_GR0_sACR:
 	case ARM_SMMU_GR0_sGFSR:
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCHv7 0/7] System Cache support for GPU and required SMMU support
  2020-10-30  9:23 [PATCHv7 0/7] System Cache support for GPU and required SMMU support Sai Prakash Ranjan
                   ` (6 preceding siblings ...)
  2020-10-30  9:23 ` [PATCHv7 7/7] iommu: arm-smmu-impl: Add a space before open parenthesis Sai Prakash Ranjan
@ 2020-11-09  5:15 ` Sai Prakash Ranjan
  7 siblings, 0 replies; 18+ messages in thread
From: Sai Prakash Ranjan @ 2020-11-09  5:15 UTC (permalink / raw)
  To: Will Deacon, Robin Murphy, Joerg Roedel, Jordan Crouse, Rob Clark
  Cc: iommu, linux-arm-kernel, linux-kernel, linux-arm-msm,
	Akhil P Oommen, freedreno, Kristian H . Kristensen, dri-devel

On 2020-10-30 14:53, Sai Prakash Ranjan wrote:
> Some hardware variants contain a system cache or the last level
> cache(llc). This cache is typically a large block which is shared
> by multiple clients on the SOC. GPU uses the system cache to cache
> both the GPU data buffers(like textures) as well the SMMU pagetables.
> This helps with improved render performance as well as lower power
> consumption by reducing the bus traffic to the system memory.
> 
> The system cache architecture allows the cache to be split into slices
> which then be used by multiple SOC clients. This patch series is an
> effort to enable and use two of those slices preallocated for the GPU,
> one for the GPU data buffers and another for the GPU SMMU hardware
> pagetables.
> 
> Patch 1 - Patch 5 adds system cache support in SMMU and GPU driver.
> Patch 6 and 7 are minor cleanups for arm-smmu impl.
> 
> Changes in v7:
>  * Squash Jordan's patch to support MMU500 targets
>  * Rebase on top of for-joerg/arm-smmu/updates and Jordan's short
> series for adreno-smmu impl
> 
> Changes in v6:
>  * Move table to arm-smmu-qcom (Robin)
> 
> Changes in v5:
>  * Drop cleanup of blank lines since it was intentional (Robin)
>  * Rebase again on top of msm-next-pgtables as it moves pretty fast
> 
> Changes in v4:
>  * Drop IOMMU_SYS_CACHE prot flag
>  * Rebase on top of
> https://gitlab.freedesktop.org/drm/msm/-/tree/msm-next-pgtables
> 
> Changes in v3:
>  * Fix domain attribute setting to before iommu_attach_device()
>  * Fix few code style and checkpatch warnings
>  * Rebase on top of Jordan's latest split pagetables and per-instance
>    pagetables support
> 
> Changes in v2:
>  * Addressed review comments and rebased on top of Jordan's split
>    pagetables series
> 
> Jordan Crouse (1):
>   drm/msm/a6xx: Add support for using system cache on MMU500 based
>     targets
> 
> Sai Prakash Ranjan (4):
>   iommu/io-pgtable-arm: Add support to use system cache
>   iommu/arm-smmu: Add domain attribute for system cache
>   iommu: arm-smmu-impl: Use table to list QCOM implementations
>   iommu: arm-smmu-impl: Add a space before open parenthesis
> 
> Sharat Masetty (2):
>   drm/msm: rearrange the gpu_rmw() function
>   drm/msm/a6xx: Add support for using system cache(LLC)
> 

Hi,

Gentle Ping!

Thanks,
Sai

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCHv7 6/7] iommu: arm-smmu-impl: Use table to list QCOM implementations
  2020-10-30  9:23 ` [PATCHv7 6/7] iommu: arm-smmu-impl: Use table to list QCOM implementations Sai Prakash Ranjan
@ 2020-11-10 12:11   ` Will Deacon
  0 siblings, 0 replies; 18+ messages in thread
From: Will Deacon @ 2020-11-10 12:11 UTC (permalink / raw)
  To: Sai Prakash Ranjan
  Cc: Robin Murphy, Joerg Roedel, Jordan Crouse, Rob Clark, iommu,
	linux-arm-kernel, linux-kernel, linux-arm-msm, Akhil P Oommen,
	freedreno, Kristian H . Kristensen, dri-devel

On Fri, Oct 30, 2020 at 02:53:13PM +0530, Sai Prakash Ranjan wrote:
> Use table and of_match_node() to match qcom implementation
> instead of multiple of_device_compatible() calls for each
> QCOM SMMU implementation.
> 
> Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
> ---
>  drivers/iommu/arm/arm-smmu/arm-smmu-impl.c |  9 +--------
>  drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 21 ++++++++++++++++-----
>  drivers/iommu/arm/arm-smmu/arm-smmu.h      |  1 -
>  3 files changed, 17 insertions(+), 14 deletions(-)

Acked-by: Will Deacon <will@kernel.org>

Will

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCHv7 7/7] iommu: arm-smmu-impl: Add a space before open parenthesis
  2020-10-30  9:23 ` [PATCHv7 7/7] iommu: arm-smmu-impl: Add a space before open parenthesis Sai Prakash Ranjan
@ 2020-11-10 12:12   ` Will Deacon
  0 siblings, 0 replies; 18+ messages in thread
From: Will Deacon @ 2020-11-10 12:12 UTC (permalink / raw)
  To: Sai Prakash Ranjan
  Cc: Robin Murphy, Joerg Roedel, Jordan Crouse, Rob Clark, iommu,
	linux-arm-kernel, linux-kernel, linux-arm-msm, Akhil P Oommen,
	freedreno, Kristian H . Kristensen, dri-devel

On Fri, Oct 30, 2020 at 02:53:14PM +0530, Sai Prakash Ranjan wrote:
> Fix the checkpatch warning for space required before the open
> parenthesis.
> 
> Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
> ---
>  drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
> index ffaf3f91ba52..f16da4a21270 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
> @@ -12,7 +12,7 @@
>  
>  static int arm_smmu_gr0_ns(int offset)
>  {
> -	switch(offset) {
> +	switch (offset) {
>  	case ARM_SMMU_GR0_sCR0:
>  	case ARM_SMMU_GR0_sACR:
>  	case ARM_SMMU_GR0_sGFSR:

Whatever...

Acked-by: Will Deacon <will@kernel.org>

Will

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCHv7 2/7] iommu/arm-smmu: Add domain attribute for system cache
  2020-10-30  9:23 ` [PATCHv7 2/7] iommu/arm-smmu: Add domain attribute for " Sai Prakash Ranjan
@ 2020-11-10 12:18   ` Will Deacon
  2020-11-11  6:40     ` Sai Prakash Ranjan
  0 siblings, 1 reply; 18+ messages in thread
From: Will Deacon @ 2020-11-10 12:18 UTC (permalink / raw)
  To: Sai Prakash Ranjan
  Cc: Robin Murphy, Joerg Roedel, Jordan Crouse, Rob Clark, iommu,
	linux-arm-kernel, linux-kernel, linux-arm-msm, Akhil P Oommen,
	freedreno, Kristian H . Kristensen, dri-devel

On Fri, Oct 30, 2020 at 02:53:09PM +0530, Sai Prakash Ranjan wrote:
> Add iommu domain attribute for using system cache aka last level
> cache by client drivers like GPU to set right attributes for caching
> the hardware pagetables into the system cache.
> 
> Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
> ---
>  drivers/iommu/arm/arm-smmu/arm-smmu.c | 17 +++++++++++++++++
>  drivers/iommu/arm/arm-smmu/arm-smmu.h |  1 +
>  include/linux/iommu.h                 |  1 +
>  3 files changed, 19 insertions(+)
> 
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> index b1cf8f0abc29..070d13f80c7e 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> @@ -789,6 +789,9 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>  	if (smmu_domain->non_strict)
>  		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
>  
> +	if (smmu_domain->sys_cache)
> +		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_SYS_CACHE;
> +
>  	pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
>  	if (!pgtbl_ops) {
>  		ret = -ENOMEM;
> @@ -1520,6 +1523,9 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
>  		case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
>  			*(int *)data = smmu_domain->non_strict;
>  			return 0;
> +		case DOMAIN_ATTR_SYS_CACHE:
> +			*((int *)data) = smmu_domain->sys_cache;
> +			return 0;
>  		default:
>  			return -ENODEV;
>  		}
> @@ -1551,6 +1557,17 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
>  			else
>  				smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
>  			break;
> +		case DOMAIN_ATTR_SYS_CACHE:
> +			if (smmu_domain->smmu) {
> +				ret = -EPERM;
> +				goto out_unlock;
> +			}
> +
> +			if (*((int *)data))
> +				smmu_domain->sys_cache = true;
> +			else
> +				smmu_domain->sys_cache = false;
> +			break;
>  		default:
>  			ret = -ENODEV;
>  		}
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> index 885840f3bec8..dfc44d806671 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> @@ -373,6 +373,7 @@ struct arm_smmu_domain {
>  	struct mutex			init_mutex; /* Protects smmu pointer */
>  	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
>  	struct iommu_domain		domain;
> +	bool				sys_cache;
>  };
>  
>  struct arm_smmu_master_cfg {
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index b95a6f8db6ff..4f4bb9c6f8f6 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -118,6 +118,7 @@ enum iommu_attr {
>  	DOMAIN_ATTR_FSL_PAMUV1,
>  	DOMAIN_ATTR_NESTING,	/* two stages of translation */
>  	DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
> +	DOMAIN_ATTR_SYS_CACHE,

I think you're trying to make this look generic, but it's really not.
If we need to funnel io-pgtable quirks through domain attributes, then I
think we should be open about that and add something like
DOMAIN_ATTR_IO_PGTABLE_CFG which could take a struct of page-table
configuration data for the domain (this could just be quirks initially,
but maybe it's worth extending to take ias, oas and page size)

Will

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCHv7 1/7] iommu/io-pgtable-arm: Add support to use system cache
  2020-10-30  9:23 ` [PATCHv7 1/7] iommu/io-pgtable-arm: Add support to use system cache Sai Prakash Ranjan
@ 2020-11-10 12:18   ` Will Deacon
  2020-11-11  6:02     ` Sai Prakash Ranjan
  0 siblings, 1 reply; 18+ messages in thread
From: Will Deacon @ 2020-11-10 12:18 UTC (permalink / raw)
  To: Sai Prakash Ranjan
  Cc: Robin Murphy, Joerg Roedel, Jordan Crouse, Rob Clark, iommu,
	linux-arm-kernel, linux-kernel, linux-arm-msm, Akhil P Oommen,
	freedreno, Kristian H . Kristensen, dri-devel

On Fri, Oct 30, 2020 at 02:53:08PM +0530, Sai Prakash Ranjan wrote:
> Add a quirk IO_PGTABLE_QUIRK_SYS_CACHE to override the
> attributes set in TCR for the page table walker when
> using system cache.
> 
> Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
> ---
>  drivers/iommu/io-pgtable-arm.c | 7 ++++++-
>  include/linux/io-pgtable.h     | 4 ++++
>  2 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index a7a9bc08dcd1..a356caf1683a 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -761,7 +761,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
>  
>  	if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS |
>  			    IO_PGTABLE_QUIRK_NON_STRICT |
> -			    IO_PGTABLE_QUIRK_ARM_TTBR1))
> +			    IO_PGTABLE_QUIRK_ARM_TTBR1 |
> +			    IO_PGTABLE_QUIRK_SYS_CACHE))
>  		return NULL;
>  
>  	data = arm_lpae_alloc_pgtable(cfg);
> @@ -773,6 +774,10 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
>  		tcr->sh = ARM_LPAE_TCR_SH_IS;
>  		tcr->irgn = ARM_LPAE_TCR_RGN_WBWA;
>  		tcr->orgn = ARM_LPAE_TCR_RGN_WBWA;
> +	} else if (cfg->quirks & IO_PGTABLE_QUIRK_SYS_CACHE) {
> +		tcr->sh = ARM_LPAE_TCR_SH_OS;
> +		tcr->irgn = ARM_LPAE_TCR_RGN_NC;
> +		tcr->orgn = ARM_LPAE_TCR_RGN_WBWA;

Given that this only applies in the case where then page-table walker is
non-coherent, I think we'd be better off renaming the quirk to something
like IO_PGTABLE_QUIRK_ARM_OUTER_WBWA and then rejecting it in the
non-coherent case.

>  	} else {
>  		tcr->sh = ARM_LPAE_TCR_SH_OS;
>  		tcr->irgn = ARM_LPAE_TCR_RGN_NC;
> diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
> index 4cde111e425b..86631f711e05 100644
> --- a/include/linux/io-pgtable.h
> +++ b/include/linux/io-pgtable.h
> @@ -86,6 +86,9 @@ struct io_pgtable_cfg {
>  	 *
>  	 * IO_PGTABLE_QUIRK_ARM_TTBR1: (ARM LPAE format) Configure the table
>  	 *	for use in the upper half of a split address space.
> +	 *
> +	 * IO_PGTABLE_QUIRK_SYS_CACHE: Override the attributes set in TCR for
> +	 *	the page table walker when using system cache.

and then update this accordingly.

Will

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCHv7 1/7] iommu/io-pgtable-arm: Add support to use system cache
  2020-11-10 12:18   ` Will Deacon
@ 2020-11-11  6:02     ` Sai Prakash Ranjan
  2020-11-12  9:43       ` Will Deacon
  0 siblings, 1 reply; 18+ messages in thread
From: Sai Prakash Ranjan @ 2020-11-11  6:02 UTC (permalink / raw)
  To: Will Deacon
  Cc: Robin Murphy, Joerg Roedel, Jordan Crouse, Rob Clark, iommu,
	linux-arm-kernel, linux-kernel, linux-arm-msm, Akhil P Oommen,
	freedreno, Kristian H . Kristensen, dri-devel

On 2020-11-10 17:48, Will Deacon wrote:
> On Fri, Oct 30, 2020 at 02:53:08PM +0530, Sai Prakash Ranjan wrote:
>> Add a quirk IO_PGTABLE_QUIRK_SYS_CACHE to override the
>> attributes set in TCR for the page table walker when
>> using system cache.
>> 
>> Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
>> ---
>>  drivers/iommu/io-pgtable-arm.c | 7 ++++++-
>>  include/linux/io-pgtable.h     | 4 ++++
>>  2 files changed, 10 insertions(+), 1 deletion(-)
>> 
>> diff --git a/drivers/iommu/io-pgtable-arm.c 
>> b/drivers/iommu/io-pgtable-arm.c
>> index a7a9bc08dcd1..a356caf1683a 100644
>> --- a/drivers/iommu/io-pgtable-arm.c
>> +++ b/drivers/iommu/io-pgtable-arm.c
>> @@ -761,7 +761,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg 
>> *cfg, void *cookie)
>> 
>>  	if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS |
>>  			    IO_PGTABLE_QUIRK_NON_STRICT |
>> -			    IO_PGTABLE_QUIRK_ARM_TTBR1))
>> +			    IO_PGTABLE_QUIRK_ARM_TTBR1 |
>> +			    IO_PGTABLE_QUIRK_SYS_CACHE))
>>  		return NULL;
>> 
>>  	data = arm_lpae_alloc_pgtable(cfg);
>> @@ -773,6 +774,10 @@ arm_64_lpae_alloc_pgtable_s1(struct 
>> io_pgtable_cfg *cfg, void *cookie)
>>  		tcr->sh = ARM_LPAE_TCR_SH_IS;
>>  		tcr->irgn = ARM_LPAE_TCR_RGN_WBWA;
>>  		tcr->orgn = ARM_LPAE_TCR_RGN_WBWA;
>> +	} else if (cfg->quirks & IO_PGTABLE_QUIRK_SYS_CACHE) {
>> +		tcr->sh = ARM_LPAE_TCR_SH_OS;
>> +		tcr->irgn = ARM_LPAE_TCR_RGN_NC;
>> +		tcr->orgn = ARM_LPAE_TCR_RGN_WBWA;
> 
> Given that this only applies in the case where then page-table walker 
> is
> non-coherent, I think we'd be better off renaming the quirk to 
> something
> like IO_PGTABLE_QUIRK_ARM_OUTER_WBWA and then rejecting it in the
> non-coherent case.
> 

Do you mean like below?

diff --git a/drivers/iommu/io-pgtable-arm.c 
b/drivers/iommu/io-pgtable-arm.c
index a7a9bc08dcd1..94de1f71db42 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -776,7 +776,10 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg 
*cfg, void *cookie)
         } else {
                 tcr->sh = ARM_LPAE_TCR_SH_OS;
                 tcr->irgn = ARM_LPAE_TCR_RGN_NC;
-               tcr->orgn = ARM_LPAE_TCR_RGN_NC;
+               if (!(cfg->quirks & IO_PGTABLE_QUIRK_ARM_OUTER_WBWA))
+                       tcr->orgn = ARM_LPAE_TCR_RGN_NC;
+               else
+                       tcr->orgn = ARM_LPAE_TCR_RGN_WBWA;
         }

         tg1 = cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1;


Thanks,
Sai

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCHv7 2/7] iommu/arm-smmu: Add domain attribute for system cache
  2020-11-10 12:18   ` Will Deacon
@ 2020-11-11  6:40     ` Sai Prakash Ranjan
  2020-11-12  9:35       ` Will Deacon
  0 siblings, 1 reply; 18+ messages in thread
From: Sai Prakash Ranjan @ 2020-11-11  6:40 UTC (permalink / raw)
  To: Will Deacon
  Cc: Robin Murphy, Joerg Roedel, Jordan Crouse, Rob Clark, iommu,
	linux-arm-kernel, linux-kernel, linux-arm-msm, Akhil P Oommen,
	freedreno, Kristian H . Kristensen, dri-devel

On 2020-11-10 17:48, Will Deacon wrote:
> On Fri, Oct 30, 2020 at 02:53:09PM +0530, Sai Prakash Ranjan wrote:
>> Add iommu domain attribute for using system cache aka last level
>> cache by client drivers like GPU to set right attributes for caching
>> the hardware pagetables into the system cache.
>> 
>> Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
>> ---
>>  drivers/iommu/arm/arm-smmu/arm-smmu.c | 17 +++++++++++++++++
>>  drivers/iommu/arm/arm-smmu/arm-smmu.h |  1 +
>>  include/linux/iommu.h                 |  1 +
>>  3 files changed, 19 insertions(+)
>> 
>> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
>> b/drivers/iommu/arm/arm-smmu/arm-smmu.c
>> index b1cf8f0abc29..070d13f80c7e 100644
>> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
>> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
>> @@ -789,6 +789,9 @@ static int arm_smmu_init_domain_context(struct 
>> iommu_domain *domain,
>>  	if (smmu_domain->non_strict)
>>  		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
>> 
>> +	if (smmu_domain->sys_cache)
>> +		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_SYS_CACHE;
>> +
>>  	pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
>>  	if (!pgtbl_ops) {
>>  		ret = -ENOMEM;
>> @@ -1520,6 +1523,9 @@ static int arm_smmu_domain_get_attr(struct 
>> iommu_domain *domain,
>>  		case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
>>  			*(int *)data = smmu_domain->non_strict;
>>  			return 0;
>> +		case DOMAIN_ATTR_SYS_CACHE:
>> +			*((int *)data) = smmu_domain->sys_cache;
>> +			return 0;
>>  		default:
>>  			return -ENODEV;
>>  		}
>> @@ -1551,6 +1557,17 @@ static int arm_smmu_domain_set_attr(struct 
>> iommu_domain *domain,
>>  			else
>>  				smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
>>  			break;
>> +		case DOMAIN_ATTR_SYS_CACHE:
>> +			if (smmu_domain->smmu) {
>> +				ret = -EPERM;
>> +				goto out_unlock;
>> +			}
>> +
>> +			if (*((int *)data))
>> +				smmu_domain->sys_cache = true;
>> +			else
>> +				smmu_domain->sys_cache = false;
>> +			break;
>>  		default:
>>  			ret = -ENODEV;
>>  		}
>> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h 
>> b/drivers/iommu/arm/arm-smmu/arm-smmu.h
>> index 885840f3bec8..dfc44d806671 100644
>> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
>> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
>> @@ -373,6 +373,7 @@ struct arm_smmu_domain {
>>  	struct mutex			init_mutex; /* Protects smmu pointer */
>>  	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
>>  	struct iommu_domain		domain;
>> +	bool				sys_cache;
>>  };
>> 
>>  struct arm_smmu_master_cfg {
>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> index b95a6f8db6ff..4f4bb9c6f8f6 100644
>> --- a/include/linux/iommu.h
>> +++ b/include/linux/iommu.h
>> @@ -118,6 +118,7 @@ enum iommu_attr {
>>  	DOMAIN_ATTR_FSL_PAMUV1,
>>  	DOMAIN_ATTR_NESTING,	/* two stages of translation */
>>  	DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
>> +	DOMAIN_ATTR_SYS_CACHE,
> 
> I think you're trying to make this look generic, but it's really not.
> If we need to funnel io-pgtable quirks through domain attributes, then 
> I
> think we should be open about that and add something like
> DOMAIN_ATTR_IO_PGTABLE_CFG which could take a struct of page-table
> configuration data for the domain (this could just be quirks initially,
> but maybe it's worth extending to take ias, oas and page size)
> 

Actually the initial versions used DOMAIN_ATTR_QCOM_SYS_CACHE
to make it QCOM specific and not generic, I don't see anyone else
using this attribute, would that work?

Thanks,
Sai

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCHv7 2/7] iommu/arm-smmu: Add domain attribute for system cache
  2020-11-11  6:40     ` Sai Prakash Ranjan
@ 2020-11-12  9:35       ` Will Deacon
  2020-11-14 11:47         ` Sai Prakash Ranjan
  0 siblings, 1 reply; 18+ messages in thread
From: Will Deacon @ 2020-11-12  9:35 UTC (permalink / raw)
  To: Sai Prakash Ranjan
  Cc: Robin Murphy, Joerg Roedel, Jordan Crouse, Rob Clark, iommu,
	linux-arm-kernel, linux-kernel, linux-arm-msm, Akhil P Oommen,
	freedreno, Kristian H . Kristensen, dri-devel

On Wed, Nov 11, 2020 at 12:10:50PM +0530, Sai Prakash Ranjan wrote:
> On 2020-11-10 17:48, Will Deacon wrote:
> > On Fri, Oct 30, 2020 at 02:53:09PM +0530, Sai Prakash Ranjan wrote:
> > > Add iommu domain attribute for using system cache aka last level
> > > cache by client drivers like GPU to set right attributes for caching
> > > the hardware pagetables into the system cache.
> > > 
> > > Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
> > > ---
> > >  drivers/iommu/arm/arm-smmu/arm-smmu.c | 17 +++++++++++++++++
> > >  drivers/iommu/arm/arm-smmu/arm-smmu.h |  1 +
> > >  include/linux/iommu.h                 |  1 +
> > >  3 files changed, 19 insertions(+)
> > > 
> > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > > b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > > index b1cf8f0abc29..070d13f80c7e 100644
> > > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > > @@ -789,6 +789,9 @@ static int arm_smmu_init_domain_context(struct
> > > iommu_domain *domain,
> > >  	if (smmu_domain->non_strict)
> > >  		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
> > > 
> > > +	if (smmu_domain->sys_cache)
> > > +		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_SYS_CACHE;
> > > +
> > >  	pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
> > >  	if (!pgtbl_ops) {
> > >  		ret = -ENOMEM;
> > > @@ -1520,6 +1523,9 @@ static int arm_smmu_domain_get_attr(struct
> > > iommu_domain *domain,
> > >  		case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
> > >  			*(int *)data = smmu_domain->non_strict;
> > >  			return 0;
> > > +		case DOMAIN_ATTR_SYS_CACHE:
> > > +			*((int *)data) = smmu_domain->sys_cache;
> > > +			return 0;
> > >  		default:
> > >  			return -ENODEV;
> > >  		}
> > > @@ -1551,6 +1557,17 @@ static int arm_smmu_domain_set_attr(struct
> > > iommu_domain *domain,
> > >  			else
> > >  				smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
> > >  			break;
> > > +		case DOMAIN_ATTR_SYS_CACHE:
> > > +			if (smmu_domain->smmu) {
> > > +				ret = -EPERM;
> > > +				goto out_unlock;
> > > +			}
> > > +
> > > +			if (*((int *)data))
> > > +				smmu_domain->sys_cache = true;
> > > +			else
> > > +				smmu_domain->sys_cache = false;
> > > +			break;
> > >  		default:
> > >  			ret = -ENODEV;
> > >  		}
> > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h
> > > b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> > > index 885840f3bec8..dfc44d806671 100644
> > > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
> > > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> > > @@ -373,6 +373,7 @@ struct arm_smmu_domain {
> > >  	struct mutex			init_mutex; /* Protects smmu pointer */
> > >  	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
> > >  	struct iommu_domain		domain;
> > > +	bool				sys_cache;
> > >  };
> > > 
> > >  struct arm_smmu_master_cfg {
> > > diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> > > index b95a6f8db6ff..4f4bb9c6f8f6 100644
> > > --- a/include/linux/iommu.h
> > > +++ b/include/linux/iommu.h
> > > @@ -118,6 +118,7 @@ enum iommu_attr {
> > >  	DOMAIN_ATTR_FSL_PAMUV1,
> > >  	DOMAIN_ATTR_NESTING,	/* two stages of translation */
> > >  	DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
> > > +	DOMAIN_ATTR_SYS_CACHE,
> > 
> > I think you're trying to make this look generic, but it's really not.
> > If we need to funnel io-pgtable quirks through domain attributes, then I
> > think we should be open about that and add something like
> > DOMAIN_ATTR_IO_PGTABLE_CFG which could take a struct of page-table
> > configuration data for the domain (this could just be quirks initially,
> > but maybe it's worth extending to take ias, oas and page size)
> > 
> 
> Actually the initial versions used DOMAIN_ATTR_QCOM_SYS_CACHE
> to make it QCOM specific and not generic, I don't see anyone else
> using this attribute, would that work?

No -- I'd prefer to have _one_ domain attribute for funneling all the
IP_PGTABLE_CFG data. Otherwise, we'll just end up with things like
DOMAIN_ATTR_QCOM_SYS_CACHE_EXT or DOMAIN_ATTR_QCOM_QUIRKS later on.

Will

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCHv7 1/7] iommu/io-pgtable-arm: Add support to use system cache
  2020-11-11  6:02     ` Sai Prakash Ranjan
@ 2020-11-12  9:43       ` Will Deacon
  0 siblings, 0 replies; 18+ messages in thread
From: Will Deacon @ 2020-11-12  9:43 UTC (permalink / raw)
  To: Sai Prakash Ranjan
  Cc: Robin Murphy, Joerg Roedel, Jordan Crouse, Rob Clark, iommu,
	linux-arm-kernel, linux-kernel, linux-arm-msm, Akhil P Oommen,
	freedreno, Kristian H . Kristensen, dri-devel

On Wed, Nov 11, 2020 at 11:32:42AM +0530, Sai Prakash Ranjan wrote:
> On 2020-11-10 17:48, Will Deacon wrote:
> > On Fri, Oct 30, 2020 at 02:53:08PM +0530, Sai Prakash Ranjan wrote:
> > > Add a quirk IO_PGTABLE_QUIRK_SYS_CACHE to override the
> > > attributes set in TCR for the page table walker when
> > > using system cache.
> > > 
> > > Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
> > > ---
> > >  drivers/iommu/io-pgtable-arm.c | 7 ++++++-
> > >  include/linux/io-pgtable.h     | 4 ++++
> > >  2 files changed, 10 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/iommu/io-pgtable-arm.c
> > > b/drivers/iommu/io-pgtable-arm.c
> > > index a7a9bc08dcd1..a356caf1683a 100644
> > > --- a/drivers/iommu/io-pgtable-arm.c
> > > +++ b/drivers/iommu/io-pgtable-arm.c
> > > @@ -761,7 +761,8 @@ arm_64_lpae_alloc_pgtable_s1(struct
> > > io_pgtable_cfg *cfg, void *cookie)
> > > 
> > >  	if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS |
> > >  			    IO_PGTABLE_QUIRK_NON_STRICT |
> > > -			    IO_PGTABLE_QUIRK_ARM_TTBR1))
> > > +			    IO_PGTABLE_QUIRK_ARM_TTBR1 |
> > > +			    IO_PGTABLE_QUIRK_SYS_CACHE))
> > >  		return NULL;
> > > 
> > >  	data = arm_lpae_alloc_pgtable(cfg);
> > > @@ -773,6 +774,10 @@ arm_64_lpae_alloc_pgtable_s1(struct
> > > io_pgtable_cfg *cfg, void *cookie)
> > >  		tcr->sh = ARM_LPAE_TCR_SH_IS;
> > >  		tcr->irgn = ARM_LPAE_TCR_RGN_WBWA;
> > >  		tcr->orgn = ARM_LPAE_TCR_RGN_WBWA;
> > > +	} else if (cfg->quirks & IO_PGTABLE_QUIRK_SYS_CACHE) {
> > > +		tcr->sh = ARM_LPAE_TCR_SH_OS;
> > > +		tcr->irgn = ARM_LPAE_TCR_RGN_NC;
> > > +		tcr->orgn = ARM_LPAE_TCR_RGN_WBWA;
> > 
> > Given that this only applies in the case where then page-table walker is
> > non-coherent, I think we'd be better off renaming the quirk to something
> > like IO_PGTABLE_QUIRK_ARM_OUTER_WBWA and then rejecting it in the
> > non-coherent case.
> > 
> 
> Do you mean like below?
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index a7a9bc08dcd1..94de1f71db42 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -776,7 +776,10 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg
> *cfg, void *cookie)
>         } else {
>                 tcr->sh = ARM_LPAE_TCR_SH_OS;
>                 tcr->irgn = ARM_LPAE_TCR_RGN_NC;
> -               tcr->orgn = ARM_LPAE_TCR_RGN_NC;
> +               if (!(cfg->quirks & IO_PGTABLE_QUIRK_ARM_OUTER_WBWA))
> +                       tcr->orgn = ARM_LPAE_TCR_RGN_NC;
> +               else
> +                       tcr->orgn = ARM_LPAE_TCR_RGN_WBWA;

Yes, but rejecting the quirk if the walker is coherent (I accidentally said
"non-coherent" earlier on).

Will

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCHv7 2/7] iommu/arm-smmu: Add domain attribute for system cache
  2020-11-12  9:35       ` Will Deacon
@ 2020-11-14 11:47         ` Sai Prakash Ranjan
  0 siblings, 0 replies; 18+ messages in thread
From: Sai Prakash Ranjan @ 2020-11-14 11:47 UTC (permalink / raw)
  To: Will Deacon
  Cc: Robin Murphy, Joerg Roedel, Jordan Crouse, Rob Clark, iommu,
	linux-arm-kernel, linux-kernel, linux-arm-msm, Akhil P Oommen,
	freedreno, Kristian H . Kristensen, dri-devel

On 2020-11-12 15:05, Will Deacon wrote:
> On Wed, Nov 11, 2020 at 12:10:50PM +0530, Sai Prakash Ranjan wrote:
>> On 2020-11-10 17:48, Will Deacon wrote:
>> > On Fri, Oct 30, 2020 at 02:53:09PM +0530, Sai Prakash Ranjan wrote:
>> > > Add iommu domain attribute for using system cache aka last level
>> > > cache by client drivers like GPU to set right attributes for caching
>> > > the hardware pagetables into the system cache.
>> > >
>> > > Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
>> > > ---
>> > >  drivers/iommu/arm/arm-smmu/arm-smmu.c | 17 +++++++++++++++++
>> > >  drivers/iommu/arm/arm-smmu/arm-smmu.h |  1 +
>> > >  include/linux/iommu.h                 |  1 +
>> > >  3 files changed, 19 insertions(+)
>> > >
>> > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c
>> > > b/drivers/iommu/arm/arm-smmu/arm-smmu.c
>> > > index b1cf8f0abc29..070d13f80c7e 100644
>> > > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
>> > > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
>> > > @@ -789,6 +789,9 @@ static int arm_smmu_init_domain_context(struct
>> > > iommu_domain *domain,
>> > >  	if (smmu_domain->non_strict)
>> > >  		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
>> > >
>> > > +	if (smmu_domain->sys_cache)
>> > > +		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_SYS_CACHE;
>> > > +
>> > >  	pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
>> > >  	if (!pgtbl_ops) {
>> > >  		ret = -ENOMEM;
>> > > @@ -1520,6 +1523,9 @@ static int arm_smmu_domain_get_attr(struct
>> > > iommu_domain *domain,
>> > >  		case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
>> > >  			*(int *)data = smmu_domain->non_strict;
>> > >  			return 0;
>> > > +		case DOMAIN_ATTR_SYS_CACHE:
>> > > +			*((int *)data) = smmu_domain->sys_cache;
>> > > +			return 0;
>> > >  		default:
>> > >  			return -ENODEV;
>> > >  		}
>> > > @@ -1551,6 +1557,17 @@ static int arm_smmu_domain_set_attr(struct
>> > > iommu_domain *domain,
>> > >  			else
>> > >  				smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
>> > >  			break;
>> > > +		case DOMAIN_ATTR_SYS_CACHE:
>> > > +			if (smmu_domain->smmu) {
>> > > +				ret = -EPERM;
>> > > +				goto out_unlock;
>> > > +			}
>> > > +
>> > > +			if (*((int *)data))
>> > > +				smmu_domain->sys_cache = true;
>> > > +			else
>> > > +				smmu_domain->sys_cache = false;
>> > > +			break;
>> > >  		default:
>> > >  			ret = -ENODEV;
>> > >  		}
>> > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h
>> > > b/drivers/iommu/arm/arm-smmu/arm-smmu.h
>> > > index 885840f3bec8..dfc44d806671 100644
>> > > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
>> > > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
>> > > @@ -373,6 +373,7 @@ struct arm_smmu_domain {
>> > >  	struct mutex			init_mutex; /* Protects smmu pointer */
>> > >  	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
>> > >  	struct iommu_domain		domain;
>> > > +	bool				sys_cache;
>> > >  };
>> > >
>> > >  struct arm_smmu_master_cfg {
>> > > diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> > > index b95a6f8db6ff..4f4bb9c6f8f6 100644
>> > > --- a/include/linux/iommu.h
>> > > +++ b/include/linux/iommu.h
>> > > @@ -118,6 +118,7 @@ enum iommu_attr {
>> > >  	DOMAIN_ATTR_FSL_PAMUV1,
>> > >  	DOMAIN_ATTR_NESTING,	/* two stages of translation */
>> > >  	DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
>> > > +	DOMAIN_ATTR_SYS_CACHE,
>> >
>> > I think you're trying to make this look generic, but it's really not.
>> > If we need to funnel io-pgtable quirks through domain attributes, then I
>> > think we should be open about that and add something like
>> > DOMAIN_ATTR_IO_PGTABLE_CFG which could take a struct of page-table
>> > configuration data for the domain (this could just be quirks initially,
>> > but maybe it's worth extending to take ias, oas and page size)
>> >
>> 
>> Actually the initial versions used DOMAIN_ATTR_QCOM_SYS_CACHE
>> to make it QCOM specific and not generic, I don't see anyone else
>> using this attribute, would that work?
> 
> No -- I'd prefer to have _one_ domain attribute for funneling all the
> IP_PGTABLE_CFG data. Otherwise, we'll just end up with things like
> DOMAIN_ATTR_QCOM_SYS_CACHE_EXT or DOMAIN_ATTR_QCOM_QUIRKS later on.
> 

Right, that makes sense. I will add this in next version.

Thanks,
Sai

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2020-11-14 11:47 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-30  9:23 [PATCHv7 0/7] System Cache support for GPU and required SMMU support Sai Prakash Ranjan
2020-10-30  9:23 ` [PATCHv7 1/7] iommu/io-pgtable-arm: Add support to use system cache Sai Prakash Ranjan
2020-11-10 12:18   ` Will Deacon
2020-11-11  6:02     ` Sai Prakash Ranjan
2020-11-12  9:43       ` Will Deacon
2020-10-30  9:23 ` [PATCHv7 2/7] iommu/arm-smmu: Add domain attribute for " Sai Prakash Ranjan
2020-11-10 12:18   ` Will Deacon
2020-11-11  6:40     ` Sai Prakash Ranjan
2020-11-12  9:35       ` Will Deacon
2020-11-14 11:47         ` Sai Prakash Ranjan
2020-10-30  9:23 ` [PATCHv7 3/7] drm/msm: rearrange the gpu_rmw() function Sai Prakash Ranjan
2020-10-30  9:23 ` [PATCHv7 4/7] drm/msm/a6xx: Add support for using system cache(LLC) Sai Prakash Ranjan
2020-10-30  9:23 ` [PATCHv7 5/7] drm/msm/a6xx: Add support for using system cache on MMU500 based targets Sai Prakash Ranjan
2020-10-30  9:23 ` [PATCHv7 6/7] iommu: arm-smmu-impl: Use table to list QCOM implementations Sai Prakash Ranjan
2020-11-10 12:11   ` Will Deacon
2020-10-30  9:23 ` [PATCHv7 7/7] iommu: arm-smmu-impl: Add a space before open parenthesis Sai Prakash Ranjan
2020-11-10 12:12   ` Will Deacon
2020-11-09  5:15 ` [PATCHv7 0/7] System Cache support for GPU and required SMMU support Sai Prakash Ranjan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).