intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH 0/7] drm/i915/mtl: Define MOCS and PAT tables for MTL
@ 2023-04-01  6:38 fei.yang
  2023-04-01  6:38 ` [Intel-gfx] [PATCH 1/7] " fei.yang
                   ` (9 more replies)
  0 siblings, 10 replies; 35+ messages in thread
From: fei.yang @ 2023-04-01  6:38 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

From: Fei Yang <fei.yang@intel.com>

The series includes patches needed to enable MTL.
Also add new extension for GEM_CREATE uAPI to let
user space set cache policy for buffer objects.

Fei Yang (7):
  drm/i915/mtl: Define MOCS and PAT tables for MTL
  drm/i915/mtl: workaround coherency issue for Media
  drm/i915/mtl: end support for set caching ioctl
  drm/i915: preparation for using PAT index
  drm/i915: use pat_index instead of cache_level
  drm/i915: make sure correct pte encode is used
  drm/i915: Allow user to set cache at BO creation

 drivers/gpu/drm/i915/display/intel_dpt.c      | 14 ++--
 drivers/gpu/drm/i915/gem/i915_gem_create.c    | 33 ++++++++
 drivers/gpu/drm/i915/gem/i915_gem_domain.c    | 30 +++----
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 10 ++-
 drivers/gpu/drm/i915/gem/i915_gem_mman.c      |  3 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c    | 48 ++++++++++-
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |  8 ++
 .../gpu/drm/i915/gem/i915_gem_object_types.h  | 19 +++--
 drivers/gpu/drm/i915/gem/i915_gem_pages.c     |  5 +-
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c     |  9 +-
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  2 -
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c    |  4 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  | 16 ++--
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  2 +-
 .../drm/i915/gem/selftests/i915_gem_migrate.c |  2 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c    |  2 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c          | 10 ++-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c          | 81 +++++++++++++-----
 drivers/gpu/drm/i915/gt/gen8_ppgtt.h          |  6 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c          | 84 +++++++++++++------
 drivers/gpu/drm/i915/gt/intel_gtt.c           | 23 ++++-
 drivers/gpu/drm/i915/gt/intel_gtt.h           | 38 ++++++---
 drivers/gpu/drm/i915/gt/intel_migrate.c       | 47 ++++++-----
 drivers/gpu/drm/i915/gt/intel_migrate.h       | 13 ++-
 drivers/gpu/drm/i915/gt/intel_mocs.c          | 76 ++++++++++++++++-
 drivers/gpu/drm/i915/gt/intel_ppgtt.c         |  6 +-
 drivers/gpu/drm/i915/gt/selftest_migrate.c    | 47 ++++++-----
 drivers/gpu/drm/i915/gt/selftest_mocs.c       |  2 +-
 drivers/gpu/drm/i915/gt/selftest_reset.c      |  8 +-
 drivers/gpu/drm/i915/gt/selftest_timeline.c   |  2 +-
 drivers/gpu/drm/i915/gt/selftest_tlb.c        |  4 +-
 drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c     | 13 +++
 drivers/gpu/drm/i915/gt/uc/intel_guc.c        |  7 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c     | 18 ++--
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c      | 10 ++-
 drivers/gpu/drm/i915/i915_debugfs.c           | 55 +++++++++---
 drivers/gpu/drm/i915/i915_gem.c               | 16 +++-
 drivers/gpu/drm/i915/i915_gpu_error.c         |  8 +-
 drivers/gpu/drm/i915/i915_pci.c               | 76 +++++++++++++++--
 drivers/gpu/drm/i915/i915_vma.c               | 16 ++--
 drivers/gpu/drm/i915/i915_vma.h               |  2 +-
 drivers/gpu/drm/i915/i915_vma_types.h         |  2 -
 drivers/gpu/drm/i915/intel_device_info.h      |  5 ++
 drivers/gpu/drm/i915/selftests/i915_gem.c     |  5 +-
 .../gpu/drm/i915/selftests/i915_gem_evict.c   |  4 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 15 ++--
 .../drm/i915/selftests/intel_memory_region.c  |  4 +-
 .../gpu/drm/i915/selftests/mock_gem_device.c  |  6 ++
 drivers/gpu/drm/i915/selftests/mock_gtt.c     |  8 +-
 include/uapi/drm/i915_drm.h                   | 36 ++++++++
 tools/include/uapi/drm/i915_drm.h             | 36 ++++++++
 51 files changed, 763 insertions(+), 233 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Intel-gfx] [PATCH 1/7] drm/i915/mtl: Define MOCS and PAT tables for MTL
  2023-04-01  6:38 [Intel-gfx] [PATCH 0/7] drm/i915/mtl: Define MOCS and PAT tables for MTL fei.yang
@ 2023-04-01  6:38 ` fei.yang
  2023-04-03 12:50   ` Jani Nikula
  2023-04-06  8:28   ` Das, Nirmoy
  2023-04-01  6:38 ` [Intel-gfx] [PATCH 2/7] drm/i915/mtl: workaround coherency issue for Media fei.yang
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 35+ messages in thread
From: fei.yang @ 2023-04-01  6:38 UTC (permalink / raw)
  To: intel-gfx; +Cc: Lucas De Marchi, Matt Roper, dri-devel

From: Fei Yang <fei.yang@intel.com>

On MTL, GT can no longer allocate on LLC - only the CPU can.
This, along with addition of support for ADM/L4 cache calls a
MOCS/PAT table update.
Also add PTE encode functions for MTL as it has different PAT
index definition than previous platforms.

BSpec: 44509, 45101, 44235

Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Madhumitha Tolakanahalli Pradeep <madhumitha.tolakanahalli.pradeep@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Fei Yang <fei.yang@intel.com>
---
 drivers/gpu/drm/i915/display/intel_dpt.c |  2 +-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c     | 43 ++++++++++++--
 drivers/gpu/drm/i915/gt/gen8_ppgtt.h     |  3 +
 drivers/gpu/drm/i915/gt/intel_ggtt.c     | 36 ++++++++++-
 drivers/gpu/drm/i915/gt/intel_gtt.c      | 23 ++++++-
 drivers/gpu/drm/i915/gt/intel_gtt.h      | 20 ++++++-
 drivers/gpu/drm/i915/gt/intel_mocs.c     | 76 ++++++++++++++++++++++--
 drivers/gpu/drm/i915/gt/selftest_mocs.c  |  2 +-
 drivers/gpu/drm/i915/i915_pci.c          |  1 +
 9 files changed, 189 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c b/drivers/gpu/drm/i915/display/intel_dpt.c
index b8027392144d..c5eacfdba1a5 100644
--- a/drivers/gpu/drm/i915/display/intel_dpt.c
+++ b/drivers/gpu/drm/i915/display/intel_dpt.c
@@ -300,7 +300,7 @@ intel_dpt_create(struct intel_framebuffer *fb)
 	vm->vma_ops.bind_vma    = dpt_bind_vma;
 	vm->vma_ops.unbind_vma  = dpt_unbind_vma;
 
-	vm->pte_encode = gen8_ggtt_pte_encode;
+	vm->pte_encode = vm->gt->ggtt->vm.pte_encode;
 
 	dpt->obj = dpt_obj;
 	dpt->obj->is_dpt = true;
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 4daaa6f55668..4197b43150cc 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -55,6 +55,34 @@ static u64 gen8_pte_encode(dma_addr_t addr,
 	return pte;
 }
 
+static u64 mtl_pte_encode(dma_addr_t addr,
+			  enum i915_cache_level level,
+			  u32 flags)
+{
+	gen8_pte_t pte = addr | GEN8_PAGE_PRESENT | GEN8_PAGE_RW;
+
+	if (unlikely(flags & PTE_READ_ONLY))
+		pte &= ~GEN8_PAGE_RW;
+
+	if (flags & PTE_LM)
+		pte |= GEN12_PPGTT_PTE_LM | GEN12_PPGTT_PTE_NC;
+
+	switch (level) {
+	case I915_CACHE_NONE:
+		pte |= GEN12_PPGTT_PTE_PAT1;
+		break;
+	case I915_CACHE_LLC:
+	case I915_CACHE_L3_LLC:
+		pte |= GEN12_PPGTT_PTE_PAT0 | GEN12_PPGTT_PTE_PAT1;
+		break;
+	case I915_CACHE_WT:
+		pte |= GEN12_PPGTT_PTE_PAT0;
+		break;
+	}
+
+	return pte;
+}
+
 static void gen8_ppgtt_notify_vgt(struct i915_ppgtt *ppgtt, bool create)
 {
 	struct drm_i915_private *i915 = ppgtt->vm.i915;
@@ -427,7 +455,7 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
 		      u32 flags)
 {
 	struct i915_page_directory *pd;
-	const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
+	const gen8_pte_t pte_encode = ppgtt->vm.pte_encode(0, cache_level, flags);
 	gen8_pte_t *vaddr;
 
 	pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2));
@@ -580,7 +608,7 @@ static void gen8_ppgtt_insert_huge(struct i915_address_space *vm,
 				   enum i915_cache_level cache_level,
 				   u32 flags)
 {
-	const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
+	const gen8_pte_t pte_encode = vm->pte_encode(0, cache_level, flags);
 	unsigned int rem = sg_dma_len(iter->sg);
 	u64 start = vma_res->start;
 
@@ -743,7 +771,7 @@ static void gen8_ppgtt_insert_entry(struct i915_address_space *vm,
 	GEM_BUG_ON(pt->is_compact);
 
 	vaddr = px_vaddr(pt);
-	vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags);
+	vaddr[gen8_pd_index(idx, 0)] = vm->pte_encode(addr, level, flags);
 	drm_clflush_virt_range(&vaddr[gen8_pd_index(idx, 0)], sizeof(*vaddr));
 }
 
@@ -773,7 +801,7 @@ static void __xehpsdv_ppgtt_insert_entry_lm(struct i915_address_space *vm,
 	}
 
 	vaddr = px_vaddr(pt);
-	vaddr[gen8_pd_index(idx, 0) / 16] = gen8_pte_encode(addr, level, flags);
+	vaddr[gen8_pd_index(idx, 0) / 16] = vm->pte_encode(addr, level, flags);
 }
 
 static void xehpsdv_ppgtt_insert_entry(struct i915_address_space *vm,
@@ -820,7 +848,7 @@ static int gen8_init_scratch(struct i915_address_space *vm)
 		pte_flags |= PTE_LM;
 
 	vm->scratch[0]->encode =
-		gen8_pte_encode(px_dma(vm->scratch[0]),
+		vm->pte_encode(px_dma(vm->scratch[0]),
 				I915_CACHE_NONE, pte_flags);
 
 	for (i = 1; i <= vm->top; i++) {
@@ -963,7 +991,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
 	 */
 	ppgtt->vm.alloc_scratch_dma = alloc_pt_dma;
 
-	ppgtt->vm.pte_encode = gen8_pte_encode;
+	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 70))
+		ppgtt->vm.pte_encode = mtl_pte_encode;
+	else
+		ppgtt->vm.pte_encode = gen8_pte_encode;
 
 	ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND;
 	ppgtt->vm.insert_entries = gen8_ppgtt_insert;
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
index f541d19264b4..6b8ce7f4d25a 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
@@ -18,5 +18,8 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
 u64 gen8_ggtt_pte_encode(dma_addr_t addr,
 			 enum i915_cache_level level,
 			 u32 flags);
+u64 mtl_ggtt_pte_encode(dma_addr_t addr,
+			unsigned int pat_index,
+			u32 flags);
 
 #endif
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 3c7f1ed92f5b..ba3109338aee 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -220,6 +220,33 @@ static void guc_ggtt_invalidate(struct i915_ggtt *ggtt)
 	}
 }
 
+u64 mtl_ggtt_pte_encode(dma_addr_t addr,
+			enum i915_cache_level level,
+			u32 flags)
+{
+	gen8_pte_t pte = addr | GEN8_PAGE_PRESENT;
+
+	GEM_BUG_ON(addr & ~GEN12_GGTT_PTE_ADDR_MASK);
+
+	if (flags & PTE_LM)
+		pte |= GEN12_GGTT_PTE_LM;
+
+	switch (level) {
+	case I915_CACHE_NONE:
+		pte |= MTL_GGTT_PTE_PAT1;
+		break;
+	case I915_CACHE_LLC:
+	case I915_CACHE_L3_LLC:
+		pte |= MTL_GGTT_PTE_PAT0 | MTL_GGTT_PTE_PAT1;
+		break;
+	case I915_CACHE_WT:
+		pte |= MTL_GGTT_PTE_PAT0;
+		break;
+	}
+
+	return pte;
+}
+
 u64 gen8_ggtt_pte_encode(dma_addr_t addr,
 			 enum i915_cache_level level,
 			 u32 flags)
@@ -247,7 +274,7 @@ static void gen8_ggtt_insert_page(struct i915_address_space *vm,
 	gen8_pte_t __iomem *pte =
 		(gen8_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE;
 
-	gen8_set_pte(pte, gen8_ggtt_pte_encode(addr, level, flags));
+	gen8_set_pte(pte, ggtt->vm.pte_encode(addr, level, flags));
 
 	ggtt->invalidate(ggtt);
 }
@@ -257,8 +284,8 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
 				     enum i915_cache_level level,
 				     u32 flags)
 {
-	const gen8_pte_t pte_encode = gen8_ggtt_pte_encode(0, level, flags);
 	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
+	const gen8_pte_t pte_encode = ggtt->vm.pte_encode(0, level, flags);
 	gen8_pte_t __iomem *gte;
 	gen8_pte_t __iomem *end;
 	struct sgt_iter iter;
@@ -981,7 +1008,10 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
 	ggtt->vm.vma_ops.bind_vma    = intel_ggtt_bind_vma;
 	ggtt->vm.vma_ops.unbind_vma  = intel_ggtt_unbind_vma;
 
-	ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
+	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
+		ggtt->vm.pte_encode = mtl_ggtt_pte_encode;
+	else
+		ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
 
 	return ggtt_probe_common(ggtt, size);
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 4f436ba7a3c8..1e1b34e22cf5 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -468,6 +468,25 @@ void gtt_write_workarounds(struct intel_gt *gt)
 	}
 }
 
+static void mtl_setup_private_ppat(struct intel_uncore *uncore)
+{
+	intel_uncore_write(uncore, GEN12_PAT_INDEX(0),
+			   MTL_PPAT_L4_0_WB);
+	intel_uncore_write(uncore, GEN12_PAT_INDEX(1),
+			   MTL_PPAT_L4_1_WT);
+	intel_uncore_write(uncore, GEN12_PAT_INDEX(2),
+			   MTL_PPAT_L4_3_UC);
+	intel_uncore_write(uncore, GEN12_PAT_INDEX(3),
+			   MTL_PPAT_L4_0_WB | MTL_2_COH_1W);
+	intel_uncore_write(uncore, GEN12_PAT_INDEX(4),
+			   MTL_PPAT_L4_0_WB | MTL_3_COH_2W);
+
+	/*
+	 * Remaining PAT entries are left at the hardware-default
+	 * fully-cached setting
+	 */
+}
+
 static void tgl_setup_private_ppat(struct intel_uncore *uncore)
 {
 	/* TGL doesn't support LLC or AGE settings */
@@ -603,7 +622,9 @@ void setup_private_pat(struct intel_gt *gt)
 
 	GEM_BUG_ON(GRAPHICS_VER(i915) < 8);
 
-	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
+	if (IS_METEORLAKE(i915))
+		mtl_setup_private_ppat(uncore);
+	else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
 		xehp_setup_private_ppat(gt);
 	else if (GRAPHICS_VER(i915) >= 12)
 		tgl_setup_private_ppat(uncore);
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 69ce55f517f5..b632167eaf2e 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -88,9 +88,18 @@ typedef u64 gen8_pte_t;
 #define BYT_PTE_SNOOPED_BY_CPU_CACHES	REG_BIT(2)
 #define BYT_PTE_WRITEABLE		REG_BIT(1)
 
+#define GEN12_PPGTT_PTE_PAT3    BIT_ULL(62)
 #define GEN12_PPGTT_PTE_LM	BIT_ULL(11)
+#define GEN12_PPGTT_PTE_PAT2    BIT_ULL(7)
+#define GEN12_PPGTT_PTE_NC      BIT_ULL(5)
+#define GEN12_PPGTT_PTE_PAT1    BIT_ULL(4)
+#define GEN12_PPGTT_PTE_PAT0    BIT_ULL(3)
 
-#define GEN12_GGTT_PTE_LM	BIT_ULL(1)
+#define GEN12_GGTT_PTE_LM		BIT_ULL(1)
+#define MTL_GGTT_PTE_PAT0		BIT_ULL(52)
+#define MTL_GGTT_PTE_PAT1		BIT_ULL(53)
+#define GEN12_GGTT_PTE_ADDR_MASK	GENMASK_ULL(45, 12)
+#define MTL_GGTT_PTE_PAT_MASK		GENMASK_ULL(53, 52)
 
 #define GEN12_PDE_64K BIT(6)
 #define GEN12_PTE_PS64 BIT(8)
@@ -147,6 +156,15 @@ typedef u64 gen8_pte_t;
 #define GEN8_PDE_IPS_64K BIT(11)
 #define GEN8_PDE_PS_2M   BIT(7)
 
+#define MTL_PPAT_L4_CACHE_POLICY_MASK	REG_GENMASK(3, 2)
+#define MTL_PAT_INDEX_COH_MODE_MASK	REG_GENMASK(1, 0)
+#define MTL_PPAT_L4_3_UC	REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 3)
+#define MTL_PPAT_L4_1_WT	REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 1)
+#define MTL_PPAT_L4_0_WB	REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 0)
+#define MTL_3_COH_2W	REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 3)
+#define MTL_2_COH_1W	REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 2)
+#define MTL_0_COH_NON	REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 0)
+
 enum i915_cache_level;
 
 struct drm_i915_gem_object;
diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c
index 69b489e8dfed..89570f137b2c 100644
--- a/drivers/gpu/drm/i915/gt/intel_mocs.c
+++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
@@ -40,6 +40,10 @@ struct drm_i915_mocs_table {
 #define LE_COS(value)		((value) << 15)
 #define LE_SSE(value)		((value) << 17)
 
+/* Defines for the tables (GLOB_MOCS_0 - GLOB_MOCS_16) */
+#define _L4_CACHEABILITY(value)	((value) << 2)
+#define IG_PAT(value)		((value) << 8)
+
 /* Defines for the tables (LNCFMOCS0 - LNCFMOCS31) - two entries per word */
 #define L3_ESC(value)		((value) << 0)
 #define L3_SCC(value)		((value) << 1)
@@ -50,6 +54,7 @@ struct drm_i915_mocs_table {
 /* Helper defines */
 #define GEN9_NUM_MOCS_ENTRIES	64  /* 63-64 are reserved, but configured. */
 #define PVC_NUM_MOCS_ENTRIES	3
+#define MTL_NUM_MOCS_ENTRIES	16
 
 /* (e)LLC caching options */
 /*
@@ -73,6 +78,12 @@ struct drm_i915_mocs_table {
 #define L3_2_RESERVED		_L3_CACHEABILITY(2)
 #define L3_3_WB			_L3_CACHEABILITY(3)
 
+/* L4 caching options */
+#define L4_0_WB			_L4_CACHEABILITY(0)
+#define L4_1_WT			_L4_CACHEABILITY(1)
+#define L4_2_RESERVED		_L4_CACHEABILITY(2)
+#define L4_3_UC			_L4_CACHEABILITY(3)
+
 #define MOCS_ENTRY(__idx, __control_value, __l3cc_value) \
 	[__idx] = { \
 		.control_value = __control_value, \
@@ -416,6 +427,57 @@ static const struct drm_i915_mocs_entry pvc_mocs_table[] = {
 	MOCS_ENTRY(2, 0, L3_3_WB),
 };
 
+static const struct drm_i915_mocs_entry mtl_mocs_table[] = {
+	/* Error - Reserved for Non-Use */
+	MOCS_ENTRY(0,
+		   IG_PAT(0),
+		   L3_LKUP(1) | L3_3_WB),
+	/* Cached - L3 + L4 */
+	MOCS_ENTRY(1,
+		   IG_PAT(1),
+		   L3_LKUP(1) | L3_3_WB),
+	/* L4 - GO:L3 */
+	MOCS_ENTRY(2,
+		   IG_PAT(1),
+		   L3_LKUP(1) | L3_1_UC),
+	/* Uncached - GO:L3 */
+	MOCS_ENTRY(3,
+		   IG_PAT(1) | L4_3_UC,
+		   L3_LKUP(1) | L3_1_UC),
+	/* L4 - GO:Mem */
+	MOCS_ENTRY(4,
+		   IG_PAT(1),
+		   L3_LKUP(1) | L3_GLBGO(1) | L3_1_UC),
+	/* Uncached - GO:Mem */
+	MOCS_ENTRY(5,
+		   IG_PAT(1) | L4_3_UC,
+		   L3_LKUP(1) | L3_GLBGO(1) | L3_1_UC),
+	/* L4 - L3:NoLKUP; GO:L3 */
+	MOCS_ENTRY(6,
+		   IG_PAT(1),
+		   L3_1_UC),
+	/* Uncached - L3:NoLKUP; GO:L3 */
+	MOCS_ENTRY(7,
+		   IG_PAT(1) | L4_3_UC,
+		   L3_1_UC),
+	/* L4 - L3:NoLKUP; GO:Mem */
+	MOCS_ENTRY(8,
+		   IG_PAT(1),
+		   L3_GLBGO(1) | L3_1_UC),
+	/* Uncached - L3:NoLKUP; GO:Mem */
+	MOCS_ENTRY(9,
+		   IG_PAT(1) | L4_3_UC,
+		   L3_GLBGO(1) | L3_1_UC),
+	/* Display - L3; L4:WT */
+	MOCS_ENTRY(14,
+		   IG_PAT(1) | L4_1_WT,
+		   L3_LKUP(1) | L3_3_WB),
+	/* CCS - Non-Displayable */
+	MOCS_ENTRY(15,
+		   IG_PAT(1),
+		   L3_GLBGO(1) | L3_1_UC),
+};
+
 enum {
 	HAS_GLOBAL_MOCS = BIT(0),
 	HAS_ENGINE_MOCS = BIT(1),
@@ -445,7 +507,13 @@ static unsigned int get_mocs_settings(const struct drm_i915_private *i915,
 	memset(table, 0, sizeof(struct drm_i915_mocs_table));
 
 	table->unused_entries_index = I915_MOCS_PTE;
-	if (IS_PONTEVECCHIO(i915)) {
+	if (IS_METEORLAKE(i915)) {
+		table->size = ARRAY_SIZE(mtl_mocs_table);
+		table->table = mtl_mocs_table;
+		table->n_entries = MTL_NUM_MOCS_ENTRIES;
+		table->uc_index = 9;
+		table->unused_entries_index = 1;
+	} else if (IS_PONTEVECCHIO(i915)) {
 		table->size = ARRAY_SIZE(pvc_mocs_table);
 		table->table = pvc_mocs_table;
 		table->n_entries = PVC_NUM_MOCS_ENTRIES;
@@ -646,9 +714,9 @@ void intel_mocs_init_engine(struct intel_engine_cs *engine)
 		init_l3cc_table(engine->gt, &table);
 }
 
-static u32 global_mocs_offset(void)
+static u32 global_mocs_offset(struct intel_gt *gt)
 {
-	return i915_mmio_reg_offset(GEN12_GLOBAL_MOCS(0));
+	return i915_mmio_reg_offset(GEN12_GLOBAL_MOCS(0)) + gt->uncore->gsi_offset;
 }
 
 void intel_set_mocs_index(struct intel_gt *gt)
@@ -671,7 +739,7 @@ void intel_mocs_init(struct intel_gt *gt)
 	 */
 	flags = get_mocs_settings(gt->i915, &table);
 	if (flags & HAS_GLOBAL_MOCS)
-		__init_mocs_table(gt->uncore, &table, global_mocs_offset());
+		__init_mocs_table(gt->uncore, &table, global_mocs_offset(gt));
 
 	/*
 	 * Initialize the L3CC table as part of mocs initalization to make
diff --git a/drivers/gpu/drm/i915/gt/selftest_mocs.c b/drivers/gpu/drm/i915/gt/selftest_mocs.c
index ca009a6a13bd..730796346514 100644
--- a/drivers/gpu/drm/i915/gt/selftest_mocs.c
+++ b/drivers/gpu/drm/i915/gt/selftest_mocs.c
@@ -137,7 +137,7 @@ static int read_mocs_table(struct i915_request *rq,
 		return 0;
 
 	if (HAS_GLOBAL_MOCS_REGISTERS(rq->engine->i915))
-		addr = global_mocs_offset();
+		addr = global_mocs_offset(rq->engine->gt);
 	else
 		addr = mocs_offset(rq->engine);
 
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 621730b6551c..480b128499ae 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -1149,6 +1149,7 @@ static const struct intel_device_info mtl_info = {
 	.has_flat_ccs = 0,
 	.has_gmd_id = 1,
 	.has_guc_deprivilege = 1,
+	.has_llc = 0,
 	.has_mslice_steering = 0,
 	.has_snoop = 1,
 	.__runtime.memory_regions = REGION_SMEM | REGION_STOLEN_LMEM,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Intel-gfx] [PATCH 2/7] drm/i915/mtl: workaround coherency issue for Media
  2023-04-01  6:38 [Intel-gfx] [PATCH 0/7] drm/i915/mtl: Define MOCS and PAT tables for MTL fei.yang
  2023-04-01  6:38 ` [Intel-gfx] [PATCH 1/7] " fei.yang
@ 2023-04-01  6:38 ` fei.yang
  2023-04-01  6:38 ` [Intel-gfx] [PATCH 3/7] drm/i915/mtl: end support for set caching ioctl fei.yang
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 35+ messages in thread
From: fei.yang @ 2023-04-01  6:38 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

From: Fei Yang <fei.yang@intel.com>

This patch implements Wa_22016122933.

In MTL, memory writes initiated by Media tile update the whole
cache line even for partial writes. This creates a coherency
problem for cacheable memory if both CPU and GPU are writing data
to different locations within a single cache line. CTB communication
is impacted by this issue because the head and tail pointers are
adjacent words within a cache line (see struct guc_ct_buffer_desc),
where one is written by GuC and the other by the host.
This patch circumvents the issue by making CPU/GPU shared memory
uncacheable (WC on CPU side, and PAT index 2 for GPU). Also for
CTB which is being updated by both CPU and GuC, mfence instruction
is added to make sure the CPU writes are visible to GPU right away
(flush the write combining buffer).

While fixing the CTB issue, we noticed some random GSC firmware
loading failure because the share buffers are cacheable (WB) on CPU
side but uncached on GPU side. To fix these issues we need to map
such shared buffers as WC on CPU side. Since such allocations are
not all done through GuC allocator, to avoid too many code changes,
the i915_coherent_map_type() is now hard coded to return WC for MTL.

BSpec: 45101

Signed-off-by: Fei Yang <fei.yang@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_pages.c |  5 ++++-
 drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c | 13 +++++++++++++
 drivers/gpu/drm/i915/gt/uc/intel_guc.c    |  7 +++++++
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 18 ++++++++++++------
 4 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index ecd86130b74f..89fc8ea6bcfc 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -469,7 +469,10 @@ enum i915_map_type i915_coherent_map_type(struct drm_i915_private *i915,
 					  struct drm_i915_gem_object *obj,
 					  bool always_coherent)
 {
-	if (i915_gem_object_is_lmem(obj))
+	/*
+	 * Wa_22016122933: always return I915_MAP_WC for MTL
+	 */
+	if (i915_gem_object_is_lmem(obj) || IS_METEORLAKE(i915))
 		return I915_MAP_WC;
 	if (HAS_LLC(i915) || always_coherent)
 		return I915_MAP_WB;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
index 1d9fdfb11268..236673c02f9a 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
@@ -110,6 +110,13 @@ static int gsc_fw_load_prepare(struct intel_gsc_uc *gsc)
 	if (obj->base.size < gsc->fw.size)
 		return -ENOSPC;
 
+	/*
+	 * Wa_22016122933: For MTL the shared memory needs to be mapped
+	 * as WC on CPU side and UC (PAT index 2) on GPU side
+	 */
+	if (IS_METEORLAKE(i915))
+		i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);
+
 	dst = i915_gem_object_pin_map_unlocked(obj,
 					       i915_coherent_map_type(i915, obj, true));
 	if (IS_ERR(dst))
@@ -125,6 +132,12 @@ static int gsc_fw_load_prepare(struct intel_gsc_uc *gsc)
 	memset(dst, 0, obj->base.size);
 	memcpy(dst, src, gsc->fw.size);
 
+	/*
+	 * Wa_22016122933: Making sure the data in dst is
+	 * visible to GSC right away
+	 */
+	intel_guc_write_barrier(&gt->uc.guc);
+
 	i915_gem_object_unpin_map(gsc->fw.obj);
 	i915_gem_object_unpin_map(obj);
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index d76508fa3af7..f9bddaa876d9 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -743,6 +743,13 @@ struct i915_vma *intel_guc_allocate_vma(struct intel_guc *guc, u32 size)
 	if (IS_ERR(obj))
 		return ERR_CAST(obj);
 
+	/*
+	 * Wa_22016122933: For MTL the shared memory needs to be mapped
+	 * as WC on CPU side and UC (PAT index 2) on GPU side
+	 */
+	if (IS_METEORLAKE(gt->i915))
+		i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);
+
 	vma = i915_vma_instance(obj, &gt->ggtt->vm, NULL);
 	if (IS_ERR(vma))
 		goto err;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 1803a633ed64..98e682b7df07 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -415,12 +415,6 @@ static int ct_write(struct intel_guc_ct *ct,
 	}
 	GEM_BUG_ON(tail > size);
 
-	/*
-	 * make sure H2G buffer update and LRC tail update (if this triggering a
-	 * submission) are visible before updating the descriptor tail
-	 */
-	intel_guc_write_barrier(ct_to_guc(ct));
-
 	/* update local copies */
 	ctb->tail = tail;
 	GEM_BUG_ON(atomic_read(&ctb->space) < len + GUC_CTB_HDR_LEN);
@@ -429,6 +423,12 @@ static int ct_write(struct intel_guc_ct *ct,
 	/* now update descriptor */
 	WRITE_ONCE(desc->tail, tail);
 
+	/*
+	 * make sure H2G buffer update and LRC tail update (if this triggering a
+	 * submission) are visible before updating the descriptor tail
+	 */
+	intel_guc_write_barrier(ct_to_guc(ct));
+
 	return 0;
 
 corrupted:
@@ -902,6 +902,12 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
 	/* now update descriptor */
 	WRITE_ONCE(desc->head, head);
 
+	/*
+	 * Wa_22016122933: Making sure the head update is
+	 * visible to GuC right away
+	 */
+	intel_guc_write_barrier(ct_to_guc(ct));
+
 	return available - len;
 
 corrupted:
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Intel-gfx] [PATCH 3/7] drm/i915/mtl: end support for set caching ioctl
  2023-04-01  6:38 [Intel-gfx] [PATCH 0/7] drm/i915/mtl: Define MOCS and PAT tables for MTL fei.yang
  2023-04-01  6:38 ` [Intel-gfx] [PATCH 1/7] " fei.yang
  2023-04-01  6:38 ` [Intel-gfx] [PATCH 2/7] drm/i915/mtl: workaround coherency issue for Media fei.yang
@ 2023-04-01  6:38 ` fei.yang
  2023-04-01  6:38 ` [Intel-gfx] [PATCH 4/7] drm/i915: preparation for using PAT index fei.yang
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 35+ messages in thread
From: fei.yang @ 2023-04-01  6:38 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

From: Fei Yang <fei.yang@intel.com>

The design is to keep Buffer Object's caching policy immutable through
out its life cycle. This patch ends the support for set caching ioctl
from MTL onward. While doing that we also set BO's to be 1-way coherent
at creation time because GPU is no longer automatically snooping CPU
cache.

Signed-off-by: Fei Yang <fei.yang@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_domain.c | 3 +++
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c  | 9 ++++++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
index 497de40b8e68..33b73bea1e08 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -335,6 +335,9 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
 	if (IS_DGFX(i915))
 		return -ENODEV;
 
+	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
+		return -EOPNOTSUPP;
+
 	switch (args->caching) {
 	case I915_CACHING_NONE:
 		level = I915_CACHE_NONE;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 37d1efcd3ca6..e602c323896b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -601,7 +601,14 @@ static int shmem_object_init(struct intel_memory_region *mem,
 	obj->write_domain = I915_GEM_DOMAIN_CPU;
 	obj->read_domains = I915_GEM_DOMAIN_CPU;
 
-	if (HAS_LLC(i915))
+	/*
+	 * MTL doesn't snooping CPU cache by default for GPU access (namely
+	 * 1-way coherency). However some UMD's are currently depending on
+	 * that. Make 1-way coherent the default setting for MTL. A follow
+	 * up patch will extend the GEM_CREATE uAPI to allow UMD's specify
+	 * caching mode at BO creation time
+	 */
+	if (HAS_LLC(i915) || (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70)))
 		/* On some devices, we can have the GPU use the LLC (the CPU
 		 * cache) for about a 10% performance improvement
 		 * compared to uncached.  Graphics requests other than
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Intel-gfx] [PATCH 4/7] drm/i915: preparation for using PAT index
  2023-04-01  6:38 [Intel-gfx] [PATCH 0/7] drm/i915/mtl: Define MOCS and PAT tables for MTL fei.yang
                   ` (2 preceding siblings ...)
  2023-04-01  6:38 ` [Intel-gfx] [PATCH 3/7] drm/i915/mtl: end support for set caching ioctl fei.yang
@ 2023-04-01  6:38 ` fei.yang
  2023-04-01  6:38 ` [Intel-gfx] [PATCH 5/7] drm/i915: use pat_index instead of cache_level fei.yang
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 35+ messages in thread
From: fei.yang @ 2023-04-01  6:38 UTC (permalink / raw)
  To: intel-gfx; +Cc: Matt Roper, Chris Wilson, dri-devel

From: Fei Yang <fei.yang@intel.com>

This patch is a preparation for replacing enum i915_cache_level with PAT
index. Caching policy for buffer objects is set through the PAT index in
PTE, the old i915_cache_level is not sufficient to represent all caching
modes supported by the hardware.

Preparing the transition by adding some platform dependent data structures
and helper functions to translate the cache_level to pat_index.

cachelevel_to_pat: a platform dependent array mapping cache_level to
                   pat_index.

max_pat_index: the maximum PAT index supported by the hardware. Needed for
               validating the PAT index passed in from user space.

i915_gem_get_pat_index: function to convert cache_level to PAT index.

obj_to_i915(obj): macro moved to header file for wider usage.

I915_MAX_CACHE_LEVEL: upper bound of i915_cache_level for the
                      convenience of coding.

Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Fei Yang <fei.yang@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c    |  9 +++
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |  4 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  1 +
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  2 -
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c          |  6 ++
 drivers/gpu/drm/i915/gt/intel_ggtt.c          |  6 ++
 drivers/gpu/drm/i915/i915_pci.c               | 75 +++++++++++++++++--
 drivers/gpu/drm/i915/intel_device_info.h      |  5 ++
 .../gpu/drm/i915/selftests/mock_gem_device.c  |  6 ++
 9 files changed, 104 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index e6d4efde4fc5..1295bb812866 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -45,6 +45,15 @@ static struct kmem_cache *slab_objects;
 
 static const struct drm_gem_object_funcs i915_gem_object_funcs;
 
+unsigned int i915_gem_get_pat_index(struct drm_i915_private *i915,
+				    enum i915_cache_level level)
+{
+	if (drm_WARN_ON(&i915->drm, level >= I915_MAX_CACHE_LEVEL))
+		return 0;
+
+	return INTEL_INFO(i915)->cachelevel_to_pat[level];
+}
+
 struct drm_i915_gem_object *i915_gem_object_alloc(void)
 {
 	struct drm_i915_gem_object *obj;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 885ccde9dc3c..4c92e17b4337 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -20,6 +20,8 @@
 
 enum intel_region_id;
 
+#define obj_to_i915(obj__) to_i915((obj__)->base.dev)
+
 static inline bool i915_gem_object_size_2big(u64 size)
 {
 	struct drm_i915_gem_object *obj;
@@ -30,6 +32,8 @@ static inline bool i915_gem_object_size_2big(u64 size)
 	return false;
 }
 
+unsigned int i915_gem_get_pat_index(struct drm_i915_private *i915,
+				    enum i915_cache_level level);
 void i915_gem_init__objects(struct drm_i915_private *i915);
 
 void i915_objects_module_exit(void);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 5dcbbef31d44..890f3ad497c5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -194,6 +194,7 @@ enum i915_cache_level {
 	 * engine.
 	 */
 	I915_CACHE_WT,
+	I915_MAX_CACHE_LEVEL,
 };
 
 enum i915_map_type {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index b1672e054b21..214763942aa2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -460,8 +460,6 @@ void i915_gem_shrinker_taints_mutex(struct drm_i915_private *i915,
 	fs_reclaim_release(GFP_KERNEL);
 }
 
-#define obj_to_i915(obj__) to_i915((obj__)->base.dev)
-
 /**
  * i915_gem_object_make_unshrinkable - Hide the object from the shrinker. By
  * default all object types that support shrinking(see IS_SHRINKABLE), will also
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 4197b43150cc..3ae41a13d28d 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -78,6 +78,12 @@ static u64 mtl_pte_encode(dma_addr_t addr,
 	case I915_CACHE_WT:
 		pte |= GEN12_PPGTT_PTE_PAT0;
 		break;
+	default:
+		/* This should never happen. Added to deal with the compile
+		 * error due to the addition of I915_MAX_CACHE_LEVEL. Will
+		 * be removed by the pat_index patch.
+		 */
+		break;
 	}
 
 	return pte;
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index ba3109338aee..91056b9a60a9 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -242,6 +242,12 @@ u64 mtl_ggtt_pte_encode(dma_addr_t addr,
 	case I915_CACHE_WT:
 		pte |= MTL_GGTT_PTE_PAT0;
 		break;
+	default:
+		/* This should never happen. Added to deal with the compile
+		 * error due to the addition of I915_MAX_CACHE_LEVEL. Will
+		 * be removed by the pat_index patch.
+		 */
+		break;
 	}
 
 	return pte;
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 480b128499ae..7c50bcddaa5c 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -29,6 +29,7 @@
 #include "display/intel_display.h"
 #include "gt/intel_gt_regs.h"
 #include "gt/intel_sa_media.h"
+#include "gem/i915_gem_object_types.h"
 
 #include "i915_driver.h"
 #include "i915_drv.h"
@@ -163,6 +164,38 @@
 		.gamma_lut_tests = DRM_COLOR_LUT_NON_DECREASING, \
 	}
 
+#define LEGACY_CACHELEVEL \
+	.cachelevel_to_pat = { \
+		[I915_CACHE_NONE]   = 0, \
+		[I915_CACHE_LLC]    = 1, \
+		[I915_CACHE_L3_LLC] = 2, \
+		[I915_CACHE_WT]     = 3, \
+	}
+
+#define TGL_CACHELEVEL \
+	.cachelevel_to_pat = { \
+		[I915_CACHE_NONE]   = 3, \
+		[I915_CACHE_LLC]    = 0, \
+		[I915_CACHE_L3_LLC] = 0, \
+		[I915_CACHE_WT]     = 2, \
+	}
+
+#define PVC_CACHELEVEL \
+	.cachelevel_to_pat = { \
+		[I915_CACHE_NONE]   = 0, \
+		[I915_CACHE_LLC]    = 3, \
+		[I915_CACHE_L3_LLC] = 3, \
+		[I915_CACHE_WT]     = 2, \
+	}
+
+#define MTL_CACHELEVEL \
+	.cachelevel_to_pat = { \
+		[I915_CACHE_NONE]   = 2, \
+		[I915_CACHE_LLC]    = 3, \
+		[I915_CACHE_L3_LLC] = 3, \
+		[I915_CACHE_WT]     = 1, \
+	}
+
 /* Keep in gen based order, and chronological order within a gen */
 
 #define GEN_DEFAULT_PAGE_SIZES \
@@ -188,11 +221,13 @@
 	.has_snoop = true, \
 	.has_coherent_ggtt = false, \
 	.dma_mask_size = 32, \
+	.max_pat_index = 3, \
 	I9XX_PIPE_OFFSETS, \
 	I9XX_CURSOR_OFFSETS, \
 	I9XX_COLORS, \
 	GEN_DEFAULT_PAGE_SIZES, \
-	GEN_DEFAULT_REGIONS
+	GEN_DEFAULT_REGIONS, \
+	LEGACY_CACHELEVEL
 
 #define I845_FEATURES \
 	GEN(2), \
@@ -209,11 +244,13 @@
 	.has_snoop = true, \
 	.has_coherent_ggtt = false, \
 	.dma_mask_size = 32, \
+	.max_pat_index = 3, \
 	I845_PIPE_OFFSETS, \
 	I845_CURSOR_OFFSETS, \
 	I845_COLORS, \
 	GEN_DEFAULT_PAGE_SIZES, \
-	GEN_DEFAULT_REGIONS
+	GEN_DEFAULT_REGIONS, \
+	LEGACY_CACHELEVEL
 
 static const struct intel_device_info i830_info = {
 	I830_FEATURES,
@@ -248,11 +285,13 @@ static const struct intel_device_info i865g_info = {
 	.has_snoop = true, \
 	.has_coherent_ggtt = true, \
 	.dma_mask_size = 32, \
+	.max_pat_index = 3, \
 	I9XX_PIPE_OFFSETS, \
 	I9XX_CURSOR_OFFSETS, \
 	I9XX_COLORS, \
 	GEN_DEFAULT_PAGE_SIZES, \
-	GEN_DEFAULT_REGIONS
+	GEN_DEFAULT_REGIONS, \
+	LEGACY_CACHELEVEL
 
 static const struct intel_device_info i915g_info = {
 	GEN3_FEATURES,
@@ -340,11 +379,13 @@ static const struct intel_device_info pnv_m_info = {
 	.has_snoop = true, \
 	.has_coherent_ggtt = true, \
 	.dma_mask_size = 36, \
+	.max_pat_index = 3, \
 	I9XX_PIPE_OFFSETS, \
 	I9XX_CURSOR_OFFSETS, \
 	I9XX_COLORS, \
 	GEN_DEFAULT_PAGE_SIZES, \
-	GEN_DEFAULT_REGIONS
+	GEN_DEFAULT_REGIONS, \
+	LEGACY_CACHELEVEL
 
 static const struct intel_device_info i965g_info = {
 	GEN4_FEATURES,
@@ -394,11 +435,13 @@ static const struct intel_device_info gm45_info = {
 	/* ilk does support rc6, but we do not implement [power] contexts */ \
 	.has_rc6 = 0, \
 	.dma_mask_size = 36, \
+	.max_pat_index = 3, \
 	I9XX_PIPE_OFFSETS, \
 	I9XX_CURSOR_OFFSETS, \
 	ILK_COLORS, \
 	GEN_DEFAULT_PAGE_SIZES, \
-	GEN_DEFAULT_REGIONS
+	GEN_DEFAULT_REGIONS, \
+	LEGACY_CACHELEVEL
 
 static const struct intel_device_info ilk_d_info = {
 	GEN5_FEATURES,
@@ -428,13 +471,15 @@ static const struct intel_device_info ilk_m_info = {
 	.has_rc6p = 0, \
 	.has_rps = true, \
 	.dma_mask_size = 40, \
+	.max_pat_index = 3, \
 	.__runtime.ppgtt_type = INTEL_PPGTT_ALIASING, \
 	.__runtime.ppgtt_size = 31, \
 	I9XX_PIPE_OFFSETS, \
 	I9XX_CURSOR_OFFSETS, \
 	ILK_COLORS, \
 	GEN_DEFAULT_PAGE_SIZES, \
-	GEN_DEFAULT_REGIONS
+	GEN_DEFAULT_REGIONS, \
+	LEGACY_CACHELEVEL
 
 #define SNB_D_PLATFORM \
 	GEN6_FEATURES, \
@@ -481,13 +526,15 @@ static const struct intel_device_info snb_m_gt2_info = {
 	.has_reset_engine = true, \
 	.has_rps = true, \
 	.dma_mask_size = 40, \
+	.max_pat_index = 3, \
 	.__runtime.ppgtt_type = INTEL_PPGTT_ALIASING, \
 	.__runtime.ppgtt_size = 31, \
 	IVB_PIPE_OFFSETS, \
 	IVB_CURSOR_OFFSETS, \
 	IVB_COLORS, \
 	GEN_DEFAULT_PAGE_SIZES, \
-	GEN_DEFAULT_REGIONS
+	GEN_DEFAULT_REGIONS, \
+	LEGACY_CACHELEVEL
 
 #define IVB_D_PLATFORM \
 	GEN7_FEATURES, \
@@ -541,6 +588,7 @@ static const struct intel_device_info vlv_info = {
 	.display.has_gmch = 1,
 	.display.has_hotplug = 1,
 	.dma_mask_size = 40,
+	.max_pat_index = 3,
 	.__runtime.ppgtt_type = INTEL_PPGTT_ALIASING,
 	.__runtime.ppgtt_size = 31,
 	.has_snoop = true,
@@ -552,6 +600,7 @@ static const struct intel_device_info vlv_info = {
 	I9XX_COLORS,
 	GEN_DEFAULT_PAGE_SIZES,
 	GEN_DEFAULT_REGIONS,
+	LEGACY_CACHELEVEL,
 };
 
 #define G75_FEATURES  \
@@ -639,6 +688,7 @@ static const struct intel_device_info chv_info = {
 	.has_logical_ring_contexts = 1,
 	.display.has_gmch = 1,
 	.dma_mask_size = 39,
+	.max_pat_index = 3,
 	.__runtime.ppgtt_type = INTEL_PPGTT_FULL,
 	.__runtime.ppgtt_size = 32,
 	.has_reset_engine = 1,
@@ -650,6 +700,7 @@ static const struct intel_device_info chv_info = {
 	CHV_COLORS,
 	GEN_DEFAULT_PAGE_SIZES,
 	GEN_DEFAULT_REGIONS,
+	LEGACY_CACHELEVEL,
 };
 
 #define GEN9_DEFAULT_PAGE_SIZES \
@@ -889,9 +940,11 @@ static const struct intel_device_info jsl_info = {
 		[TRANSCODER_DSI_1] = TRANSCODER_DSI1_OFFSET, \
 	}, \
 	TGL_CURSOR_OFFSETS, \
+	TGL_CACHELEVEL, \
 	.has_global_mocs = 1, \
 	.has_pxp = 1, \
-	.display.has_dsb = 1
+	.display.has_dsb = 1, \
+	.max_pat_index = 3
 
 static const struct intel_device_info tgl_info = {
 	GEN12_FEATURES,
@@ -1015,6 +1068,7 @@ static const struct intel_device_info adl_p_info = {
 	.__runtime.graphics.ip.ver = 12, \
 	.__runtime.graphics.ip.rel = 50, \
 	XE_HP_PAGE_SIZES, \
+	TGL_CACHELEVEL, \
 	.dma_mask_size = 46, \
 	.has_3d_pipeline = 1, \
 	.has_64bit_reloc = 1, \
@@ -1033,6 +1087,7 @@ static const struct intel_device_info adl_p_info = {
 	.has_reset_engine = 1, \
 	.has_rps = 1, \
 	.has_runtime_pm = 1, \
+	.max_pat_index = 3, \
 	.__runtime.ppgtt_size = 48, \
 	.__runtime.ppgtt_type = INTEL_PPGTT_FULL
 
@@ -1109,11 +1164,13 @@ static const struct intel_device_info pvc_info = {
 	PLATFORM(INTEL_PONTEVECCHIO),
 	NO_DISPLAY,
 	.has_flat_ccs = 0,
+	.max_pat_index = 7,
 	.__runtime.platform_engine_mask =
 		BIT(BCS0) |
 		BIT(VCS0) |
 		BIT(CCS0) | BIT(CCS1) | BIT(CCS2) | BIT(CCS3),
 	.require_force_probe = 1,
+	PVC_CACHELEVEL,
 };
 
 #define XE_LPDP_FEATURES	\
@@ -1152,9 +1209,11 @@ static const struct intel_device_info mtl_info = {
 	.has_llc = 0,
 	.has_mslice_steering = 0,
 	.has_snoop = 1,
+	.max_pat_index = 4,
 	.__runtime.memory_regions = REGION_SMEM | REGION_STOLEN_LMEM,
 	.__runtime.platform_engine_mask = BIT(RCS0) | BIT(BCS0) | BIT(CCS0),
 	.require_force_probe = 1,
+	MTL_CACHELEVEL,
 };
 
 #undef PLATFORM
diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
index 0aad8e48d27d..9643febbb4ea 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -35,6 +35,8 @@
 #include "gt/intel_context_types.h"
 #include "gt/intel_sseu.h"
 
+#include "gem/i915_gem_object_types.h"
+
 struct drm_printer;
 struct drm_i915_private;
 struct intel_gt_definition;
@@ -309,6 +311,9 @@ struct intel_device_info {
 	 * Initial runtime info. Do not access outside of i915_driver_create().
 	 */
 	const struct intel_runtime_info __runtime;
+
+	u32 cachelevel_to_pat[I915_MAX_CACHE_LEVEL];
+	u32 max_pat_index;
 };
 
 struct intel_driver_caps {
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index f6a7c0bd2955..bc28f7afa54a 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -124,6 +124,7 @@ struct drm_i915_private *mock_gem_device(void)
 #endif
 	struct drm_i915_private *i915;
 	struct pci_dev *pdev;
+	unsigned int i;
 	int ret;
 
 	pdev = kzalloc(sizeof(*pdev), GFP_KERNEL);
@@ -180,6 +181,11 @@ struct drm_i915_private *mock_gem_device(void)
 		I915_GTT_PAGE_SIZE_2M;
 
 	RUNTIME_INFO(i915)->memory_regions = REGION_SMEM;
+
+	/* simply use legacy cache level for mock device */
+	for (i = 0; i < I915_MAX_CACHE_LEVEL; i++)
+		mkwrite_device_info(i915)->cachelevel_to_pat[i] = i;
+
 	intel_memory_regions_hw_probe(i915);
 
 	spin_lock_init(&i915->gpu_error.lock);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Intel-gfx] [PATCH 5/7] drm/i915: use pat_index instead of cache_level
  2023-04-01  6:38 [Intel-gfx] [PATCH 0/7] drm/i915/mtl: Define MOCS and PAT tables for MTL fei.yang
                   ` (3 preceding siblings ...)
  2023-04-01  6:38 ` [Intel-gfx] [PATCH 4/7] drm/i915: preparation for using PAT index fei.yang
@ 2023-04-01  6:38 ` fei.yang
  2023-04-03 14:50   ` Ville Syrjälä
  2023-04-01  6:38 ` [Intel-gfx] [PATCH 6/7] drm/i915: make sure correct pte encode is used fei.yang
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: fei.yang @ 2023-04-01  6:38 UTC (permalink / raw)
  To: intel-gfx; +Cc: Matt Roper, Chris Wilson, dri-devel

From: Fei Yang <fei.yang@intel.com>

Currently the KMD is using enum i915_cache_level to set caching policy for
buffer objects. This is flaky because the PAT index which really controls
the caching behavior in PTE has far more levels than what's defined in the
enum. In addition, the PAT index is platform dependent, having to translate
between i915_cache_level and PAT index is not reliable, and makes the code
more complicated.

From UMD's perspective there is also a necessity to set caching policy for
performance fine tuning. It's much easier for the UMD to directly use PAT
index because the behavior of each PAT index is clearly defined in Bspec.
Haivng the abstracted i915_cache_level sitting in between would only cause
more ambiguity.

For these reasons this patch replaces i915_cache_level with PAT index. Also
note, the cache_level is not completely removed yet, because the KMD still
has the need of creating buffer objects with simple cache settings such as
cached, uncached, or writethrough. For these simple cases, using cache_level
would help simplify the code.

Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Fei Yang <fei.yang@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/display/intel_dpt.c      | 12 +--
 drivers/gpu/drm/i915/gem/i915_gem_domain.c    | 27 ++----
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 10 ++-
 drivers/gpu/drm/i915/gem/i915_gem_mman.c      |  3 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c    | 39 ++++++++-
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |  4 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  | 18 ++--
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c    |  4 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  | 16 ++--
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  2 +-
 .../drm/i915/gem/selftests/i915_gem_migrate.c |  2 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c    |  2 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c          | 10 ++-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c          | 76 ++++++++---------
 drivers/gpu/drm/i915/gt/gen8_ppgtt.h          |  3 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c          | 82 +++++++++----------
 drivers/gpu/drm/i915/gt/intel_gtt.h           | 20 ++---
 drivers/gpu/drm/i915/gt/intel_migrate.c       | 47 ++++++-----
 drivers/gpu/drm/i915/gt/intel_migrate.h       | 13 ++-
 drivers/gpu/drm/i915/gt/intel_ppgtt.c         |  6 +-
 drivers/gpu/drm/i915/gt/selftest_migrate.c    | 47 ++++++-----
 drivers/gpu/drm/i915/gt/selftest_reset.c      |  8 +-
 drivers/gpu/drm/i915/gt/selftest_timeline.c   |  2 +-
 drivers/gpu/drm/i915/gt/selftest_tlb.c        |  4 +-
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c      | 10 ++-
 drivers/gpu/drm/i915/i915_debugfs.c           | 55 ++++++++++---
 drivers/gpu/drm/i915/i915_gem.c               | 16 +++-
 drivers/gpu/drm/i915/i915_gpu_error.c         |  8 +-
 drivers/gpu/drm/i915/i915_vma.c               | 16 ++--
 drivers/gpu/drm/i915/i915_vma.h               |  2 +-
 drivers/gpu/drm/i915/i915_vma_types.h         |  2 -
 drivers/gpu/drm/i915/selftests/i915_gem.c     |  5 +-
 .../gpu/drm/i915/selftests/i915_gem_evict.c   |  4 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 15 ++--
 .../drm/i915/selftests/intel_memory_region.c  |  4 +-
 drivers/gpu/drm/i915/selftests/mock_gtt.c     |  8 +-
 36 files changed, 361 insertions(+), 241 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c b/drivers/gpu/drm/i915/display/intel_dpt.c
index c5eacfdba1a5..7c5fddb203ba 100644
--- a/drivers/gpu/drm/i915/display/intel_dpt.c
+++ b/drivers/gpu/drm/i915/display/intel_dpt.c
@@ -43,24 +43,24 @@ static void gen8_set_pte(void __iomem *addr, gen8_pte_t pte)
 static void dpt_insert_page(struct i915_address_space *vm,
 			    dma_addr_t addr,
 			    u64 offset,
-			    enum i915_cache_level level,
+			    unsigned int pat_index,
 			    u32 flags)
 {
 	struct i915_dpt *dpt = i915_vm_to_dpt(vm);
 	gen8_pte_t __iomem *base = dpt->iomem;
 
 	gen8_set_pte(base + offset / I915_GTT_PAGE_SIZE,
-		     vm->pte_encode(addr, level, flags));
+		     vm->pte_encode(addr, pat_index, flags));
 }
 
 static void dpt_insert_entries(struct i915_address_space *vm,
 			       struct i915_vma_resource *vma_res,
-			       enum i915_cache_level level,
+			       unsigned int pat_index,
 			       u32 flags)
 {
 	struct i915_dpt *dpt = i915_vm_to_dpt(vm);
 	gen8_pte_t __iomem *base = dpt->iomem;
-	const gen8_pte_t pte_encode = vm->pte_encode(0, level, flags);
+	const gen8_pte_t pte_encode = vm->pte_encode(0, pat_index, flags);
 	struct sgt_iter sgt_iter;
 	dma_addr_t addr;
 	int i;
@@ -83,7 +83,7 @@ static void dpt_clear_range(struct i915_address_space *vm,
 static void dpt_bind_vma(struct i915_address_space *vm,
 			 struct i915_vm_pt_stash *stash,
 			 struct i915_vma_resource *vma_res,
-			 enum i915_cache_level cache_level,
+			 unsigned int pat_index,
 			 u32 flags)
 {
 	u32 pte_flags;
@@ -98,7 +98,7 @@ static void dpt_bind_vma(struct i915_address_space *vm,
 	if (vma_res->bi.lmem)
 		pte_flags |= PTE_LM;
 
-	vm->insert_entries(vm, vma_res, cache_level, pte_flags);
+	vm->insert_entries(vm, vma_res, pat_index, pte_flags);
 
 	vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
index 33b73bea1e08..84e0a96f6c71 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -27,8 +27,8 @@ static bool gpu_write_needs_clflush(struct drm_i915_gem_object *obj)
 	if (IS_DGFX(i915))
 		return false;
 
-	return !(obj->cache_level == I915_CACHE_NONE ||
-		 obj->cache_level == I915_CACHE_WT);
+	return !(i915_gem_object_has_cache_level(obj, I915_CACHE_NONE) ||
+		 i915_gem_object_has_cache_level(obj, I915_CACHE_WT));
 }
 
 bool i915_gem_cpu_write_needs_clflush(struct drm_i915_gem_object *obj)
@@ -265,7 +265,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 {
 	int ret;
 
-	if (obj->cache_level == cache_level)
+	if (i915_gem_object_has_cache_level(obj, cache_level))
 		return 0;
 
 	ret = i915_gem_object_wait(obj,
@@ -276,10 +276,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 		return ret;
 
 	/* Always invalidate stale cachelines */
-	if (obj->cache_level != cache_level) {
-		i915_gem_object_set_cache_coherency(obj, cache_level);
-		obj->cache_dirty = true;
-	}
+	i915_gem_object_set_cache_coherency(obj, cache_level);
+	obj->cache_dirty = true;
 
 	/* The cache-level will be applied when each vma is rebound. */
 	return i915_gem_object_unbind(obj,
@@ -304,20 +302,13 @@ int i915_gem_get_caching_ioctl(struct drm_device *dev, void *data,
 		goto out;
 	}
 
-	switch (obj->cache_level) {
-	case I915_CACHE_LLC:
-	case I915_CACHE_L3_LLC:
+	if (i915_gem_object_has_cache_level(obj, I915_CACHE_LLC) ||
+	    i915_gem_object_has_cache_level(obj, I915_CACHE_L3_LLC))
 		args->caching = I915_CACHING_CACHED;
-		break;
-
-	case I915_CACHE_WT:
+	else if (i915_gem_object_has_cache_level(obj, I915_CACHE_WT))
 		args->caching = I915_CACHING_DISPLAY;
-		break;
-
-	default:
+	else
 		args->caching = I915_CACHING_NONE;
-		break;
-	}
 out:
 	rcu_read_unlock();
 	return err;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 3aeede6aee4d..d42915516636 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -642,7 +642,7 @@ static inline int use_cpu_reloc(const struct reloc_cache *cache,
 
 	return (cache->has_llc ||
 		obj->cache_dirty ||
-		obj->cache_level != I915_CACHE_NONE);
+		!i915_gem_object_has_cache_level(obj, I915_CACHE_NONE));
 }
 
 static int eb_reserve_vma(struct i915_execbuffer *eb,
@@ -1323,8 +1323,10 @@ static void *reloc_iomap(struct i915_vma *batch,
 	offset = cache->node.start;
 	if (drm_mm_node_allocated(&cache->node)) {
 		ggtt->vm.insert_page(&ggtt->vm,
-				     i915_gem_object_get_dma_address(obj, page),
-				     offset, I915_CACHE_NONE, 0);
+			i915_gem_object_get_dma_address(obj, page),
+			offset,
+			i915_gem_get_pat_index(ggtt->vm.i915, I915_CACHE_NONE),
+			0);
 	} else {
 		offset += page << PAGE_SHIFT;
 	}
@@ -1464,7 +1466,7 @@ eb_relocate_entry(struct i915_execbuffer *eb,
 			reloc_cache_unmap(&eb->reloc_cache);
 			mutex_lock(&vma->vm->mutex);
 			err = i915_vma_bind(target->vma,
-					    target->vma->obj->cache_level,
+					    target->vma->obj->pat_index,
 					    PIN_GLOBAL, NULL, NULL);
 			mutex_unlock(&vma->vm->mutex);
 			reloc_cache_remap(&eb->reloc_cache, ev->vma->obj);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index d3c1dee16af2..6c242f9ffc75 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -383,7 +383,8 @@ static vm_fault_t vm_fault_gtt(struct vm_fault *vmf)
 	}
 
 	/* Access to snoopable pages through the GTT is incoherent. */
-	if (obj->cache_level != I915_CACHE_NONE && !HAS_LLC(i915)) {
+	if (!(i915_gem_object_has_cache_level(obj, I915_CACHE_NONE) ||
+	      HAS_LLC(i915))) {
 		ret = -EFAULT;
 		goto err_unpin;
 	}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 1295bb812866..2894ed9156c7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -54,6 +54,12 @@ unsigned int i915_gem_get_pat_index(struct drm_i915_private *i915,
 	return INTEL_INFO(i915)->cachelevel_to_pat[level];
 }
 
+bool i915_gem_object_has_cache_level(const struct drm_i915_gem_object *obj,
+				     enum i915_cache_level lvl)
+{
+	return obj->pat_index == i915_gem_get_pat_index(obj_to_i915(obj), lvl);
+}
+
 struct drm_i915_gem_object *i915_gem_object_alloc(void)
 {
 	struct drm_i915_gem_object *obj;
@@ -133,7 +139,7 @@ void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj,
 {
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 
-	obj->cache_level = cache_level;
+	obj->pat_index = i915_gem_get_pat_index(i915, cache_level);
 
 	if (cache_level != I915_CACHE_NONE)
 		obj->cache_coherent = (I915_BO_CACHE_COHERENT_FOR_READ |
@@ -148,6 +154,37 @@ void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj,
 		!IS_DGFX(i915);
 }
 
+/**
+ * i915_gem_object_set_pat_index - set PAT index to be used in PTE encode
+ * @obj: #drm_i915_gem_object
+ * @pat_index: PAT index
+ *
+ * This is a clone of i915_gem_object_set_cache_coherency taking pat index
+ * instead of cache_level as its second argument.
+ */
+void i915_gem_object_set_pat_index(struct drm_i915_gem_object *obj,
+				   unsigned int pat_index)
+{
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+
+	if (obj->pat_index == pat_index)
+		return;
+
+	obj->pat_index = pat_index;
+
+	if (pat_index != i915_gem_get_pat_index(i915, I915_CACHE_NONE))
+		obj->cache_coherent = (I915_BO_CACHE_COHERENT_FOR_READ |
+				       I915_BO_CACHE_COHERENT_FOR_WRITE);
+	else if (HAS_LLC(i915))
+		obj->cache_coherent = I915_BO_CACHE_COHERENT_FOR_READ;
+	else
+		obj->cache_coherent = 0;
+
+	obj->cache_dirty =
+		!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE) &&
+		!IS_DGFX(i915);
+}
+
 bool i915_gem_object_can_bypass_llc(struct drm_i915_gem_object *obj)
 {
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 4c92e17b4337..6f00aab10015 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -34,6 +34,8 @@ static inline bool i915_gem_object_size_2big(u64 size)
 
 unsigned int i915_gem_get_pat_index(struct drm_i915_private *i915,
 				    enum i915_cache_level level);
+bool i915_gem_object_has_cache_level(const struct drm_i915_gem_object *obj,
+				     enum i915_cache_level lvl);
 void i915_gem_init__objects(struct drm_i915_private *i915);
 
 void i915_objects_module_exit(void);
@@ -764,6 +766,8 @@ bool i915_gem_object_has_unknown_state(struct drm_i915_gem_object *obj);
 
 void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj,
 					 unsigned int cache_level);
+void i915_gem_object_set_pat_index(struct drm_i915_gem_object *obj,
+				   unsigned int pat_index);
 bool i915_gem_object_can_bypass_llc(struct drm_i915_gem_object *obj);
 void i915_gem_object_flush_if_display(struct drm_i915_gem_object *obj);
 void i915_gem_object_flush_if_display_locked(struct drm_i915_gem_object *obj);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 890f3ad497c5..9c70dedf25cc 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -351,12 +351,20 @@ struct drm_i915_gem_object {
 #define I915_BO_FLAG_STRUCT_PAGE BIT(0) /* Object backed by struct pages */
 #define I915_BO_FLAG_IOMEM       BIT(1) /* Object backed by IO memory */
 	/**
-	 * @cache_level: The desired GTT caching level.
-	 *
-	 * See enum i915_cache_level for possible values, along with what
-	 * each does.
+	 * @pat_index: The desired PAT index.
+	 *
+	 * See hardware specification for valid PAT indices for each platform.
+	 * This field used to contain a value of enum i915_cache_level. It's
+	 * changed to an unsigned int because PAT indices are being used by
+	 * both UMD and KMD for caching policy control after GEN12.
+	 * For backward compatibility, this field will continue to contain
+	 * value of i915_cache_level for pre-GEN12 platforms so that the PTE
+	 * encode functions for these legacy platforms can stay the same.
+	 * In the meantime platform specific tables are created to translate
+	 * i915_cache_level into pat index, for more details check the macros
+	 * defined i915/i915_pci.c, e.g. PVC_CACHELEVEL.
 	 */
-	unsigned int cache_level:3;
+	unsigned int pat_index:6;
 	/**
 	 * @cache_coherent:
 	 *
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index 8ac376c24aa2..9f379141f966 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -557,7 +557,9 @@ static void dbg_poison(struct i915_ggtt *ggtt,
 
 		ggtt->vm.insert_page(&ggtt->vm, addr,
 				     ggtt->error_capture.start,
-				     I915_CACHE_NONE, 0);
+				     i915_gem_get_pat_index(ggtt->vm.i915,
+							    I915_CACHE_NONE),
+				     0);
 		mb();
 
 		s = io_mapping_map_wc(&ggtt->iomap,
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
index d030182ca176..7eadb7d68d47 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
@@ -214,7 +214,8 @@ static struct dma_fence *i915_ttm_accel_move(struct ttm_buffer_object *bo,
 
 		intel_engine_pm_get(to_gt(i915)->migrate.context->engine);
 		ret = intel_context_migrate_clear(to_gt(i915)->migrate.context, deps,
-						  dst_st->sgl, dst_level,
+						  dst_st->sgl,
+						  i915_gem_get_pat_index(i915, dst_level),
 						  i915_ttm_gtt_binds_lmem(dst_mem),
 						  0, &rq);
 	} else {
@@ -227,12 +228,13 @@ static struct dma_fence *i915_ttm_accel_move(struct ttm_buffer_object *bo,
 		src_level = i915_ttm_cache_level(i915, bo->resource, src_ttm);
 		intel_engine_pm_get(to_gt(i915)->migrate.context->engine);
 		ret = intel_context_migrate_copy(to_gt(i915)->migrate.context,
-						 deps, src_rsgt->table.sgl,
-						 src_level,
-						 i915_ttm_gtt_binds_lmem(bo->resource),
-						 dst_st->sgl, dst_level,
-						 i915_ttm_gtt_binds_lmem(dst_mem),
-						 &rq);
+					deps, src_rsgt->table.sgl,
+					i915_gem_get_pat_index(i915, src_level),
+					i915_ttm_gtt_binds_lmem(bo->resource),
+					dst_st->sgl,
+					i915_gem_get_pat_index(i915, dst_level),
+					i915_ttm_gtt_binds_lmem(dst_mem),
+					&rq);
 
 		i915_refct_sgt_put(src_rsgt);
 	}
diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
index defece0bcb81..ebb68ac9cd5e 100644
--- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
@@ -354,7 +354,7 @@ fake_huge_pages_object(struct drm_i915_private *i915, u64 size, bool single)
 
 	obj->write_domain = I915_GEM_DOMAIN_CPU;
 	obj->read_domains = I915_GEM_DOMAIN_CPU;
-	obj->cache_level = I915_CACHE_NONE;
+	obj->pat_index = i915_gem_get_pat_index(i915, I915_CACHE_NONE);
 
 	return obj;
 }
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
index fe6c37fd7859..a93a90b15907 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
@@ -219,7 +219,7 @@ static int __igt_lmem_pages_migrate(struct intel_gt *gt,
 			continue;
 
 		err = intel_migrate_clear(&gt->migrate, &ww, deps,
-					  obj->mm.pages->sgl, obj->cache_level,
+					  obj->mm.pages->sgl, obj->pat_index,
 					  i915_gem_object_is_lmem(obj),
 					  0xdeadbeaf, &rq);
 		if (rq) {
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
index 56279908ed30..a93d8f9f8bc1 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
@@ -1222,7 +1222,7 @@ static int __igt_mmap_migrate(struct intel_memory_region **placements,
 	}
 
 	err = intel_context_migrate_clear(to_gt(i915)->migrate.context, NULL,
-					  obj->mm.pages->sgl, obj->cache_level,
+					  obj->mm.pages->sgl, obj->pat_index,
 					  i915_gem_object_is_lmem(obj),
 					  expand32(POISON_INUSE), &rq);
 	i915_gem_object_unpin_pages(obj);
diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
index 5aaacc53fa4c..c2bdc133c89a 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -109,7 +109,7 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
 
 static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 				      struct i915_vma_resource *vma_res,
-				      enum i915_cache_level cache_level,
+				      unsigned int pat_index,
 				      u32 flags)
 {
 	struct i915_ppgtt *ppgtt = i915_vm_to_ppgtt(vm);
@@ -117,7 +117,7 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 	unsigned int first_entry = vma_res->start / I915_GTT_PAGE_SIZE;
 	unsigned int act_pt = first_entry / GEN6_PTES;
 	unsigned int act_pte = first_entry % GEN6_PTES;
-	const u32 pte_encode = vm->pte_encode(0, cache_level, flags);
+	const u32 pte_encode = vm->pte_encode(0, pat_index, flags);
 	struct sgt_dma iter = sgt_dma(vma_res);
 	gen6_pte_t *vaddr;
 
@@ -227,7 +227,9 @@ static int gen6_ppgtt_init_scratch(struct gen6_ppgtt *ppgtt)
 
 	vm->scratch[0]->encode =
 		vm->pte_encode(px_dma(vm->scratch[0]),
-			       I915_CACHE_NONE, PTE_READ_ONLY);
+			       i915_gem_get_pat_index(vm->i915,
+						      I915_CACHE_NONE),
+			       PTE_READ_ONLY);
 
 	vm->scratch[1] = vm->alloc_pt_dma(vm, I915_GTT_PAGE_SIZE_4K);
 	if (IS_ERR(vm->scratch[1])) {
@@ -278,7 +280,7 @@ static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
 static void pd_vma_bind(struct i915_address_space *vm,
 			struct i915_vm_pt_stash *stash,
 			struct i915_vma_resource *vma_res,
-			enum i915_cache_level cache_level,
+			unsigned int pat_index,
 			u32 unused)
 {
 	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 3ae41a13d28d..f76ec2cb29ef 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -15,6 +15,11 @@
 #include "intel_gt.h"
 #include "intel_gtt.h"
 
+/**
+ * For pre-gen12 platforms pat_index is the same as enum i915_cache_level,
+ * so the code here is still valid. See translation table defined by
+ * LEGACY_CACHELEVEL
+ */
 static u64 gen8_pde_encode(const dma_addr_t addr,
 			   const enum i915_cache_level level)
 {
@@ -56,7 +61,7 @@ static u64 gen8_pte_encode(dma_addr_t addr,
 }
 
 static u64 mtl_pte_encode(dma_addr_t addr,
-			  enum i915_cache_level level,
+			  unsigned int pat_index,
 			  u32 flags)
 {
 	gen8_pte_t pte = addr | GEN8_PAGE_PRESENT | GEN8_PAGE_RW;
@@ -67,24 +72,17 @@ static u64 mtl_pte_encode(dma_addr_t addr,
 	if (flags & PTE_LM)
 		pte |= GEN12_PPGTT_PTE_LM | GEN12_PPGTT_PTE_NC;
 
-	switch (level) {
-	case I915_CACHE_NONE:
-		pte |= GEN12_PPGTT_PTE_PAT1;
-		break;
-	case I915_CACHE_LLC:
-	case I915_CACHE_L3_LLC:
-		pte |= GEN12_PPGTT_PTE_PAT0 | GEN12_PPGTT_PTE_PAT1;
-		break;
-	case I915_CACHE_WT:
+	if (pat_index & BIT(0))
 		pte |= GEN12_PPGTT_PTE_PAT0;
-		break;
-	default:
-		/* This should never happen. Added to deal with the compile
-		 * error due to the addition of I915_MAX_CACHE_LEVEL. Will
-		 * be removed by the pat_index patch.
-		 */
-		break;
-	}
+
+	if (pat_index & BIT(1))
+		pte |= GEN12_PPGTT_PTE_PAT1;
+
+	if (pat_index & BIT(2))
+		pte |= GEN12_PPGTT_PTE_PAT2;
+
+	if (pat_index & BIT(3))
+		pte |= GEN12_PPGTT_PTE_PAT3;
 
 	return pte;
 }
@@ -457,11 +455,11 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
 		      struct i915_page_directory *pdp,
 		      struct sgt_dma *iter,
 		      u64 idx,
-		      enum i915_cache_level cache_level,
+		      unsigned int pat_index,
 		      u32 flags)
 {
 	struct i915_page_directory *pd;
-	const gen8_pte_t pte_encode = ppgtt->vm.pte_encode(0, cache_level, flags);
+	const gen8_pte_t pte_encode = ppgtt->vm.pte_encode(0, pat_index, flags);
 	gen8_pte_t *vaddr;
 
 	pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2));
@@ -504,10 +502,10 @@ static void
 xehpsdv_ppgtt_insert_huge(struct i915_address_space *vm,
 			  struct i915_vma_resource *vma_res,
 			  struct sgt_dma *iter,
-			  enum i915_cache_level cache_level,
+			  unsigned int pat_index,
 			  u32 flags)
 {
-	const gen8_pte_t pte_encode = vm->pte_encode(0, cache_level, flags);
+	const gen8_pte_t pte_encode = vm->pte_encode(0, pat_index, flags);
 	unsigned int rem = sg_dma_len(iter->sg);
 	u64 start = vma_res->start;
 	u64 end = start + vma_res->vma_size;
@@ -611,10 +609,10 @@ xehpsdv_ppgtt_insert_huge(struct i915_address_space *vm,
 static void gen8_ppgtt_insert_huge(struct i915_address_space *vm,
 				   struct i915_vma_resource *vma_res,
 				   struct sgt_dma *iter,
-				   enum i915_cache_level cache_level,
+				   unsigned int pat_index,
 				   u32 flags)
 {
-	const gen8_pte_t pte_encode = vm->pte_encode(0, cache_level, flags);
+	const gen8_pte_t pte_encode = vm->pte_encode(0, pat_index, flags);
 	unsigned int rem = sg_dma_len(iter->sg);
 	u64 start = vma_res->start;
 
@@ -734,7 +732,7 @@ static void gen8_ppgtt_insert_huge(struct i915_address_space *vm,
 
 static void gen8_ppgtt_insert(struct i915_address_space *vm,
 			      struct i915_vma_resource *vma_res,
-			      enum i915_cache_level cache_level,
+			      unsigned int pat_index,
 			      u32 flags)
 {
 	struct i915_ppgtt * const ppgtt = i915_vm_to_ppgtt(vm);
@@ -742,9 +740,9 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm,
 
 	if (vma_res->bi.page_sizes.sg > I915_GTT_PAGE_SIZE) {
 		if (HAS_64K_PAGES(vm->i915))
-			xehpsdv_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags);
+			xehpsdv_ppgtt_insert_huge(vm, vma_res, &iter, pat_index, flags);
 		else
-			gen8_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags);
+			gen8_ppgtt_insert_huge(vm, vma_res, &iter, pat_index, flags);
 	} else  {
 		u64 idx = vma_res->start >> GEN8_PTE_SHIFT;
 
@@ -753,7 +751,7 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm,
 				gen8_pdp_for_page_index(vm, idx);
 
 			idx = gen8_ppgtt_insert_pte(ppgtt, pdp, &iter, idx,
-						    cache_level, flags);
+						    pat_index, flags);
 		} while (idx);
 
 		vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
@@ -763,7 +761,7 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm,
 static void gen8_ppgtt_insert_entry(struct i915_address_space *vm,
 				    dma_addr_t addr,
 				    u64 offset,
-				    enum i915_cache_level level,
+				    unsigned int pat_index,
 				    u32 flags)
 {
 	u64 idx = offset >> GEN8_PTE_SHIFT;
@@ -777,14 +775,14 @@ static void gen8_ppgtt_insert_entry(struct i915_address_space *vm,
 	GEM_BUG_ON(pt->is_compact);
 
 	vaddr = px_vaddr(pt);
-	vaddr[gen8_pd_index(idx, 0)] = vm->pte_encode(addr, level, flags);
+	vaddr[gen8_pd_index(idx, 0)] = vm->pte_encode(addr, pat_index, flags);
 	drm_clflush_virt_range(&vaddr[gen8_pd_index(idx, 0)], sizeof(*vaddr));
 }
 
 static void __xehpsdv_ppgtt_insert_entry_lm(struct i915_address_space *vm,
 					    dma_addr_t addr,
 					    u64 offset,
-					    enum i915_cache_level level,
+					    unsigned int pat_index,
 					    u32 flags)
 {
 	u64 idx = offset >> GEN8_PTE_SHIFT;
@@ -807,20 +805,20 @@ static void __xehpsdv_ppgtt_insert_entry_lm(struct i915_address_space *vm,
 	}
 
 	vaddr = px_vaddr(pt);
-	vaddr[gen8_pd_index(idx, 0) / 16] = vm->pte_encode(addr, level, flags);
+	vaddr[gen8_pd_index(idx, 0) / 16] = vm->pte_encode(addr, pat_index, flags);
 }
 
 static void xehpsdv_ppgtt_insert_entry(struct i915_address_space *vm,
 				       dma_addr_t addr,
 				       u64 offset,
-				       enum i915_cache_level level,
+				       unsigned int pat_index,
 				       u32 flags)
 {
 	if (flags & PTE_LM)
 		return __xehpsdv_ppgtt_insert_entry_lm(vm, addr, offset,
-						       level, flags);
+						       pat_index, flags);
 
-	return gen8_ppgtt_insert_entry(vm, addr, offset, level, flags);
+	return gen8_ppgtt_insert_entry(vm, addr, offset, pat_index, flags);
 }
 
 static int gen8_init_scratch(struct i915_address_space *vm)
@@ -855,7 +853,9 @@ static int gen8_init_scratch(struct i915_address_space *vm)
 
 	vm->scratch[0]->encode =
 		vm->pte_encode(px_dma(vm->scratch[0]),
-				I915_CACHE_NONE, pte_flags);
+			       i915_gem_get_pat_index(vm->i915,
+						      I915_CACHE_NONE),
+			       pte_flags);
 
 	for (i = 1; i <= vm->top; i++) {
 		struct drm_i915_gem_object *obj;
@@ -873,7 +873,9 @@ static int gen8_init_scratch(struct i915_address_space *vm)
 		}
 
 		fill_px(obj, vm->scratch[i - 1]->encode);
-		obj->encode = gen8_pde_encode(px_dma(obj), I915_CACHE_NONE);
+		obj->encode = gen8_pde_encode(px_dma(obj),
+					      i915_gem_get_pat_index(vm->i915,
+							I915_CACHE_NONE));
 
 		vm->scratch[i] = obj;
 	}
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
index 6b8ce7f4d25a..98e260e1a081 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
@@ -10,13 +10,12 @@
 
 struct i915_address_space;
 struct intel_gt;
-enum i915_cache_level;
 
 struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
 				     unsigned long lmem_pt_obj_flags);
 
 u64 gen8_ggtt_pte_encode(dma_addr_t addr,
-			 enum i915_cache_level level,
+			 unsigned int pat_index,
 			 u32 flags);
 u64 mtl_ggtt_pte_encode(dma_addr_t addr,
 			unsigned int pat_index,
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 91056b9a60a9..66a4955f19e4 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -221,7 +221,7 @@ static void guc_ggtt_invalidate(struct i915_ggtt *ggtt)
 }
 
 u64 mtl_ggtt_pte_encode(dma_addr_t addr,
-			enum i915_cache_level level,
+			unsigned int pat_index,
 			u32 flags)
 {
 	gen8_pte_t pte = addr | GEN8_PAGE_PRESENT;
@@ -231,30 +231,17 @@ u64 mtl_ggtt_pte_encode(dma_addr_t addr,
 	if (flags & PTE_LM)
 		pte |= GEN12_GGTT_PTE_LM;
 
-	switch (level) {
-	case I915_CACHE_NONE:
-		pte |= MTL_GGTT_PTE_PAT1;
-		break;
-	case I915_CACHE_LLC:
-	case I915_CACHE_L3_LLC:
-		pte |= MTL_GGTT_PTE_PAT0 | MTL_GGTT_PTE_PAT1;
-		break;
-	case I915_CACHE_WT:
+	if (pat_index & BIT(0))
 		pte |= MTL_GGTT_PTE_PAT0;
-		break;
-	default:
-		/* This should never happen. Added to deal with the compile
-		 * error due to the addition of I915_MAX_CACHE_LEVEL. Will
-		 * be removed by the pat_index patch.
-		 */
-		break;
-	}
+
+	if (pat_index & BIT(1))
+		pte |= MTL_GGTT_PTE_PAT1;
 
 	return pte;
 }
 
 u64 gen8_ggtt_pte_encode(dma_addr_t addr,
-			 enum i915_cache_level level,
+			 unsigned int pat_index,
 			 u32 flags)
 {
 	gen8_pte_t pte = addr | GEN8_PAGE_PRESENT;
@@ -273,25 +260,25 @@ static void gen8_set_pte(void __iomem *addr, gen8_pte_t pte)
 static void gen8_ggtt_insert_page(struct i915_address_space *vm,
 				  dma_addr_t addr,
 				  u64 offset,
-				  enum i915_cache_level level,
+				  unsigned int pat_index,
 				  u32 flags)
 {
 	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
 	gen8_pte_t __iomem *pte =
 		(gen8_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE;
 
-	gen8_set_pte(pte, ggtt->vm.pte_encode(addr, level, flags));
+	gen8_set_pte(pte, ggtt->vm.pte_encode(addr, pat_index, flags));
 
 	ggtt->invalidate(ggtt);
 }
 
 static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
 				     struct i915_vma_resource *vma_res,
-				     enum i915_cache_level level,
+				     unsigned int pat_index,
 				     u32 flags)
 {
 	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
-	const gen8_pte_t pte_encode = ggtt->vm.pte_encode(0, level, flags);
+	const gen8_pte_t pte_encode = ggtt->vm.pte_encode(0, pat_index, flags);
 	gen8_pte_t __iomem *gte;
 	gen8_pte_t __iomem *end;
 	struct sgt_iter iter;
@@ -348,14 +335,14 @@ static void gen8_ggtt_clear_range(struct i915_address_space *vm,
 static void gen6_ggtt_insert_page(struct i915_address_space *vm,
 				  dma_addr_t addr,
 				  u64 offset,
-				  enum i915_cache_level level,
+				  unsigned int pat_index,
 				  u32 flags)
 {
 	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
 	gen6_pte_t __iomem *pte =
 		(gen6_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE;
 
-	iowrite32(vm->pte_encode(addr, level, flags), pte);
+	iowrite32(vm->pte_encode(addr, pat_index, flags), pte);
 
 	ggtt->invalidate(ggtt);
 }
@@ -368,7 +355,7 @@ static void gen6_ggtt_insert_page(struct i915_address_space *vm,
  */
 static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
 				     struct i915_vma_resource *vma_res,
-				     enum i915_cache_level level,
+				     unsigned int pat_index,
 				     u32 flags)
 {
 	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
@@ -385,7 +372,7 @@ static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
 		iowrite32(vm->scratch[0]->encode, gte++);
 	end += (vma_res->node_size + vma_res->guard) / I915_GTT_PAGE_SIZE;
 	for_each_sgt_daddr(addr, iter, vma_res->bi.pages)
-		iowrite32(vm->pte_encode(addr, level, flags), gte++);
+		iowrite32(vm->pte_encode(addr, pat_index, flags), gte++);
 	GEM_BUG_ON(gte > end);
 
 	/* Fill the allocated but "unused" space beyond the end of the buffer */
@@ -420,14 +407,15 @@ struct insert_page {
 	struct i915_address_space *vm;
 	dma_addr_t addr;
 	u64 offset;
-	enum i915_cache_level level;
+	unsigned int pat_index;
 };
 
 static int bxt_vtd_ggtt_insert_page__cb(void *_arg)
 {
 	struct insert_page *arg = _arg;
 
-	gen8_ggtt_insert_page(arg->vm, arg->addr, arg->offset, arg->level, 0);
+	gen8_ggtt_insert_page(arg->vm, arg->addr, arg->offset,
+			      arg->pat_index, 0);
 	bxt_vtd_ggtt_wa(arg->vm);
 
 	return 0;
@@ -436,10 +424,10 @@ static int bxt_vtd_ggtt_insert_page__cb(void *_arg)
 static void bxt_vtd_ggtt_insert_page__BKL(struct i915_address_space *vm,
 					  dma_addr_t addr,
 					  u64 offset,
-					  enum i915_cache_level level,
+					  unsigned int pat_index,
 					  u32 unused)
 {
-	struct insert_page arg = { vm, addr, offset, level };
+	struct insert_page arg = { vm, addr, offset, pat_index };
 
 	stop_machine(bxt_vtd_ggtt_insert_page__cb, &arg, NULL);
 }
@@ -447,7 +435,7 @@ static void bxt_vtd_ggtt_insert_page__BKL(struct i915_address_space *vm,
 struct insert_entries {
 	struct i915_address_space *vm;
 	struct i915_vma_resource *vma_res;
-	enum i915_cache_level level;
+	unsigned int pat_index;
 	u32 flags;
 };
 
@@ -455,7 +443,8 @@ static int bxt_vtd_ggtt_insert_entries__cb(void *_arg)
 {
 	struct insert_entries *arg = _arg;
 
-	gen8_ggtt_insert_entries(arg->vm, arg->vma_res, arg->level, arg->flags);
+	gen8_ggtt_insert_entries(arg->vm, arg->vma_res,
+				 arg->pat_index, arg->flags);
 	bxt_vtd_ggtt_wa(arg->vm);
 
 	return 0;
@@ -463,10 +452,10 @@ static int bxt_vtd_ggtt_insert_entries__cb(void *_arg)
 
 static void bxt_vtd_ggtt_insert_entries__BKL(struct i915_address_space *vm,
 					     struct i915_vma_resource *vma_res,
-					     enum i915_cache_level level,
+					     unsigned int pat_index,
 					     u32 flags)
 {
-	struct insert_entries arg = { vm, vma_res, level, flags };
+	struct insert_entries arg = { vm, vma_res, pat_index, flags };
 
 	stop_machine(bxt_vtd_ggtt_insert_entries__cb, &arg, NULL);
 }
@@ -495,7 +484,7 @@ static void gen6_ggtt_clear_range(struct i915_address_space *vm,
 void intel_ggtt_bind_vma(struct i915_address_space *vm,
 			 struct i915_vm_pt_stash *stash,
 			 struct i915_vma_resource *vma_res,
-			 enum i915_cache_level cache_level,
+			 unsigned int pat_index,
 			 u32 flags)
 {
 	u32 pte_flags;
@@ -512,7 +501,7 @@ void intel_ggtt_bind_vma(struct i915_address_space *vm,
 	if (vma_res->bi.lmem)
 		pte_flags |= PTE_LM;
 
-	vm->insert_entries(vm, vma_res, cache_level, pte_flags);
+	vm->insert_entries(vm, vma_res, pat_index, pte_flags);
 	vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
 }
 
@@ -661,7 +650,7 @@ static int init_ggtt(struct i915_ggtt *ggtt)
 static void aliasing_gtt_bind_vma(struct i915_address_space *vm,
 				  struct i915_vm_pt_stash *stash,
 				  struct i915_vma_resource *vma_res,
-				  enum i915_cache_level cache_level,
+				  unsigned int pat_index,
 				  u32 flags)
 {
 	u32 pte_flags;
@@ -673,10 +662,10 @@ static void aliasing_gtt_bind_vma(struct i915_address_space *vm,
 
 	if (flags & I915_VMA_LOCAL_BIND)
 		ppgtt_bind_vma(&i915_vm_to_ggtt(vm)->alias->vm,
-			       stash, vma_res, cache_level, flags);
+			       stash, vma_res, pat_index, flags);
 
 	if (flags & I915_VMA_GLOBAL_BIND)
-		vm->insert_entries(vm, vma_res, cache_level, pte_flags);
+		vm->insert_entries(vm, vma_res, pat_index, pte_flags);
 
 	vma_res->bound_flags |= flags;
 }
@@ -933,7 +922,9 @@ static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 size)
 
 	ggtt->vm.scratch[0]->encode =
 		ggtt->vm.pte_encode(px_dma(ggtt->vm.scratch[0]),
-				    I915_CACHE_NONE, pte_flags);
+				    i915_gem_get_pat_index(i915,
+							   I915_CACHE_NONE),
+				    pte_flags);
 
 	return 0;
 }
@@ -1022,6 +1013,11 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
 	return ggtt_probe_common(ggtt, size);
 }
 
+/*
+ * For pre-gen8 platforms pat_index is the same as enum i915_cache_level,
+ * so these PTE encode functions are left with using cache_level.
+ * See translation table LEGACY_CACHELEVEL.
+ */
 static u64 snb_pte_encode(dma_addr_t addr,
 			  enum i915_cache_level level,
 			  u32 flags)
@@ -1302,7 +1298,9 @@ bool i915_ggtt_resume_vm(struct i915_address_space *vm)
 		 */
 		vma->resource->bound_flags = 0;
 		vma->ops->bind_vma(vm, NULL, vma->resource,
-				   obj ? obj->cache_level : 0,
+				   obj ? obj->pat_index :
+					 i915_gem_get_pat_index(vm->i915,
+							I915_CACHE_NONE),
 				   was_bound);
 
 		if (obj) { /* only used during resume => exclusive access */
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index b632167eaf2e..12bd4398ad38 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -165,8 +165,6 @@ typedef u64 gen8_pte_t;
 #define MTL_2_COH_1W	REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 2)
 #define MTL_0_COH_NON	REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 0)
 
-enum i915_cache_level;
-
 struct drm_i915_gem_object;
 struct i915_fence_reg;
 struct i915_vma;
@@ -234,7 +232,7 @@ struct i915_vma_ops {
 	void (*bind_vma)(struct i915_address_space *vm,
 			 struct i915_vm_pt_stash *stash,
 			 struct i915_vma_resource *vma_res,
-			 enum i915_cache_level cache_level,
+			 unsigned int pat_index,
 			 u32 flags);
 	/*
 	 * Unmap an object from an address space. This usually consists of
@@ -306,7 +304,7 @@ struct i915_address_space {
 		(*alloc_scratch_dma)(struct i915_address_space *vm, int sz);
 
 	u64 (*pte_encode)(dma_addr_t addr,
-			  enum i915_cache_level level,
+			  unsigned int pat_index,
 			  u32 flags); /* Create a valid PTE */
 #define PTE_READ_ONLY	BIT(0)
 #define PTE_LM		BIT(1)
@@ -321,20 +319,20 @@ struct i915_address_space {
 	void (*insert_page)(struct i915_address_space *vm,
 			    dma_addr_t addr,
 			    u64 offset,
-			    enum i915_cache_level cache_level,
+			    unsigned int pat_index,
 			    u32 flags);
 	void (*insert_entries)(struct i915_address_space *vm,
 			       struct i915_vma_resource *vma_res,
-			       enum i915_cache_level cache_level,
+			       unsigned int pat_index,
 			       u32 flags);
 	void (*raw_insert_page)(struct i915_address_space *vm,
 				dma_addr_t addr,
 				u64 offset,
-				enum i915_cache_level cache_level,
+				unsigned int pat_index,
 				u32 flags);
 	void (*raw_insert_entries)(struct i915_address_space *vm,
 				   struct i915_vma_resource *vma_res,
-				   enum i915_cache_level cache_level,
+				   unsigned int pat_index,
 				   u32 flags);
 	void (*cleanup)(struct i915_address_space *vm);
 
@@ -581,7 +579,7 @@ void ppgtt_init(struct i915_ppgtt *ppgtt, struct intel_gt *gt,
 void intel_ggtt_bind_vma(struct i915_address_space *vm,
 			 struct i915_vm_pt_stash *stash,
 			 struct i915_vma_resource *vma_res,
-			 enum i915_cache_level cache_level,
+			 unsigned int pat_index,
 			 u32 flags);
 void intel_ggtt_unbind_vma(struct i915_address_space *vm,
 			   struct i915_vma_resource *vma_res);
@@ -639,7 +637,7 @@ void
 __set_pd_entry(struct i915_page_directory * const pd,
 	       const unsigned short idx,
 	       struct i915_page_table *pt,
-	       u64 (*encode)(const dma_addr_t, const enum i915_cache_level));
+	       u64 (*encode)(const dma_addr_t, const unsigned int pat_index));
 
 #define set_pd_entry(pd, idx, to) \
 	__set_pd_entry((pd), (idx), px_pt(to), gen8_pde_encode)
@@ -659,7 +657,7 @@ void gen6_ggtt_invalidate(struct i915_ggtt *ggtt);
 void ppgtt_bind_vma(struct i915_address_space *vm,
 		    struct i915_vm_pt_stash *stash,
 		    struct i915_vma_resource *vma_res,
-		    enum i915_cache_level cache_level,
+		    unsigned int pat_index,
 		    u32 flags);
 void ppgtt_unbind_vma(struct i915_address_space *vm,
 		      struct i915_vma_resource *vma_res);
diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c
index 3f638f198796..117c3d05af3e 100644
--- a/drivers/gpu/drm/i915/gt/intel_migrate.c
+++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
@@ -45,7 +45,9 @@ static void xehpsdv_toggle_pdes(struct i915_address_space *vm,
 	 * Insert a dummy PTE into every PT that will map to LMEM to ensure
 	 * we have a correctly setup PDE structure for later use.
 	 */
-	vm->insert_page(vm, 0, d->offset, I915_CACHE_NONE, PTE_LM);
+	vm->insert_page(vm, 0, d->offset,
+			i915_gem_get_pat_index(vm->i915, I915_CACHE_NONE),
+			PTE_LM);
 	GEM_BUG_ON(!pt->is_compact);
 	d->offset += SZ_2M;
 }
@@ -63,7 +65,9 @@ static void xehpsdv_insert_pte(struct i915_address_space *vm,
 	 * alignment is 64K underneath for the pt, and we are careful
 	 * not to access the space in the void.
 	 */
-	vm->insert_page(vm, px_dma(pt), d->offset, I915_CACHE_NONE, PTE_LM);
+	vm->insert_page(vm, px_dma(pt), d->offset,
+			i915_gem_get_pat_index(vm->i915, I915_CACHE_NONE),
+			PTE_LM);
 	d->offset += SZ_64K;
 }
 
@@ -73,7 +77,8 @@ static void insert_pte(struct i915_address_space *vm,
 {
 	struct insert_pte_data *d = data;
 
-	vm->insert_page(vm, px_dma(pt), d->offset, I915_CACHE_NONE,
+	vm->insert_page(vm, px_dma(pt), d->offset,
+			i915_gem_get_pat_index(vm->i915, I915_CACHE_NONE),
 			i915_gem_object_is_lmem(pt->base) ? PTE_LM : 0);
 	d->offset += PAGE_SIZE;
 }
@@ -356,13 +361,13 @@ static int max_pte_pkt_size(struct i915_request *rq, int pkt)
 
 static int emit_pte(struct i915_request *rq,
 		    struct sgt_dma *it,
-		    enum i915_cache_level cache_level,
+		    unsigned int pat_index,
 		    bool is_lmem,
 		    u64 offset,
 		    int length)
 {
 	bool has_64K_pages = HAS_64K_PAGES(rq->engine->i915);
-	const u64 encode = rq->context->vm->pte_encode(0, cache_level,
+	const u64 encode = rq->context->vm->pte_encode(0, pat_index,
 						       is_lmem ? PTE_LM : 0);
 	struct intel_ring *ring = rq->ring;
 	int pkt, dword_length;
@@ -673,17 +678,17 @@ int
 intel_context_migrate_copy(struct intel_context *ce,
 			   const struct i915_deps *deps,
 			   struct scatterlist *src,
-			   enum i915_cache_level src_cache_level,
+			   unsigned int src_pat_index,
 			   bool src_is_lmem,
 			   struct scatterlist *dst,
-			   enum i915_cache_level dst_cache_level,
+			   unsigned int dst_pat_index,
 			   bool dst_is_lmem,
 			   struct i915_request **out)
 {
 	struct sgt_dma it_src = sg_sgt(src), it_dst = sg_sgt(dst), it_ccs;
 	struct drm_i915_private *i915 = ce->engine->i915;
 	u64 ccs_bytes_to_cpy = 0, bytes_to_cpy;
-	enum i915_cache_level ccs_cache_level;
+	unsigned int ccs_pat_index;
 	u32 src_offset, dst_offset;
 	u8 src_access, dst_access;
 	struct i915_request *rq;
@@ -707,12 +712,12 @@ intel_context_migrate_copy(struct intel_context *ce,
 		dst_sz = scatter_list_length(dst);
 		if (src_is_lmem) {
 			it_ccs = it_dst;
-			ccs_cache_level = dst_cache_level;
+			ccs_pat_index = dst_pat_index;
 			ccs_is_src = false;
 		} else if (dst_is_lmem) {
 			bytes_to_cpy = dst_sz;
 			it_ccs = it_src;
-			ccs_cache_level = src_cache_level;
+			ccs_pat_index = src_pat_index;
 			ccs_is_src = true;
 		}
 
@@ -773,7 +778,7 @@ intel_context_migrate_copy(struct intel_context *ce,
 		src_sz = calculate_chunk_sz(i915, src_is_lmem,
 					    bytes_to_cpy, ccs_bytes_to_cpy);
 
-		len = emit_pte(rq, &it_src, src_cache_level, src_is_lmem,
+		len = emit_pte(rq, &it_src, src_pat_index, src_is_lmem,
 			       src_offset, src_sz);
 		if (!len) {
 			err = -EINVAL;
@@ -784,7 +789,7 @@ intel_context_migrate_copy(struct intel_context *ce,
 			goto out_rq;
 		}
 
-		err = emit_pte(rq, &it_dst, dst_cache_level, dst_is_lmem,
+		err = emit_pte(rq, &it_dst, dst_pat_index, dst_is_lmem,
 			       dst_offset, len);
 		if (err < 0)
 			goto out_rq;
@@ -811,7 +816,7 @@ intel_context_migrate_copy(struct intel_context *ce,
 				goto out_rq;
 
 			ccs_sz = GET_CCS_BYTES(i915, len);
-			err = emit_pte(rq, &it_ccs, ccs_cache_level, false,
+			err = emit_pte(rq, &it_ccs, ccs_pat_index, false,
 				       ccs_is_src ? src_offset : dst_offset,
 				       ccs_sz);
 			if (err < 0)
@@ -979,7 +984,7 @@ int
 intel_context_migrate_clear(struct intel_context *ce,
 			    const struct i915_deps *deps,
 			    struct scatterlist *sg,
-			    enum i915_cache_level cache_level,
+			    unsigned int pat_index,
 			    bool is_lmem,
 			    u32 value,
 			    struct i915_request **out)
@@ -1027,7 +1032,7 @@ intel_context_migrate_clear(struct intel_context *ce,
 		if (err)
 			goto out_rq;
 
-		len = emit_pte(rq, &it, cache_level, is_lmem, offset, CHUNK_SZ);
+		len = emit_pte(rq, &it, pat_index, is_lmem, offset, CHUNK_SZ);
 		if (len <= 0) {
 			err = len;
 			goto out_rq;
@@ -1074,10 +1079,10 @@ int intel_migrate_copy(struct intel_migrate *m,
 		       struct i915_gem_ww_ctx *ww,
 		       const struct i915_deps *deps,
 		       struct scatterlist *src,
-		       enum i915_cache_level src_cache_level,
+		       unsigned int src_pat_index,
 		       bool src_is_lmem,
 		       struct scatterlist *dst,
-		       enum i915_cache_level dst_cache_level,
+		       unsigned int dst_pat_index,
 		       bool dst_is_lmem,
 		       struct i915_request **out)
 {
@@ -1098,8 +1103,8 @@ int intel_migrate_copy(struct intel_migrate *m,
 		goto out;
 
 	err = intel_context_migrate_copy(ce, deps,
-					 src, src_cache_level, src_is_lmem,
-					 dst, dst_cache_level, dst_is_lmem,
+					 src, src_pat_index, src_is_lmem,
+					 dst, dst_pat_index, dst_is_lmem,
 					 out);
 
 	intel_context_unpin(ce);
@@ -1113,7 +1118,7 @@ intel_migrate_clear(struct intel_migrate *m,
 		    struct i915_gem_ww_ctx *ww,
 		    const struct i915_deps *deps,
 		    struct scatterlist *sg,
-		    enum i915_cache_level cache_level,
+		    unsigned int pat_index,
 		    bool is_lmem,
 		    u32 value,
 		    struct i915_request **out)
@@ -1134,7 +1139,7 @@ intel_migrate_clear(struct intel_migrate *m,
 	if (err)
 		goto out;
 
-	err = intel_context_migrate_clear(ce, deps, sg, cache_level,
+	err = intel_context_migrate_clear(ce, deps, sg, pat_index,
 					  is_lmem, value, out);
 
 	intel_context_unpin(ce);
diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.h b/drivers/gpu/drm/i915/gt/intel_migrate.h
index ccc677ec4aa3..11fc09a00c4b 100644
--- a/drivers/gpu/drm/i915/gt/intel_migrate.h
+++ b/drivers/gpu/drm/i915/gt/intel_migrate.h
@@ -16,7 +16,6 @@ struct i915_request;
 struct i915_gem_ww_ctx;
 struct intel_gt;
 struct scatterlist;
-enum i915_cache_level;
 
 int intel_migrate_init(struct intel_migrate *m, struct intel_gt *gt);
 
@@ -26,20 +25,20 @@ int intel_migrate_copy(struct intel_migrate *m,
 		       struct i915_gem_ww_ctx *ww,
 		       const struct i915_deps *deps,
 		       struct scatterlist *src,
-		       enum i915_cache_level src_cache_level,
+		       unsigned int src_pat_index,
 		       bool src_is_lmem,
 		       struct scatterlist *dst,
-		       enum i915_cache_level dst_cache_level,
+		       unsigned int dst_pat_index,
 		       bool dst_is_lmem,
 		       struct i915_request **out);
 
 int intel_context_migrate_copy(struct intel_context *ce,
 			       const struct i915_deps *deps,
 			       struct scatterlist *src,
-			       enum i915_cache_level src_cache_level,
+			       unsigned int src_pat_index,
 			       bool src_is_lmem,
 			       struct scatterlist *dst,
-			       enum i915_cache_level dst_cache_level,
+			       unsigned int dst_pat_index,
 			       bool dst_is_lmem,
 			       struct i915_request **out);
 
@@ -48,7 +47,7 @@ intel_migrate_clear(struct intel_migrate *m,
 		    struct i915_gem_ww_ctx *ww,
 		    const struct i915_deps *deps,
 		    struct scatterlist *sg,
-		    enum i915_cache_level cache_level,
+		    unsigned int pat_index,
 		    bool is_lmem,
 		    u32 value,
 		    struct i915_request **out);
@@ -56,7 +55,7 @@ int
 intel_context_migrate_clear(struct intel_context *ce,
 			    const struct i915_deps *deps,
 			    struct scatterlist *sg,
-			    enum i915_cache_level cache_level,
+			    unsigned int pat_index,
 			    bool is_lmem,
 			    u32 value,
 			    struct i915_request **out);
diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
index 7ecfa672f738..f0da3555c6db 100644
--- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
@@ -98,7 +98,7 @@ void
 __set_pd_entry(struct i915_page_directory * const pd,
 	       const unsigned short idx,
 	       struct i915_page_table * const to,
-	       u64 (*encode)(const dma_addr_t, const enum i915_cache_level))
+	       u64 (*encode)(const dma_addr_t, const unsigned int))
 {
 	/* Each thread pre-pins the pd, and we may have a thread per pde. */
 	GEM_BUG_ON(atomic_read(px_used(pd)) > NALLOC * I915_PDES);
@@ -181,7 +181,7 @@ struct i915_ppgtt *i915_ppgtt_create(struct intel_gt *gt,
 void ppgtt_bind_vma(struct i915_address_space *vm,
 		    struct i915_vm_pt_stash *stash,
 		    struct i915_vma_resource *vma_res,
-		    enum i915_cache_level cache_level,
+		    unsigned int pat_index,
 		    u32 flags)
 {
 	u32 pte_flags;
@@ -199,7 +199,7 @@ void ppgtt_bind_vma(struct i915_address_space *vm,
 	if (vma_res->bi.lmem)
 		pte_flags |= PTE_LM;
 
-	vm->insert_entries(vm, vma_res, cache_level, pte_flags);
+	vm->insert_entries(vm, vma_res, pat_index, pte_flags);
 	wmb();
 }
 
diff --git a/drivers/gpu/drm/i915/gt/selftest_migrate.c b/drivers/gpu/drm/i915/gt/selftest_migrate.c
index e677f2da093d..3def5ca72dec 100644
--- a/drivers/gpu/drm/i915/gt/selftest_migrate.c
+++ b/drivers/gpu/drm/i915/gt/selftest_migrate.c
@@ -137,7 +137,7 @@ static int copy(struct intel_migrate *migrate,
 static int intel_context_copy_ccs(struct intel_context *ce,
 				  const struct i915_deps *deps,
 				  struct scatterlist *sg,
-				  enum i915_cache_level cache_level,
+				  unsigned int pat_index,
 				  bool write_to_ccs,
 				  struct i915_request **out)
 {
@@ -185,7 +185,7 @@ static int intel_context_copy_ccs(struct intel_context *ce,
 		if (err)
 			goto out_rq;
 
-		len = emit_pte(rq, &it, cache_level, true, offset, CHUNK_SZ);
+		len = emit_pte(rq, &it, pat_index, true, offset, CHUNK_SZ);
 		if (len <= 0) {
 			err = len;
 			goto out_rq;
@@ -223,7 +223,7 @@ intel_migrate_ccs_copy(struct intel_migrate *m,
 		       struct i915_gem_ww_ctx *ww,
 		       const struct i915_deps *deps,
 		       struct scatterlist *sg,
-		       enum i915_cache_level cache_level,
+		       unsigned int pat_index,
 		       bool write_to_ccs,
 		       struct i915_request **out)
 {
@@ -243,7 +243,7 @@ intel_migrate_ccs_copy(struct intel_migrate *m,
 	if (err)
 		goto out;
 
-	err = intel_context_copy_ccs(ce, deps, sg, cache_level,
+	err = intel_context_copy_ccs(ce, deps, sg, pat_index,
 				     write_to_ccs, out);
 
 	intel_context_unpin(ce);
@@ -300,7 +300,7 @@ static int clear(struct intel_migrate *migrate,
 			/* Write the obj data into ccs surface */
 			err = intel_migrate_ccs_copy(migrate, &ww, NULL,
 						     obj->mm.pages->sgl,
-						     obj->cache_level,
+						     obj->pat_index,
 						     true, &rq);
 			if (rq && !err) {
 				if (i915_request_wait(rq, 0, HZ) < 0) {
@@ -351,7 +351,7 @@ static int clear(struct intel_migrate *migrate,
 
 			err = intel_migrate_ccs_copy(migrate, &ww, NULL,
 						     obj->mm.pages->sgl,
-						     obj->cache_level,
+						     obj->pat_index,
 						     false, &rq);
 			if (rq && !err) {
 				if (i915_request_wait(rq, 0, HZ) < 0) {
@@ -414,9 +414,9 @@ static int __migrate_copy(struct intel_migrate *migrate,
 			  struct i915_request **out)
 {
 	return intel_migrate_copy(migrate, ww, NULL,
-				  src->mm.pages->sgl, src->cache_level,
+				  src->mm.pages->sgl, src->pat_index,
 				  i915_gem_object_is_lmem(src),
-				  dst->mm.pages->sgl, dst->cache_level,
+				  dst->mm.pages->sgl, dst->pat_index,
 				  i915_gem_object_is_lmem(dst),
 				  out);
 }
@@ -428,9 +428,9 @@ static int __global_copy(struct intel_migrate *migrate,
 			 struct i915_request **out)
 {
 	return intel_context_migrate_copy(migrate->context, NULL,
-					  src->mm.pages->sgl, src->cache_level,
+					  src->mm.pages->sgl, src->pat_index,
 					  i915_gem_object_is_lmem(src),
-					  dst->mm.pages->sgl, dst->cache_level,
+					  dst->mm.pages->sgl, dst->pat_index,
 					  i915_gem_object_is_lmem(dst),
 					  out);
 }
@@ -455,7 +455,7 @@ static int __migrate_clear(struct intel_migrate *migrate,
 {
 	return intel_migrate_clear(migrate, ww, NULL,
 				   obj->mm.pages->sgl,
-				   obj->cache_level,
+				   obj->pat_index,
 				   i915_gem_object_is_lmem(obj),
 				   value, out);
 }
@@ -468,7 +468,7 @@ static int __global_clear(struct intel_migrate *migrate,
 {
 	return intel_context_migrate_clear(migrate->context, NULL,
 					   obj->mm.pages->sgl,
-					   obj->cache_level,
+					   obj->pat_index,
 					   i915_gem_object_is_lmem(obj),
 					   value, out);
 }
@@ -648,7 +648,7 @@ static int live_emit_pte_full_ring(void *arg)
 	 */
 	pr_info("%s emite_pte ring space=%u\n", __func__, rq->ring->space);
 	it = sg_sgt(obj->mm.pages->sgl);
-	len = emit_pte(rq, &it, obj->cache_level, false, 0, CHUNK_SZ);
+	len = emit_pte(rq, &it, obj->pat_index, false, 0, CHUNK_SZ);
 	if (!len) {
 		err = -EINVAL;
 		goto out_rq;
@@ -844,7 +844,7 @@ static int wrap_ktime_compare(const void *A, const void *B)
 
 static int __perf_clear_blt(struct intel_context *ce,
 			    struct scatterlist *sg,
-			    enum i915_cache_level cache_level,
+			    unsigned int pat_index,
 			    bool is_lmem,
 			    size_t sz)
 {
@@ -858,7 +858,7 @@ static int __perf_clear_blt(struct intel_context *ce,
 
 		t0 = ktime_get();
 
-		err = intel_context_migrate_clear(ce, NULL, sg, cache_level,
+		err = intel_context_migrate_clear(ce, NULL, sg, pat_index,
 						  is_lmem, 0, &rq);
 		if (rq) {
 			if (i915_request_wait(rq, 0, MAX_SCHEDULE_TIMEOUT) < 0)
@@ -904,7 +904,8 @@ static int perf_clear_blt(void *arg)
 
 		err = __perf_clear_blt(gt->migrate.context,
 				       dst->mm.pages->sgl,
-				       I915_CACHE_NONE,
+				       i915_gem_get_pat_index(gt->i915,
+							      I915_CACHE_NONE),
 				       i915_gem_object_is_lmem(dst),
 				       sizes[i]);
 
@@ -919,10 +920,10 @@ static int perf_clear_blt(void *arg)
 
 static int __perf_copy_blt(struct intel_context *ce,
 			   struct scatterlist *src,
-			   enum i915_cache_level src_cache_level,
+			   unsigned int src_pat_index,
 			   bool src_is_lmem,
 			   struct scatterlist *dst,
-			   enum i915_cache_level dst_cache_level,
+			   unsigned int dst_pat_index,
 			   bool dst_is_lmem,
 			   size_t sz)
 {
@@ -937,9 +938,9 @@ static int __perf_copy_blt(struct intel_context *ce,
 		t0 = ktime_get();
 
 		err = intel_context_migrate_copy(ce, NULL,
-						 src, src_cache_level,
+						 src, src_pat_index,
 						 src_is_lmem,
-						 dst, dst_cache_level,
+						 dst, dst_pat_index,
 						 dst_is_lmem,
 						 &rq);
 		if (rq) {
@@ -994,10 +995,12 @@ static int perf_copy_blt(void *arg)
 
 		err = __perf_copy_blt(gt->migrate.context,
 				      src->mm.pages->sgl,
-				      I915_CACHE_NONE,
+				      i915_gem_get_pat_index(gt->i915,
+							     I915_CACHE_NONE),
 				      i915_gem_object_is_lmem(src),
 				      dst->mm.pages->sgl,
-				      I915_CACHE_NONE,
+				      i915_gem_get_pat_index(gt->i915,
+							     I915_CACHE_NONE),
 				      i915_gem_object_is_lmem(dst),
 				      sz);
 
diff --git a/drivers/gpu/drm/i915/gt/selftest_reset.c b/drivers/gpu/drm/i915/gt/selftest_reset.c
index a9e0a91bc0e0..79aa6ac66ad2 100644
--- a/drivers/gpu/drm/i915/gt/selftest_reset.c
+++ b/drivers/gpu/drm/i915/gt/selftest_reset.c
@@ -86,7 +86,9 @@ __igt_reset_stolen(struct intel_gt *gt,
 
 		ggtt->vm.insert_page(&ggtt->vm, dma,
 				     ggtt->error_capture.start,
-				     I915_CACHE_NONE, 0);
+				     i915_gem_get_pat_index(gt->i915,
+							    I915_CACHE_NONE),
+				     0);
 		mb();
 
 		s = io_mapping_map_wc(&ggtt->iomap,
@@ -127,7 +129,9 @@ __igt_reset_stolen(struct intel_gt *gt,
 
 		ggtt->vm.insert_page(&ggtt->vm, dma,
 				     ggtt->error_capture.start,
-				     I915_CACHE_NONE, 0);
+				     i915_gem_get_pat_index(gt->i915,
+							    I915_CACHE_NONE),
+				     0);
 		mb();
 
 		s = io_mapping_map_wc(&ggtt->iomap,
diff --git a/drivers/gpu/drm/i915/gt/selftest_timeline.c b/drivers/gpu/drm/i915/gt/selftest_timeline.c
index 9f536c251179..39c3ec12df1a 100644
--- a/drivers/gpu/drm/i915/gt/selftest_timeline.c
+++ b/drivers/gpu/drm/i915/gt/selftest_timeline.c
@@ -836,7 +836,7 @@ static int setup_watcher(struct hwsp_watcher *w, struct intel_gt *gt,
 		return PTR_ERR(obj);
 
 	/* keep the same cache settings as timeline */
-	i915_gem_object_set_cache_coherency(obj, tl->hwsp_ggtt->obj->cache_level);
+	i915_gem_object_set_pat_index(obj, tl->hwsp_ggtt->obj->pat_index);
 	w->map = i915_gem_object_pin_map_unlocked(obj,
 						  page_unmask_bits(tl->hwsp_ggtt->obj->mm.mapping));
 	if (IS_ERR(w->map)) {
diff --git a/drivers/gpu/drm/i915/gt/selftest_tlb.c b/drivers/gpu/drm/i915/gt/selftest_tlb.c
index e6cac1f15d6e..4493c8518e91 100644
--- a/drivers/gpu/drm/i915/gt/selftest_tlb.c
+++ b/drivers/gpu/drm/i915/gt/selftest_tlb.c
@@ -36,6 +36,8 @@ pte_tlbinv(struct intel_context *ce,
 	   u64 length,
 	   struct rnd_state *prng)
 {
+	const unsigned int pat_index =
+		i915_gem_get_pat_index(ce->vm->i915, I915_CACHE_NONE);
 	struct drm_i915_gem_object *batch;
 	struct drm_mm_node vb_node;
 	struct i915_request *rq;
@@ -155,7 +157,7 @@ pte_tlbinv(struct intel_context *ce,
 		/* Flip the PTE between A and B */
 		if (i915_gem_object_is_lmem(vb->obj))
 			pte_flags |= PTE_LM;
-		ce->vm->insert_entries(ce->vm, &vb_res, 0, pte_flags);
+		ce->vm->insert_entries(ce->vm, &vb_res, pat_index, pte_flags);
 
 		/* Flush the PTE update to concurrent HW */
 		tlbinv(ce->vm, addr & -length, length);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
index 264c952f777b..31182915f3d2 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
@@ -876,9 +876,15 @@ static void uc_fw_bind_ggtt(struct intel_uc_fw *uc_fw)
 		pte_flags |= PTE_LM;
 
 	if (ggtt->vm.raw_insert_entries)
-		ggtt->vm.raw_insert_entries(&ggtt->vm, dummy, I915_CACHE_NONE, pte_flags);
+		ggtt->vm.raw_insert_entries(&ggtt->vm, dummy,
+					    i915_gem_get_pat_index(ggtt->vm.i915,
+								   I915_CACHE_NONE),
+					    pte_flags);
 	else
-		ggtt->vm.insert_entries(&ggtt->vm, dummy, I915_CACHE_NONE, pte_flags);
+		ggtt->vm.insert_entries(&ggtt->vm, dummy,
+					i915_gem_get_pat_index(ggtt->vm.i915,
+							       I915_CACHE_NONE),
+					pte_flags);
 }
 
 static void uc_fw_unbind_ggtt(struct intel_uc_fw *uc_fw)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 80c2bf98e341..1c407d59ff3d 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -138,21 +138,56 @@ static const char *stringify_vma_type(const struct i915_vma *vma)
 	return "ppgtt";
 }
 
-static const char *i915_cache_level_str(struct drm_i915_private *i915, int type)
-{
-	switch (type) {
-	case I915_CACHE_NONE: return " uncached";
-	case I915_CACHE_LLC: return HAS_LLC(i915) ? " LLC" : " snooped";
-	case I915_CACHE_L3_LLC: return " L3+LLC";
-	case I915_CACHE_WT: return " WT";
-	default: return "";
+static const char *i915_cache_level_str(struct drm_i915_gem_object *obj)
+{
+	struct drm_i915_private *i915 = obj_to_i915(obj);
+
+	if (IS_METEORLAKE(i915)) {
+		switch (obj->pat_index) {
+		case 0: return " WB";
+		case 1: return " WT";
+		case 2: return " UC";
+		case 3: return " WB (1-Way Coh)";
+		case 4: return " WB (2-Way Coh)";
+		default: return " not defined";
+		}
+	} else if (IS_PONTEVECCHIO(i915)) {
+		switch (obj->pat_index) {
+		case 0: return " UC";
+		case 1: return " WC";
+		case 2: return " WT";
+		case 3: return " WB";
+		case 4: return " WT (CLOS1)";
+		case 5: return " WB (CLOS1)";
+		case 6: return " WT (CLOS2)";
+		case 7: return " WT (CLOS2)";
+		default: return " not defined";
+		}
+	} else if (GRAPHICS_VER(i915) >= 12) {
+		switch (obj->pat_index) {
+		case 0: return " WB";
+		case 1: return " WC";
+		case 2: return " WT";
+		case 3: return " UC";
+		default: return " not defined";
+		}
+	} else {
+		if (i915_gem_object_has_cache_level(obj, I915_CACHE_NONE))
+			return " uncached";
+		else if (i915_gem_object_has_cache_level(obj, I915_CACHE_LLC))
+			return HAS_LLC(i915) ? " LLC" : " snooped";
+		else if (i915_gem_object_has_cache_level(obj, I915_CACHE_L3_LLC))
+			return " L3+LLC";
+		else if (i915_gem_object_has_cache_level(obj, I915_CACHE_WT))
+			return " WT";
+		else
+			return " not defined";
 	}
 }
 
 void
 i915_debugfs_describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 {
-	struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
 	struct i915_vma *vma;
 	int pin_count = 0;
 
@@ -164,7 +199,7 @@ i915_debugfs_describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 		   obj->base.size / 1024,
 		   obj->read_domains,
 		   obj->write_domain,
-		   i915_cache_level_str(dev_priv, obj->cache_level),
+		   i915_cache_level_str(obj),
 		   obj->mm.dirty ? " dirty" : "",
 		   obj->mm.madv == I915_MADV_DONTNEED ? " purgeable" : "");
 	if (obj->base.name)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 2ba922fbbd5f..fbeddf81e729 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -420,8 +420,12 @@ i915_gem_gtt_pread(struct drm_i915_gem_object *obj,
 		page_length = remain < page_length ? remain : page_length;
 		if (drm_mm_node_allocated(&node)) {
 			ggtt->vm.insert_page(&ggtt->vm,
-					     i915_gem_object_get_dma_address(obj, offset >> PAGE_SHIFT),
-					     node.start, I915_CACHE_NONE, 0);
+					i915_gem_object_get_dma_address(obj,
+							offset >> PAGE_SHIFT),
+					node.start,
+					i915_gem_get_pat_index(i915,
+							       I915_CACHE_NONE),
+					0);
 		} else {
 			page_base += offset & PAGE_MASK;
 		}
@@ -598,8 +602,12 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj,
 			/* flush the write before we modify the GGTT */
 			intel_gt_flush_ggtt_writes(ggtt->vm.gt);
 			ggtt->vm.insert_page(&ggtt->vm,
-					     i915_gem_object_get_dma_address(obj, offset >> PAGE_SHIFT),
-					     node.start, I915_CACHE_NONE, 0);
+					i915_gem_object_get_dma_address(obj,
+							offset >> PAGE_SHIFT),
+					node.start,
+					i915_gem_get_pat_index(i915,
+							       I915_CACHE_NONE),
+					0);
 			wmb(); /* flush modifications to the GGTT (insert_page) */
 		} else {
 			page_base += offset & PAGE_MASK;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index f020c0086fbc..54f17ba3b03c 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1117,10 +1117,14 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 			mutex_lock(&ggtt->error_mutex);
 			if (ggtt->vm.raw_insert_page)
 				ggtt->vm.raw_insert_page(&ggtt->vm, dma, slot,
-							 I915_CACHE_NONE, 0);
+						i915_gem_get_pat_index(gt->i915,
+							I915_CACHE_NONE),
+						0);
 			else
 				ggtt->vm.insert_page(&ggtt->vm, dma, slot,
-						     I915_CACHE_NONE, 0);
+						i915_gem_get_pat_index(gt->i915,
+							I915_CACHE_NONE),
+						0);
 			mb();
 
 			s = io_mapping_map_wc(&ggtt->iomap, slot, PAGE_SIZE);
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index f51fd9fd4c89..e5f5368b175f 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -315,7 +315,7 @@ struct i915_vma_work {
 	struct i915_vma_resource *vma_res;
 	struct drm_i915_gem_object *obj;
 	struct i915_sw_dma_fence_cb cb;
-	enum i915_cache_level cache_level;
+	unsigned int pat_index;
 	unsigned int flags;
 };
 
@@ -334,7 +334,7 @@ static void __vma_bind(struct dma_fence_work *work)
 		return;
 
 	vma_res->ops->bind_vma(vma_res->vm, &vw->stash,
-			       vma_res, vw->cache_level, vw->flags);
+			       vma_res, vw->pat_index, vw->flags);
 }
 
 static void __vma_release(struct dma_fence_work *work)
@@ -426,7 +426,7 @@ i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
 /**
  * i915_vma_bind - Sets up PTEs for an VMA in it's corresponding address space.
  * @vma: VMA to map
- * @cache_level: mapping cache level
+ * @pat_index: PAT index to set in PTE
  * @flags: flags like global or local mapping
  * @work: preallocated worker for allocating and binding the PTE
  * @vma_res: pointer to a preallocated vma resource. The resource is either
@@ -437,7 +437,7 @@ i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
  * Note that DMA addresses are also the only part of the SG table we care about.
  */
 int i915_vma_bind(struct i915_vma *vma,
-		  enum i915_cache_level cache_level,
+		  unsigned int pat_index,
 		  u32 flags,
 		  struct i915_vma_work *work,
 		  struct i915_vma_resource *vma_res)
@@ -507,7 +507,7 @@ int i915_vma_bind(struct i915_vma *vma,
 		struct dma_fence *prev;
 
 		work->vma_res = i915_vma_resource_get(vma->resource);
-		work->cache_level = cache_level;
+		work->pat_index = pat_index;
 		work->flags = bind_flags;
 
 		/*
@@ -537,7 +537,7 @@ int i915_vma_bind(struct i915_vma *vma,
 
 			return ret;
 		}
-		vma->ops->bind_vma(vma->vm, NULL, vma->resource, cache_level,
+		vma->ops->bind_vma(vma->vm, NULL, vma->resource, pat_index,
 				   bind_flags);
 	}
 
@@ -813,7 +813,7 @@ i915_vma_insert(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 	color = 0;
 
 	if (i915_vm_has_cache_coloring(vma->vm))
-		color = vma->obj->cache_level;
+		color = vma->obj->pat_index;
 
 	if (flags & PIN_OFFSET_FIXED) {
 		u64 offset = flags & PIN_OFFSET_MASK;
@@ -1517,7 +1517,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 
 	GEM_BUG_ON(!vma->pages);
 	err = i915_vma_bind(vma,
-			    vma->obj->cache_level,
+			    vma->obj->pat_index,
 			    flags, work, vma_res);
 	vma_res = NULL;
 	if (err)
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index ed5c9d682a1b..31a8f8aa5558 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -250,7 +250,7 @@ i915_vma_compare(struct i915_vma *vma,
 
 struct i915_vma_work *i915_vma_work(void);
 int i915_vma_bind(struct i915_vma *vma,
-		  enum i915_cache_level cache_level,
+		  unsigned int pat_index,
 		  u32 flags,
 		  struct i915_vma_work *work,
 		  struct i915_vma_resource *vma_res);
diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
index 77fda2244d16..64472b7f0e77 100644
--- a/drivers/gpu/drm/i915/i915_vma_types.h
+++ b/drivers/gpu/drm/i915/i915_vma_types.h
@@ -32,8 +32,6 @@
 
 #include "gem/i915_gem_object_types.h"
 
-enum i915_cache_level;
-
 /**
  * DOC: Global GTT views
  *
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem.c b/drivers/gpu/drm/i915/selftests/i915_gem.c
index d91d0ade8abd..bde981a8f23f 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem.c
@@ -57,7 +57,10 @@ static void trash_stolen(struct drm_i915_private *i915)
 		u32 __iomem *s;
 		int x;
 
-		ggtt->vm.insert_page(&ggtt->vm, dma, slot, I915_CACHE_NONE, 0);
+		ggtt->vm.insert_page(&ggtt->vm, dma, slot,
+				     i915_gem_get_pat_index(i915,
+							I915_CACHE_NONE),
+				     0);
 
 		s = io_mapping_map_atomic_wc(&ggtt->iomap, slot);
 		for (x = 0; x < PAGE_SIZE / sizeof(u32); x++) {
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
index 37068542aafe..f13a4d265814 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
@@ -245,7 +245,7 @@ static int igt_evict_for_cache_color(void *arg)
 	struct drm_mm_node target = {
 		.start = I915_GTT_PAGE_SIZE * 2,
 		.size = I915_GTT_PAGE_SIZE,
-		.color = I915_CACHE_LLC,
+		.color = i915_gem_get_pat_index(gt->i915, I915_CACHE_LLC),
 	};
 	struct drm_i915_gem_object *obj;
 	struct i915_vma *vma;
@@ -308,7 +308,7 @@ static int igt_evict_for_cache_color(void *arg)
 	/* Attempt to remove the first *pinned* vma, by removing the (empty)
 	 * neighbour -- this should fail.
 	 */
-	target.color = I915_CACHE_L3_LLC;
+	target.color = i915_gem_get_pat_index(gt->i915, I915_CACHE_L3_LLC);
 
 	mutex_lock(&ggtt->vm.mutex);
 	err = i915_gem_evict_for_node(&ggtt->vm, NULL, &target, 0);
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 5361ce70d3f2..0b6350eb4dad 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -133,7 +133,7 @@ fake_dma_object(struct drm_i915_private *i915, u64 size)
 
 	obj->write_domain = I915_GEM_DOMAIN_CPU;
 	obj->read_domains = I915_GEM_DOMAIN_CPU;
-	obj->cache_level = I915_CACHE_NONE;
+	obj->pat_index = i915_gem_get_pat_index(i915, I915_CACHE_NONE);
 
 	/* Preallocate the "backing storage" */
 	if (i915_gem_object_pin_pages_unlocked(obj))
@@ -357,7 +357,9 @@ static int lowlevel_hole(struct i915_address_space *vm,
 
 			with_intel_runtime_pm(vm->gt->uncore->rpm, wakeref)
 			  vm->insert_entries(vm, mock_vma_res,
-						   I915_CACHE_NONE, 0);
+					     i915_gem_get_pat_index(vm->i915,
+						     I915_CACHE_NONE),
+					     0);
 		}
 		count = n;
 
@@ -1375,7 +1377,10 @@ static int igt_ggtt_page(void *arg)
 
 		ggtt->vm.insert_page(&ggtt->vm,
 				     i915_gem_object_get_dma_address(obj, 0),
-				     offset, I915_CACHE_NONE, 0);
+				     offset,
+				     i915_gem_get_pat_index(i915,
+					                    I915_CACHE_NONE),
+				     0);
 	}
 
 	order = i915_random_order(count, &prng);
@@ -1508,7 +1513,7 @@ static int reserve_gtt_with_resource(struct i915_vma *vma, u64 offset)
 	mutex_lock(&vm->mutex);
 	err = i915_gem_gtt_reserve(vm, NULL, &vma->node, obj->base.size,
 				   offset,
-				   obj->cache_level,
+				   obj->pat_index,
 				   0);
 	if (!err) {
 		i915_vma_resource_init_from_vma(vma_res, vma);
@@ -1688,7 +1693,7 @@ static int insert_gtt_with_resource(struct i915_vma *vma)
 
 	mutex_lock(&vm->mutex);
 	err = i915_gem_gtt_insert(vm, NULL, &vma->node, obj->base.size, 0,
-				  obj->cache_level, 0, vm->total, 0);
+				  obj->pat_index, 0, vm->total, 0);
 	if (!err) {
 		i915_vma_resource_init_from_vma(vma_res, vma);
 		vma->resource = vma_res;
diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
index 3b18e5905c86..cce180114d0c 100644
--- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
@@ -1070,7 +1070,9 @@ static int igt_lmem_write_cpu(void *arg)
 	/* Put the pages into a known state -- from the gpu for added fun */
 	intel_engine_pm_get(engine);
 	err = intel_context_migrate_clear(engine->gt->migrate.context, NULL,
-					  obj->mm.pages->sgl, I915_CACHE_NONE,
+					  obj->mm.pages->sgl,
+					  i915_gem_get_pat_index(i915,
+							I915_CACHE_NONE),
 					  true, 0xdeadbeaf, &rq);
 	if (rq) {
 		dma_resv_add_fence(obj->base.resv, &rq->fence,
diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c b/drivers/gpu/drm/i915/selftests/mock_gtt.c
index ece97e4faacb..a516c0aa88fd 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c
@@ -27,21 +27,21 @@
 static void mock_insert_page(struct i915_address_space *vm,
 			     dma_addr_t addr,
 			     u64 offset,
-			     enum i915_cache_level level,
+			     unsigned int pat_index,
 			     u32 flags)
 {
 }
 
 static void mock_insert_entries(struct i915_address_space *vm,
 				struct i915_vma_resource *vma_res,
-				enum i915_cache_level level, u32 flags)
+				unsigned int pat_index, u32 flags)
 {
 }
 
 static void mock_bind_ppgtt(struct i915_address_space *vm,
 			    struct i915_vm_pt_stash *stash,
 			    struct i915_vma_resource *vma_res,
-			    enum i915_cache_level cache_level,
+			    unsigned int pat_index,
 			    u32 flags)
 {
 	GEM_BUG_ON(flags & I915_VMA_GLOBAL_BIND);
@@ -94,7 +94,7 @@ struct i915_ppgtt *mock_ppgtt(struct drm_i915_private *i915, const char *name)
 static void mock_bind_ggtt(struct i915_address_space *vm,
 			   struct i915_vm_pt_stash *stash,
 			   struct i915_vma_resource *vma_res,
-			   enum i915_cache_level cache_level,
+			   unsigned int pat_index,
 			   u32 flags)
 {
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Intel-gfx] [PATCH 6/7] drm/i915: make sure correct pte encode is used
  2023-04-01  6:38 [Intel-gfx] [PATCH 0/7] drm/i915/mtl: Define MOCS and PAT tables for MTL fei.yang
                   ` (4 preceding siblings ...)
  2023-04-01  6:38 ` [Intel-gfx] [PATCH 5/7] drm/i915: use pat_index instead of cache_level fei.yang
@ 2023-04-01  6:38 ` fei.yang
  2023-04-01  6:38 ` [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation fei.yang
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 35+ messages in thread
From: fei.yang @ 2023-04-01  6:38 UTC (permalink / raw)
  To: intel-gfx; +Cc: Matt Roper, Chris Wilson, dri-devel

From: Fei Yang <fei.yang@intel.com>

PTE encode is platform dependent. After replacing cache_level with
pat_index, the newly introduced mtl_pte_encode is actually generic
for all gen12 platforms, thus rename it to gen12_pte_encode and
apply it to all gen12 platforms.

Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Fei Yang <fei.yang@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index f76ec2cb29ef..e393e20b5894 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -60,7 +60,7 @@ static u64 gen8_pte_encode(dma_addr_t addr,
 	return pte;
 }
 
-static u64 mtl_pte_encode(dma_addr_t addr,
+static u64 gen12_pte_encode(dma_addr_t addr,
 			  unsigned int pat_index,
 			  u32 flags)
 {
@@ -999,8 +999,8 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
 	 */
 	ppgtt->vm.alloc_scratch_dma = alloc_pt_dma;
 
-	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 70))
-		ppgtt->vm.pte_encode = mtl_pte_encode;
+	if (GRAPHICS_VER(gt->i915) >= 12)
+		ppgtt->vm.pte_encode = gen12_pte_encode;
 	else
 		ppgtt->vm.pte_encode = gen8_pte_encode;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation
  2023-04-01  6:38 [Intel-gfx] [PATCH 0/7] drm/i915/mtl: Define MOCS and PAT tables for MTL fei.yang
                   ` (5 preceding siblings ...)
  2023-04-01  6:38 ` [Intel-gfx] [PATCH 6/7] drm/i915: make sure correct pte encode is used fei.yang
@ 2023-04-01  6:38 ` fei.yang
  2023-04-03 16:02   ` Ville Syrjälä
                     ` (2 more replies)
  2023-04-01  7:03 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/mtl: Define MOCS and PAT tables for MTL Patchwork
                   ` (2 subsequent siblings)
  9 siblings, 3 replies; 35+ messages in thread
From: fei.yang @ 2023-04-01  6:38 UTC (permalink / raw)
  To: intel-gfx; +Cc: Matt Roper, Chris Wilson, dri-devel

From: Fei Yang <fei.yang@intel.com>

To comply with the design that buffer objects shall have immutable
cache setting through out its life cycle, {set, get}_caching ioctl's
are no longer supported from MTL onward. With that change caching
policy can only be set at object creation time. The current code
applies a default (platform dependent) cache setting for all objects.
However this is not optimal for performance tuning. The patch extends
the existing gem_create uAPI to let user set PAT index for the object
at creation time.
The new extension is platform independent, so UMD's can switch to using
this extension for older platforms as well, while {set, get}_caching are
still supported on these legacy paltforms for compatibility reason.

Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Fei Yang <fei.yang@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_create.c | 33 ++++++++++++++++++++
 include/uapi/drm/i915_drm.h                | 36 ++++++++++++++++++++++
 tools/include/uapi/drm/i915_drm.h          | 36 ++++++++++++++++++++++
 3 files changed, 105 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c b/drivers/gpu/drm/i915/gem/i915_gem_create.c
index e76c9703680e..1c6e2034d28e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -244,6 +244,7 @@ struct create_ext {
 	unsigned int n_placements;
 	unsigned int placement_mask;
 	unsigned long flags;
+	unsigned int pat_index;
 };
 
 static void repr_placements(char *buf, size_t size,
@@ -393,11 +394,39 @@ static int ext_set_protected(struct i915_user_extension __user *base, void *data
 	return 0;
 }
 
+static int ext_set_pat(struct i915_user_extension __user *base, void *data)
+{
+	struct create_ext *ext_data = data;
+	struct drm_i915_private *i915 = ext_data->i915;
+	struct drm_i915_gem_create_ext_set_pat ext;
+	unsigned int max_pat_index;
+
+	BUILD_BUG_ON(sizeof(struct drm_i915_gem_create_ext_set_pat) !=
+		     offsetofend(struct drm_i915_gem_create_ext_set_pat, rsvd));
+
+	if (copy_from_user(&ext, base, sizeof(ext)))
+		return -EFAULT;
+
+	max_pat_index = INTEL_INFO(i915)->max_pat_index;
+
+	if (ext.pat_index > max_pat_index) {
+		drm_dbg(&i915->drm, "PAT index is invalid: %u\n",
+			ext.pat_index);
+		return -EINVAL;
+	}
+
+	ext_data->pat_index = ext.pat_index;
+
+	return 0;
+}
+
 static const i915_user_extension_fn create_extensions[] = {
 	[I915_GEM_CREATE_EXT_MEMORY_REGIONS] = ext_set_placements,
 	[I915_GEM_CREATE_EXT_PROTECTED_CONTENT] = ext_set_protected,
+	[I915_GEM_CREATE_EXT_SET_PAT] = ext_set_pat,
 };
 
+#define PAT_INDEX_NOT_SET	0xffff
 /**
  * Creates a new mm object and returns a handle to it.
  * @dev: drm device pointer
@@ -417,6 +446,7 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void *data,
 	if (args->flags & ~I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS)
 		return -EINVAL;
 
+	ext_data.pat_index = PAT_INDEX_NOT_SET;
 	ret = i915_user_extensions(u64_to_user_ptr(args->extensions),
 				   create_extensions,
 				   ARRAY_SIZE(create_extensions),
@@ -453,5 +483,8 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void *data,
 	if (IS_ERR(obj))
 		return PTR_ERR(obj);
 
+	if (ext_data.pat_index != PAT_INDEX_NOT_SET)
+		i915_gem_object_set_pat_index(obj, ext_data.pat_index);
+
 	return i915_gem_publish(obj, file, &args->size, &args->handle);
 }
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index dba7c5a5b25e..03c5c314846e 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -3630,9 +3630,13 @@ struct drm_i915_gem_create_ext {
 	 *
 	 * For I915_GEM_CREATE_EXT_PROTECTED_CONTENT usage see
 	 * struct drm_i915_gem_create_ext_protected_content.
+	 *
+	 * For I915_GEM_CREATE_EXT_SET_PAT usage see
+	 * struct drm_i915_gem_create_ext_set_pat.
 	 */
 #define I915_GEM_CREATE_EXT_MEMORY_REGIONS 0
 #define I915_GEM_CREATE_EXT_PROTECTED_CONTENT 1
+#define I915_GEM_CREATE_EXT_SET_PAT 2
 	__u64 extensions;
 };
 
@@ -3747,6 +3751,38 @@ struct drm_i915_gem_create_ext_protected_content {
 	__u32 flags;
 };
 
+/**
+ * struct drm_i915_gem_create_ext_set_pat - The
+ * I915_GEM_CREATE_EXT_SET_PAT extension.
+ *
+ * If this extension is provided, the specified caching policy (PAT index) is
+ * applied to the buffer object.
+ *
+ * Below is an example on how to create an object with specific caching policy:
+ *
+ * .. code-block:: C
+ *
+ *      struct drm_i915_gem_create_ext_set_pat set_pat_ext = {
+ *              .base = { .name = I915_GEM_CREATE_EXT_SET_PAT },
+ *              .pat_index = 0,
+ *      };
+ *      struct drm_i915_gem_create_ext create_ext = {
+ *              .size = PAGE_SIZE,
+ *              .extensions = (uintptr_t)&set_pat_ext,
+ *      };
+ *
+ *      int err = ioctl(fd, DRM_IOCTL_I915_GEM_CREATE_EXT, &create_ext);
+ *      if (err) ...
+ */
+struct drm_i915_gem_create_ext_set_pat {
+	/** @base: Extension link. See struct i915_user_extension. */
+	struct i915_user_extension base;
+	/** @pat_index: PAT index to be set */
+	__u32 pat_index;
+	/** @rsvd: reserved for future use */
+	__u32 rsvd;
+};
+
 /* ID of the protected content session managed by i915 when PXP is active */
 #define I915_PROTECTED_CONTENT_DEFAULT_SESSION 0xf
 
diff --git a/tools/include/uapi/drm/i915_drm.h b/tools/include/uapi/drm/i915_drm.h
index 8df261c5ab9b..8cdcdb5fac26 100644
--- a/tools/include/uapi/drm/i915_drm.h
+++ b/tools/include/uapi/drm/i915_drm.h
@@ -3607,9 +3607,13 @@ struct drm_i915_gem_create_ext {
 	 *
 	 * For I915_GEM_CREATE_EXT_PROTECTED_CONTENT usage see
 	 * struct drm_i915_gem_create_ext_protected_content.
+	 *
+	 * For I915_GEM_CREATE_EXT_SET_PAT usage see
+	 * struct drm_i915_gem_create_ext_set_pat.
 	 */
 #define I915_GEM_CREATE_EXT_MEMORY_REGIONS 0
 #define I915_GEM_CREATE_EXT_PROTECTED_CONTENT 1
+#define I915_GEM_CREATE_EXT_SET_PAT 2
 	__u64 extensions;
 };
 
@@ -3724,6 +3728,38 @@ struct drm_i915_gem_create_ext_protected_content {
 	__u32 flags;
 };
 
+/**
+ * struct drm_i915_gem_create_ext_set_pat - The
+ * I915_GEM_CREATE_EXT_SET_PAT extension.
+ *
+ * If this extension is provided, the specified caching policy (PAT index) is
+ * applied to the buffer object.
+ *
+ * Below is an example on how to create an object with specific caching policy:
+ *
+ * .. code-block:: C
+ *
+ *      struct drm_i915_gem_create_ext_set_pat set_pat_ext = {
+ *              .base = { .name = I915_GEM_CREATE_EXT_SET_PAT },
+ *              .pat_index = 0,
+ *      };
+ *      struct drm_i915_gem_create_ext create_ext = {
+ *              .size = PAGE_SIZE,
+ *              .extensions = (uintptr_t)&set_pat_ext,
+ *      };
+ *
+ *      int err = ioctl(fd, DRM_IOCTL_I915_GEM_CREATE_EXT, &create_ext);
+ *      if (err) ...
+ */
+struct drm_i915_gem_create_ext_set_pat {
+	/** @base: Extension link. See struct i915_user_extension. */
+	struct i915_user_extension base;
+	/** @pat_index: PAT index to be set */
+	__u32 pat_index;
+	/** @rsvd: reserved for future use */
+	__u32 rsvd;
+};
+
 /* ID of the protected content session managed by i915 when PXP is active */
 #define I915_PROTECTED_CONTENT_DEFAULT_SESSION 0xf
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/mtl: Define MOCS and PAT tables for MTL
  2023-04-01  6:38 [Intel-gfx] [PATCH 0/7] drm/i915/mtl: Define MOCS and PAT tables for MTL fei.yang
                   ` (6 preceding siblings ...)
  2023-04-01  6:38 ` [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation fei.yang
@ 2023-04-01  7:03 ` Patchwork
  2023-04-01  7:03 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
  2023-04-01  7:20 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
  9 siblings, 0 replies; 35+ messages in thread
From: Patchwork @ 2023-04-01  7:03 UTC (permalink / raw)
  To: fei.yang; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/mtl: Define MOCS and PAT tables for MTL
URL   : https://patchwork.freedesktop.org/series/115980/
State : warning

== Summary ==

Error: dim checkpatch failed
4e6dff890525 drm/i915/mtl: Define MOCS and PAT tables for MTL
-:156: WARNING:AVOID_BUG: Do not crash the kernel unless it is absolutely unavoidable--use WARN_ON_ONCE() plus recovery code (if feasible) instead of BUG() or variants
#156: FILE: drivers/gpu/drm/i915/gt/intel_ggtt.c:229:
+	GEM_BUG_ON(addr & ~GEN12_GGTT_PTE_ADDR_MASK);

total: 0 errors, 1 warnings, 0 checks, 365 lines checked
fcee17736586 drm/i915/mtl: workaround coherency issue for Media
e18ea467eb36 drm/i915/mtl: end support for set caching ioctl
f424e91e447c drm/i915: preparation for using PAT index
1ec5d5d39c8f drm/i915: use pat_index instead of cache_level
-:22: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#22: 
cached, uncached, or writethrough. For these simple cases, using cache_level

-:637: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#637: FILE: drivers/gpu/drm/i915/gt/gen8_ppgtt.c:878:
+					      i915_gem_get_pat_index(vm->i915,
+							I915_CACHE_NONE));

-:907: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#907: FILE: drivers/gpu/drm/i915/gt/intel_ggtt.c:1303:
+					 i915_gem_get_pat_index(vm->i915,
+							I915_CACHE_NONE),

-:1605: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#1605: FILE: drivers/gpu/drm/i915/i915_gem.c:424:
+					i915_gem_object_get_dma_address(obj,
+							offset >> PAGE_SHIFT),

-:1620: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#1620: FILE: drivers/gpu/drm/i915/i915_gem.c:606:
+					i915_gem_object_get_dma_address(obj,
+							offset >> PAGE_SHIFT),

-:1638: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#1638: FILE: drivers/gpu/drm/i915/i915_gpu_error.c:1121:
+						i915_gem_get_pat_index(gt->i915,
+							I915_CACHE_NONE),

-:1644: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#1644: FILE: drivers/gpu/drm/i915/i915_gpu_error.c:1126:
+						i915_gem_get_pat_index(gt->i915,
+							I915_CACHE_NONE),

-:1762: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#1762: FILE: drivers/gpu/drm/i915/selftests/i915_gem.c:62:
+				     i915_gem_get_pat_index(i915,
+							I915_CACHE_NONE),

-:1808: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#1808: FILE: drivers/gpu/drm/i915/selftests/i915_gem_gtt.c:361:
+					     i915_gem_get_pat_index(vm->i915,
+						     I915_CACHE_NONE),

-:1820: ERROR:CODE_INDENT: code indent should use tabs where possible
#1820: FILE: drivers/gpu/drm/i915/selftests/i915_gem_gtt.c:1382:
+^I^I^I^I^I                    I915_CACHE_NONE),$

-:1820: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#1820: FILE: drivers/gpu/drm/i915/selftests/i915_gem_gtt.c:1382:
+				     i915_gem_get_pat_index(i915,
+					                    I915_CACHE_NONE),

-:1854: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#1854: FILE: drivers/gpu/drm/i915/selftests/intel_memory_region.c:1075:
+					  i915_gem_get_pat_index(i915,
+							I915_CACHE_NONE),

total: 1 errors, 1 warnings, 10 checks, 1584 lines checked
9b8791e38e3b drm/i915: make sure correct pte encode is used
-:26: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#26: FILE: drivers/gpu/drm/i915/gt/gen8_ppgtt.c:64:
+static u64 gen12_pte_encode(dma_addr_t addr,
 			  unsigned int pat_index,

total: 0 errors, 0 warnings, 1 checks, 18 lines checked
01905d6d5429 drm/i915: Allow user to set cache at BO creation



^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915/mtl: Define MOCS and PAT tables for MTL
  2023-04-01  6:38 [Intel-gfx] [PATCH 0/7] drm/i915/mtl: Define MOCS and PAT tables for MTL fei.yang
                   ` (7 preceding siblings ...)
  2023-04-01  7:03 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/mtl: Define MOCS and PAT tables for MTL Patchwork
@ 2023-04-01  7:03 ` Patchwork
  2023-04-01  7:20 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
  9 siblings, 0 replies; 35+ messages in thread
From: Patchwork @ 2023-04-01  7:03 UTC (permalink / raw)
  To: fei.yang; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/mtl: Define MOCS and PAT tables for MTL
URL   : https://patchwork.freedesktop.org/series/115980/
State : warning

== Summary ==

Error: dim sparse failed
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.



^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/mtl: Define MOCS and PAT tables for MTL
  2023-04-01  6:38 [Intel-gfx] [PATCH 0/7] drm/i915/mtl: Define MOCS and PAT tables for MTL fei.yang
                   ` (8 preceding siblings ...)
  2023-04-01  7:03 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
@ 2023-04-01  7:20 ` Patchwork
  9 siblings, 0 replies; 35+ messages in thread
From: Patchwork @ 2023-04-01  7:20 UTC (permalink / raw)
  To: fei.yang; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 7559 bytes --]

== Series Details ==

Series: drm/i915/mtl: Define MOCS and PAT tables for MTL
URL   : https://patchwork.freedesktop.org/series/115980/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_12952 -> Patchwork_115980v1
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_115980v1 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_115980v1, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115980v1/index.html

Participating hosts (38 -> 37)
------------------------------

  Missing    (1): fi-snb-2520m 

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_115980v1:

### IGT changes ###

#### Possible regressions ####

  * igt@i915_selftest@live@execlists:
    - fi-apl-guc:         [PASS][1] -> [DMESG-FAIL][2] +1 similar issue
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12952/fi-apl-guc/igt@i915_selftest@live@execlists.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115980v1/fi-apl-guc/igt@i915_selftest@live@execlists.html
    - fi-glk-j4005:       [PASS][3] -> [ABORT][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12952/fi-glk-j4005/igt@i915_selftest@live@execlists.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115980v1/fi-glk-j4005/igt@i915_selftest@live@execlists.html

  * igt@i915_selftest@live@gt_engines:
    - fi-glk-j4005:       [PASS][5] -> [FAIL][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12952/fi-glk-j4005/igt@i915_selftest@live@gt_engines.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115980v1/fi-glk-j4005/igt@i915_selftest@live@gt_engines.html

  * igt@i915_selftest@live@gt_mocs:
    - fi-glk-j4005:       [PASS][7] -> [DMESG-FAIL][8] +3 similar issues
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12952/fi-glk-j4005/igt@i915_selftest@live@gt_mocs.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115980v1/fi-glk-j4005/igt@i915_selftest@live@gt_mocs.html

  
Known issues
------------

  Here are the changes found in Patchwork_115980v1 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_suspend@basic-s3@smem:
    - bat-rpls-2:         [PASS][9] -> [ABORT][10] ([i915#7978])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12952/bat-rpls-2/igt@gem_exec_suspend@basic-s3@smem.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115980v1/bat-rpls-2/igt@gem_exec_suspend@basic-s3@smem.html

  * igt@i915_selftest@live@requests:
    - bat-rpls-1:         [PASS][11] -> [ABORT][12] ([i915#7911])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12952/bat-rpls-1/igt@i915_selftest@live@requests.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115980v1/bat-rpls-1/igt@i915_selftest@live@requests.html

  * igt@i915_selftest@live@slpc:
    - bat-adln-1:         NOTRUN -> [DMESG-FAIL][13] ([i915#6997])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115980v1/bat-adln-1/igt@i915_selftest@live@slpc.html

  * igt@i915_selftest@live@workarounds:
    - bat-adlp-6:         [PASS][14] -> [INCOMPLETE][15] ([i915#4983] / [i915#7913])
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12952/bat-adlp-6/igt@i915_selftest@live@workarounds.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115980v1/bat-adlp-6/igt@i915_selftest@live@workarounds.html

  * igt@kms_chamelium_hpd@common-hpd-after-suspend:
    - bat-adln-1:         NOTRUN -> [SKIP][16] ([i915#7828])
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115980v1/bat-adln-1/igt@kms_chamelium_hpd@common-hpd-after-suspend.html

  * igt@kms_pipe_crc_basic@nonblocking-crc@pipe-d-dp-1:
    - bat-dg2-8:          [PASS][17] -> [FAIL][18] ([i915#7932])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12952/bat-dg2-8/igt@kms_pipe_crc_basic@nonblocking-crc@pipe-d-dp-1.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115980v1/bat-dg2-8/igt@kms_pipe_crc_basic@nonblocking-crc@pipe-d-dp-1.html

  
#### Possible fixes ####

  * igt@i915_pm_rps@basic-api:
    - bat-dg2-11:         [FAIL][19] ([i915#8308]) -> [PASS][20]
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12952/bat-dg2-11/igt@i915_pm_rps@basic-api.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115980v1/bat-dg2-11/igt@i915_pm_rps@basic-api.html

  * igt@i915_selftest@live@gt_heartbeat:
    - fi-apl-guc:         [DMESG-FAIL][21] ([i915#5334]) -> [PASS][22]
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12952/fi-apl-guc/igt@i915_selftest@live@gt_heartbeat.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115980v1/fi-apl-guc/igt@i915_selftest@live@gt_heartbeat.html

  * igt@i915_selftest@live@gt_lrc:
    - bat-adln-1:         [INCOMPLETE][23] ([i915#4983] / [i915#7609]) -> [PASS][24]
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12952/bat-adln-1/igt@i915_selftest@live@gt_lrc.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115980v1/bat-adln-1/igt@i915_selftest@live@gt_lrc.html

  
#### Warnings ####

  * igt@i915_selftest@live@slpc:
    - bat-rpls-2:         [DMESG-FAIL][25] ([i915#6367] / [i915#7913] / [i915#7996]) -> [DMESG-FAIL][26] ([i915#6997] / [i915#7913])
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12952/bat-rpls-2/igt@i915_selftest@live@slpc.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115980v1/bat-rpls-2/igt@i915_selftest@live@slpc.html

  
  [i915#4983]: https://gitlab.freedesktop.org/drm/intel/issues/4983
  [i915#5334]: https://gitlab.freedesktop.org/drm/intel/issues/5334
  [i915#6367]: https://gitlab.freedesktop.org/drm/intel/issues/6367
  [i915#6997]: https://gitlab.freedesktop.org/drm/intel/issues/6997
  [i915#7609]: https://gitlab.freedesktop.org/drm/intel/issues/7609
  [i915#7828]: https://gitlab.freedesktop.org/drm/intel/issues/7828
  [i915#7911]: https://gitlab.freedesktop.org/drm/intel/issues/7911
  [i915#7913]: https://gitlab.freedesktop.org/drm/intel/issues/7913
  [i915#7932]: https://gitlab.freedesktop.org/drm/intel/issues/7932
  [i915#7978]: https://gitlab.freedesktop.org/drm/intel/issues/7978
  [i915#7996]: https://gitlab.freedesktop.org/drm/intel/issues/7996
  [i915#8308]: https://gitlab.freedesktop.org/drm/intel/issues/8308


Build changes
-------------

  * Linux: CI_DRM_12952 -> Patchwork_115980v1

  CI-20190529: 20190529
  CI_DRM_12952: 51cf6fb5e846c1adbe92debb7282d0dcc3934ecb @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_7231: 94188a1dc91b6ef1cf3e9df1440ff00b6ff25935 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_115980v1: 51cf6fb5e846c1adbe92debb7282d0dcc3934ecb @ git://anongit.freedesktop.org/gfx-ci/linux


### Linux commits

dcef13f7a9a2 drm/i915: Allow user to set cache at BO creation
fa93984db8c9 drm/i915: make sure correct pte encode is used
92eba10b204e drm/i915: use pat_index instead of cache_level
6ba4580e481f drm/i915: preparation for using PAT index
f77e121d433b drm/i915/mtl: end support for set caching ioctl
50fa01b885e6 drm/i915/mtl: workaround coherency issue for Media
a19e7012ce6c drm/i915/mtl: Define MOCS and PAT tables for MTL

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_115980v1/index.html

[-- Attachment #2: Type: text/html, Size: 8821 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Intel-gfx] [PATCH 1/7] drm/i915/mtl: Define MOCS and PAT tables for MTL
  2023-04-01  6:38 ` [Intel-gfx] [PATCH 1/7] " fei.yang
@ 2023-04-03 12:50   ` Jani Nikula
  2023-04-06  8:16     ` Andi Shyti
  2023-04-06  8:28   ` Das, Nirmoy
  1 sibling, 1 reply; 35+ messages in thread
From: Jani Nikula @ 2023-04-03 12:50 UTC (permalink / raw)
  To: fei.yang, intel-gfx; +Cc: Matt Roper, Lucas De Marchi, dri-devel

On Fri, 31 Mar 2023, fei.yang@intel.com wrote:
> From: Fei Yang <fei.yang@intel.com>
>
> On MTL, GT can no longer allocate on LLC - only the CPU can.
> This, along with addition of support for ADM/L4 cache calls a
> MOCS/PAT table update.
> Also add PTE encode functions for MTL as it has different PAT
> index definition than previous platforms.

As a general observation, turning something into a function pointer and
extending it to more platforms should be two separate changes.

BR,
Jani.

>
> BSpec: 44509, 45101, 44235
>
> Cc: Matt Roper <matthew.d.roper@intel.com>
> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> Signed-off-by: Madhumitha Tolakanahalli Pradeep <madhumitha.tolakanahalli.pradeep@intel.com>
> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
> Signed-off-by: Fei Yang <fei.yang@intel.com>
> ---
>  drivers/gpu/drm/i915/display/intel_dpt.c |  2 +-
>  drivers/gpu/drm/i915/gt/gen8_ppgtt.c     | 43 ++++++++++++--
>  drivers/gpu/drm/i915/gt/gen8_ppgtt.h     |  3 +
>  drivers/gpu/drm/i915/gt/intel_ggtt.c     | 36 ++++++++++-
>  drivers/gpu/drm/i915/gt/intel_gtt.c      | 23 ++++++-
>  drivers/gpu/drm/i915/gt/intel_gtt.h      | 20 ++++++-
>  drivers/gpu/drm/i915/gt/intel_mocs.c     | 76 ++++++++++++++++++++++--
>  drivers/gpu/drm/i915/gt/selftest_mocs.c  |  2 +-
>  drivers/gpu/drm/i915/i915_pci.c          |  1 +
>  9 files changed, 189 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c b/drivers/gpu/drm/i915/display/intel_dpt.c
> index b8027392144d..c5eacfdba1a5 100644
> --- a/drivers/gpu/drm/i915/display/intel_dpt.c
> +++ b/drivers/gpu/drm/i915/display/intel_dpt.c
> @@ -300,7 +300,7 @@ intel_dpt_create(struct intel_framebuffer *fb)
>  	vm->vma_ops.bind_vma    = dpt_bind_vma;
>  	vm->vma_ops.unbind_vma  = dpt_unbind_vma;
>  
> -	vm->pte_encode = gen8_ggtt_pte_encode;
> +	vm->pte_encode = vm->gt->ggtt->vm.pte_encode;
>  
>  	dpt->obj = dpt_obj;
>  	dpt->obj->is_dpt = true;
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> index 4daaa6f55668..4197b43150cc 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> @@ -55,6 +55,34 @@ static u64 gen8_pte_encode(dma_addr_t addr,
>  	return pte;
>  }
>  
> +static u64 mtl_pte_encode(dma_addr_t addr,
> +			  enum i915_cache_level level,
> +			  u32 flags)
> +{
> +	gen8_pte_t pte = addr | GEN8_PAGE_PRESENT | GEN8_PAGE_RW;
> +
> +	if (unlikely(flags & PTE_READ_ONLY))
> +		pte &= ~GEN8_PAGE_RW;
> +
> +	if (flags & PTE_LM)
> +		pte |= GEN12_PPGTT_PTE_LM | GEN12_PPGTT_PTE_NC;
> +
> +	switch (level) {
> +	case I915_CACHE_NONE:
> +		pte |= GEN12_PPGTT_PTE_PAT1;
> +		break;
> +	case I915_CACHE_LLC:
> +	case I915_CACHE_L3_LLC:
> +		pte |= GEN12_PPGTT_PTE_PAT0 | GEN12_PPGTT_PTE_PAT1;
> +		break;
> +	case I915_CACHE_WT:
> +		pte |= GEN12_PPGTT_PTE_PAT0;
> +		break;
> +	}
> +
> +	return pte;
> +}
> +
>  static void gen8_ppgtt_notify_vgt(struct i915_ppgtt *ppgtt, bool create)
>  {
>  	struct drm_i915_private *i915 = ppgtt->vm.i915;
> @@ -427,7 +455,7 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
>  		      u32 flags)
>  {
>  	struct i915_page_directory *pd;
> -	const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
> +	const gen8_pte_t pte_encode = ppgtt->vm.pte_encode(0, cache_level, flags);
>  	gen8_pte_t *vaddr;
>  
>  	pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2));
> @@ -580,7 +608,7 @@ static void gen8_ppgtt_insert_huge(struct i915_address_space *vm,
>  				   enum i915_cache_level cache_level,
>  				   u32 flags)
>  {
> -	const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
> +	const gen8_pte_t pte_encode = vm->pte_encode(0, cache_level, flags);
>  	unsigned int rem = sg_dma_len(iter->sg);
>  	u64 start = vma_res->start;
>  
> @@ -743,7 +771,7 @@ static void gen8_ppgtt_insert_entry(struct i915_address_space *vm,
>  	GEM_BUG_ON(pt->is_compact);
>  
>  	vaddr = px_vaddr(pt);
> -	vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags);
> +	vaddr[gen8_pd_index(idx, 0)] = vm->pte_encode(addr, level, flags);
>  	drm_clflush_virt_range(&vaddr[gen8_pd_index(idx, 0)], sizeof(*vaddr));
>  }
>  
> @@ -773,7 +801,7 @@ static void __xehpsdv_ppgtt_insert_entry_lm(struct i915_address_space *vm,
>  	}
>  
>  	vaddr = px_vaddr(pt);
> -	vaddr[gen8_pd_index(idx, 0) / 16] = gen8_pte_encode(addr, level, flags);
> +	vaddr[gen8_pd_index(idx, 0) / 16] = vm->pte_encode(addr, level, flags);
>  }
>  
>  static void xehpsdv_ppgtt_insert_entry(struct i915_address_space *vm,
> @@ -820,7 +848,7 @@ static int gen8_init_scratch(struct i915_address_space *vm)
>  		pte_flags |= PTE_LM;
>  
>  	vm->scratch[0]->encode =
> -		gen8_pte_encode(px_dma(vm->scratch[0]),
> +		vm->pte_encode(px_dma(vm->scratch[0]),
>  				I915_CACHE_NONE, pte_flags);
>  
>  	for (i = 1; i <= vm->top; i++) {
> @@ -963,7 +991,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
>  	 */
>  	ppgtt->vm.alloc_scratch_dma = alloc_pt_dma;
>  
> -	ppgtt->vm.pte_encode = gen8_pte_encode;
> +	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 70))
> +		ppgtt->vm.pte_encode = mtl_pte_encode;
> +	else
> +		ppgtt->vm.pte_encode = gen8_pte_encode;
>  
>  	ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND;
>  	ppgtt->vm.insert_entries = gen8_ppgtt_insert;
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
> index f541d19264b4..6b8ce7f4d25a 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
> @@ -18,5 +18,8 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
>  u64 gen8_ggtt_pte_encode(dma_addr_t addr,
>  			 enum i915_cache_level level,
>  			 u32 flags);
> +u64 mtl_ggtt_pte_encode(dma_addr_t addr,
> +			unsigned int pat_index,
> +			u32 flags);
>  
>  #endif
> diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> index 3c7f1ed92f5b..ba3109338aee 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> @@ -220,6 +220,33 @@ static void guc_ggtt_invalidate(struct i915_ggtt *ggtt)
>  	}
>  }
>  
> +u64 mtl_ggtt_pte_encode(dma_addr_t addr,
> +			enum i915_cache_level level,
> +			u32 flags)
> +{
> +	gen8_pte_t pte = addr | GEN8_PAGE_PRESENT;
> +
> +	GEM_BUG_ON(addr & ~GEN12_GGTT_PTE_ADDR_MASK);
> +
> +	if (flags & PTE_LM)
> +		pte |= GEN12_GGTT_PTE_LM;
> +
> +	switch (level) {
> +	case I915_CACHE_NONE:
> +		pte |= MTL_GGTT_PTE_PAT1;
> +		break;
> +	case I915_CACHE_LLC:
> +	case I915_CACHE_L3_LLC:
> +		pte |= MTL_GGTT_PTE_PAT0 | MTL_GGTT_PTE_PAT1;
> +		break;
> +	case I915_CACHE_WT:
> +		pte |= MTL_GGTT_PTE_PAT0;
> +		break;
> +	}
> +
> +	return pte;
> +}
> +
>  u64 gen8_ggtt_pte_encode(dma_addr_t addr,
>  			 enum i915_cache_level level,
>  			 u32 flags)
> @@ -247,7 +274,7 @@ static void gen8_ggtt_insert_page(struct i915_address_space *vm,
>  	gen8_pte_t __iomem *pte =
>  		(gen8_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE;
>  
> -	gen8_set_pte(pte, gen8_ggtt_pte_encode(addr, level, flags));
> +	gen8_set_pte(pte, ggtt->vm.pte_encode(addr, level, flags));
>  
>  	ggtt->invalidate(ggtt);
>  }
> @@ -257,8 +284,8 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
>  				     enum i915_cache_level level,
>  				     u32 flags)
>  {
> -	const gen8_pte_t pte_encode = gen8_ggtt_pte_encode(0, level, flags);
>  	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
> +	const gen8_pte_t pte_encode = ggtt->vm.pte_encode(0, level, flags);
>  	gen8_pte_t __iomem *gte;
>  	gen8_pte_t __iomem *end;
>  	struct sgt_iter iter;
> @@ -981,7 +1008,10 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
>  	ggtt->vm.vma_ops.bind_vma    = intel_ggtt_bind_vma;
>  	ggtt->vm.vma_ops.unbind_vma  = intel_ggtt_unbind_vma;
>  
> -	ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
> +	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
> +		ggtt->vm.pte_encode = mtl_ggtt_pte_encode;
> +	else
> +		ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
>  
>  	return ggtt_probe_common(ggtt, size);
>  }
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
> index 4f436ba7a3c8..1e1b34e22cf5 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> @@ -468,6 +468,25 @@ void gtt_write_workarounds(struct intel_gt *gt)
>  	}
>  }
>  
> +static void mtl_setup_private_ppat(struct intel_uncore *uncore)
> +{
> +	intel_uncore_write(uncore, GEN12_PAT_INDEX(0),
> +			   MTL_PPAT_L4_0_WB);
> +	intel_uncore_write(uncore, GEN12_PAT_INDEX(1),
> +			   MTL_PPAT_L4_1_WT);
> +	intel_uncore_write(uncore, GEN12_PAT_INDEX(2),
> +			   MTL_PPAT_L4_3_UC);
> +	intel_uncore_write(uncore, GEN12_PAT_INDEX(3),
> +			   MTL_PPAT_L4_0_WB | MTL_2_COH_1W);
> +	intel_uncore_write(uncore, GEN12_PAT_INDEX(4),
> +			   MTL_PPAT_L4_0_WB | MTL_3_COH_2W);
> +
> +	/*
> +	 * Remaining PAT entries are left at the hardware-default
> +	 * fully-cached setting
> +	 */
> +}
> +
>  static void tgl_setup_private_ppat(struct intel_uncore *uncore)
>  {
>  	/* TGL doesn't support LLC or AGE settings */
> @@ -603,7 +622,9 @@ void setup_private_pat(struct intel_gt *gt)
>  
>  	GEM_BUG_ON(GRAPHICS_VER(i915) < 8);
>  
> -	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
> +	if (IS_METEORLAKE(i915))
> +		mtl_setup_private_ppat(uncore);
> +	else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
>  		xehp_setup_private_ppat(gt);
>  	else if (GRAPHICS_VER(i915) >= 12)
>  		tgl_setup_private_ppat(uncore);
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
> index 69ce55f517f5..b632167eaf2e 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> @@ -88,9 +88,18 @@ typedef u64 gen8_pte_t;
>  #define BYT_PTE_SNOOPED_BY_CPU_CACHES	REG_BIT(2)
>  #define BYT_PTE_WRITEABLE		REG_BIT(1)
>  
> +#define GEN12_PPGTT_PTE_PAT3    BIT_ULL(62)
>  #define GEN12_PPGTT_PTE_LM	BIT_ULL(11)
> +#define GEN12_PPGTT_PTE_PAT2    BIT_ULL(7)
> +#define GEN12_PPGTT_PTE_NC      BIT_ULL(5)
> +#define GEN12_PPGTT_PTE_PAT1    BIT_ULL(4)
> +#define GEN12_PPGTT_PTE_PAT0    BIT_ULL(3)
>  
> -#define GEN12_GGTT_PTE_LM	BIT_ULL(1)
> +#define GEN12_GGTT_PTE_LM		BIT_ULL(1)
> +#define MTL_GGTT_PTE_PAT0		BIT_ULL(52)
> +#define MTL_GGTT_PTE_PAT1		BIT_ULL(53)
> +#define GEN12_GGTT_PTE_ADDR_MASK	GENMASK_ULL(45, 12)
> +#define MTL_GGTT_PTE_PAT_MASK		GENMASK_ULL(53, 52)
>  
>  #define GEN12_PDE_64K BIT(6)
>  #define GEN12_PTE_PS64 BIT(8)
> @@ -147,6 +156,15 @@ typedef u64 gen8_pte_t;
>  #define GEN8_PDE_IPS_64K BIT(11)
>  #define GEN8_PDE_PS_2M   BIT(7)
>  
> +#define MTL_PPAT_L4_CACHE_POLICY_MASK	REG_GENMASK(3, 2)
> +#define MTL_PAT_INDEX_COH_MODE_MASK	REG_GENMASK(1, 0)
> +#define MTL_PPAT_L4_3_UC	REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 3)
> +#define MTL_PPAT_L4_1_WT	REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 1)
> +#define MTL_PPAT_L4_0_WB	REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 0)
> +#define MTL_3_COH_2W	REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 3)
> +#define MTL_2_COH_1W	REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 2)
> +#define MTL_0_COH_NON	REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 0)
> +
>  enum i915_cache_level;
>  
>  struct drm_i915_gem_object;
> diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c
> index 69b489e8dfed..89570f137b2c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_mocs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
> @@ -40,6 +40,10 @@ struct drm_i915_mocs_table {
>  #define LE_COS(value)		((value) << 15)
>  #define LE_SSE(value)		((value) << 17)
>  
> +/* Defines for the tables (GLOB_MOCS_0 - GLOB_MOCS_16) */
> +#define _L4_CACHEABILITY(value)	((value) << 2)
> +#define IG_PAT(value)		((value) << 8)
> +
>  /* Defines for the tables (LNCFMOCS0 - LNCFMOCS31) - two entries per word */
>  #define L3_ESC(value)		((value) << 0)
>  #define L3_SCC(value)		((value) << 1)
> @@ -50,6 +54,7 @@ struct drm_i915_mocs_table {
>  /* Helper defines */
>  #define GEN9_NUM_MOCS_ENTRIES	64  /* 63-64 are reserved, but configured. */
>  #define PVC_NUM_MOCS_ENTRIES	3
> +#define MTL_NUM_MOCS_ENTRIES	16
>  
>  /* (e)LLC caching options */
>  /*
> @@ -73,6 +78,12 @@ struct drm_i915_mocs_table {
>  #define L3_2_RESERVED		_L3_CACHEABILITY(2)
>  #define L3_3_WB			_L3_CACHEABILITY(3)
>  
> +/* L4 caching options */
> +#define L4_0_WB			_L4_CACHEABILITY(0)
> +#define L4_1_WT			_L4_CACHEABILITY(1)
> +#define L4_2_RESERVED		_L4_CACHEABILITY(2)
> +#define L4_3_UC			_L4_CACHEABILITY(3)
> +
>  #define MOCS_ENTRY(__idx, __control_value, __l3cc_value) \
>  	[__idx] = { \
>  		.control_value = __control_value, \
> @@ -416,6 +427,57 @@ static const struct drm_i915_mocs_entry pvc_mocs_table[] = {
>  	MOCS_ENTRY(2, 0, L3_3_WB),
>  };
>  
> +static const struct drm_i915_mocs_entry mtl_mocs_table[] = {
> +	/* Error - Reserved for Non-Use */
> +	MOCS_ENTRY(0,
> +		   IG_PAT(0),
> +		   L3_LKUP(1) | L3_3_WB),
> +	/* Cached - L3 + L4 */
> +	MOCS_ENTRY(1,
> +		   IG_PAT(1),
> +		   L3_LKUP(1) | L3_3_WB),
> +	/* L4 - GO:L3 */
> +	MOCS_ENTRY(2,
> +		   IG_PAT(1),
> +		   L3_LKUP(1) | L3_1_UC),
> +	/* Uncached - GO:L3 */
> +	MOCS_ENTRY(3,
> +		   IG_PAT(1) | L4_3_UC,
> +		   L3_LKUP(1) | L3_1_UC),
> +	/* L4 - GO:Mem */
> +	MOCS_ENTRY(4,
> +		   IG_PAT(1),
> +		   L3_LKUP(1) | L3_GLBGO(1) | L3_1_UC),
> +	/* Uncached - GO:Mem */
> +	MOCS_ENTRY(5,
> +		   IG_PAT(1) | L4_3_UC,
> +		   L3_LKUP(1) | L3_GLBGO(1) | L3_1_UC),
> +	/* L4 - L3:NoLKUP; GO:L3 */
> +	MOCS_ENTRY(6,
> +		   IG_PAT(1),
> +		   L3_1_UC),
> +	/* Uncached - L3:NoLKUP; GO:L3 */
> +	MOCS_ENTRY(7,
> +		   IG_PAT(1) | L4_3_UC,
> +		   L3_1_UC),
> +	/* L4 - L3:NoLKUP; GO:Mem */
> +	MOCS_ENTRY(8,
> +		   IG_PAT(1),
> +		   L3_GLBGO(1) | L3_1_UC),
> +	/* Uncached - L3:NoLKUP; GO:Mem */
> +	MOCS_ENTRY(9,
> +		   IG_PAT(1) | L4_3_UC,
> +		   L3_GLBGO(1) | L3_1_UC),
> +	/* Display - L3; L4:WT */
> +	MOCS_ENTRY(14,
> +		   IG_PAT(1) | L4_1_WT,
> +		   L3_LKUP(1) | L3_3_WB),
> +	/* CCS - Non-Displayable */
> +	MOCS_ENTRY(15,
> +		   IG_PAT(1),
> +		   L3_GLBGO(1) | L3_1_UC),
> +};
> +
>  enum {
>  	HAS_GLOBAL_MOCS = BIT(0),
>  	HAS_ENGINE_MOCS = BIT(1),
> @@ -445,7 +507,13 @@ static unsigned int get_mocs_settings(const struct drm_i915_private *i915,
>  	memset(table, 0, sizeof(struct drm_i915_mocs_table));
>  
>  	table->unused_entries_index = I915_MOCS_PTE;
> -	if (IS_PONTEVECCHIO(i915)) {
> +	if (IS_METEORLAKE(i915)) {
> +		table->size = ARRAY_SIZE(mtl_mocs_table);
> +		table->table = mtl_mocs_table;
> +		table->n_entries = MTL_NUM_MOCS_ENTRIES;
> +		table->uc_index = 9;
> +		table->unused_entries_index = 1;
> +	} else if (IS_PONTEVECCHIO(i915)) {
>  		table->size = ARRAY_SIZE(pvc_mocs_table);
>  		table->table = pvc_mocs_table;
>  		table->n_entries = PVC_NUM_MOCS_ENTRIES;
> @@ -646,9 +714,9 @@ void intel_mocs_init_engine(struct intel_engine_cs *engine)
>  		init_l3cc_table(engine->gt, &table);
>  }
>  
> -static u32 global_mocs_offset(void)
> +static u32 global_mocs_offset(struct intel_gt *gt)
>  {
> -	return i915_mmio_reg_offset(GEN12_GLOBAL_MOCS(0));
> +	return i915_mmio_reg_offset(GEN12_GLOBAL_MOCS(0)) + gt->uncore->gsi_offset;
>  }
>  
>  void intel_set_mocs_index(struct intel_gt *gt)
> @@ -671,7 +739,7 @@ void intel_mocs_init(struct intel_gt *gt)
>  	 */
>  	flags = get_mocs_settings(gt->i915, &table);
>  	if (flags & HAS_GLOBAL_MOCS)
> -		__init_mocs_table(gt->uncore, &table, global_mocs_offset());
> +		__init_mocs_table(gt->uncore, &table, global_mocs_offset(gt));
>  
>  	/*
>  	 * Initialize the L3CC table as part of mocs initalization to make
> diff --git a/drivers/gpu/drm/i915/gt/selftest_mocs.c b/drivers/gpu/drm/i915/gt/selftest_mocs.c
> index ca009a6a13bd..730796346514 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_mocs.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_mocs.c
> @@ -137,7 +137,7 @@ static int read_mocs_table(struct i915_request *rq,
>  		return 0;
>  
>  	if (HAS_GLOBAL_MOCS_REGISTERS(rq->engine->i915))
> -		addr = global_mocs_offset();
> +		addr = global_mocs_offset(rq->engine->gt);
>  	else
>  		addr = mocs_offset(rq->engine);
>  
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index 621730b6551c..480b128499ae 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -1149,6 +1149,7 @@ static const struct intel_device_info mtl_info = {
>  	.has_flat_ccs = 0,
>  	.has_gmd_id = 1,
>  	.has_guc_deprivilege = 1,
> +	.has_llc = 0,
>  	.has_mslice_steering = 0,
>  	.has_snoop = 1,
>  	.__runtime.memory_regions = REGION_SMEM | REGION_STOLEN_LMEM,

-- 
Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Intel-gfx] [PATCH 5/7] drm/i915: use pat_index instead of cache_level
  2023-04-01  6:38 ` [Intel-gfx] [PATCH 5/7] drm/i915: use pat_index instead of cache_level fei.yang
@ 2023-04-03 14:50   ` Ville Syrjälä
  2023-04-03 16:57     ` Yang, Fei
  0 siblings, 1 reply; 35+ messages in thread
From: Ville Syrjälä @ 2023-04-03 14:50 UTC (permalink / raw)
  To: fei.yang; +Cc: Chris Wilson, intel-gfx, Matt Roper, dri-devel

On Fri, Mar 31, 2023 at 11:38:28PM -0700, fei.yang@intel.com wrote:
> From: Fei Yang <fei.yang@intel.com>
> 
> Currently the KMD is using enum i915_cache_level to set caching policy for
> buffer objects. This is flaky because the PAT index which really controls
> the caching behavior in PTE has far more levels than what's defined in the
> enum.

Then just add more enum values.

'pat_index' is absolutely meaningless to the reader, it's just an
arbitrary number. Whereas 'cache_level' conveys how the thing is
actually going to get used and thus how the caches should behave.

> In addition, the PAT index is platform dependent, having to translate
> between i915_cache_level and PAT index is not reliable,

If it's not realiable then the code is clearly broken.

> and makes the code
> more complicated.

You have to translate somewhere anyway. Looks like you're now adding
translations the other way (pat_index->cache_level). How is that better?

> 
> >From UMD's perspective there is also a necessity to set caching policy for
> performance fine tuning. It's much easier for the UMD to directly use PAT
> index because the behavior of each PAT index is clearly defined in Bspec.
> Haivng the abstracted i915_cache_level sitting in between would only cause
> more ambiguity.
> 
> For these reasons this patch replaces i915_cache_level with PAT index. Also
> note, the cache_level is not completely removed yet, because the KMD still
> has the need of creating buffer objects with simple cache settings such as
> cached, uncached, or writethrough. For these simple cases, using cache_level
> would help simplify the code.
> 
> Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
> Cc: Matt Roper <matthew.d.roper@intel.com>
> Signed-off-by: Fei Yang <fei.yang@intel.com>
> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/display/intel_dpt.c      | 12 +--
>  drivers/gpu/drm/i915/gem/i915_gem_domain.c    | 27 ++----
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 10 ++-
>  drivers/gpu/drm/i915/gem/i915_gem_mman.c      |  3 +-
>  drivers/gpu/drm/i915/gem/i915_gem_object.c    | 39 ++++++++-
>  drivers/gpu/drm/i915/gem/i915_gem_object.h    |  4 +
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  | 18 ++--
>  drivers/gpu/drm/i915/gem/i915_gem_stolen.c    |  4 +-
>  drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  | 16 ++--
>  .../gpu/drm/i915/gem/selftests/huge_pages.c   |  2 +-
>  .../drm/i915/gem/selftests/i915_gem_migrate.c |  2 +-
>  .../drm/i915/gem/selftests/i915_gem_mman.c    |  2 +-
>  drivers/gpu/drm/i915/gt/gen6_ppgtt.c          | 10 ++-
>  drivers/gpu/drm/i915/gt/gen8_ppgtt.c          | 76 ++++++++---------
>  drivers/gpu/drm/i915/gt/gen8_ppgtt.h          |  3 +-
>  drivers/gpu/drm/i915/gt/intel_ggtt.c          | 82 +++++++++----------
>  drivers/gpu/drm/i915/gt/intel_gtt.h           | 20 ++---
>  drivers/gpu/drm/i915/gt/intel_migrate.c       | 47 ++++++-----
>  drivers/gpu/drm/i915/gt/intel_migrate.h       | 13 ++-
>  drivers/gpu/drm/i915/gt/intel_ppgtt.c         |  6 +-
>  drivers/gpu/drm/i915/gt/selftest_migrate.c    | 47 ++++++-----
>  drivers/gpu/drm/i915/gt/selftest_reset.c      |  8 +-
>  drivers/gpu/drm/i915/gt/selftest_timeline.c   |  2 +-
>  drivers/gpu/drm/i915/gt/selftest_tlb.c        |  4 +-
>  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c      | 10 ++-
>  drivers/gpu/drm/i915/i915_debugfs.c           | 55 ++++++++++---
>  drivers/gpu/drm/i915/i915_gem.c               | 16 +++-
>  drivers/gpu/drm/i915/i915_gpu_error.c         |  8 +-
>  drivers/gpu/drm/i915/i915_vma.c               | 16 ++--
>  drivers/gpu/drm/i915/i915_vma.h               |  2 +-
>  drivers/gpu/drm/i915/i915_vma_types.h         |  2 -
>  drivers/gpu/drm/i915/selftests/i915_gem.c     |  5 +-
>  .../gpu/drm/i915/selftests/i915_gem_evict.c   |  4 +-
>  drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 15 ++--
>  .../drm/i915/selftests/intel_memory_region.c  |  4 +-
>  drivers/gpu/drm/i915/selftests/mock_gtt.c     |  8 +-
>  36 files changed, 361 insertions(+), 241 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c b/drivers/gpu/drm/i915/display/intel_dpt.c
> index c5eacfdba1a5..7c5fddb203ba 100644
> --- a/drivers/gpu/drm/i915/display/intel_dpt.c
> +++ b/drivers/gpu/drm/i915/display/intel_dpt.c
> @@ -43,24 +43,24 @@ static void gen8_set_pte(void __iomem *addr, gen8_pte_t pte)
>  static void dpt_insert_page(struct i915_address_space *vm,
>  			    dma_addr_t addr,
>  			    u64 offset,
> -			    enum i915_cache_level level,
> +			    unsigned int pat_index,
>  			    u32 flags)
>  {
>  	struct i915_dpt *dpt = i915_vm_to_dpt(vm);
>  	gen8_pte_t __iomem *base = dpt->iomem;
>  
>  	gen8_set_pte(base + offset / I915_GTT_PAGE_SIZE,
> -		     vm->pte_encode(addr, level, flags));
> +		     vm->pte_encode(addr, pat_index, flags));
>  }
>  
>  static void dpt_insert_entries(struct i915_address_space *vm,
>  			       struct i915_vma_resource *vma_res,
> -			       enum i915_cache_level level,
> +			       unsigned int pat_index,
>  			       u32 flags)
>  {
>  	struct i915_dpt *dpt = i915_vm_to_dpt(vm);
>  	gen8_pte_t __iomem *base = dpt->iomem;
> -	const gen8_pte_t pte_encode = vm->pte_encode(0, level, flags);
> +	const gen8_pte_t pte_encode = vm->pte_encode(0, pat_index, flags);
>  	struct sgt_iter sgt_iter;
>  	dma_addr_t addr;
>  	int i;
> @@ -83,7 +83,7 @@ static void dpt_clear_range(struct i915_address_space *vm,
>  static void dpt_bind_vma(struct i915_address_space *vm,
>  			 struct i915_vm_pt_stash *stash,
>  			 struct i915_vma_resource *vma_res,
> -			 enum i915_cache_level cache_level,
> +			 unsigned int pat_index,
>  			 u32 flags)
>  {
>  	u32 pte_flags;
> @@ -98,7 +98,7 @@ static void dpt_bind_vma(struct i915_address_space *vm,
>  	if (vma_res->bi.lmem)
>  		pte_flags |= PTE_LM;
>  
> -	vm->insert_entries(vm, vma_res, cache_level, pte_flags);
> +	vm->insert_entries(vm, vma_res, pat_index, pte_flags);
>  
>  	vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
>  
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> index 33b73bea1e08..84e0a96f6c71 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> @@ -27,8 +27,8 @@ static bool gpu_write_needs_clflush(struct drm_i915_gem_object *obj)
>  	if (IS_DGFX(i915))
>  		return false;
>  
> -	return !(obj->cache_level == I915_CACHE_NONE ||
> -		 obj->cache_level == I915_CACHE_WT);
> +	return !(i915_gem_object_has_cache_level(obj, I915_CACHE_NONE) ||
> +		 i915_gem_object_has_cache_level(obj, I915_CACHE_WT));
>  }
>  
>  bool i915_gem_cpu_write_needs_clflush(struct drm_i915_gem_object *obj)
> @@ -265,7 +265,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  {
>  	int ret;
>  
> -	if (obj->cache_level == cache_level)
> +	if (i915_gem_object_has_cache_level(obj, cache_level))
>  		return 0;
>  
>  	ret = i915_gem_object_wait(obj,
> @@ -276,10 +276,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  		return ret;
>  
>  	/* Always invalidate stale cachelines */
> -	if (obj->cache_level != cache_level) {
> -		i915_gem_object_set_cache_coherency(obj, cache_level);
> -		obj->cache_dirty = true;
> -	}
> +	i915_gem_object_set_cache_coherency(obj, cache_level);
> +	obj->cache_dirty = true;
>  
>  	/* The cache-level will be applied when each vma is rebound. */
>  	return i915_gem_object_unbind(obj,
> @@ -304,20 +302,13 @@ int i915_gem_get_caching_ioctl(struct drm_device *dev, void *data,
>  		goto out;
>  	}
>  
> -	switch (obj->cache_level) {
> -	case I915_CACHE_LLC:
> -	case I915_CACHE_L3_LLC:
> +	if (i915_gem_object_has_cache_level(obj, I915_CACHE_LLC) ||
> +	    i915_gem_object_has_cache_level(obj, I915_CACHE_L3_LLC))
>  		args->caching = I915_CACHING_CACHED;
> -		break;
> -
> -	case I915_CACHE_WT:
> +	else if (i915_gem_object_has_cache_level(obj, I915_CACHE_WT))
>  		args->caching = I915_CACHING_DISPLAY;
> -		break;
> -
> -	default:
> +	else
>  		args->caching = I915_CACHING_NONE;
> -		break;
> -	}
>  out:
>  	rcu_read_unlock();
>  	return err;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index 3aeede6aee4d..d42915516636 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -642,7 +642,7 @@ static inline int use_cpu_reloc(const struct reloc_cache *cache,
>  
>  	return (cache->has_llc ||
>  		obj->cache_dirty ||
> -		obj->cache_level != I915_CACHE_NONE);
> +		!i915_gem_object_has_cache_level(obj, I915_CACHE_NONE));
>  }
>  
>  static int eb_reserve_vma(struct i915_execbuffer *eb,
> @@ -1323,8 +1323,10 @@ static void *reloc_iomap(struct i915_vma *batch,
>  	offset = cache->node.start;
>  	if (drm_mm_node_allocated(&cache->node)) {
>  		ggtt->vm.insert_page(&ggtt->vm,
> -				     i915_gem_object_get_dma_address(obj, page),
> -				     offset, I915_CACHE_NONE, 0);
> +			i915_gem_object_get_dma_address(obj, page),
> +			offset,
> +			i915_gem_get_pat_index(ggtt->vm.i915, I915_CACHE_NONE),
> +			0);
>  	} else {
>  		offset += page << PAGE_SHIFT;
>  	}
> @@ -1464,7 +1466,7 @@ eb_relocate_entry(struct i915_execbuffer *eb,
>  			reloc_cache_unmap(&eb->reloc_cache);
>  			mutex_lock(&vma->vm->mutex);
>  			err = i915_vma_bind(target->vma,
> -					    target->vma->obj->cache_level,
> +					    target->vma->obj->pat_index,
>  					    PIN_GLOBAL, NULL, NULL);
>  			mutex_unlock(&vma->vm->mutex);
>  			reloc_cache_remap(&eb->reloc_cache, ev->vma->obj);
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> index d3c1dee16af2..6c242f9ffc75 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> @@ -383,7 +383,8 @@ static vm_fault_t vm_fault_gtt(struct vm_fault *vmf)
>  	}
>  
>  	/* Access to snoopable pages through the GTT is incoherent. */
> -	if (obj->cache_level != I915_CACHE_NONE && !HAS_LLC(i915)) {
> +	if (!(i915_gem_object_has_cache_level(obj, I915_CACHE_NONE) ||
> +	      HAS_LLC(i915))) {
>  		ret = -EFAULT;
>  		goto err_unpin;
>  	}
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
> index 1295bb812866..2894ed9156c7 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
> @@ -54,6 +54,12 @@ unsigned int i915_gem_get_pat_index(struct drm_i915_private *i915,
>  	return INTEL_INFO(i915)->cachelevel_to_pat[level];
>  }
>  
> +bool i915_gem_object_has_cache_level(const struct drm_i915_gem_object *obj,
> +				     enum i915_cache_level lvl)
> +{
> +	return obj->pat_index == i915_gem_get_pat_index(obj_to_i915(obj), lvl);
> +}
> +
>  struct drm_i915_gem_object *i915_gem_object_alloc(void)
>  {
>  	struct drm_i915_gem_object *obj;
> @@ -133,7 +139,7 @@ void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj,
>  {
>  	struct drm_i915_private *i915 = to_i915(obj->base.dev);
>  
> -	obj->cache_level = cache_level;
> +	obj->pat_index = i915_gem_get_pat_index(i915, cache_level);
>  
>  	if (cache_level != I915_CACHE_NONE)
>  		obj->cache_coherent = (I915_BO_CACHE_COHERENT_FOR_READ |
> @@ -148,6 +154,37 @@ void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj,
>  		!IS_DGFX(i915);
>  }
>  
> +/**
> + * i915_gem_object_set_pat_index - set PAT index to be used in PTE encode
> + * @obj: #drm_i915_gem_object
> + * @pat_index: PAT index
> + *
> + * This is a clone of i915_gem_object_set_cache_coherency taking pat index
> + * instead of cache_level as its second argument.
> + */
> +void i915_gem_object_set_pat_index(struct drm_i915_gem_object *obj,
> +				   unsigned int pat_index)
> +{
> +	struct drm_i915_private *i915 = to_i915(obj->base.dev);
> +
> +	if (obj->pat_index == pat_index)
> +		return;
> +
> +	obj->pat_index = pat_index;
> +
> +	if (pat_index != i915_gem_get_pat_index(i915, I915_CACHE_NONE))
> +		obj->cache_coherent = (I915_BO_CACHE_COHERENT_FOR_READ |
> +				       I915_BO_CACHE_COHERENT_FOR_WRITE);
> +	else if (HAS_LLC(i915))
> +		obj->cache_coherent = I915_BO_CACHE_COHERENT_FOR_READ;
> +	else
> +		obj->cache_coherent = 0;
> +
> +	obj->cache_dirty =
> +		!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE) &&
> +		!IS_DGFX(i915);
> +}
> +
>  bool i915_gem_object_can_bypass_llc(struct drm_i915_gem_object *obj)
>  {
>  	struct drm_i915_private *i915 = to_i915(obj->base.dev);
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> index 4c92e17b4337..6f00aab10015 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> @@ -34,6 +34,8 @@ static inline bool i915_gem_object_size_2big(u64 size)
>  
>  unsigned int i915_gem_get_pat_index(struct drm_i915_private *i915,
>  				    enum i915_cache_level level);
> +bool i915_gem_object_has_cache_level(const struct drm_i915_gem_object *obj,
> +				     enum i915_cache_level lvl);
>  void i915_gem_init__objects(struct drm_i915_private *i915);
>  
>  void i915_objects_module_exit(void);
> @@ -764,6 +766,8 @@ bool i915_gem_object_has_unknown_state(struct drm_i915_gem_object *obj);
>  
>  void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj,
>  					 unsigned int cache_level);
> +void i915_gem_object_set_pat_index(struct drm_i915_gem_object *obj,
> +				   unsigned int pat_index);
>  bool i915_gem_object_can_bypass_llc(struct drm_i915_gem_object *obj);
>  void i915_gem_object_flush_if_display(struct drm_i915_gem_object *obj);
>  void i915_gem_object_flush_if_display_locked(struct drm_i915_gem_object *obj);
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> index 890f3ad497c5..9c70dedf25cc 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> @@ -351,12 +351,20 @@ struct drm_i915_gem_object {
>  #define I915_BO_FLAG_STRUCT_PAGE BIT(0) /* Object backed by struct pages */
>  #define I915_BO_FLAG_IOMEM       BIT(1) /* Object backed by IO memory */
>  	/**
> -	 * @cache_level: The desired GTT caching level.
> -	 *
> -	 * See enum i915_cache_level for possible values, along with what
> -	 * each does.
> +	 * @pat_index: The desired PAT index.
> +	 *
> +	 * See hardware specification for valid PAT indices for each platform.
> +	 * This field used to contain a value of enum i915_cache_level. It's
> +	 * changed to an unsigned int because PAT indices are being used by
> +	 * both UMD and KMD for caching policy control after GEN12.
> +	 * For backward compatibility, this field will continue to contain
> +	 * value of i915_cache_level for pre-GEN12 platforms so that the PTE
> +	 * encode functions for these legacy platforms can stay the same.
> +	 * In the meantime platform specific tables are created to translate
> +	 * i915_cache_level into pat index, for more details check the macros
> +	 * defined i915/i915_pci.c, e.g. PVC_CACHELEVEL.
>  	 */
> -	unsigned int cache_level:3;
> +	unsigned int pat_index:6;
>  	/**
>  	 * @cache_coherent:
>  	 *
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> index 8ac376c24aa2..9f379141f966 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> @@ -557,7 +557,9 @@ static void dbg_poison(struct i915_ggtt *ggtt,
>  
>  		ggtt->vm.insert_page(&ggtt->vm, addr,
>  				     ggtt->error_capture.start,
> -				     I915_CACHE_NONE, 0);
> +				     i915_gem_get_pat_index(ggtt->vm.i915,
> +							    I915_CACHE_NONE),
> +				     0);
>  		mb();
>  
>  		s = io_mapping_map_wc(&ggtt->iomap,
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
> index d030182ca176..7eadb7d68d47 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
> @@ -214,7 +214,8 @@ static struct dma_fence *i915_ttm_accel_move(struct ttm_buffer_object *bo,
>  
>  		intel_engine_pm_get(to_gt(i915)->migrate.context->engine);
>  		ret = intel_context_migrate_clear(to_gt(i915)->migrate.context, deps,
> -						  dst_st->sgl, dst_level,
> +						  dst_st->sgl,
> +						  i915_gem_get_pat_index(i915, dst_level),
>  						  i915_ttm_gtt_binds_lmem(dst_mem),
>  						  0, &rq);
>  	} else {
> @@ -227,12 +228,13 @@ static struct dma_fence *i915_ttm_accel_move(struct ttm_buffer_object *bo,
>  		src_level = i915_ttm_cache_level(i915, bo->resource, src_ttm);
>  		intel_engine_pm_get(to_gt(i915)->migrate.context->engine);
>  		ret = intel_context_migrate_copy(to_gt(i915)->migrate.context,
> -						 deps, src_rsgt->table.sgl,
> -						 src_level,
> -						 i915_ttm_gtt_binds_lmem(bo->resource),
> -						 dst_st->sgl, dst_level,
> -						 i915_ttm_gtt_binds_lmem(dst_mem),
> -						 &rq);
> +					deps, src_rsgt->table.sgl,
> +					i915_gem_get_pat_index(i915, src_level),
> +					i915_ttm_gtt_binds_lmem(bo->resource),
> +					dst_st->sgl,
> +					i915_gem_get_pat_index(i915, dst_level),
> +					i915_ttm_gtt_binds_lmem(dst_mem),
> +					&rq);
>  
>  		i915_refct_sgt_put(src_rsgt);
>  	}
> diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> index defece0bcb81..ebb68ac9cd5e 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> @@ -354,7 +354,7 @@ fake_huge_pages_object(struct drm_i915_private *i915, u64 size, bool single)
>  
>  	obj->write_domain = I915_GEM_DOMAIN_CPU;
>  	obj->read_domains = I915_GEM_DOMAIN_CPU;
> -	obj->cache_level = I915_CACHE_NONE;
> +	obj->pat_index = i915_gem_get_pat_index(i915, I915_CACHE_NONE);
>  
>  	return obj;
>  }
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
> index fe6c37fd7859..a93a90b15907 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
> @@ -219,7 +219,7 @@ static int __igt_lmem_pages_migrate(struct intel_gt *gt,
>  			continue;
>  
>  		err = intel_migrate_clear(&gt->migrate, &ww, deps,
> -					  obj->mm.pages->sgl, obj->cache_level,
> +					  obj->mm.pages->sgl, obj->pat_index,
>  					  i915_gem_object_is_lmem(obj),
>  					  0xdeadbeaf, &rq);
>  		if (rq) {
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> index 56279908ed30..a93d8f9f8bc1 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> @@ -1222,7 +1222,7 @@ static int __igt_mmap_migrate(struct intel_memory_region **placements,
>  	}
>  
>  	err = intel_context_migrate_clear(to_gt(i915)->migrate.context, NULL,
> -					  obj->mm.pages->sgl, obj->cache_level,
> +					  obj->mm.pages->sgl, obj->pat_index,
>  					  i915_gem_object_is_lmem(obj),
>  					  expand32(POISON_INUSE), &rq);
>  	i915_gem_object_unpin_pages(obj);
> diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
> index 5aaacc53fa4c..c2bdc133c89a 100644
> --- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
> @@ -109,7 +109,7 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
>  
>  static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
>  				      struct i915_vma_resource *vma_res,
> -				      enum i915_cache_level cache_level,
> +				      unsigned int pat_index,
>  				      u32 flags)
>  {
>  	struct i915_ppgtt *ppgtt = i915_vm_to_ppgtt(vm);
> @@ -117,7 +117,7 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
>  	unsigned int first_entry = vma_res->start / I915_GTT_PAGE_SIZE;
>  	unsigned int act_pt = first_entry / GEN6_PTES;
>  	unsigned int act_pte = first_entry % GEN6_PTES;
> -	const u32 pte_encode = vm->pte_encode(0, cache_level, flags);
> +	const u32 pte_encode = vm->pte_encode(0, pat_index, flags);
>  	struct sgt_dma iter = sgt_dma(vma_res);
>  	gen6_pte_t *vaddr;
>  
> @@ -227,7 +227,9 @@ static int gen6_ppgtt_init_scratch(struct gen6_ppgtt *ppgtt)
>  
>  	vm->scratch[0]->encode =
>  		vm->pte_encode(px_dma(vm->scratch[0]),
> -			       I915_CACHE_NONE, PTE_READ_ONLY);
> +			       i915_gem_get_pat_index(vm->i915,
> +						      I915_CACHE_NONE),
> +			       PTE_READ_ONLY);
>  
>  	vm->scratch[1] = vm->alloc_pt_dma(vm, I915_GTT_PAGE_SIZE_4K);
>  	if (IS_ERR(vm->scratch[1])) {
> @@ -278,7 +280,7 @@ static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
>  static void pd_vma_bind(struct i915_address_space *vm,
>  			struct i915_vm_pt_stash *stash,
>  			struct i915_vma_resource *vma_res,
> -			enum i915_cache_level cache_level,
> +			unsigned int pat_index,
>  			u32 unused)
>  {
>  	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> index 3ae41a13d28d..f76ec2cb29ef 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> @@ -15,6 +15,11 @@
>  #include "intel_gt.h"
>  #include "intel_gtt.h"
>  
> +/**
> + * For pre-gen12 platforms pat_index is the same as enum i915_cache_level,
> + * so the code here is still valid. See translation table defined by
> + * LEGACY_CACHELEVEL
> + */
>  static u64 gen8_pde_encode(const dma_addr_t addr,
>  			   const enum i915_cache_level level)
>  {
> @@ -56,7 +61,7 @@ static u64 gen8_pte_encode(dma_addr_t addr,
>  }
>  
>  static u64 mtl_pte_encode(dma_addr_t addr,
> -			  enum i915_cache_level level,
> +			  unsigned int pat_index,
>  			  u32 flags)
>  {
>  	gen8_pte_t pte = addr | GEN8_PAGE_PRESENT | GEN8_PAGE_RW;
> @@ -67,24 +72,17 @@ static u64 mtl_pte_encode(dma_addr_t addr,
>  	if (flags & PTE_LM)
>  		pte |= GEN12_PPGTT_PTE_LM | GEN12_PPGTT_PTE_NC;
>  
> -	switch (level) {
> -	case I915_CACHE_NONE:
> -		pte |= GEN12_PPGTT_PTE_PAT1;
> -		break;
> -	case I915_CACHE_LLC:
> -	case I915_CACHE_L3_LLC:
> -		pte |= GEN12_PPGTT_PTE_PAT0 | GEN12_PPGTT_PTE_PAT1;
> -		break;
> -	case I915_CACHE_WT:
> +	if (pat_index & BIT(0))
>  		pte |= GEN12_PPGTT_PTE_PAT0;
> -		break;
> -	default:
> -		/* This should never happen. Added to deal with the compile
> -		 * error due to the addition of I915_MAX_CACHE_LEVEL. Will
> -		 * be removed by the pat_index patch.
> -		 */
> -		break;
> -	}
> +
> +	if (pat_index & BIT(1))
> +		pte |= GEN12_PPGTT_PTE_PAT1;
> +
> +	if (pat_index & BIT(2))
> +		pte |= GEN12_PPGTT_PTE_PAT2;
> +
> +	if (pat_index & BIT(3))
> +		pte |= GEN12_PPGTT_PTE_PAT3;
>  
>  	return pte;
>  }
> @@ -457,11 +455,11 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
>  		      struct i915_page_directory *pdp,
>  		      struct sgt_dma *iter,
>  		      u64 idx,
> -		      enum i915_cache_level cache_level,
> +		      unsigned int pat_index,
>  		      u32 flags)
>  {
>  	struct i915_page_directory *pd;
> -	const gen8_pte_t pte_encode = ppgtt->vm.pte_encode(0, cache_level, flags);
> +	const gen8_pte_t pte_encode = ppgtt->vm.pte_encode(0, pat_index, flags);
>  	gen8_pte_t *vaddr;
>  
>  	pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2));
> @@ -504,10 +502,10 @@ static void
>  xehpsdv_ppgtt_insert_huge(struct i915_address_space *vm,
>  			  struct i915_vma_resource *vma_res,
>  			  struct sgt_dma *iter,
> -			  enum i915_cache_level cache_level,
> +			  unsigned int pat_index,
>  			  u32 flags)
>  {
> -	const gen8_pte_t pte_encode = vm->pte_encode(0, cache_level, flags);
> +	const gen8_pte_t pte_encode = vm->pte_encode(0, pat_index, flags);
>  	unsigned int rem = sg_dma_len(iter->sg);
>  	u64 start = vma_res->start;
>  	u64 end = start + vma_res->vma_size;
> @@ -611,10 +609,10 @@ xehpsdv_ppgtt_insert_huge(struct i915_address_space *vm,
>  static void gen8_ppgtt_insert_huge(struct i915_address_space *vm,
>  				   struct i915_vma_resource *vma_res,
>  				   struct sgt_dma *iter,
> -				   enum i915_cache_level cache_level,
> +				   unsigned int pat_index,
>  				   u32 flags)
>  {
> -	const gen8_pte_t pte_encode = vm->pte_encode(0, cache_level, flags);
> +	const gen8_pte_t pte_encode = vm->pte_encode(0, pat_index, flags);
>  	unsigned int rem = sg_dma_len(iter->sg);
>  	u64 start = vma_res->start;
>  
> @@ -734,7 +732,7 @@ static void gen8_ppgtt_insert_huge(struct i915_address_space *vm,
>  
>  static void gen8_ppgtt_insert(struct i915_address_space *vm,
>  			      struct i915_vma_resource *vma_res,
> -			      enum i915_cache_level cache_level,
> +			      unsigned int pat_index,
>  			      u32 flags)
>  {
>  	struct i915_ppgtt * const ppgtt = i915_vm_to_ppgtt(vm);
> @@ -742,9 +740,9 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm,
>  
>  	if (vma_res->bi.page_sizes.sg > I915_GTT_PAGE_SIZE) {
>  		if (HAS_64K_PAGES(vm->i915))
> -			xehpsdv_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags);
> +			xehpsdv_ppgtt_insert_huge(vm, vma_res, &iter, pat_index, flags);
>  		else
> -			gen8_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags);
> +			gen8_ppgtt_insert_huge(vm, vma_res, &iter, pat_index, flags);
>  	} else  {
>  		u64 idx = vma_res->start >> GEN8_PTE_SHIFT;
>  
> @@ -753,7 +751,7 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm,
>  				gen8_pdp_for_page_index(vm, idx);
>  
>  			idx = gen8_ppgtt_insert_pte(ppgtt, pdp, &iter, idx,
> -						    cache_level, flags);
> +						    pat_index, flags);
>  		} while (idx);
>  
>  		vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
> @@ -763,7 +761,7 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm,
>  static void gen8_ppgtt_insert_entry(struct i915_address_space *vm,
>  				    dma_addr_t addr,
>  				    u64 offset,
> -				    enum i915_cache_level level,
> +				    unsigned int pat_index,
>  				    u32 flags)
>  {
>  	u64 idx = offset >> GEN8_PTE_SHIFT;
> @@ -777,14 +775,14 @@ static void gen8_ppgtt_insert_entry(struct i915_address_space *vm,
>  	GEM_BUG_ON(pt->is_compact);
>  
>  	vaddr = px_vaddr(pt);
> -	vaddr[gen8_pd_index(idx, 0)] = vm->pte_encode(addr, level, flags);
> +	vaddr[gen8_pd_index(idx, 0)] = vm->pte_encode(addr, pat_index, flags);
>  	drm_clflush_virt_range(&vaddr[gen8_pd_index(idx, 0)], sizeof(*vaddr));
>  }
>  
>  static void __xehpsdv_ppgtt_insert_entry_lm(struct i915_address_space *vm,
>  					    dma_addr_t addr,
>  					    u64 offset,
> -					    enum i915_cache_level level,
> +					    unsigned int pat_index,
>  					    u32 flags)
>  {
>  	u64 idx = offset >> GEN8_PTE_SHIFT;
> @@ -807,20 +805,20 @@ static void __xehpsdv_ppgtt_insert_entry_lm(struct i915_address_space *vm,
>  	}
>  
>  	vaddr = px_vaddr(pt);
> -	vaddr[gen8_pd_index(idx, 0) / 16] = vm->pte_encode(addr, level, flags);
> +	vaddr[gen8_pd_index(idx, 0) / 16] = vm->pte_encode(addr, pat_index, flags);
>  }
>  
>  static void xehpsdv_ppgtt_insert_entry(struct i915_address_space *vm,
>  				       dma_addr_t addr,
>  				       u64 offset,
> -				       enum i915_cache_level level,
> +				       unsigned int pat_index,
>  				       u32 flags)
>  {
>  	if (flags & PTE_LM)
>  		return __xehpsdv_ppgtt_insert_entry_lm(vm, addr, offset,
> -						       level, flags);
> +						       pat_index, flags);
>  
> -	return gen8_ppgtt_insert_entry(vm, addr, offset, level, flags);
> +	return gen8_ppgtt_insert_entry(vm, addr, offset, pat_index, flags);
>  }
>  
>  static int gen8_init_scratch(struct i915_address_space *vm)
> @@ -855,7 +853,9 @@ static int gen8_init_scratch(struct i915_address_space *vm)
>  
>  	vm->scratch[0]->encode =
>  		vm->pte_encode(px_dma(vm->scratch[0]),
> -				I915_CACHE_NONE, pte_flags);
> +			       i915_gem_get_pat_index(vm->i915,
> +						      I915_CACHE_NONE),
> +			       pte_flags);
>  
>  	for (i = 1; i <= vm->top; i++) {
>  		struct drm_i915_gem_object *obj;
> @@ -873,7 +873,9 @@ static int gen8_init_scratch(struct i915_address_space *vm)
>  		}
>  
>  		fill_px(obj, vm->scratch[i - 1]->encode);
> -		obj->encode = gen8_pde_encode(px_dma(obj), I915_CACHE_NONE);
> +		obj->encode = gen8_pde_encode(px_dma(obj),
> +					      i915_gem_get_pat_index(vm->i915,
> +							I915_CACHE_NONE));
>  
>  		vm->scratch[i] = obj;
>  	}
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
> index 6b8ce7f4d25a..98e260e1a081 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
> @@ -10,13 +10,12 @@
>  
>  struct i915_address_space;
>  struct intel_gt;
> -enum i915_cache_level;
>  
>  struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
>  				     unsigned long lmem_pt_obj_flags);
>  
>  u64 gen8_ggtt_pte_encode(dma_addr_t addr,
> -			 enum i915_cache_level level,
> +			 unsigned int pat_index,
>  			 u32 flags);
>  u64 mtl_ggtt_pte_encode(dma_addr_t addr,
>  			unsigned int pat_index,
> diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> index 91056b9a60a9..66a4955f19e4 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> @@ -221,7 +221,7 @@ static void guc_ggtt_invalidate(struct i915_ggtt *ggtt)
>  }
>  
>  u64 mtl_ggtt_pte_encode(dma_addr_t addr,
> -			enum i915_cache_level level,
> +			unsigned int pat_index,
>  			u32 flags)
>  {
>  	gen8_pte_t pte = addr | GEN8_PAGE_PRESENT;
> @@ -231,30 +231,17 @@ u64 mtl_ggtt_pte_encode(dma_addr_t addr,
>  	if (flags & PTE_LM)
>  		pte |= GEN12_GGTT_PTE_LM;
>  
> -	switch (level) {
> -	case I915_CACHE_NONE:
> -		pte |= MTL_GGTT_PTE_PAT1;
> -		break;
> -	case I915_CACHE_LLC:
> -	case I915_CACHE_L3_LLC:
> -		pte |= MTL_GGTT_PTE_PAT0 | MTL_GGTT_PTE_PAT1;
> -		break;
> -	case I915_CACHE_WT:
> +	if (pat_index & BIT(0))
>  		pte |= MTL_GGTT_PTE_PAT0;
> -		break;
> -	default:
> -		/* This should never happen. Added to deal with the compile
> -		 * error due to the addition of I915_MAX_CACHE_LEVEL. Will
> -		 * be removed by the pat_index patch.
> -		 */
> -		break;
> -	}
> +
> +	if (pat_index & BIT(1))
> +		pte |= MTL_GGTT_PTE_PAT1;
>  
>  	return pte;
>  }
>  
>  u64 gen8_ggtt_pte_encode(dma_addr_t addr,
> -			 enum i915_cache_level level,
> +			 unsigned int pat_index,
>  			 u32 flags)
>  {
>  	gen8_pte_t pte = addr | GEN8_PAGE_PRESENT;
> @@ -273,25 +260,25 @@ static void gen8_set_pte(void __iomem *addr, gen8_pte_t pte)
>  static void gen8_ggtt_insert_page(struct i915_address_space *vm,
>  				  dma_addr_t addr,
>  				  u64 offset,
> -				  enum i915_cache_level level,
> +				  unsigned int pat_index,
>  				  u32 flags)
>  {
>  	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
>  	gen8_pte_t __iomem *pte =
>  		(gen8_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE;
>  
> -	gen8_set_pte(pte, ggtt->vm.pte_encode(addr, level, flags));
> +	gen8_set_pte(pte, ggtt->vm.pte_encode(addr, pat_index, flags));
>  
>  	ggtt->invalidate(ggtt);
>  }
>  
>  static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
>  				     struct i915_vma_resource *vma_res,
> -				     enum i915_cache_level level,
> +				     unsigned int pat_index,
>  				     u32 flags)
>  {
>  	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
> -	const gen8_pte_t pte_encode = ggtt->vm.pte_encode(0, level, flags);
> +	const gen8_pte_t pte_encode = ggtt->vm.pte_encode(0, pat_index, flags);
>  	gen8_pte_t __iomem *gte;
>  	gen8_pte_t __iomem *end;
>  	struct sgt_iter iter;
> @@ -348,14 +335,14 @@ static void gen8_ggtt_clear_range(struct i915_address_space *vm,
>  static void gen6_ggtt_insert_page(struct i915_address_space *vm,
>  				  dma_addr_t addr,
>  				  u64 offset,
> -				  enum i915_cache_level level,
> +				  unsigned int pat_index,
>  				  u32 flags)
>  {
>  	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
>  	gen6_pte_t __iomem *pte =
>  		(gen6_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE;
>  
> -	iowrite32(vm->pte_encode(addr, level, flags), pte);
> +	iowrite32(vm->pte_encode(addr, pat_index, flags), pte);
>  
>  	ggtt->invalidate(ggtt);
>  }
> @@ -368,7 +355,7 @@ static void gen6_ggtt_insert_page(struct i915_address_space *vm,
>   */
>  static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
>  				     struct i915_vma_resource *vma_res,
> -				     enum i915_cache_level level,
> +				     unsigned int pat_index,
>  				     u32 flags)
>  {
>  	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
> @@ -385,7 +372,7 @@ static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
>  		iowrite32(vm->scratch[0]->encode, gte++);
>  	end += (vma_res->node_size + vma_res->guard) / I915_GTT_PAGE_SIZE;
>  	for_each_sgt_daddr(addr, iter, vma_res->bi.pages)
> -		iowrite32(vm->pte_encode(addr, level, flags), gte++);
> +		iowrite32(vm->pte_encode(addr, pat_index, flags), gte++);
>  	GEM_BUG_ON(gte > end);
>  
>  	/* Fill the allocated but "unused" space beyond the end of the buffer */
> @@ -420,14 +407,15 @@ struct insert_page {
>  	struct i915_address_space *vm;
>  	dma_addr_t addr;
>  	u64 offset;
> -	enum i915_cache_level level;
> +	unsigned int pat_index;
>  };
>  
>  static int bxt_vtd_ggtt_insert_page__cb(void *_arg)
>  {
>  	struct insert_page *arg = _arg;
>  
> -	gen8_ggtt_insert_page(arg->vm, arg->addr, arg->offset, arg->level, 0);
> +	gen8_ggtt_insert_page(arg->vm, arg->addr, arg->offset,
> +			      arg->pat_index, 0);
>  	bxt_vtd_ggtt_wa(arg->vm);
>  
>  	return 0;
> @@ -436,10 +424,10 @@ static int bxt_vtd_ggtt_insert_page__cb(void *_arg)
>  static void bxt_vtd_ggtt_insert_page__BKL(struct i915_address_space *vm,
>  					  dma_addr_t addr,
>  					  u64 offset,
> -					  enum i915_cache_level level,
> +					  unsigned int pat_index,
>  					  u32 unused)
>  {
> -	struct insert_page arg = { vm, addr, offset, level };
> +	struct insert_page arg = { vm, addr, offset, pat_index };
>  
>  	stop_machine(bxt_vtd_ggtt_insert_page__cb, &arg, NULL);
>  }
> @@ -447,7 +435,7 @@ static void bxt_vtd_ggtt_insert_page__BKL(struct i915_address_space *vm,
>  struct insert_entries {
>  	struct i915_address_space *vm;
>  	struct i915_vma_resource *vma_res;
> -	enum i915_cache_level level;
> +	unsigned int pat_index;
>  	u32 flags;
>  };
>  
> @@ -455,7 +443,8 @@ static int bxt_vtd_ggtt_insert_entries__cb(void *_arg)
>  {
>  	struct insert_entries *arg = _arg;
>  
> -	gen8_ggtt_insert_entries(arg->vm, arg->vma_res, arg->level, arg->flags);
> +	gen8_ggtt_insert_entries(arg->vm, arg->vma_res,
> +				 arg->pat_index, arg->flags);
>  	bxt_vtd_ggtt_wa(arg->vm);
>  
>  	return 0;
> @@ -463,10 +452,10 @@ static int bxt_vtd_ggtt_insert_entries__cb(void *_arg)
>  
>  static void bxt_vtd_ggtt_insert_entries__BKL(struct i915_address_space *vm,
>  					     struct i915_vma_resource *vma_res,
> -					     enum i915_cache_level level,
> +					     unsigned int pat_index,
>  					     u32 flags)
>  {
> -	struct insert_entries arg = { vm, vma_res, level, flags };
> +	struct insert_entries arg = { vm, vma_res, pat_index, flags };
>  
>  	stop_machine(bxt_vtd_ggtt_insert_entries__cb, &arg, NULL);
>  }
> @@ -495,7 +484,7 @@ static void gen6_ggtt_clear_range(struct i915_address_space *vm,
>  void intel_ggtt_bind_vma(struct i915_address_space *vm,
>  			 struct i915_vm_pt_stash *stash,
>  			 struct i915_vma_resource *vma_res,
> -			 enum i915_cache_level cache_level,
> +			 unsigned int pat_index,
>  			 u32 flags)
>  {
>  	u32 pte_flags;
> @@ -512,7 +501,7 @@ void intel_ggtt_bind_vma(struct i915_address_space *vm,
>  	if (vma_res->bi.lmem)
>  		pte_flags |= PTE_LM;
>  
> -	vm->insert_entries(vm, vma_res, cache_level, pte_flags);
> +	vm->insert_entries(vm, vma_res, pat_index, pte_flags);
>  	vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
>  }
>  
> @@ -661,7 +650,7 @@ static int init_ggtt(struct i915_ggtt *ggtt)
>  static void aliasing_gtt_bind_vma(struct i915_address_space *vm,
>  				  struct i915_vm_pt_stash *stash,
>  				  struct i915_vma_resource *vma_res,
> -				  enum i915_cache_level cache_level,
> +				  unsigned int pat_index,
>  				  u32 flags)
>  {
>  	u32 pte_flags;
> @@ -673,10 +662,10 @@ static void aliasing_gtt_bind_vma(struct i915_address_space *vm,
>  
>  	if (flags & I915_VMA_LOCAL_BIND)
>  		ppgtt_bind_vma(&i915_vm_to_ggtt(vm)->alias->vm,
> -			       stash, vma_res, cache_level, flags);
> +			       stash, vma_res, pat_index, flags);
>  
>  	if (flags & I915_VMA_GLOBAL_BIND)
> -		vm->insert_entries(vm, vma_res, cache_level, pte_flags);
> +		vm->insert_entries(vm, vma_res, pat_index, pte_flags);
>  
>  	vma_res->bound_flags |= flags;
>  }
> @@ -933,7 +922,9 @@ static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 size)
>  
>  	ggtt->vm.scratch[0]->encode =
>  		ggtt->vm.pte_encode(px_dma(ggtt->vm.scratch[0]),
> -				    I915_CACHE_NONE, pte_flags);
> +				    i915_gem_get_pat_index(i915,
> +							   I915_CACHE_NONE),
> +				    pte_flags);
>  
>  	return 0;
>  }
> @@ -1022,6 +1013,11 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
>  	return ggtt_probe_common(ggtt, size);
>  }
>  
> +/*
> + * For pre-gen8 platforms pat_index is the same as enum i915_cache_level,
> + * so these PTE encode functions are left with using cache_level.
> + * See translation table LEGACY_CACHELEVEL.
> + */
>  static u64 snb_pte_encode(dma_addr_t addr,
>  			  enum i915_cache_level level,
>  			  u32 flags)
> @@ -1302,7 +1298,9 @@ bool i915_ggtt_resume_vm(struct i915_address_space *vm)
>  		 */
>  		vma->resource->bound_flags = 0;
>  		vma->ops->bind_vma(vm, NULL, vma->resource,
> -				   obj ? obj->cache_level : 0,
> +				   obj ? obj->pat_index :
> +					 i915_gem_get_pat_index(vm->i915,
> +							I915_CACHE_NONE),
>  				   was_bound);
>  
>  		if (obj) { /* only used during resume => exclusive access */
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
> index b632167eaf2e..12bd4398ad38 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> @@ -165,8 +165,6 @@ typedef u64 gen8_pte_t;
>  #define MTL_2_COH_1W	REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 2)
>  #define MTL_0_COH_NON	REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 0)
>  
> -enum i915_cache_level;
> -
>  struct drm_i915_gem_object;
>  struct i915_fence_reg;
>  struct i915_vma;
> @@ -234,7 +232,7 @@ struct i915_vma_ops {
>  	void (*bind_vma)(struct i915_address_space *vm,
>  			 struct i915_vm_pt_stash *stash,
>  			 struct i915_vma_resource *vma_res,
> -			 enum i915_cache_level cache_level,
> +			 unsigned int pat_index,
>  			 u32 flags);
>  	/*
>  	 * Unmap an object from an address space. This usually consists of
> @@ -306,7 +304,7 @@ struct i915_address_space {
>  		(*alloc_scratch_dma)(struct i915_address_space *vm, int sz);
>  
>  	u64 (*pte_encode)(dma_addr_t addr,
> -			  enum i915_cache_level level,
> +			  unsigned int pat_index,
>  			  u32 flags); /* Create a valid PTE */
>  #define PTE_READ_ONLY	BIT(0)
>  #define PTE_LM		BIT(1)
> @@ -321,20 +319,20 @@ struct i915_address_space {
>  	void (*insert_page)(struct i915_address_space *vm,
>  			    dma_addr_t addr,
>  			    u64 offset,
> -			    enum i915_cache_level cache_level,
> +			    unsigned int pat_index,
>  			    u32 flags);
>  	void (*insert_entries)(struct i915_address_space *vm,
>  			       struct i915_vma_resource *vma_res,
> -			       enum i915_cache_level cache_level,
> +			       unsigned int pat_index,
>  			       u32 flags);
>  	void (*raw_insert_page)(struct i915_address_space *vm,
>  				dma_addr_t addr,
>  				u64 offset,
> -				enum i915_cache_level cache_level,
> +				unsigned int pat_index,
>  				u32 flags);
>  	void (*raw_insert_entries)(struct i915_address_space *vm,
>  				   struct i915_vma_resource *vma_res,
> -				   enum i915_cache_level cache_level,
> +				   unsigned int pat_index,
>  				   u32 flags);
>  	void (*cleanup)(struct i915_address_space *vm);
>  
> @@ -581,7 +579,7 @@ void ppgtt_init(struct i915_ppgtt *ppgtt, struct intel_gt *gt,
>  void intel_ggtt_bind_vma(struct i915_address_space *vm,
>  			 struct i915_vm_pt_stash *stash,
>  			 struct i915_vma_resource *vma_res,
> -			 enum i915_cache_level cache_level,
> +			 unsigned int pat_index,
>  			 u32 flags);
>  void intel_ggtt_unbind_vma(struct i915_address_space *vm,
>  			   struct i915_vma_resource *vma_res);
> @@ -639,7 +637,7 @@ void
>  __set_pd_entry(struct i915_page_directory * const pd,
>  	       const unsigned short idx,
>  	       struct i915_page_table *pt,
> -	       u64 (*encode)(const dma_addr_t, const enum i915_cache_level));
> +	       u64 (*encode)(const dma_addr_t, const unsigned int pat_index));
>  
>  #define set_pd_entry(pd, idx, to) \
>  	__set_pd_entry((pd), (idx), px_pt(to), gen8_pde_encode)
> @@ -659,7 +657,7 @@ void gen6_ggtt_invalidate(struct i915_ggtt *ggtt);
>  void ppgtt_bind_vma(struct i915_address_space *vm,
>  		    struct i915_vm_pt_stash *stash,
>  		    struct i915_vma_resource *vma_res,
> -		    enum i915_cache_level cache_level,
> +		    unsigned int pat_index,
>  		    u32 flags);
>  void ppgtt_unbind_vma(struct i915_address_space *vm,
>  		      struct i915_vma_resource *vma_res);
> diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c
> index 3f638f198796..117c3d05af3e 100644
> --- a/drivers/gpu/drm/i915/gt/intel_migrate.c
> +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
> @@ -45,7 +45,9 @@ static void xehpsdv_toggle_pdes(struct i915_address_space *vm,
>  	 * Insert a dummy PTE into every PT that will map to LMEM to ensure
>  	 * we have a correctly setup PDE structure for later use.
>  	 */
> -	vm->insert_page(vm, 0, d->offset, I915_CACHE_NONE, PTE_LM);
> +	vm->insert_page(vm, 0, d->offset,
> +			i915_gem_get_pat_index(vm->i915, I915_CACHE_NONE),
> +			PTE_LM);
>  	GEM_BUG_ON(!pt->is_compact);
>  	d->offset += SZ_2M;
>  }
> @@ -63,7 +65,9 @@ static void xehpsdv_insert_pte(struct i915_address_space *vm,
>  	 * alignment is 64K underneath for the pt, and we are careful
>  	 * not to access the space in the void.
>  	 */
> -	vm->insert_page(vm, px_dma(pt), d->offset, I915_CACHE_NONE, PTE_LM);
> +	vm->insert_page(vm, px_dma(pt), d->offset,
> +			i915_gem_get_pat_index(vm->i915, I915_CACHE_NONE),
> +			PTE_LM);
>  	d->offset += SZ_64K;
>  }
>  
> @@ -73,7 +77,8 @@ static void insert_pte(struct i915_address_space *vm,
>  {
>  	struct insert_pte_data *d = data;
>  
> -	vm->insert_page(vm, px_dma(pt), d->offset, I915_CACHE_NONE,
> +	vm->insert_page(vm, px_dma(pt), d->offset,
> +			i915_gem_get_pat_index(vm->i915, I915_CACHE_NONE),
>  			i915_gem_object_is_lmem(pt->base) ? PTE_LM : 0);
>  	d->offset += PAGE_SIZE;
>  }
> @@ -356,13 +361,13 @@ static int max_pte_pkt_size(struct i915_request *rq, int pkt)
>  
>  static int emit_pte(struct i915_request *rq,
>  		    struct sgt_dma *it,
> -		    enum i915_cache_level cache_level,
> +		    unsigned int pat_index,
>  		    bool is_lmem,
>  		    u64 offset,
>  		    int length)
>  {
>  	bool has_64K_pages = HAS_64K_PAGES(rq->engine->i915);
> -	const u64 encode = rq->context->vm->pte_encode(0, cache_level,
> +	const u64 encode = rq->context->vm->pte_encode(0, pat_index,
>  						       is_lmem ? PTE_LM : 0);
>  	struct intel_ring *ring = rq->ring;
>  	int pkt, dword_length;
> @@ -673,17 +678,17 @@ int
>  intel_context_migrate_copy(struct intel_context *ce,
>  			   const struct i915_deps *deps,
>  			   struct scatterlist *src,
> -			   enum i915_cache_level src_cache_level,
> +			   unsigned int src_pat_index,
>  			   bool src_is_lmem,
>  			   struct scatterlist *dst,
> -			   enum i915_cache_level dst_cache_level,
> +			   unsigned int dst_pat_index,
>  			   bool dst_is_lmem,
>  			   struct i915_request **out)
>  {
>  	struct sgt_dma it_src = sg_sgt(src), it_dst = sg_sgt(dst), it_ccs;
>  	struct drm_i915_private *i915 = ce->engine->i915;
>  	u64 ccs_bytes_to_cpy = 0, bytes_to_cpy;
> -	enum i915_cache_level ccs_cache_level;
> +	unsigned int ccs_pat_index;
>  	u32 src_offset, dst_offset;
>  	u8 src_access, dst_access;
>  	struct i915_request *rq;
> @@ -707,12 +712,12 @@ intel_context_migrate_copy(struct intel_context *ce,
>  		dst_sz = scatter_list_length(dst);
>  		if (src_is_lmem) {
>  			it_ccs = it_dst;
> -			ccs_cache_level = dst_cache_level;
> +			ccs_pat_index = dst_pat_index;
>  			ccs_is_src = false;
>  		} else if (dst_is_lmem) {
>  			bytes_to_cpy = dst_sz;
>  			it_ccs = it_src;
> -			ccs_cache_level = src_cache_level;
> +			ccs_pat_index = src_pat_index;
>  			ccs_is_src = true;
>  		}
>  
> @@ -773,7 +778,7 @@ intel_context_migrate_copy(struct intel_context *ce,
>  		src_sz = calculate_chunk_sz(i915, src_is_lmem,
>  					    bytes_to_cpy, ccs_bytes_to_cpy);
>  
> -		len = emit_pte(rq, &it_src, src_cache_level, src_is_lmem,
> +		len = emit_pte(rq, &it_src, src_pat_index, src_is_lmem,
>  			       src_offset, src_sz);
>  		if (!len) {
>  			err = -EINVAL;
> @@ -784,7 +789,7 @@ intel_context_migrate_copy(struct intel_context *ce,
>  			goto out_rq;
>  		}
>  
> -		err = emit_pte(rq, &it_dst, dst_cache_level, dst_is_lmem,
> +		err = emit_pte(rq, &it_dst, dst_pat_index, dst_is_lmem,
>  			       dst_offset, len);
>  		if (err < 0)
>  			goto out_rq;
> @@ -811,7 +816,7 @@ intel_context_migrate_copy(struct intel_context *ce,
>  				goto out_rq;
>  
>  			ccs_sz = GET_CCS_BYTES(i915, len);
> -			err = emit_pte(rq, &it_ccs, ccs_cache_level, false,
> +			err = emit_pte(rq, &it_ccs, ccs_pat_index, false,
>  				       ccs_is_src ? src_offset : dst_offset,
>  				       ccs_sz);
>  			if (err < 0)
> @@ -979,7 +984,7 @@ int
>  intel_context_migrate_clear(struct intel_context *ce,
>  			    const struct i915_deps *deps,
>  			    struct scatterlist *sg,
> -			    enum i915_cache_level cache_level,
> +			    unsigned int pat_index,
>  			    bool is_lmem,
>  			    u32 value,
>  			    struct i915_request **out)
> @@ -1027,7 +1032,7 @@ intel_context_migrate_clear(struct intel_context *ce,
>  		if (err)
>  			goto out_rq;
>  
> -		len = emit_pte(rq, &it, cache_level, is_lmem, offset, CHUNK_SZ);
> +		len = emit_pte(rq, &it, pat_index, is_lmem, offset, CHUNK_SZ);
>  		if (len <= 0) {
>  			err = len;
>  			goto out_rq;
> @@ -1074,10 +1079,10 @@ int intel_migrate_copy(struct intel_migrate *m,
>  		       struct i915_gem_ww_ctx *ww,
>  		       const struct i915_deps *deps,
>  		       struct scatterlist *src,
> -		       enum i915_cache_level src_cache_level,
> +		       unsigned int src_pat_index,
>  		       bool src_is_lmem,
>  		       struct scatterlist *dst,
> -		       enum i915_cache_level dst_cache_level,
> +		       unsigned int dst_pat_index,
>  		       bool dst_is_lmem,
>  		       struct i915_request **out)
>  {
> @@ -1098,8 +1103,8 @@ int intel_migrate_copy(struct intel_migrate *m,
>  		goto out;
>  
>  	err = intel_context_migrate_copy(ce, deps,
> -					 src, src_cache_level, src_is_lmem,
> -					 dst, dst_cache_level, dst_is_lmem,
> +					 src, src_pat_index, src_is_lmem,
> +					 dst, dst_pat_index, dst_is_lmem,
>  					 out);
>  
>  	intel_context_unpin(ce);
> @@ -1113,7 +1118,7 @@ intel_migrate_clear(struct intel_migrate *m,
>  		    struct i915_gem_ww_ctx *ww,
>  		    const struct i915_deps *deps,
>  		    struct scatterlist *sg,
> -		    enum i915_cache_level cache_level,
> +		    unsigned int pat_index,
>  		    bool is_lmem,
>  		    u32 value,
>  		    struct i915_request **out)
> @@ -1134,7 +1139,7 @@ intel_migrate_clear(struct intel_migrate *m,
>  	if (err)
>  		goto out;
>  
> -	err = intel_context_migrate_clear(ce, deps, sg, cache_level,
> +	err = intel_context_migrate_clear(ce, deps, sg, pat_index,
>  					  is_lmem, value, out);
>  
>  	intel_context_unpin(ce);
> diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.h b/drivers/gpu/drm/i915/gt/intel_migrate.h
> index ccc677ec4aa3..11fc09a00c4b 100644
> --- a/drivers/gpu/drm/i915/gt/intel_migrate.h
> +++ b/drivers/gpu/drm/i915/gt/intel_migrate.h
> @@ -16,7 +16,6 @@ struct i915_request;
>  struct i915_gem_ww_ctx;
>  struct intel_gt;
>  struct scatterlist;
> -enum i915_cache_level;
>  
>  int intel_migrate_init(struct intel_migrate *m, struct intel_gt *gt);
>  
> @@ -26,20 +25,20 @@ int intel_migrate_copy(struct intel_migrate *m,
>  		       struct i915_gem_ww_ctx *ww,
>  		       const struct i915_deps *deps,
>  		       struct scatterlist *src,
> -		       enum i915_cache_level src_cache_level,
> +		       unsigned int src_pat_index,
>  		       bool src_is_lmem,
>  		       struct scatterlist *dst,
> -		       enum i915_cache_level dst_cache_level,
> +		       unsigned int dst_pat_index,
>  		       bool dst_is_lmem,
>  		       struct i915_request **out);
>  
>  int intel_context_migrate_copy(struct intel_context *ce,
>  			       const struct i915_deps *deps,
>  			       struct scatterlist *src,
> -			       enum i915_cache_level src_cache_level,
> +			       unsigned int src_pat_index,
>  			       bool src_is_lmem,
>  			       struct scatterlist *dst,
> -			       enum i915_cache_level dst_cache_level,
> +			       unsigned int dst_pat_index,
>  			       bool dst_is_lmem,
>  			       struct i915_request **out);
>  
> @@ -48,7 +47,7 @@ intel_migrate_clear(struct intel_migrate *m,
>  		    struct i915_gem_ww_ctx *ww,
>  		    const struct i915_deps *deps,
>  		    struct scatterlist *sg,
> -		    enum i915_cache_level cache_level,
> +		    unsigned int pat_index,
>  		    bool is_lmem,
>  		    u32 value,
>  		    struct i915_request **out);
> @@ -56,7 +55,7 @@ int
>  intel_context_migrate_clear(struct intel_context *ce,
>  			    const struct i915_deps *deps,
>  			    struct scatterlist *sg,
> -			    enum i915_cache_level cache_level,
> +			    unsigned int pat_index,
>  			    bool is_lmem,
>  			    u32 value,
>  			    struct i915_request **out);
> diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> index 7ecfa672f738..f0da3555c6db 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> @@ -98,7 +98,7 @@ void
>  __set_pd_entry(struct i915_page_directory * const pd,
>  	       const unsigned short idx,
>  	       struct i915_page_table * const to,
> -	       u64 (*encode)(const dma_addr_t, const enum i915_cache_level))
> +	       u64 (*encode)(const dma_addr_t, const unsigned int))
>  {
>  	/* Each thread pre-pins the pd, and we may have a thread per pde. */
>  	GEM_BUG_ON(atomic_read(px_used(pd)) > NALLOC * I915_PDES);
> @@ -181,7 +181,7 @@ struct i915_ppgtt *i915_ppgtt_create(struct intel_gt *gt,
>  void ppgtt_bind_vma(struct i915_address_space *vm,
>  		    struct i915_vm_pt_stash *stash,
>  		    struct i915_vma_resource *vma_res,
> -		    enum i915_cache_level cache_level,
> +		    unsigned int pat_index,
>  		    u32 flags)
>  {
>  	u32 pte_flags;
> @@ -199,7 +199,7 @@ void ppgtt_bind_vma(struct i915_address_space *vm,
>  	if (vma_res->bi.lmem)
>  		pte_flags |= PTE_LM;
>  
> -	vm->insert_entries(vm, vma_res, cache_level, pte_flags);
> +	vm->insert_entries(vm, vma_res, pat_index, pte_flags);
>  	wmb();
>  }
>  
> diff --git a/drivers/gpu/drm/i915/gt/selftest_migrate.c b/drivers/gpu/drm/i915/gt/selftest_migrate.c
> index e677f2da093d..3def5ca72dec 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_migrate.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_migrate.c
> @@ -137,7 +137,7 @@ static int copy(struct intel_migrate *migrate,
>  static int intel_context_copy_ccs(struct intel_context *ce,
>  				  const struct i915_deps *deps,
>  				  struct scatterlist *sg,
> -				  enum i915_cache_level cache_level,
> +				  unsigned int pat_index,
>  				  bool write_to_ccs,
>  				  struct i915_request **out)
>  {
> @@ -185,7 +185,7 @@ static int intel_context_copy_ccs(struct intel_context *ce,
>  		if (err)
>  			goto out_rq;
>  
> -		len = emit_pte(rq, &it, cache_level, true, offset, CHUNK_SZ);
> +		len = emit_pte(rq, &it, pat_index, true, offset, CHUNK_SZ);
>  		if (len <= 0) {
>  			err = len;
>  			goto out_rq;
> @@ -223,7 +223,7 @@ intel_migrate_ccs_copy(struct intel_migrate *m,
>  		       struct i915_gem_ww_ctx *ww,
>  		       const struct i915_deps *deps,
>  		       struct scatterlist *sg,
> -		       enum i915_cache_level cache_level,
> +		       unsigned int pat_index,
>  		       bool write_to_ccs,
>  		       struct i915_request **out)
>  {
> @@ -243,7 +243,7 @@ intel_migrate_ccs_copy(struct intel_migrate *m,
>  	if (err)
>  		goto out;
>  
> -	err = intel_context_copy_ccs(ce, deps, sg, cache_level,
> +	err = intel_context_copy_ccs(ce, deps, sg, pat_index,
>  				     write_to_ccs, out);
>  
>  	intel_context_unpin(ce);
> @@ -300,7 +300,7 @@ static int clear(struct intel_migrate *migrate,
>  			/* Write the obj data into ccs surface */
>  			err = intel_migrate_ccs_copy(migrate, &ww, NULL,
>  						     obj->mm.pages->sgl,
> -						     obj->cache_level,
> +						     obj->pat_index,
>  						     true, &rq);
>  			if (rq && !err) {
>  				if (i915_request_wait(rq, 0, HZ) < 0) {
> @@ -351,7 +351,7 @@ static int clear(struct intel_migrate *migrate,
>  
>  			err = intel_migrate_ccs_copy(migrate, &ww, NULL,
>  						     obj->mm.pages->sgl,
> -						     obj->cache_level,
> +						     obj->pat_index,
>  						     false, &rq);
>  			if (rq && !err) {
>  				if (i915_request_wait(rq, 0, HZ) < 0) {
> @@ -414,9 +414,9 @@ static int __migrate_copy(struct intel_migrate *migrate,
>  			  struct i915_request **out)
>  {
>  	return intel_migrate_copy(migrate, ww, NULL,
> -				  src->mm.pages->sgl, src->cache_level,
> +				  src->mm.pages->sgl, src->pat_index,
>  				  i915_gem_object_is_lmem(src),
> -				  dst->mm.pages->sgl, dst->cache_level,
> +				  dst->mm.pages->sgl, dst->pat_index,
>  				  i915_gem_object_is_lmem(dst),
>  				  out);
>  }
> @@ -428,9 +428,9 @@ static int __global_copy(struct intel_migrate *migrate,
>  			 struct i915_request **out)
>  {
>  	return intel_context_migrate_copy(migrate->context, NULL,
> -					  src->mm.pages->sgl, src->cache_level,
> +					  src->mm.pages->sgl, src->pat_index,
>  					  i915_gem_object_is_lmem(src),
> -					  dst->mm.pages->sgl, dst->cache_level,
> +					  dst->mm.pages->sgl, dst->pat_index,
>  					  i915_gem_object_is_lmem(dst),
>  					  out);
>  }
> @@ -455,7 +455,7 @@ static int __migrate_clear(struct intel_migrate *migrate,
>  {
>  	return intel_migrate_clear(migrate, ww, NULL,
>  				   obj->mm.pages->sgl,
> -				   obj->cache_level,
> +				   obj->pat_index,
>  				   i915_gem_object_is_lmem(obj),
>  				   value, out);
>  }
> @@ -468,7 +468,7 @@ static int __global_clear(struct intel_migrate *migrate,
>  {
>  	return intel_context_migrate_clear(migrate->context, NULL,
>  					   obj->mm.pages->sgl,
> -					   obj->cache_level,
> +					   obj->pat_index,
>  					   i915_gem_object_is_lmem(obj),
>  					   value, out);
>  }
> @@ -648,7 +648,7 @@ static int live_emit_pte_full_ring(void *arg)
>  	 */
>  	pr_info("%s emite_pte ring space=%u\n", __func__, rq->ring->space);
>  	it = sg_sgt(obj->mm.pages->sgl);
> -	len = emit_pte(rq, &it, obj->cache_level, false, 0, CHUNK_SZ);
> +	len = emit_pte(rq, &it, obj->pat_index, false, 0, CHUNK_SZ);
>  	if (!len) {
>  		err = -EINVAL;
>  		goto out_rq;
> @@ -844,7 +844,7 @@ static int wrap_ktime_compare(const void *A, const void *B)
>  
>  static int __perf_clear_blt(struct intel_context *ce,
>  			    struct scatterlist *sg,
> -			    enum i915_cache_level cache_level,
> +			    unsigned int pat_index,
>  			    bool is_lmem,
>  			    size_t sz)
>  {
> @@ -858,7 +858,7 @@ static int __perf_clear_blt(struct intel_context *ce,
>  
>  		t0 = ktime_get();
>  
> -		err = intel_context_migrate_clear(ce, NULL, sg, cache_level,
> +		err = intel_context_migrate_clear(ce, NULL, sg, pat_index,
>  						  is_lmem, 0, &rq);
>  		if (rq) {
>  			if (i915_request_wait(rq, 0, MAX_SCHEDULE_TIMEOUT) < 0)
> @@ -904,7 +904,8 @@ static int perf_clear_blt(void *arg)
>  
>  		err = __perf_clear_blt(gt->migrate.context,
>  				       dst->mm.pages->sgl,
> -				       I915_CACHE_NONE,
> +				       i915_gem_get_pat_index(gt->i915,
> +							      I915_CACHE_NONE),
>  				       i915_gem_object_is_lmem(dst),
>  				       sizes[i]);
>  
> @@ -919,10 +920,10 @@ static int perf_clear_blt(void *arg)
>  
>  static int __perf_copy_blt(struct intel_context *ce,
>  			   struct scatterlist *src,
> -			   enum i915_cache_level src_cache_level,
> +			   unsigned int src_pat_index,
>  			   bool src_is_lmem,
>  			   struct scatterlist *dst,
> -			   enum i915_cache_level dst_cache_level,
> +			   unsigned int dst_pat_index,
>  			   bool dst_is_lmem,
>  			   size_t sz)
>  {
> @@ -937,9 +938,9 @@ static int __perf_copy_blt(struct intel_context *ce,
>  		t0 = ktime_get();
>  
>  		err = intel_context_migrate_copy(ce, NULL,
> -						 src, src_cache_level,
> +						 src, src_pat_index,
>  						 src_is_lmem,
> -						 dst, dst_cache_level,
> +						 dst, dst_pat_index,
>  						 dst_is_lmem,
>  						 &rq);
>  		if (rq) {
> @@ -994,10 +995,12 @@ static int perf_copy_blt(void *arg)
>  
>  		err = __perf_copy_blt(gt->migrate.context,
>  				      src->mm.pages->sgl,
> -				      I915_CACHE_NONE,
> +				      i915_gem_get_pat_index(gt->i915,
> +							     I915_CACHE_NONE),
>  				      i915_gem_object_is_lmem(src),
>  				      dst->mm.pages->sgl,
> -				      I915_CACHE_NONE,
> +				      i915_gem_get_pat_index(gt->i915,
> +							     I915_CACHE_NONE),
>  				      i915_gem_object_is_lmem(dst),
>  				      sz);
>  
> diff --git a/drivers/gpu/drm/i915/gt/selftest_reset.c b/drivers/gpu/drm/i915/gt/selftest_reset.c
> index a9e0a91bc0e0..79aa6ac66ad2 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_reset.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_reset.c
> @@ -86,7 +86,9 @@ __igt_reset_stolen(struct intel_gt *gt,
>  
>  		ggtt->vm.insert_page(&ggtt->vm, dma,
>  				     ggtt->error_capture.start,
> -				     I915_CACHE_NONE, 0);
> +				     i915_gem_get_pat_index(gt->i915,
> +							    I915_CACHE_NONE),
> +				     0);
>  		mb();
>  
>  		s = io_mapping_map_wc(&ggtt->iomap,
> @@ -127,7 +129,9 @@ __igt_reset_stolen(struct intel_gt *gt,
>  
>  		ggtt->vm.insert_page(&ggtt->vm, dma,
>  				     ggtt->error_capture.start,
> -				     I915_CACHE_NONE, 0);
> +				     i915_gem_get_pat_index(gt->i915,
> +							    I915_CACHE_NONE),
> +				     0);
>  		mb();
>  
>  		s = io_mapping_map_wc(&ggtt->iomap,
> diff --git a/drivers/gpu/drm/i915/gt/selftest_timeline.c b/drivers/gpu/drm/i915/gt/selftest_timeline.c
> index 9f536c251179..39c3ec12df1a 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_timeline.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_timeline.c
> @@ -836,7 +836,7 @@ static int setup_watcher(struct hwsp_watcher *w, struct intel_gt *gt,
>  		return PTR_ERR(obj);
>  
>  	/* keep the same cache settings as timeline */
> -	i915_gem_object_set_cache_coherency(obj, tl->hwsp_ggtt->obj->cache_level);
> +	i915_gem_object_set_pat_index(obj, tl->hwsp_ggtt->obj->pat_index);
>  	w->map = i915_gem_object_pin_map_unlocked(obj,
>  						  page_unmask_bits(tl->hwsp_ggtt->obj->mm.mapping));
>  	if (IS_ERR(w->map)) {
> diff --git a/drivers/gpu/drm/i915/gt/selftest_tlb.c b/drivers/gpu/drm/i915/gt/selftest_tlb.c
> index e6cac1f15d6e..4493c8518e91 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_tlb.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_tlb.c
> @@ -36,6 +36,8 @@ pte_tlbinv(struct intel_context *ce,
>  	   u64 length,
>  	   struct rnd_state *prng)
>  {
> +	const unsigned int pat_index =
> +		i915_gem_get_pat_index(ce->vm->i915, I915_CACHE_NONE);
>  	struct drm_i915_gem_object *batch;
>  	struct drm_mm_node vb_node;
>  	struct i915_request *rq;
> @@ -155,7 +157,7 @@ pte_tlbinv(struct intel_context *ce,
>  		/* Flip the PTE between A and B */
>  		if (i915_gem_object_is_lmem(vb->obj))
>  			pte_flags |= PTE_LM;
> -		ce->vm->insert_entries(ce->vm, &vb_res, 0, pte_flags);
> +		ce->vm->insert_entries(ce->vm, &vb_res, pat_index, pte_flags);
>  
>  		/* Flush the PTE update to concurrent HW */
>  		tlbinv(ce->vm, addr & -length, length);
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
> index 264c952f777b..31182915f3d2 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
> @@ -876,9 +876,15 @@ static void uc_fw_bind_ggtt(struct intel_uc_fw *uc_fw)
>  		pte_flags |= PTE_LM;
>  
>  	if (ggtt->vm.raw_insert_entries)
> -		ggtt->vm.raw_insert_entries(&ggtt->vm, dummy, I915_CACHE_NONE, pte_flags);
> +		ggtt->vm.raw_insert_entries(&ggtt->vm, dummy,
> +					    i915_gem_get_pat_index(ggtt->vm.i915,
> +								   I915_CACHE_NONE),
> +					    pte_flags);
>  	else
> -		ggtt->vm.insert_entries(&ggtt->vm, dummy, I915_CACHE_NONE, pte_flags);
> +		ggtt->vm.insert_entries(&ggtt->vm, dummy,
> +					i915_gem_get_pat_index(ggtt->vm.i915,
> +							       I915_CACHE_NONE),
> +					pte_flags);
>  }
>  
>  static void uc_fw_unbind_ggtt(struct intel_uc_fw *uc_fw)
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 80c2bf98e341..1c407d59ff3d 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -138,21 +138,56 @@ static const char *stringify_vma_type(const struct i915_vma *vma)
>  	return "ppgtt";
>  }
>  
> -static const char *i915_cache_level_str(struct drm_i915_private *i915, int type)
> -{
> -	switch (type) {
> -	case I915_CACHE_NONE: return " uncached";
> -	case I915_CACHE_LLC: return HAS_LLC(i915) ? " LLC" : " snooped";
> -	case I915_CACHE_L3_LLC: return " L3+LLC";
> -	case I915_CACHE_WT: return " WT";
> -	default: return "";
> +static const char *i915_cache_level_str(struct drm_i915_gem_object *obj)
> +{
> +	struct drm_i915_private *i915 = obj_to_i915(obj);
> +
> +	if (IS_METEORLAKE(i915)) {
> +		switch (obj->pat_index) {
> +		case 0: return " WB";
> +		case 1: return " WT";
> +		case 2: return " UC";
> +		case 3: return " WB (1-Way Coh)";
> +		case 4: return " WB (2-Way Coh)";
> +		default: return " not defined";
> +		}
> +	} else if (IS_PONTEVECCHIO(i915)) {
> +		switch (obj->pat_index) {
> +		case 0: return " UC";
> +		case 1: return " WC";
> +		case 2: return " WT";
> +		case 3: return " WB";
> +		case 4: return " WT (CLOS1)";
> +		case 5: return " WB (CLOS1)";
> +		case 6: return " WT (CLOS2)";
> +		case 7: return " WT (CLOS2)";
> +		default: return " not defined";
> +		}
> +	} else if (GRAPHICS_VER(i915) >= 12) {
> +		switch (obj->pat_index) {
> +		case 0: return " WB";
> +		case 1: return " WC";
> +		case 2: return " WT";
> +		case 3: return " UC";
> +		default: return " not defined";
> +		}
> +	} else {
> +		if (i915_gem_object_has_cache_level(obj, I915_CACHE_NONE))
> +			return " uncached";
> +		else if (i915_gem_object_has_cache_level(obj, I915_CACHE_LLC))
> +			return HAS_LLC(i915) ? " LLC" : " snooped";
> +		else if (i915_gem_object_has_cache_level(obj, I915_CACHE_L3_LLC))
> +			return " L3+LLC";
> +		else if (i915_gem_object_has_cache_level(obj, I915_CACHE_WT))
> +			return " WT";
> +		else
> +			return " not defined";
>  	}
>  }
>  
>  void
>  i915_debugfs_describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>  {
> -	struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
>  	struct i915_vma *vma;
>  	int pin_count = 0;
>  
> @@ -164,7 +199,7 @@ i915_debugfs_describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>  		   obj->base.size / 1024,
>  		   obj->read_domains,
>  		   obj->write_domain,
> -		   i915_cache_level_str(dev_priv, obj->cache_level),
> +		   i915_cache_level_str(obj),
>  		   obj->mm.dirty ? " dirty" : "",
>  		   obj->mm.madv == I915_MADV_DONTNEED ? " purgeable" : "");
>  	if (obj->base.name)
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 2ba922fbbd5f..fbeddf81e729 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -420,8 +420,12 @@ i915_gem_gtt_pread(struct drm_i915_gem_object *obj,
>  		page_length = remain < page_length ? remain : page_length;
>  		if (drm_mm_node_allocated(&node)) {
>  			ggtt->vm.insert_page(&ggtt->vm,
> -					     i915_gem_object_get_dma_address(obj, offset >> PAGE_SHIFT),
> -					     node.start, I915_CACHE_NONE, 0);
> +					i915_gem_object_get_dma_address(obj,
> +							offset >> PAGE_SHIFT),
> +					node.start,
> +					i915_gem_get_pat_index(i915,
> +							       I915_CACHE_NONE),
> +					0);
>  		} else {
>  			page_base += offset & PAGE_MASK;
>  		}
> @@ -598,8 +602,12 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj,
>  			/* flush the write before we modify the GGTT */
>  			intel_gt_flush_ggtt_writes(ggtt->vm.gt);
>  			ggtt->vm.insert_page(&ggtt->vm,
> -					     i915_gem_object_get_dma_address(obj, offset >> PAGE_SHIFT),
> -					     node.start, I915_CACHE_NONE, 0);
> +					i915_gem_object_get_dma_address(obj,
> +							offset >> PAGE_SHIFT),
> +					node.start,
> +					i915_gem_get_pat_index(i915,
> +							       I915_CACHE_NONE),
> +					0);
>  			wmb(); /* flush modifications to the GGTT (insert_page) */
>  		} else {
>  			page_base += offset & PAGE_MASK;
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index f020c0086fbc..54f17ba3b03c 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -1117,10 +1117,14 @@ i915_vma_coredump_create(const struct intel_gt *gt,
>  			mutex_lock(&ggtt->error_mutex);
>  			if (ggtt->vm.raw_insert_page)
>  				ggtt->vm.raw_insert_page(&ggtt->vm, dma, slot,
> -							 I915_CACHE_NONE, 0);
> +						i915_gem_get_pat_index(gt->i915,
> +							I915_CACHE_NONE),
> +						0);
>  			else
>  				ggtt->vm.insert_page(&ggtt->vm, dma, slot,
> -						     I915_CACHE_NONE, 0);
> +						i915_gem_get_pat_index(gt->i915,
> +							I915_CACHE_NONE),
> +						0);
>  			mb();
>  
>  			s = io_mapping_map_wc(&ggtt->iomap, slot, PAGE_SIZE);
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index f51fd9fd4c89..e5f5368b175f 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -315,7 +315,7 @@ struct i915_vma_work {
>  	struct i915_vma_resource *vma_res;
>  	struct drm_i915_gem_object *obj;
>  	struct i915_sw_dma_fence_cb cb;
> -	enum i915_cache_level cache_level;
> +	unsigned int pat_index;
>  	unsigned int flags;
>  };
>  
> @@ -334,7 +334,7 @@ static void __vma_bind(struct dma_fence_work *work)
>  		return;
>  
>  	vma_res->ops->bind_vma(vma_res->vm, &vw->stash,
> -			       vma_res, vw->cache_level, vw->flags);
> +			       vma_res, vw->pat_index, vw->flags);
>  }
>  
>  static void __vma_release(struct dma_fence_work *work)
> @@ -426,7 +426,7 @@ i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
>  /**
>   * i915_vma_bind - Sets up PTEs for an VMA in it's corresponding address space.
>   * @vma: VMA to map
> - * @cache_level: mapping cache level
> + * @pat_index: PAT index to set in PTE
>   * @flags: flags like global or local mapping
>   * @work: preallocated worker for allocating and binding the PTE
>   * @vma_res: pointer to a preallocated vma resource. The resource is either
> @@ -437,7 +437,7 @@ i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
>   * Note that DMA addresses are also the only part of the SG table we care about.
>   */
>  int i915_vma_bind(struct i915_vma *vma,
> -		  enum i915_cache_level cache_level,
> +		  unsigned int pat_index,
>  		  u32 flags,
>  		  struct i915_vma_work *work,
>  		  struct i915_vma_resource *vma_res)
> @@ -507,7 +507,7 @@ int i915_vma_bind(struct i915_vma *vma,
>  		struct dma_fence *prev;
>  
>  		work->vma_res = i915_vma_resource_get(vma->resource);
> -		work->cache_level = cache_level;
> +		work->pat_index = pat_index;
>  		work->flags = bind_flags;
>  
>  		/*
> @@ -537,7 +537,7 @@ int i915_vma_bind(struct i915_vma *vma,
>  
>  			return ret;
>  		}
> -		vma->ops->bind_vma(vma->vm, NULL, vma->resource, cache_level,
> +		vma->ops->bind_vma(vma->vm, NULL, vma->resource, pat_index,
>  				   bind_flags);
>  	}
>  
> @@ -813,7 +813,7 @@ i915_vma_insert(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
>  	color = 0;
>  
>  	if (i915_vm_has_cache_coloring(vma->vm))
> -		color = vma->obj->cache_level;
> +		color = vma->obj->pat_index;
>  
>  	if (flags & PIN_OFFSET_FIXED) {
>  		u64 offset = flags & PIN_OFFSET_MASK;
> @@ -1517,7 +1517,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
>  
>  	GEM_BUG_ON(!vma->pages);
>  	err = i915_vma_bind(vma,
> -			    vma->obj->cache_level,
> +			    vma->obj->pat_index,
>  			    flags, work, vma_res);
>  	vma_res = NULL;
>  	if (err)
> diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
> index ed5c9d682a1b..31a8f8aa5558 100644
> --- a/drivers/gpu/drm/i915/i915_vma.h
> +++ b/drivers/gpu/drm/i915/i915_vma.h
> @@ -250,7 +250,7 @@ i915_vma_compare(struct i915_vma *vma,
>  
>  struct i915_vma_work *i915_vma_work(void);
>  int i915_vma_bind(struct i915_vma *vma,
> -		  enum i915_cache_level cache_level,
> +		  unsigned int pat_index,
>  		  u32 flags,
>  		  struct i915_vma_work *work,
>  		  struct i915_vma_resource *vma_res);
> diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
> index 77fda2244d16..64472b7f0e77 100644
> --- a/drivers/gpu/drm/i915/i915_vma_types.h
> +++ b/drivers/gpu/drm/i915/i915_vma_types.h
> @@ -32,8 +32,6 @@
>  
>  #include "gem/i915_gem_object_types.h"
>  
> -enum i915_cache_level;
> -
>  /**
>   * DOC: Global GTT views
>   *
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem.c b/drivers/gpu/drm/i915/selftests/i915_gem.c
> index d91d0ade8abd..bde981a8f23f 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem.c
> @@ -57,7 +57,10 @@ static void trash_stolen(struct drm_i915_private *i915)
>  		u32 __iomem *s;
>  		int x;
>  
> -		ggtt->vm.insert_page(&ggtt->vm, dma, slot, I915_CACHE_NONE, 0);
> +		ggtt->vm.insert_page(&ggtt->vm, dma, slot,
> +				     i915_gem_get_pat_index(i915,
> +							I915_CACHE_NONE),
> +				     0);
>  
>  		s = io_mapping_map_atomic_wc(&ggtt->iomap, slot);
>  		for (x = 0; x < PAGE_SIZE / sizeof(u32); x++) {
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
> index 37068542aafe..f13a4d265814 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
> @@ -245,7 +245,7 @@ static int igt_evict_for_cache_color(void *arg)
>  	struct drm_mm_node target = {
>  		.start = I915_GTT_PAGE_SIZE * 2,
>  		.size = I915_GTT_PAGE_SIZE,
> -		.color = I915_CACHE_LLC,
> +		.color = i915_gem_get_pat_index(gt->i915, I915_CACHE_LLC),
>  	};
>  	struct drm_i915_gem_object *obj;
>  	struct i915_vma *vma;
> @@ -308,7 +308,7 @@ static int igt_evict_for_cache_color(void *arg)
>  	/* Attempt to remove the first *pinned* vma, by removing the (empty)
>  	 * neighbour -- this should fail.
>  	 */
> -	target.color = I915_CACHE_L3_LLC;
> +	target.color = i915_gem_get_pat_index(gt->i915, I915_CACHE_L3_LLC);
>  
>  	mutex_lock(&ggtt->vm.mutex);
>  	err = i915_gem_evict_for_node(&ggtt->vm, NULL, &target, 0);
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> index 5361ce70d3f2..0b6350eb4dad 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> @@ -133,7 +133,7 @@ fake_dma_object(struct drm_i915_private *i915, u64 size)
>  
>  	obj->write_domain = I915_GEM_DOMAIN_CPU;
>  	obj->read_domains = I915_GEM_DOMAIN_CPU;
> -	obj->cache_level = I915_CACHE_NONE;
> +	obj->pat_index = i915_gem_get_pat_index(i915, I915_CACHE_NONE);
>  
>  	/* Preallocate the "backing storage" */
>  	if (i915_gem_object_pin_pages_unlocked(obj))
> @@ -357,7 +357,9 @@ static int lowlevel_hole(struct i915_address_space *vm,
>  
>  			with_intel_runtime_pm(vm->gt->uncore->rpm, wakeref)
>  			  vm->insert_entries(vm, mock_vma_res,
> -						   I915_CACHE_NONE, 0);
> +					     i915_gem_get_pat_index(vm->i915,
> +						     I915_CACHE_NONE),
> +					     0);
>  		}
>  		count = n;
>  
> @@ -1375,7 +1377,10 @@ static int igt_ggtt_page(void *arg)
>  
>  		ggtt->vm.insert_page(&ggtt->vm,
>  				     i915_gem_object_get_dma_address(obj, 0),
> -				     offset, I915_CACHE_NONE, 0);
> +				     offset,
> +				     i915_gem_get_pat_index(i915,
> +					                    I915_CACHE_NONE),
> +				     0);
>  	}
>  
>  	order = i915_random_order(count, &prng);
> @@ -1508,7 +1513,7 @@ static int reserve_gtt_with_resource(struct i915_vma *vma, u64 offset)
>  	mutex_lock(&vm->mutex);
>  	err = i915_gem_gtt_reserve(vm, NULL, &vma->node, obj->base.size,
>  				   offset,
> -				   obj->cache_level,
> +				   obj->pat_index,
>  				   0);
>  	if (!err) {
>  		i915_vma_resource_init_from_vma(vma_res, vma);
> @@ -1688,7 +1693,7 @@ static int insert_gtt_with_resource(struct i915_vma *vma)
>  
>  	mutex_lock(&vm->mutex);
>  	err = i915_gem_gtt_insert(vm, NULL, &vma->node, obj->base.size, 0,
> -				  obj->cache_level, 0, vm->total, 0);
> +				  obj->pat_index, 0, vm->total, 0);
>  	if (!err) {
>  		i915_vma_resource_init_from_vma(vma_res, vma);
>  		vma->resource = vma_res;
> diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> index 3b18e5905c86..cce180114d0c 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> @@ -1070,7 +1070,9 @@ static int igt_lmem_write_cpu(void *arg)
>  	/* Put the pages into a known state -- from the gpu for added fun */
>  	intel_engine_pm_get(engine);
>  	err = intel_context_migrate_clear(engine->gt->migrate.context, NULL,
> -					  obj->mm.pages->sgl, I915_CACHE_NONE,
> +					  obj->mm.pages->sgl,
> +					  i915_gem_get_pat_index(i915,
> +							I915_CACHE_NONE),
>  					  true, 0xdeadbeaf, &rq);
>  	if (rq) {
>  		dma_resv_add_fence(obj->base.resv, &rq->fence,
> diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c b/drivers/gpu/drm/i915/selftests/mock_gtt.c
> index ece97e4faacb..a516c0aa88fd 100644
> --- a/drivers/gpu/drm/i915/selftests/mock_gtt.c
> +++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c
> @@ -27,21 +27,21 @@
>  static void mock_insert_page(struct i915_address_space *vm,
>  			     dma_addr_t addr,
>  			     u64 offset,
> -			     enum i915_cache_level level,
> +			     unsigned int pat_index,
>  			     u32 flags)
>  {
>  }
>  
>  static void mock_insert_entries(struct i915_address_space *vm,
>  				struct i915_vma_resource *vma_res,
> -				enum i915_cache_level level, u32 flags)
> +				unsigned int pat_index, u32 flags)
>  {
>  }
>  
>  static void mock_bind_ppgtt(struct i915_address_space *vm,
>  			    struct i915_vm_pt_stash *stash,
>  			    struct i915_vma_resource *vma_res,
> -			    enum i915_cache_level cache_level,
> +			    unsigned int pat_index,
>  			    u32 flags)
>  {
>  	GEM_BUG_ON(flags & I915_VMA_GLOBAL_BIND);
> @@ -94,7 +94,7 @@ struct i915_ppgtt *mock_ppgtt(struct drm_i915_private *i915, const char *name)
>  static void mock_bind_ggtt(struct i915_address_space *vm,
>  			   struct i915_vm_pt_stash *stash,
>  			   struct i915_vma_resource *vma_res,
> -			   enum i915_cache_level cache_level,
> +			   unsigned int pat_index,
>  			   u32 flags)
>  {
>  }
> -- 
> 2.25.1

-- 
Ville Syrjälä
Intel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation
  2023-04-01  6:38 ` [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation fei.yang
@ 2023-04-03 16:02   ` Ville Syrjälä
  2023-04-03 16:35     ` Matt Roper
  2023-04-04  7:29   ` Lionel Landwerlin
  2023-04-06  9:11   ` Matthew Auld
  2 siblings, 1 reply; 35+ messages in thread
From: Ville Syrjälä @ 2023-04-03 16:02 UTC (permalink / raw)
  To: fei.yang; +Cc: Chris Wilson, intel-gfx, Matt Roper, dri-devel

On Fri, Mar 31, 2023 at 11:38:30PM -0700, fei.yang@intel.com wrote:
> From: Fei Yang <fei.yang@intel.com>
> 
> To comply with the design that buffer objects shall have immutable
> cache setting through out its life cycle, {set, get}_caching ioctl's
> are no longer supported from MTL onward. With that change caching
> policy can only be set at object creation time. The current code
> applies a default (platform dependent) cache setting for all objects.
> However this is not optimal for performance tuning. The patch extends
> the existing gem_create uAPI to let user set PAT index for the object
> at creation time.

This is missing the whole justification for the new uapi.
Why is MOCS not sufficient?

> The new extension is platform independent, so UMD's can switch to using
> this extension for older platforms as well, while {set, get}_caching are
> still supported on these legacy paltforms for compatibility reason.
> 
> Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
> Cc: Matt Roper <matthew.d.roper@intel.com>
> Signed-off-by: Fei Yang <fei.yang@intel.com>
> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_create.c | 33 ++++++++++++++++++++
>  include/uapi/drm/i915_drm.h                | 36 ++++++++++++++++++++++
>  tools/include/uapi/drm/i915_drm.h          | 36 ++++++++++++++++++++++
>  3 files changed, 105 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c b/drivers/gpu/drm/i915/gem/i915_gem_create.c
> index e76c9703680e..1c6e2034d28e 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
> @@ -244,6 +244,7 @@ struct create_ext {
>  	unsigned int n_placements;
>  	unsigned int placement_mask;
>  	unsigned long flags;
> +	unsigned int pat_index;
>  };
>  
>  static void repr_placements(char *buf, size_t size,
> @@ -393,11 +394,39 @@ static int ext_set_protected(struct i915_user_extension __user *base, void *data
>  	return 0;
>  }
>  
> +static int ext_set_pat(struct i915_user_extension __user *base, void *data)
> +{
> +	struct create_ext *ext_data = data;
> +	struct drm_i915_private *i915 = ext_data->i915;
> +	struct drm_i915_gem_create_ext_set_pat ext;
> +	unsigned int max_pat_index;
> +
> +	BUILD_BUG_ON(sizeof(struct drm_i915_gem_create_ext_set_pat) !=
> +		     offsetofend(struct drm_i915_gem_create_ext_set_pat, rsvd));
> +
> +	if (copy_from_user(&ext, base, sizeof(ext)))
> +		return -EFAULT;
> +
> +	max_pat_index = INTEL_INFO(i915)->max_pat_index;
> +
> +	if (ext.pat_index > max_pat_index) {
> +		drm_dbg(&i915->drm, "PAT index is invalid: %u\n",
> +			ext.pat_index);
> +		return -EINVAL;
> +	}
> +
> +	ext_data->pat_index = ext.pat_index;
> +
> +	return 0;
> +}
> +
>  static const i915_user_extension_fn create_extensions[] = {
>  	[I915_GEM_CREATE_EXT_MEMORY_REGIONS] = ext_set_placements,
>  	[I915_GEM_CREATE_EXT_PROTECTED_CONTENT] = ext_set_protected,
> +	[I915_GEM_CREATE_EXT_SET_PAT] = ext_set_pat,
>  };
>  
> +#define PAT_INDEX_NOT_SET	0xffff
>  /**
>   * Creates a new mm object and returns a handle to it.
>   * @dev: drm device pointer
> @@ -417,6 +446,7 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void *data,
>  	if (args->flags & ~I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS)
>  		return -EINVAL;
>  
> +	ext_data.pat_index = PAT_INDEX_NOT_SET;
>  	ret = i915_user_extensions(u64_to_user_ptr(args->extensions),
>  				   create_extensions,
>  				   ARRAY_SIZE(create_extensions),
> @@ -453,5 +483,8 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void *data,
>  	if (IS_ERR(obj))
>  		return PTR_ERR(obj);
>  
> +	if (ext_data.pat_index != PAT_INDEX_NOT_SET)
> +		i915_gem_object_set_pat_index(obj, ext_data.pat_index);
> +
>  	return i915_gem_publish(obj, file, &args->size, &args->handle);
>  }
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index dba7c5a5b25e..03c5c314846e 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -3630,9 +3630,13 @@ struct drm_i915_gem_create_ext {
>  	 *
>  	 * For I915_GEM_CREATE_EXT_PROTECTED_CONTENT usage see
>  	 * struct drm_i915_gem_create_ext_protected_content.
> +	 *
> +	 * For I915_GEM_CREATE_EXT_SET_PAT usage see
> +	 * struct drm_i915_gem_create_ext_set_pat.
>  	 */
>  #define I915_GEM_CREATE_EXT_MEMORY_REGIONS 0
>  #define I915_GEM_CREATE_EXT_PROTECTED_CONTENT 1
> +#define I915_GEM_CREATE_EXT_SET_PAT 2
>  	__u64 extensions;
>  };
>  
> @@ -3747,6 +3751,38 @@ struct drm_i915_gem_create_ext_protected_content {
>  	__u32 flags;
>  };
>  
> +/**
> + * struct drm_i915_gem_create_ext_set_pat - The
> + * I915_GEM_CREATE_EXT_SET_PAT extension.
> + *
> + * If this extension is provided, the specified caching policy (PAT index) is
> + * applied to the buffer object.
> + *
> + * Below is an example on how to create an object with specific caching policy:
> + *
> + * .. code-block:: C
> + *
> + *      struct drm_i915_gem_create_ext_set_pat set_pat_ext = {
> + *              .base = { .name = I915_GEM_CREATE_EXT_SET_PAT },
> + *              .pat_index = 0,
> + *      };
> + *      struct drm_i915_gem_create_ext create_ext = {
> + *              .size = PAGE_SIZE,
> + *              .extensions = (uintptr_t)&set_pat_ext,
> + *      };
> + *
> + *      int err = ioctl(fd, DRM_IOCTL_I915_GEM_CREATE_EXT, &create_ext);
> + *      if (err) ...
> + */
> +struct drm_i915_gem_create_ext_set_pat {
> +	/** @base: Extension link. See struct i915_user_extension. */
> +	struct i915_user_extension base;
> +	/** @pat_index: PAT index to be set */
> +	__u32 pat_index;
> +	/** @rsvd: reserved for future use */
> +	__u32 rsvd;
> +};
> +
>  /* ID of the protected content session managed by i915 when PXP is active */
>  #define I915_PROTECTED_CONTENT_DEFAULT_SESSION 0xf
>  
> diff --git a/tools/include/uapi/drm/i915_drm.h b/tools/include/uapi/drm/i915_drm.h
> index 8df261c5ab9b..8cdcdb5fac26 100644
> --- a/tools/include/uapi/drm/i915_drm.h
> +++ b/tools/include/uapi/drm/i915_drm.h
> @@ -3607,9 +3607,13 @@ struct drm_i915_gem_create_ext {
>  	 *
>  	 * For I915_GEM_CREATE_EXT_PROTECTED_CONTENT usage see
>  	 * struct drm_i915_gem_create_ext_protected_content.
> +	 *
> +	 * For I915_GEM_CREATE_EXT_SET_PAT usage see
> +	 * struct drm_i915_gem_create_ext_set_pat.
>  	 */
>  #define I915_GEM_CREATE_EXT_MEMORY_REGIONS 0
>  #define I915_GEM_CREATE_EXT_PROTECTED_CONTENT 1
> +#define I915_GEM_CREATE_EXT_SET_PAT 2
>  	__u64 extensions;
>  };
>  
> @@ -3724,6 +3728,38 @@ struct drm_i915_gem_create_ext_protected_content {
>  	__u32 flags;
>  };
>  
> +/**
> + * struct drm_i915_gem_create_ext_set_pat - The
> + * I915_GEM_CREATE_EXT_SET_PAT extension.
> + *
> + * If this extension is provided, the specified caching policy (PAT index) is
> + * applied to the buffer object.
> + *
> + * Below is an example on how to create an object with specific caching policy:
> + *
> + * .. code-block:: C
> + *
> + *      struct drm_i915_gem_create_ext_set_pat set_pat_ext = {
> + *              .base = { .name = I915_GEM_CREATE_EXT_SET_PAT },
> + *              .pat_index = 0,
> + *      };
> + *      struct drm_i915_gem_create_ext create_ext = {
> + *              .size = PAGE_SIZE,
> + *              .extensions = (uintptr_t)&set_pat_ext,
> + *      };
> + *
> + *      int err = ioctl(fd, DRM_IOCTL_I915_GEM_CREATE_EXT, &create_ext);
> + *      if (err) ...
> + */
> +struct drm_i915_gem_create_ext_set_pat {
> +	/** @base: Extension link. See struct i915_user_extension. */
> +	struct i915_user_extension base;
> +	/** @pat_index: PAT index to be set */
> +	__u32 pat_index;
> +	/** @rsvd: reserved for future use */
> +	__u32 rsvd;
> +};
> +
>  /* ID of the protected content session managed by i915 when PXP is active */
>  #define I915_PROTECTED_CONTENT_DEFAULT_SESSION 0xf
>  
> -- 
> 2.25.1

-- 
Ville Syrjälä
Intel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation
  2023-04-03 16:02   ` Ville Syrjälä
@ 2023-04-03 16:35     ` Matt Roper
  2023-04-03 16:48       ` Ville Syrjälä
  0 siblings, 1 reply; 35+ messages in thread
From: Matt Roper @ 2023-04-03 16:35 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx, Chris Wilson, dri-devel

On Mon, Apr 03, 2023 at 07:02:08PM +0300, Ville Syrjälä wrote:
> On Fri, Mar 31, 2023 at 11:38:30PM -0700, fei.yang@intel.com wrote:
> > From: Fei Yang <fei.yang@intel.com>
> > 
> > To comply with the design that buffer objects shall have immutable
> > cache setting through out its life cycle, {set, get}_caching ioctl's
> > are no longer supported from MTL onward. With that change caching
> > policy can only be set at object creation time. The current code
> > applies a default (platform dependent) cache setting for all objects.
> > However this is not optimal for performance tuning. The patch extends
> > the existing gem_create uAPI to let user set PAT index for the object
> > at creation time.
> 
> This is missing the whole justification for the new uapi.
> Why is MOCS not sufficient?

PAT and MOCS are somewhat related, but they're not the same thing.  The
general direction of the hardware architecture recently has been to
slowly dumb down MOCS and move more of the important memory/cache
control over to the PAT instead.  On current platforms there is some
overlap (and MOCS has an "ignore PAT" setting that makes the MOCS "win"
for the specific fields that both can control), but MOCS doesn't have a
way to express things like snoop/coherency mode (on MTL), or class of
service (on PVC).  And if you check some of the future platforms, the
hardware design starts packing even more stuff into the PAT (not just
cache behavior) which will never be handled by MOCS.

Also keep in mind that MOCS generally applies at the GPU instruction
level; although a lot of instructions have a field to provide a MOCS
index, or can use a MOCS already associated with a surface state, there
are still some that don't.  PAT is the source of memory access
characteristics for anything that can't provide a MOCS directly.


Matt

> 
> > The new extension is platform independent, so UMD's can switch to using
> > this extension for older platforms as well, while {set, get}_caching are
> > still supported on these legacy paltforms for compatibility reason.
> > 
> > Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
> > Cc: Matt Roper <matthew.d.roper@intel.com>
> > Signed-off-by: Fei Yang <fei.yang@intel.com>
> > Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_create.c | 33 ++++++++++++++++++++
> >  include/uapi/drm/i915_drm.h                | 36 ++++++++++++++++++++++
> >  tools/include/uapi/drm/i915_drm.h          | 36 ++++++++++++++++++++++
> >  3 files changed, 105 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c b/drivers/gpu/drm/i915/gem/i915_gem_create.c
> > index e76c9703680e..1c6e2034d28e 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
> > @@ -244,6 +244,7 @@ struct create_ext {
> >  	unsigned int n_placements;
> >  	unsigned int placement_mask;
> >  	unsigned long flags;
> > +	unsigned int pat_index;
> >  };
> >  
> >  static void repr_placements(char *buf, size_t size,
> > @@ -393,11 +394,39 @@ static int ext_set_protected(struct i915_user_extension __user *base, void *data
> >  	return 0;
> >  }
> >  
> > +static int ext_set_pat(struct i915_user_extension __user *base, void *data)
> > +{
> > +	struct create_ext *ext_data = data;
> > +	struct drm_i915_private *i915 = ext_data->i915;
> > +	struct drm_i915_gem_create_ext_set_pat ext;
> > +	unsigned int max_pat_index;
> > +
> > +	BUILD_BUG_ON(sizeof(struct drm_i915_gem_create_ext_set_pat) !=
> > +		     offsetofend(struct drm_i915_gem_create_ext_set_pat, rsvd));
> > +
> > +	if (copy_from_user(&ext, base, sizeof(ext)))
> > +		return -EFAULT;
> > +
> > +	max_pat_index = INTEL_INFO(i915)->max_pat_index;
> > +
> > +	if (ext.pat_index > max_pat_index) {
> > +		drm_dbg(&i915->drm, "PAT index is invalid: %u\n",
> > +			ext.pat_index);
> > +		return -EINVAL;
> > +	}
> > +
> > +	ext_data->pat_index = ext.pat_index;
> > +
> > +	return 0;
> > +}
> > +
> >  static const i915_user_extension_fn create_extensions[] = {
> >  	[I915_GEM_CREATE_EXT_MEMORY_REGIONS] = ext_set_placements,
> >  	[I915_GEM_CREATE_EXT_PROTECTED_CONTENT] = ext_set_protected,
> > +	[I915_GEM_CREATE_EXT_SET_PAT] = ext_set_pat,
> >  };
> >  
> > +#define PAT_INDEX_NOT_SET	0xffff
> >  /**
> >   * Creates a new mm object and returns a handle to it.
> >   * @dev: drm device pointer
> > @@ -417,6 +446,7 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void *data,
> >  	if (args->flags & ~I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS)
> >  		return -EINVAL;
> >  
> > +	ext_data.pat_index = PAT_INDEX_NOT_SET;
> >  	ret = i915_user_extensions(u64_to_user_ptr(args->extensions),
> >  				   create_extensions,
> >  				   ARRAY_SIZE(create_extensions),
> > @@ -453,5 +483,8 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void *data,
> >  	if (IS_ERR(obj))
> >  		return PTR_ERR(obj);
> >  
> > +	if (ext_data.pat_index != PAT_INDEX_NOT_SET)
> > +		i915_gem_object_set_pat_index(obj, ext_data.pat_index);
> > +
> >  	return i915_gem_publish(obj, file, &args->size, &args->handle);
> >  }
> > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> > index dba7c5a5b25e..03c5c314846e 100644
> > --- a/include/uapi/drm/i915_drm.h
> > +++ b/include/uapi/drm/i915_drm.h
> > @@ -3630,9 +3630,13 @@ struct drm_i915_gem_create_ext {
> >  	 *
> >  	 * For I915_GEM_CREATE_EXT_PROTECTED_CONTENT usage see
> >  	 * struct drm_i915_gem_create_ext_protected_content.
> > +	 *
> > +	 * For I915_GEM_CREATE_EXT_SET_PAT usage see
> > +	 * struct drm_i915_gem_create_ext_set_pat.
> >  	 */
> >  #define I915_GEM_CREATE_EXT_MEMORY_REGIONS 0
> >  #define I915_GEM_CREATE_EXT_PROTECTED_CONTENT 1
> > +#define I915_GEM_CREATE_EXT_SET_PAT 2
> >  	__u64 extensions;
> >  };
> >  
> > @@ -3747,6 +3751,38 @@ struct drm_i915_gem_create_ext_protected_content {
> >  	__u32 flags;
> >  };
> >  
> > +/**
> > + * struct drm_i915_gem_create_ext_set_pat - The
> > + * I915_GEM_CREATE_EXT_SET_PAT extension.
> > + *
> > + * If this extension is provided, the specified caching policy (PAT index) is
> > + * applied to the buffer object.
> > + *
> > + * Below is an example on how to create an object with specific caching policy:
> > + *
> > + * .. code-block:: C
> > + *
> > + *      struct drm_i915_gem_create_ext_set_pat set_pat_ext = {
> > + *              .base = { .name = I915_GEM_CREATE_EXT_SET_PAT },
> > + *              .pat_index = 0,
> > + *      };
> > + *      struct drm_i915_gem_create_ext create_ext = {
> > + *              .size = PAGE_SIZE,
> > + *              .extensions = (uintptr_t)&set_pat_ext,
> > + *      };
> > + *
> > + *      int err = ioctl(fd, DRM_IOCTL_I915_GEM_CREATE_EXT, &create_ext);
> > + *      if (err) ...
> > + */
> > +struct drm_i915_gem_create_ext_set_pat {
> > +	/** @base: Extension link. See struct i915_user_extension. */
> > +	struct i915_user_extension base;
> > +	/** @pat_index: PAT index to be set */
> > +	__u32 pat_index;
> > +	/** @rsvd: reserved for future use */
> > +	__u32 rsvd;
> > +};
> > +
> >  /* ID of the protected content session managed by i915 when PXP is active */
> >  #define I915_PROTECTED_CONTENT_DEFAULT_SESSION 0xf
> >  
> > diff --git a/tools/include/uapi/drm/i915_drm.h b/tools/include/uapi/drm/i915_drm.h
> > index 8df261c5ab9b..8cdcdb5fac26 100644
> > --- a/tools/include/uapi/drm/i915_drm.h
> > +++ b/tools/include/uapi/drm/i915_drm.h
> > @@ -3607,9 +3607,13 @@ struct drm_i915_gem_create_ext {
> >  	 *
> >  	 * For I915_GEM_CREATE_EXT_PROTECTED_CONTENT usage see
> >  	 * struct drm_i915_gem_create_ext_protected_content.
> > +	 *
> > +	 * For I915_GEM_CREATE_EXT_SET_PAT usage see
> > +	 * struct drm_i915_gem_create_ext_set_pat.
> >  	 */
> >  #define I915_GEM_CREATE_EXT_MEMORY_REGIONS 0
> >  #define I915_GEM_CREATE_EXT_PROTECTED_CONTENT 1
> > +#define I915_GEM_CREATE_EXT_SET_PAT 2
> >  	__u64 extensions;
> >  };
> >  
> > @@ -3724,6 +3728,38 @@ struct drm_i915_gem_create_ext_protected_content {
> >  	__u32 flags;
> >  };
> >  
> > +/**
> > + * struct drm_i915_gem_create_ext_set_pat - The
> > + * I915_GEM_CREATE_EXT_SET_PAT extension.
> > + *
> > + * If this extension is provided, the specified caching policy (PAT index) is
> > + * applied to the buffer object.
> > + *
> > + * Below is an example on how to create an object with specific caching policy:
> > + *
> > + * .. code-block:: C
> > + *
> > + *      struct drm_i915_gem_create_ext_set_pat set_pat_ext = {
> > + *              .base = { .name = I915_GEM_CREATE_EXT_SET_PAT },
> > + *              .pat_index = 0,
> > + *      };
> > + *      struct drm_i915_gem_create_ext create_ext = {
> > + *              .size = PAGE_SIZE,
> > + *              .extensions = (uintptr_t)&set_pat_ext,
> > + *      };
> > + *
> > + *      int err = ioctl(fd, DRM_IOCTL_I915_GEM_CREATE_EXT, &create_ext);
> > + *      if (err) ...
> > + */
> > +struct drm_i915_gem_create_ext_set_pat {
> > +	/** @base: Extension link. See struct i915_user_extension. */
> > +	struct i915_user_extension base;
> > +	/** @pat_index: PAT index to be set */
> > +	__u32 pat_index;
> > +	/** @rsvd: reserved for future use */
> > +	__u32 rsvd;
> > +};
> > +
> >  /* ID of the protected content session managed by i915 when PXP is active */
> >  #define I915_PROTECTED_CONTENT_DEFAULT_SESSION 0xf
> >  
> > -- 
> > 2.25.1
> 
> -- 
> Ville Syrjälä
> Intel

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation
  2023-04-03 16:35     ` Matt Roper
@ 2023-04-03 16:48       ` Ville Syrjälä
  2023-04-04 22:15         ` Kenneth Graunke
  0 siblings, 1 reply; 35+ messages in thread
From: Ville Syrjälä @ 2023-04-03 16:48 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx, Chris Wilson, dri-devel

On Mon, Apr 03, 2023 at 09:35:32AM -0700, Matt Roper wrote:
> On Mon, Apr 03, 2023 at 07:02:08PM +0300, Ville Syrjälä wrote:
> > On Fri, Mar 31, 2023 at 11:38:30PM -0700, fei.yang@intel.com wrote:
> > > From: Fei Yang <fei.yang@intel.com>
> > > 
> > > To comply with the design that buffer objects shall have immutable
> > > cache setting through out its life cycle, {set, get}_caching ioctl's
> > > are no longer supported from MTL onward. With that change caching
> > > policy can only be set at object creation time. The current code
> > > applies a default (platform dependent) cache setting for all objects.
> > > However this is not optimal for performance tuning. The patch extends
> > > the existing gem_create uAPI to let user set PAT index for the object
> > > at creation time.
> > 
> > This is missing the whole justification for the new uapi.
> > Why is MOCS not sufficient?
> 
> PAT and MOCS are somewhat related, but they're not the same thing.  The
> general direction of the hardware architecture recently has been to
> slowly dumb down MOCS and move more of the important memory/cache
> control over to the PAT instead.  On current platforms there is some
> overlap (and MOCS has an "ignore PAT" setting that makes the MOCS "win"
> for the specific fields that both can control), but MOCS doesn't have a
> way to express things like snoop/coherency mode (on MTL), or class of
> service (on PVC).  And if you check some of the future platforms, the
> hardware design starts packing even more stuff into the PAT (not just
> cache behavior) which will never be handled by MOCS.

Sigh. So the hardware designers screwed up MOCS yet again and
instead of getting that fixed we are adding a new uapi to work
around it?

The IMO sane approach (which IIRC was the situation for a few
platform generations at least) is that you just shove the PAT
index into MOCS (or tell it to go look it up from the PTE).
Why the heck did they not just stick with that?

> 
> Also keep in mind that MOCS generally applies at the GPU instruction
> level; although a lot of instructions have a field to provide a MOCS
> index, or can use a MOCS already associated with a surface state, there
> are still some that don't. PAT is the source of memory access
> characteristics for anything that can't provide a MOCS directly.

So what are the things that don't have MOCS and where we need
some custom cache behaviour, and we already know all that at
buffer creation time?

-- 
Ville Syrjälä
Intel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Intel-gfx] [PATCH 5/7] drm/i915: use pat_index instead of cache_level
  2023-04-03 14:50   ` Ville Syrjälä
@ 2023-04-03 16:57     ` Yang, Fei
  2023-04-03 17:14       ` Ville Syrjälä
  0 siblings, 1 reply; 35+ messages in thread
From: Yang, Fei @ 2023-04-03 16:57 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Chris Wilson, intel-gfx, Roper, Matthew D, dri-devel

> Subject: Re: [PATCH 5/7] drm/i915: use pat_index instead of cache_level
>
> On Fri, Mar 31, 2023 at 11:38:28PM -0700, fei.yang@intel.com wrote:
>> From: Fei Yang <fei.yang@intel.com>
>> 
>> Currently the KMD is using enum i915_cache_level to set caching policy for
>> buffer objects. This is flaky because the PAT index which really controls
>> the caching behavior in PTE has far more levels than what's defined in the
>> enum.
>
> Then just add more enum values.

That would be really messy because PAT index is platform dependent, you would
have to maintain many tables for the the translation.

> 'pat_index' is absolutely meaningless to the reader, it's just an
> arbitrary number. Whereas 'cache_level' conveys how the thing is
> actually going to get used and thus how the caches should behave.

By design UMD's understand PAT index. Both UMD and KMD should stand on the
same ground, the Bspec, to avoid any potential ambiguity.

>> In addition, the PAT index is platform dependent, having to translate
>> between i915_cache_level and PAT index is not reliable,
>
>If it's not realiable then the code is clearly broken.

Perhaps the word "reliable" is a bit confusing here. What I really meant to
say is 'difficult to maintain', or 'error-prone'.

>> and makes the code more complicated.
>
> You have to translate somewhere anyway. Looks like you're now adding
> translations the other way (pat_index->cache_level). How is that better?

No, there is no pat_index->cache_level translation.
There is only a small table for cache_level->pat_index translation. That is
added for the convenience of KMD coding, no exposure to UMD.

-Fei

>> 
>> >From UMD's perspective there is also a necessity to set caching policy for
>> performance fine tuning. It's much easier for the UMD to directly use PAT
>> index because the behavior of each PAT index is clearly defined in Bspec.
>> Haivng the abstracted i915_cache_level sitting in between would only cause
>> more ambiguity.
>> 
>> For these reasons this patch replaces i915_cache_level with PAT index. Also
>> note, the cache_level is not completely removed yet, because the KMD still
>> has the need of creating buffer objects with simple cache settings such as
>> cached, uncached, or writethrough. For these simple cases, using cache_level
>> would help simplify the code.
>> 
>> Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
>> Cc: Matt Roper <matthew.d.roper@intel.com>
>> Signed-off-by: Fei Yang <fei.yang@intel.com>
>> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
>> ---
>>  drivers/gpu/drm/i915/display/intel_dpt.c      | 12 +--
>>  drivers/gpu/drm/i915/gem/i915_gem_domain.c    | 27 ++----
>>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 10 ++-
>>  drivers/gpu/drm/i915/gem/i915_gem_mman.c      |  3 +-
>>  drivers/gpu/drm/i915/gem/i915_gem_object.c    | 39 ++++++++-
>>  drivers/gpu/drm/i915/gem/i915_gem_object.h    |  4 +
>>  .../gpu/drm/i915/gem/i915_gem_object_types.h  | 18 ++--
>>  drivers/gpu/drm/i915/gem/i915_gem_stolen.c    |  4 +-
>>  drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  | 16 ++--
>>  .../gpu/drm/i915/gem/selftests/huge_pages.c   |  2 +-
>>  .../drm/i915/gem/selftests/i915_gem_migrate.c |  2 +-
>>  .../drm/i915/gem/selftests/i915_gem_mman.c    |  2 +-
>>  drivers/gpu/drm/i915/gt/gen6_ppgtt.c          | 10 ++-
>>  drivers/gpu/drm/i915/gt/gen8_ppgtt.c          | 76 ++++++++---------
>>  drivers/gpu/drm/i915/gt/gen8_ppgtt.h          |  3 +-
>>  drivers/gpu/drm/i915/gt/intel_ggtt.c          | 82 +++++++++----------
>>  drivers/gpu/drm/i915/gt/intel_gtt.h           | 20 ++---
>>  drivers/gpu/drm/i915/gt/intel_migrate.c       | 47 ++++++-----
>>  drivers/gpu/drm/i915/gt/intel_migrate.h       | 13 ++-
>>  drivers/gpu/drm/i915/gt/intel_ppgtt.c         |  6 +-
>>  drivers/gpu/drm/i915/gt/selftest_migrate.c    | 47 ++++++-----
>>  drivers/gpu/drm/i915/gt/selftest_reset.c      |  8 +-
>>  drivers/gpu/drm/i915/gt/selftest_timeline.c   |  2 +-
>>  drivers/gpu/drm/i915/gt/selftest_tlb.c        |  4 +-
>>  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c      | 10 ++-
>>  drivers/gpu/drm/i915/i915_debugfs.c           | 55 ++++++++++---
>>  drivers/gpu/drm/i915/i915_gem.c               | 16 +++-
>>  drivers/gpu/drm/i915/i915_gpu_error.c         |  8 +-
>>  drivers/gpu/drm/i915/i915_vma.c               | 16 ++--
>>  drivers/gpu/drm/i915/i915_vma.h               |  2 +-
>>  drivers/gpu/drm/i915/i915_vma_types.h         |  2 -
>>  drivers/gpu/drm/i915/selftests/i915_gem.c     |  5 +-
>>  .../gpu/drm/i915/selftests/i915_gem_evict.c   |  4 +-
>>  drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 15 ++--
>>  .../drm/i915/selftests/intel_memory_region.c  |  4 +-
>>  drivers/gpu/drm/i915/selftests/mock_gtt.c     |  8 +-
>>  36 files changed, 361 insertions(+), 241 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c b/drivers/gpu/drm/i915/display/intel_dpt.c
>> index c5eacfdba1a5..7c5fddb203ba 100644
>> --- a/drivers/gpu/drm/i915/display/intel_dpt.c
>> +++ b/drivers/gpu/drm/i915/display/intel_dpt.c
>> @@ -43,24 +43,24 @@ static void gen8_set_pte(void __iomem *addr, gen8_pte_t pte)
>>  static void dpt_insert_page(struct i915_address_space *vm,
>>  			    dma_addr_t addr,
>>  			    u64 offset,
>> -			    enum i915_cache_level level,
>> +			    unsigned int pat_index,
>>  			    u32 flags)
>>  {
>>  	struct i915_dpt *dpt = i915_vm_to_dpt(vm);
>>  	gen8_pte_t __iomem *base = dpt->iomem;
>>  
>>  	gen8_set_pte(base + offset / I915_GTT_PAGE_SIZE,
>> -		     vm->pte_encode(addr, level, flags));
>> +		     vm->pte_encode(addr, pat_index, flags));
>>  }
>>  
>>  static void dpt_insert_entries(struct i915_address_space *vm,
>>  			       struct i915_vma_resource *vma_res,
>> -			       enum i915_cache_level level,
>> +			       unsigned int pat_index,
>>  			       u32 flags)
>>  {
>>  	struct i915_dpt *dpt = i915_vm_to_dpt(vm);
>>  	gen8_pte_t __iomem *base = dpt->iomem;
>> -	const gen8_pte_t pte_encode = vm->pte_encode(0, level, flags);
>> +	const gen8_pte_t pte_encode = vm->pte_encode(0, pat_index, flags);
>>  	struct sgt_iter sgt_iter;
>>  	dma_addr_t addr;
>>  	int i;
>> @@ -83,7 +83,7 @@ static void dpt_clear_range(struct i915_address_space *vm,
>>  static void dpt_bind_vma(struct i915_address_space *vm,
>>  			 struct i915_vm_pt_stash *stash,
>>  			 struct i915_vma_resource *vma_res,
>> -			 enum i915_cache_level cache_level,
>> +			 unsigned int pat_index,
>>  			 u32 flags)
>>  {
>>  	u32 pte_flags;
>> @@ -98,7 +98,7 @@ static void dpt_bind_vma(struct i915_address_space *vm,
>>  	if (vma_res->bi.lmem)
>>  		pte_flags |= PTE_LM;
>>  
>> -	vm->insert_entries(vm, vma_res, cache_level, pte_flags);
>> +	vm->insert_entries(vm, vma_res, pat_index, pte_flags);
>>  
>>  	vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
>>  
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
>> index 33b73bea1e08..84e0a96f6c71 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
>> @@ -27,8 +27,8 @@ static bool gpu_write_needs_clflush(struct drm_i915_gem_object *obj)
>>  	if (IS_DGFX(i915))
>>  		return false;
>>  
>> -	return !(obj->cache_level == I915_CACHE_NONE ||
>> -		 obj->cache_level == I915_CACHE_WT);
>> +	return !(i915_gem_object_has_cache_level(obj, I915_CACHE_NONE) ||
>> +		 i915_gem_object_has_cache_level(obj, I915_CACHE_WT));
>>  }
>>  
>>  bool i915_gem_cpu_write_needs_clflush(struct drm_i915_gem_object *obj)
>> @@ -265,7 +265,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>>  {
>>  	int ret;
>>  
>> -	if (obj->cache_level == cache_level)
>> +	if (i915_gem_object_has_cache_level(obj, cache_level))
>>  		return 0;
>>  
>>  	ret = i915_gem_object_wait(obj,
>> @@ -276,10 +276,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>>  		return ret;
>>  
>>  	/* Always invalidate stale cachelines */
>> -	if (obj->cache_level != cache_level) {
>> -		i915_gem_object_set_cache_coherency(obj, cache_level);
>> -		obj->cache_dirty = true;
>> -	}
>> +	i915_gem_object_set_cache_coherency(obj, cache_level);
>> +	obj->cache_dirty = true;
>>  
>>  	/* The cache-level will be applied when each vma is rebound. */
>>  	return i915_gem_object_unbind(obj,
>> @@ -304,20 +302,13 @@ int i915_gem_get_caching_ioctl(struct drm_device *dev, void *data,
>>  		goto out;
>>  	}
>>  
>> -	switch (obj->cache_level) {
>> -	case I915_CACHE_LLC:
>> -	case I915_CACHE_L3_LLC:
>> +	if (i915_gem_object_has_cache_level(obj, I915_CACHE_LLC) ||
>> +	    i915_gem_object_has_cache_level(obj, I915_CACHE_L3_LLC))
>>  		args->caching = I915_CACHING_CACHED;
>> -		break;
>> -
>> -	case I915_CACHE_WT:
>> +	else if (i915_gem_object_has_cache_level(obj, I915_CACHE_WT))
>>  		args->caching = I915_CACHING_DISPLAY;
>> -		break;
>> -
>> -	default:
>> +	else
>>  		args->caching = I915_CACHING_NONE;
>> -		break;
>> -	}
>>  out:
>>  	rcu_read_unlock();
>>  	return err;
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>> index 3aeede6aee4d..d42915516636 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>> @@ -642,7 +642,7 @@ static inline int use_cpu_reloc(const struct reloc_cache *cache,
>>  
>>  	return (cache->has_llc ||
>>  		obj->cache_dirty ||
>> -		obj->cache_level != I915_CACHE_NONE);
>> +		!i915_gem_object_has_cache_level(obj, I915_CACHE_NONE));
>>  }
>>  
>>  static int eb_reserve_vma(struct i915_execbuffer *eb,
>> @@ -1323,8 +1323,10 @@ static void *reloc_iomap(struct i915_vma *batch,
>>  	offset = cache->node.start;
>>  	if (drm_mm_node_allocated(&cache->node)) {
>>  		ggtt->vm.insert_page(&ggtt->vm,
>> -				     i915_gem_object_get_dma_address(obj, page),
>> -				     offset, I915_CACHE_NONE, 0);
>> +			i915_gem_object_get_dma_address(obj, page),
>> +			offset,
>> +			i915_gem_get_pat_index(ggtt->vm.i915, I915_CACHE_NONE),
>> +			0);
>>  	} else {
>>  		offset += page << PAGE_SHIFT;
>>  	}
>> @@ -1464,7 +1466,7 @@ eb_relocate_entry(struct i915_execbuffer *eb,
>>  			reloc_cache_unmap(&eb->reloc_cache);
>>  			mutex_lock(&vma->vm->mutex);
>>  			err = i915_vma_bind(target->vma,
>> -					    target->vma->obj->cache_level,
>> +					    target->vma->obj->pat_index,
>>  					    PIN_GLOBAL, NULL, NULL);
>>  			mutex_unlock(&vma->vm->mutex);
>>  			reloc_cache_remap(&eb->reloc_cache, ev->vma->obj);
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> index d3c1dee16af2..6c242f9ffc75 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> @@ -383,7 +383,8 @@ static vm_fault_t vm_fault_gtt(struct vm_fault *vmf)
>>  	}
>>  
>>  	/* Access to snoopable pages through the GTT is incoherent. */
>> -	if (obj->cache_level != I915_CACHE_NONE && !HAS_LLC(i915)) {
>> +	if (!(i915_gem_object_has_cache_level(obj, I915_CACHE_NONE) ||
>> +	      HAS_LLC(i915))) {
>>  		ret = -EFAULT;
>>  		goto err_unpin;
>>  	}
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
>> index 1295bb812866..2894ed9156c7 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
>> @@ -54,6 +54,12 @@ unsigned int i915_gem_get_pat_index(struct drm_i915_private *i915,
>>  	return INTEL_INFO(i915)->cachelevel_to_pat[level];
>>  }
>>  
>> +bool i915_gem_object_has_cache_level(const struct drm_i915_gem_object *obj,
>> +				     enum i915_cache_level lvl)
>> +{
>> +	return obj->pat_index == i915_gem_get_pat_index(obj_to_i915(obj), lvl);
>> +}
>> +
>>  struct drm_i915_gem_object *i915_gem_object_alloc(void)
>>  {
>>  	struct drm_i915_gem_object *obj;
>> @@ -133,7 +139,7 @@ void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj,
>>  {
>>  	struct drm_i915_private *i915 = to_i915(obj->base.dev);
>>  
>> -	obj->cache_level = cache_level;
>> +	obj->pat_index = i915_gem_get_pat_index(i915, cache_level);
>>  
>>  	if (cache_level != I915_CACHE_NONE)
>>  		obj->cache_coherent = (I915_BO_CACHE_COHERENT_FOR_READ |
>> @@ -148,6 +154,37 @@ void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj,
>>  		!IS_DGFX(i915);
>>  }
>>  
>> +/**
>> + * i915_gem_object_set_pat_index - set PAT index to be used in PTE encode
>> + * @obj: #drm_i915_gem_object
>> + * @pat_index: PAT index
>> + *
>> + * This is a clone of i915_gem_object_set_cache_coherency taking pat index
>> + * instead of cache_level as its second argument.
>> + */
>> +void i915_gem_object_set_pat_index(struct drm_i915_gem_object *obj,
>> +				   unsigned int pat_index)
>> +{
>> +	struct drm_i915_private *i915 = to_i915(obj->base.dev);
>> +
>> +	if (obj->pat_index == pat_index)
>> +		return;
>> +
>> +	obj->pat_index = pat_index;
>> +
>> +	if (pat_index != i915_gem_get_pat_index(i915, I915_CACHE_NONE))
>> +		obj->cache_coherent = (I915_BO_CACHE_COHERENT_FOR_READ |
>> +				       I915_BO_CACHE_COHERENT_FOR_WRITE);
>> +	else if (HAS_LLC(i915))
>> +		obj->cache_coherent = I915_BO_CACHE_COHERENT_FOR_READ;
>> +	else
>> +		obj->cache_coherent = 0;
>> +
>> +	obj->cache_dirty =
>> +		!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE) &&
>> +		!IS_DGFX(i915);
>> +}
>> +
>>  bool i915_gem_object_can_bypass_llc(struct drm_i915_gem_object *obj)
>>  {
>>  	struct drm_i915_private *i915 = to_i915(obj->base.dev);
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
>> index 4c92e17b4337..6f00aab10015 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
>> @@ -34,6 +34,8 @@ static inline bool i915_gem_object_size_2big(u64 size)
>>  
>>  unsigned int i915_gem_get_pat_index(struct drm_i915_private *i915,
>>  				    enum i915_cache_level level);
>> +bool i915_gem_object_has_cache_level(const struct drm_i915_gem_object *obj,
>> +				     enum i915_cache_level lvl);
>>  void i915_gem_init__objects(struct drm_i915_private *i915);
>>  
>>  void i915_objects_module_exit(void);
>> @@ -764,6 +766,8 @@ bool i915_gem_object_has_unknown_state(struct drm_i915_gem_object *obj);
>>  
>>  void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj,
>>  					 unsigned int cache_level);
>> +void i915_gem_object_set_pat_index(struct drm_i915_gem_object *obj,
>> +				   unsigned int pat_index);
>>  bool i915_gem_object_can_bypass_llc(struct drm_i915_gem_object *obj);
>>  void i915_gem_object_flush_if_display(struct drm_i915_gem_object *obj);
>>  void i915_gem_object_flush_if_display_locked(struct drm_i915_gem_object *obj);
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
>> index 890f3ad497c5..9c70dedf25cc 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
>> @@ -351,12 +351,20 @@ struct drm_i915_gem_object {
>>  #define I915_BO_FLAG_STRUCT_PAGE BIT(0) /* Object backed by struct pages */
>>  #define I915_BO_FLAG_IOMEM       BIT(1) /* Object backed by IO memory */
>>  	/**
>> -	 * @cache_level: The desired GTT caching level.
>> -	 *
>> -	 * See enum i915_cache_level for possible values, along with what
>> -	 * each does.
>> +	 * @pat_index: The desired PAT index.
>> +	 *
>> +	 * See hardware specification for valid PAT indices for each platform.
>> +	 * This field used to contain a value of enum i915_cache_level. It's
>> +	 * changed to an unsigned int because PAT indices are being used by
>> +	 * both UMD and KMD for caching policy control after GEN12.
>> +	 * For backward compatibility, this field will continue to contain
>> +	 * value of i915_cache_level for pre-GEN12 platforms so that the PTE
>> +	 * encode functions for these legacy platforms can stay the same.
>> +	 * In the meantime platform specific tables are created to translate
>> +	 * i915_cache_level into pat index, for more details check the macros
>> +	 * defined i915/i915_pci.c, e.g. PVC_CACHELEVEL.
>>  	 */
>> -	unsigned int cache_level:3;
>> +	unsigned int pat_index:6;
>>  	/**
>>  	 * @cache_coherent:
>>  	 *
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>> index 8ac376c24aa2..9f379141f966 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
>> @@ -557,7 +557,9 @@ static void dbg_poison(struct i915_ggtt *ggtt,
>>  
>>  		ggtt->vm.insert_page(&ggtt->vm, addr,
>>  				     ggtt->error_capture.start,
>> -				     I915_CACHE_NONE, 0);
>> +				     i915_gem_get_pat_index(ggtt->vm.i915,
>> +							    I915_CACHE_NONE),
>> +				     0);
>>  		mb();
>>  
>>  		s = io_mapping_map_wc(&ggtt->iomap,
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
>> index d030182ca176..7eadb7d68d47 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
>> @@ -214,7 +214,8 @@ static struct dma_fence *i915_ttm_accel_move(struct ttm_buffer_object *bo,
>>  
>>  		intel_engine_pm_get(to_gt(i915)->migrate.context->engine);
>>  		ret = intel_context_migrate_clear(to_gt(i915)->migrate.context, deps,
>> -						  dst_st->sgl, dst_level,
>> +						  dst_st->sgl,
>> +						  i915_gem_get_pat_index(i915, dst_level),
>>  						  i915_ttm_gtt_binds_lmem(dst_mem),
>>  						  0, &rq);
>>  	} else {
>> @@ -227,12 +228,13 @@ static struct dma_fence *i915_ttm_accel_move(struct ttm_buffer_object *bo,
>>  		src_level = i915_ttm_cache_level(i915, bo->resource, src_ttm);
>>  		intel_engine_pm_get(to_gt(i915)->migrate.context->engine);
>>  		ret = intel_context_migrate_copy(to_gt(i915)->migrate.context,
>> -						 deps, src_rsgt->table.sgl,
>> -						 src_level,
>> -						 i915_ttm_gtt_binds_lmem(bo->resource),
>> -						 dst_st->sgl, dst_level,
>> -						 i915_ttm_gtt_binds_lmem(dst_mem),
>> -						 &rq);
>> +					deps, src_rsgt->table.sgl,
>> +					i915_gem_get_pat_index(i915, src_level),
>> +					i915_ttm_gtt_binds_lmem(bo->resource),
>> +					dst_st->sgl,
>> +					i915_gem_get_pat_index(i915, dst_level),
>> +					i915_ttm_gtt_binds_lmem(dst_mem),
>> +					&rq);
>>  
>>  		i915_refct_sgt_put(src_rsgt);
>>  	}
>> diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
>> index defece0bcb81..ebb68ac9cd5e 100644
>> --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
>> +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
>> @@ -354,7 +354,7 @@ fake_huge_pages_object(struct drm_i915_private *i915, u64 size, bool single)
>>  
>>  	obj->write_domain = I915_GEM_DOMAIN_CPU;
>>  	obj->read_domains = I915_GEM_DOMAIN_CPU;
>> -	obj->cache_level = I915_CACHE_NONE;
>> +	obj->pat_index = i915_gem_get_pat_index(i915, I915_CACHE_NONE);
>>  
>>  	return obj;
>>  }
>> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
>> index fe6c37fd7859..a93a90b15907 100644
>> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
>> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
>> @@ -219,7 +219,7 @@ static int __igt_lmem_pages_migrate(struct intel_gt *gt,
>>  			continue;
>>  
>>  		err = intel_migrate_clear(&gt->migrate, &ww, deps,
>> -					  obj->mm.pages->sgl, obj->cache_level,
>> +					  obj->mm.pages->sgl, obj->pat_index,
>>  					  i915_gem_object_is_lmem(obj),
>>  					  0xdeadbeaf, &rq);
>>  		if (rq) {
>> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
>> index 56279908ed30..a93d8f9f8bc1 100644
>> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
>> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
>> @@ -1222,7 +1222,7 @@ static int __igt_mmap_migrate(struct intel_memory_region **placements,
>>  	}
>>  
>>  	err = intel_context_migrate_clear(to_gt(i915)->migrate.context, NULL,
>> -					  obj->mm.pages->sgl, obj->cache_level,
>> +					  obj->mm.pages->sgl, obj->pat_index,
>>  					  i915_gem_object_is_lmem(obj),
>>  					  expand32(POISON_INUSE), &rq);
>>  	i915_gem_object_unpin_pages(obj);
>> diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
>> index 5aaacc53fa4c..c2bdc133c89a 100644
>> --- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
>> +++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
>> @@ -109,7 +109,7 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
>>  
>>  static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
>>  				      struct i915_vma_resource *vma_res,
>> -				      enum i915_cache_level cache_level,
>> +				      unsigned int pat_index,
>>  				      u32 flags)
>>  {
>>  	struct i915_ppgtt *ppgtt = i915_vm_to_ppgtt(vm);
>> @@ -117,7 +117,7 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
>>  	unsigned int first_entry = vma_res->start / I915_GTT_PAGE_SIZE;
>>  	unsigned int act_pt = first_entry / GEN6_PTES;
>>  	unsigned int act_pte = first_entry % GEN6_PTES;
>> -	const u32 pte_encode = vm->pte_encode(0, cache_level, flags);
>> +	const u32 pte_encode = vm->pte_encode(0, pat_index, flags);
>>  	struct sgt_dma iter = sgt_dma(vma_res);
>>  	gen6_pte_t *vaddr;
>>  
>> @@ -227,7 +227,9 @@ static int gen6_ppgtt_init_scratch(struct gen6_ppgtt *ppgtt)
>>  
>>  	vm->scratch[0]->encode =
>>  		vm->pte_encode(px_dma(vm->scratch[0]),
>> -			       I915_CACHE_NONE, PTE_READ_ONLY);
>> +			       i915_gem_get_pat_index(vm->i915,
>> +						      I915_CACHE_NONE),
>> +			       PTE_READ_ONLY);
>>  
>>  	vm->scratch[1] = vm->alloc_pt_dma(vm, I915_GTT_PAGE_SIZE_4K);
>>  	if (IS_ERR(vm->scratch[1])) {
>> @@ -278,7 +280,7 @@ static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
>>  static void pd_vma_bind(struct i915_address_space *vm,
>>  			struct i915_vm_pt_stash *stash,
>>  			struct i915_vma_resource *vma_res,
>> -			enum i915_cache_level cache_level,
>> +			unsigned int pat_index,
>>  			u32 unused)
>>  {
>>  	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
>> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
>> index 3ae41a13d28d..f76ec2cb29ef 100644
>> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
>> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
>> @@ -15,6 +15,11 @@
>>  #include "intel_gt.h"
>>  #include "intel_gtt.h"
>>  
>> +/**
>> + * For pre-gen12 platforms pat_index is the same as enum i915_cache_level,
>> + * so the code here is still valid. See translation table defined by
>> + * LEGACY_CACHELEVEL
>> + */
>>  static u64 gen8_pde_encode(const dma_addr_t addr,
>>  			   const enum i915_cache_level level)
>>  {
>> @@ -56,7 +61,7 @@ static u64 gen8_pte_encode(dma_addr_t addr,
>>  }
>>  
>>  static u64 mtl_pte_encode(dma_addr_t addr,
>> -			  enum i915_cache_level level,
>> +			  unsigned int pat_index,
>>  			  u32 flags)
>>  {
>>  	gen8_pte_t pte = addr | GEN8_PAGE_PRESENT | GEN8_PAGE_RW;
>> @@ -67,24 +72,17 @@ static u64 mtl_pte_encode(dma_addr_t addr,
>>  	if (flags & PTE_LM)
>>  		pte |= GEN12_PPGTT_PTE_LM | GEN12_PPGTT_PTE_NC;
>>  
>> -	switch (level) {
>> -	case I915_CACHE_NONE:
>> -		pte |= GEN12_PPGTT_PTE_PAT1;
>> -		break;
>> -	case I915_CACHE_LLC:
>> -	case I915_CACHE_L3_LLC:
>> -		pte |= GEN12_PPGTT_PTE_PAT0 | GEN12_PPGTT_PTE_PAT1;
>> -		break;
>> -	case I915_CACHE_WT:
>> +	if (pat_index & BIT(0))
>>  		pte |= GEN12_PPGTT_PTE_PAT0;
>> -		break;
>> -	default:
>> -		/* This should never happen. Added to deal with the compile
>> -		 * error due to the addition of I915_MAX_CACHE_LEVEL. Will
>> -		 * be removed by the pat_index patch.
>> -		 */
>> -		break;
>> -	}
>> +
>> +	if (pat_index & BIT(1))
>> +		pte |= GEN12_PPGTT_PTE_PAT1;
>> +
>> +	if (pat_index & BIT(2))
>> +		pte |= GEN12_PPGTT_PTE_PAT2;
>> +
>> +	if (pat_index & BIT(3))
>> +		pte |= GEN12_PPGTT_PTE_PAT3;
>>  
>>  	return pte;
>>  }
>> @@ -457,11 +455,11 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
>>  		      struct i915_page_directory *pdp,
>>  		      struct sgt_dma *iter,
>>  		      u64 idx,
>> -		      enum i915_cache_level cache_level,
>> +		      unsigned int pat_index,
>>  		      u32 flags)
>>  {
>>  	struct i915_page_directory *pd;
>> -	const gen8_pte_t pte_encode = ppgtt->vm.pte_encode(0, cache_level, flags);
>> +	const gen8_pte_t pte_encode = ppgtt->vm.pte_encode(0, pat_index, flags);
>>  	gen8_pte_t *vaddr;
>>  
>>  	pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2));
>> @@ -504,10 +502,10 @@ static void
>>  xehpsdv_ppgtt_insert_huge(struct i915_address_space *vm,
>>  			  struct i915_vma_resource *vma_res,
>>  			  struct sgt_dma *iter,
>> -			  enum i915_cache_level cache_level,
>> +			  unsigned int pat_index,
>>  			  u32 flags)
>>  {
>> -	const gen8_pte_t pte_encode = vm->pte_encode(0, cache_level, flags);
>> +	const gen8_pte_t pte_encode = vm->pte_encode(0, pat_index, flags);
>>  	unsigned int rem = sg_dma_len(iter->sg);
>>  	u64 start = vma_res->start;
>>  	u64 end = start + vma_res->vma_size;
>> @@ -611,10 +609,10 @@ xehpsdv_ppgtt_insert_huge(struct i915_address_space *vm,
>>  static void gen8_ppgtt_insert_huge(struct i915_address_space *vm,
>>  				   struct i915_vma_resource *vma_res,
>>  				   struct sgt_dma *iter,
>> -				   enum i915_cache_level cache_level,
>> +				   unsigned int pat_index,
>>  				   u32 flags)
>>  {
>> -	const gen8_pte_t pte_encode = vm->pte_encode(0, cache_level, flags);
>> +	const gen8_pte_t pte_encode = vm->pte_encode(0, pat_index, flags);
>>  	unsigned int rem = sg_dma_len(iter->sg);
>>  	u64 start = vma_res->start;
>>  
>> @@ -734,7 +732,7 @@ static void gen8_ppgtt_insert_huge(struct i915_address_space *vm,
>>  
>>  static void gen8_ppgtt_insert(struct i915_address_space *vm,
>>  			      struct i915_vma_resource *vma_res,
>> -			      enum i915_cache_level cache_level,
>> +			      unsigned int pat_index,
>>  			      u32 flags)
>>  {
>>  	struct i915_ppgtt * const ppgtt = i915_vm_to_ppgtt(vm);
>> @@ -742,9 +740,9 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm,
>>  
>>  	if (vma_res->bi.page_sizes.sg > I915_GTT_PAGE_SIZE) {
>>  		if (HAS_64K_PAGES(vm->i915))
>> -			xehpsdv_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags);
>> +			xehpsdv_ppgtt_insert_huge(vm, vma_res, &iter, pat_index, flags);
>>  		else
>> -			gen8_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags);
>> +			gen8_ppgtt_insert_huge(vm, vma_res, &iter, pat_index, flags);
>>  	} else  {
>>  		u64 idx = vma_res->start >> GEN8_PTE_SHIFT;
>>  
>> @@ -753,7 +751,7 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm,
>>  				gen8_pdp_for_page_index(vm, idx);
>>  
>>  			idx = gen8_ppgtt_insert_pte(ppgtt, pdp, &iter, idx,
>> -						    cache_level, flags);
>> +						    pat_index, flags);
>>  		} while (idx);
>>  
>>  		vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
>> @@ -763,7 +761,7 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm,
>>  static void gen8_ppgtt_insert_entry(struct i915_address_space *vm,
>>  				    dma_addr_t addr,
>>  				    u64 offset,
>> -				    enum i915_cache_level level,
>> +				    unsigned int pat_index,
>>  				    u32 flags)
>>  {
>>  	u64 idx = offset >> GEN8_PTE_SHIFT;
>> @@ -777,14 +775,14 @@ static void gen8_ppgtt_insert_entry(struct i915_address_space *vm,
>>  	GEM_BUG_ON(pt->is_compact);
>>  
>>  	vaddr = px_vaddr(pt);
>> -	vaddr[gen8_pd_index(idx, 0)] = vm->pte_encode(addr, level, flags);
>> +	vaddr[gen8_pd_index(idx, 0)] = vm->pte_encode(addr, pat_index, flags);
>>  	drm_clflush_virt_range(&vaddr[gen8_pd_index(idx, 0)], sizeof(*vaddr));
>>  }
>>  
>>  static void __xehpsdv_ppgtt_insert_entry_lm(struct i915_address_space *vm,
>>  					    dma_addr_t addr,
>>  					    u64 offset,
>> -					    enum i915_cache_level level,
>> +					    unsigned int pat_index,
>>  					    u32 flags)
>>  {
>>  	u64 idx = offset >> GEN8_PTE_SHIFT;
>> @@ -807,20 +805,20 @@ static void __xehpsdv_ppgtt_insert_entry_lm(struct i915_address_space *vm,
>>  	}
>>  
>>  	vaddr = px_vaddr(pt);
>> -	vaddr[gen8_pd_index(idx, 0) / 16] = vm->pte_encode(addr, level, flags);
>> +	vaddr[gen8_pd_index(idx, 0) / 16] = vm->pte_encode(addr, pat_index, flags);
>>  }
>>  
>>  static void xehpsdv_ppgtt_insert_entry(struct i915_address_space *vm,
>>  				       dma_addr_t addr,
>>  				       u64 offset,
>> -				       enum i915_cache_level level,
>> +				       unsigned int pat_index,
>>  				       u32 flags)
>>  {
>>  	if (flags & PTE_LM)
>>  		return __xehpsdv_ppgtt_insert_entry_lm(vm, addr, offset,
>> -						       level, flags);
>> +						       pat_index, flags);
>>  
>> -	return gen8_ppgtt_insert_entry(vm, addr, offset, level, flags);
>> +	return gen8_ppgtt_insert_entry(vm, addr, offset, pat_index, flags);
>>  }
>>  
>>  static int gen8_init_scratch(struct i915_address_space *vm)
>> @@ -855,7 +853,9 @@ static int gen8_init_scratch(struct i915_address_space *vm)
>>  
>>  	vm->scratch[0]->encode =
>>  		vm->pte_encode(px_dma(vm->scratch[0]),
>> -				I915_CACHE_NONE, pte_flags);
>> +			       i915_gem_get_pat_index(vm->i915,
>> +						      I915_CACHE_NONE),
>> +			       pte_flags);
>>  
>>  	for (i = 1; i <= vm->top; i++) {
>>  		struct drm_i915_gem_object *obj;
>> @@ -873,7 +873,9 @@ static int gen8_init_scratch(struct i915_address_space *vm)
>>  		}
>>  
>>  		fill_px(obj, vm->scratch[i - 1]->encode);
>> -		obj->encode = gen8_pde_encode(px_dma(obj), I915_CACHE_NONE);
>> +		obj->encode = gen8_pde_encode(px_dma(obj),
>> +					      i915_gem_get_pat_index(vm->i915,
>> +							I915_CACHE_NONE));
>>  
>>  		vm->scratch[i] = obj;
>>  	}
>> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
>> index 6b8ce7f4d25a..98e260e1a081 100644
>> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
>> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
>> @@ -10,13 +10,12 @@
>>  
>>  struct i915_address_space;
>>  struct intel_gt;
>> -enum i915_cache_level;
>>  
>>  struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
>>  				     unsigned long lmem_pt_obj_flags);
>>  
>>  u64 gen8_ggtt_pte_encode(dma_addr_t addr,
>> -			 enum i915_cache_level level,
>> +			 unsigned int pat_index,
>>  			 u32 flags);
>>  u64 mtl_ggtt_pte_encode(dma_addr_t addr,
>>  			unsigned int pat_index,
>> diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
>> index 91056b9a60a9..66a4955f19e4 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
>> @@ -221,7 +221,7 @@ static void guc_ggtt_invalidate(struct i915_ggtt *ggtt)
>>  }
>>  
>>  u64 mtl_ggtt_pte_encode(dma_addr_t addr,
>> -			enum i915_cache_level level,
>> +			unsigned int pat_index,
>>  			u32 flags)
>>  {
>>  	gen8_pte_t pte = addr | GEN8_PAGE_PRESENT;
>> @@ -231,30 +231,17 @@ u64 mtl_ggtt_pte_encode(dma_addr_t addr,
>>  	if (flags & PTE_LM)
>>  		pte |= GEN12_GGTT_PTE_LM;
>>  
>> -	switch (level) {
>> -	case I915_CACHE_NONE:
>> -		pte |= MTL_GGTT_PTE_PAT1;
>> -		break;
>> -	case I915_CACHE_LLC:
>> -	case I915_CACHE_L3_LLC:
>> -		pte |= MTL_GGTT_PTE_PAT0 | MTL_GGTT_PTE_PAT1;
>> -		break;
>> -	case I915_CACHE_WT:
>> +	if (pat_index & BIT(0))
>>  		pte |= MTL_GGTT_PTE_PAT0;
>> -		break;
>> -	default:
>> -		/* This should never happen. Added to deal with the compile
>> -		 * error due to the addition of I915_MAX_CACHE_LEVEL. Will
>> -		 * be removed by the pat_index patch.
>> -		 */
>> -		break;
>> -	}
>> +
>> +	if (pat_index & BIT(1))
>> +		pte |= MTL_GGTT_PTE_PAT1;
>>  
>>  	return pte;
>>  }
>>  
>>  u64 gen8_ggtt_pte_encode(dma_addr_t addr,
>> -			 enum i915_cache_level level,
>> +			 unsigned int pat_index,
>>  			 u32 flags)
>>  {
>>  	gen8_pte_t pte = addr | GEN8_PAGE_PRESENT;
>> @@ -273,25 +260,25 @@ static void gen8_set_pte(void __iomem *addr, gen8_pte_t pte)
>>  static void gen8_ggtt_insert_page(struct i915_address_space *vm,
>>  				  dma_addr_t addr,
>>  				  u64 offset,
>> -				  enum i915_cache_level level,
>> +				  unsigned int pat_index,
>>  				  u32 flags)
>>  {
>>  	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
>>  	gen8_pte_t __iomem *pte =
>>  		(gen8_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE;
>>  
>> -	gen8_set_pte(pte, ggtt->vm.pte_encode(addr, level, flags));
>> +	gen8_set_pte(pte, ggtt->vm.pte_encode(addr, pat_index, flags));
>>  
>>  	ggtt->invalidate(ggtt);
>>  }
>>  
>>  static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
>>  				     struct i915_vma_resource *vma_res,
>> -				     enum i915_cache_level level,
>> +				     unsigned int pat_index,
>>  				     u32 flags)
>>  {
>>  	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
>> -	const gen8_pte_t pte_encode = ggtt->vm.pte_encode(0, level, flags);
>> +	const gen8_pte_t pte_encode = ggtt->vm.pte_encode(0, pat_index, flags);
>>  	gen8_pte_t __iomem *gte;
>>  	gen8_pte_t __iomem *end;
>>  	struct sgt_iter iter;
>> @@ -348,14 +335,14 @@ static void gen8_ggtt_clear_range(struct i915_address_space *vm,
>>  static void gen6_ggtt_insert_page(struct i915_address_space *vm,
>>  				  dma_addr_t addr,
>>  				  u64 offset,
>> -				  enum i915_cache_level level,
>> +				  unsigned int pat_index,
>>  				  u32 flags)
>>  {
>>  	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
>>  	gen6_pte_t __iomem *pte =
>>  		(gen6_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE;
>>  
>> -	iowrite32(vm->pte_encode(addr, level, flags), pte);
>> +	iowrite32(vm->pte_encode(addr, pat_index, flags), pte);
>>  
>>  	ggtt->invalidate(ggtt);
>>  }
>> @@ -368,7 +355,7 @@ static void gen6_ggtt_insert_page(struct i915_address_space *vm,
>>   */
>>  static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
>>  				     struct i915_vma_resource *vma_res,
>> -				     enum i915_cache_level level,
>> +				     unsigned int pat_index,
>>  				     u32 flags)
>>  {
>>  	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
>> @@ -385,7 +372,7 @@ static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
>>  		iowrite32(vm->scratch[0]->encode, gte++);
>>  	end += (vma_res->node_size + vma_res->guard) / I915_GTT_PAGE_SIZE;
>>  	for_each_sgt_daddr(addr, iter, vma_res->bi.pages)
>> -		iowrite32(vm->pte_encode(addr, level, flags), gte++);
>> +		iowrite32(vm->pte_encode(addr, pat_index, flags), gte++);
>>  	GEM_BUG_ON(gte > end);
>>  
>>  	/* Fill the allocated but "unused" space beyond the end of the buffer */
>> @@ -420,14 +407,15 @@ struct insert_page {
>>  	struct i915_address_space *vm;
>>  	dma_addr_t addr;
>>  	u64 offset;
>> -	enum i915_cache_level level;
>> +	unsigned int pat_index;
>>  };
>>  
>>  static int bxt_vtd_ggtt_insert_page__cb(void *_arg)
>>  {
>>  	struct insert_page *arg = _arg;
>>  
>> -	gen8_ggtt_insert_page(arg->vm, arg->addr, arg->offset, arg->level, 0);
>> +	gen8_ggtt_insert_page(arg->vm, arg->addr, arg->offset,
>> +			      arg->pat_index, 0);
>>  	bxt_vtd_ggtt_wa(arg->vm);
>>  
>>  	return 0;
>> @@ -436,10 +424,10 @@ static int bxt_vtd_ggtt_insert_page__cb(void *_arg)
>>  static void bxt_vtd_ggtt_insert_page__BKL(struct i915_address_space *vm,
>>  					  dma_addr_t addr,
>>  					  u64 offset,
>> -					  enum i915_cache_level level,
>> +					  unsigned int pat_index,
>>  					  u32 unused)
>>  {
>> -	struct insert_page arg = { vm, addr, offset, level };
>> +	struct insert_page arg = { vm, addr, offset, pat_index };
>>  
>>  	stop_machine(bxt_vtd_ggtt_insert_page__cb, &arg, NULL);
>>  }
>> @@ -447,7 +435,7 @@ static void bxt_vtd_ggtt_insert_page__BKL(struct i915_address_space *vm,
>>  struct insert_entries {
>>  	struct i915_address_space *vm;
>>  	struct i915_vma_resource *vma_res;
>> -	enum i915_cache_level level;
>> +	unsigned int pat_index;
>>  	u32 flags;
>>  };
>>  
>> @@ -455,7 +443,8 @@ static int bxt_vtd_ggtt_insert_entries__cb(void *_arg)
>>  {
>>  	struct insert_entries *arg = _arg;
>>  
>> -	gen8_ggtt_insert_entries(arg->vm, arg->vma_res, arg->level, arg->flags);
>> +	gen8_ggtt_insert_entries(arg->vm, arg->vma_res,
>> +				 arg->pat_index, arg->flags);
>>  	bxt_vtd_ggtt_wa(arg->vm);
>>  
>>  	return 0;
>> @@ -463,10 +452,10 @@ static int bxt_vtd_ggtt_insert_entries__cb(void *_arg)
>>  
>>  static void bxt_vtd_ggtt_insert_entries__BKL(struct i915_address_space *vm,
>>  					     struct i915_vma_resource *vma_res,
>> -					     enum i915_cache_level level,
>> +					     unsigned int pat_index,
>>  					     u32 flags)
>>  {
>> -	struct insert_entries arg = { vm, vma_res, level, flags };
>> +	struct insert_entries arg = { vm, vma_res, pat_index, flags };
>>  
>>  	stop_machine(bxt_vtd_ggtt_insert_entries__cb, &arg, NULL);
>>  }
>> @@ -495,7 +484,7 @@ static void gen6_ggtt_clear_range(struct i915_address_space *vm,
>>  void intel_ggtt_bind_vma(struct i915_address_space *vm,
>>  			 struct i915_vm_pt_stash *stash,
>>  			 struct i915_vma_resource *vma_res,
>> -			 enum i915_cache_level cache_level,
>> +			 unsigned int pat_index,
>>  			 u32 flags)
>>  {
>>  	u32 pte_flags;
>> @@ -512,7 +501,7 @@ void intel_ggtt_bind_vma(struct i915_address_space *vm,
>>  	if (vma_res->bi.lmem)
>>  		pte_flags |= PTE_LM;
>>  
>> -	vm->insert_entries(vm, vma_res, cache_level, pte_flags);
>> +	vm->insert_entries(vm, vma_res, pat_index, pte_flags);
>>  	vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
>>  }
>>  
>> @@ -661,7 +650,7 @@ static int init_ggtt(struct i915_ggtt *ggtt)
>>  static void aliasing_gtt_bind_vma(struct i915_address_space *vm,
>>  				  struct i915_vm_pt_stash *stash,
>>  				  struct i915_vma_resource *vma_res,
>> -				  enum i915_cache_level cache_level,
>> +				  unsigned int pat_index,
>>  				  u32 flags)
>>  {
>>  	u32 pte_flags;
>> @@ -673,10 +662,10 @@ static void aliasing_gtt_bind_vma(struct i915_address_space *vm,
>>  
>>  	if (flags & I915_VMA_LOCAL_BIND)
>>  		ppgtt_bind_vma(&i915_vm_to_ggtt(vm)->alias->vm,
>> -			       stash, vma_res, cache_level, flags);
>> +			       stash, vma_res, pat_index, flags);
>>  
>>  	if (flags & I915_VMA_GLOBAL_BIND)
>> -		vm->insert_entries(vm, vma_res, cache_level, pte_flags);
>> +		vm->insert_entries(vm, vma_res, pat_index, pte_flags);
>>  
>>  	vma_res->bound_flags |= flags;
>>  }
>> @@ -933,7 +922,9 @@ static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 size)
>>  
>>  	ggtt->vm.scratch[0]->encode =
>>  		ggtt->vm.pte_encode(px_dma(ggtt->vm.scratch[0]),
>> -				    I915_CACHE_NONE, pte_flags);
>> +				    i915_gem_get_pat_index(i915,
>> +							   I915_CACHE_NONE),
>> +				    pte_flags);
>>  
>>  	return 0;
>>  }
>> @@ -1022,6 +1013,11 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
>>  	return ggtt_probe_common(ggtt, size);
>>  }
>>  
>> +/*
>> + * For pre-gen8 platforms pat_index is the same as enum i915_cache_level,
>> + * so these PTE encode functions are left with using cache_level.
>> + * See translation table LEGACY_CACHELEVEL.
>> + */
>>  static u64 snb_pte_encode(dma_addr_t addr,
>>  			  enum i915_cache_level level,
>>  			  u32 flags)
>> @@ -1302,7 +1298,9 @@ bool i915_ggtt_resume_vm(struct i915_address_space *vm)
>>  		 */
>>  		vma->resource->bound_flags = 0;
>>  		vma->ops->bind_vma(vm, NULL, vma->resource,
>> -				   obj ? obj->cache_level : 0,
>> +				   obj ? obj->pat_index :
>> +					 i915_gem_get_pat_index(vm->i915,
>> +							I915_CACHE_NONE),
>>  				   was_bound);
>>  
>>  		if (obj) { /* only used during resume => exclusive access */
>> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
>> index b632167eaf2e..12bd4398ad38 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
>> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
>> @@ -165,8 +165,6 @@ typedef u64 gen8_pte_t;
>>  #define MTL_2_COH_1W	REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 2)
>>  #define MTL_0_COH_NON	REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 0)
>>  
>> -enum i915_cache_level;
>> -
>>  struct drm_i915_gem_object;
>>  struct i915_fence_reg;
>>  struct i915_vma;
>> @@ -234,7 +232,7 @@ struct i915_vma_ops {
>>  	void (*bind_vma)(struct i915_address_space *vm,
>>  			 struct i915_vm_pt_stash *stash,
>>  			 struct i915_vma_resource *vma_res,
>> -			 enum i915_cache_level cache_level,
>> +			 unsigned int pat_index,
>>  			 u32 flags);
>>  	/*
>>  	 * Unmap an object from an address space. This usually consists of
>> @@ -306,7 +304,7 @@ struct i915_address_space {
>>  		(*alloc_scratch_dma)(struct i915_address_space *vm, int sz);
>>  
>>  	u64 (*pte_encode)(dma_addr_t addr,
>> -			  enum i915_cache_level level,
>> +			  unsigned int pat_index,
>>  			  u32 flags); /* Create a valid PTE */
>>  #define PTE_READ_ONLY	BIT(0)
>>  #define PTE_LM		BIT(1)
>> @@ -321,20 +319,20 @@ struct i915_address_space {
>>  	void (*insert_page)(struct i915_address_space *vm,
>>  			    dma_addr_t addr,
>>  			    u64 offset,
>> -			    enum i915_cache_level cache_level,
>> +			    unsigned int pat_index,
>>  			    u32 flags);
>>  	void (*insert_entries)(struct i915_address_space *vm,
>>  			       struct i915_vma_resource *vma_res,
>> -			       enum i915_cache_level cache_level,
>> +			       unsigned int pat_index,
>>  			       u32 flags);
>>  	void (*raw_insert_page)(struct i915_address_space *vm,
>>  				dma_addr_t addr,
>>  				u64 offset,
>> -				enum i915_cache_level cache_level,
>> +				unsigned int pat_index,
>>  				u32 flags);
>>  	void (*raw_insert_entries)(struct i915_address_space *vm,
>>  				   struct i915_vma_resource *vma_res,
>> -				   enum i915_cache_level cache_level,
>> +				   unsigned int pat_index,
>>  				   u32 flags);
>>  	void (*cleanup)(struct i915_address_space *vm);
>>  
>> @@ -581,7 +579,7 @@ void ppgtt_init(struct i915_ppgtt *ppgtt, struct intel_gt *gt,
>>  void intel_ggtt_bind_vma(struct i915_address_space *vm,
>>  			 struct i915_vm_pt_stash *stash,
>>  			 struct i915_vma_resource *vma_res,
>> -			 enum i915_cache_level cache_level,
>> +			 unsigned int pat_index,
>>  			 u32 flags);
>>  void intel_ggtt_unbind_vma(struct i915_address_space *vm,
>>  			   struct i915_vma_resource *vma_res);
>> @@ -639,7 +637,7 @@ void
>>  __set_pd_entry(struct i915_page_directory * const pd,
>>  	       const unsigned short idx,
>>  	       struct i915_page_table *pt,
>> -	       u64 (*encode)(const dma_addr_t, const enum i915_cache_level));
>> +	       u64 (*encode)(const dma_addr_t, const unsigned int pat_index));
>>  
>>  #define set_pd_entry(pd, idx, to) \
>>  	__set_pd_entry((pd), (idx), px_pt(to), gen8_pde_encode)
>> @@ -659,7 +657,7 @@ void gen6_ggtt_invalidate(struct i915_ggtt *ggtt);
>>  void ppgtt_bind_vma(struct i915_address_space *vm,
>>  		    struct i915_vm_pt_stash *stash,
>>  		    struct i915_vma_resource *vma_res,
>> -		    enum i915_cache_level cache_level,
>> +		    unsigned int pat_index,
>>  		    u32 flags);
>>  void ppgtt_unbind_vma(struct i915_address_space *vm,
>>  		      struct i915_vma_resource *vma_res);
>> diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c
>> index 3f638f198796..117c3d05af3e 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_migrate.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
>> @@ -45,7 +45,9 @@ static void xehpsdv_toggle_pdes(struct i915_address_space *vm,
>>  	 * Insert a dummy PTE into every PT that will map to LMEM to ensure
>>  	 * we have a correctly setup PDE structure for later use.
>>  	 */
>> -	vm->insert_page(vm, 0, d->offset, I915_CACHE_NONE, PTE_LM);
>> +	vm->insert_page(vm, 0, d->offset,
>> +			i915_gem_get_pat_index(vm->i915, I915_CACHE_NONE),
>> +			PTE_LM);
>>  	GEM_BUG_ON(!pt->is_compact);
>>  	d->offset += SZ_2M;
>>  }
>> @@ -63,7 +65,9 @@ static void xehpsdv_insert_pte(struct i915_address_space *vm,
>>  	 * alignment is 64K underneath for the pt, and we are careful
>>  	 * not to access the space in the void.
>>  	 */
>> -	vm->insert_page(vm, px_dma(pt), d->offset, I915_CACHE_NONE, PTE_LM);
>> +	vm->insert_page(vm, px_dma(pt), d->offset,
>> +			i915_gem_get_pat_index(vm->i915, I915_CACHE_NONE),
>> +			PTE_LM);
>>  	d->offset += SZ_64K;
>>  }
>>  
>> @@ -73,7 +77,8 @@ static void insert_pte(struct i915_address_space *vm,
>>  {
>>  	struct insert_pte_data *d = data;
>>  
>> -	vm->insert_page(vm, px_dma(pt), d->offset, I915_CACHE_NONE,
>> +	vm->insert_page(vm, px_dma(pt), d->offset,
>> +			i915_gem_get_pat_index(vm->i915, I915_CACHE_NONE),
>>  			i915_gem_object_is_lmem(pt->base) ? PTE_LM : 0);
>>  	d->offset += PAGE_SIZE;
>>  }
>> @@ -356,13 +361,13 @@ static int max_pte_pkt_size(struct i915_request *rq, int pkt)
>>  
>>  static int emit_pte(struct i915_request *rq,
>>  		    struct sgt_dma *it,
>> -		    enum i915_cache_level cache_level,
>> +		    unsigned int pat_index,
>>  		    bool is_lmem,
>>  		    u64 offset,
>>  		    int length)
>>  {
>>  	bool has_64K_pages = HAS_64K_PAGES(rq->engine->i915);
>> -	const u64 encode = rq->context->vm->pte_encode(0, cache_level,
>> +	const u64 encode = rq->context->vm->pte_encode(0, pat_index,
>>  						       is_lmem ? PTE_LM : 0);
>>  	struct intel_ring *ring = rq->ring;
>>  	int pkt, dword_length;
>> @@ -673,17 +678,17 @@ int
>>  intel_context_migrate_copy(struct intel_context *ce,
>>  			   const struct i915_deps *deps,
>>  			   struct scatterlist *src,
>> -			   enum i915_cache_level src_cache_level,
>> +			   unsigned int src_pat_index,
>>  			   bool src_is_lmem,
>>  			   struct scatterlist *dst,
>> -			   enum i915_cache_level dst_cache_level,
>> +			   unsigned int dst_pat_index,
>>  			   bool dst_is_lmem,
>>  			   struct i915_request **out)
>>  {
>>  	struct sgt_dma it_src = sg_sgt(src), it_dst = sg_sgt(dst), it_ccs;
>>  	struct drm_i915_private *i915 = ce->engine->i915;
>>  	u64 ccs_bytes_to_cpy = 0, bytes_to_cpy;
>> -	enum i915_cache_level ccs_cache_level;
>> +	unsigned int ccs_pat_index;
>>  	u32 src_offset, dst_offset;
>>  	u8 src_access, dst_access;
>>  	struct i915_request *rq;
>> @@ -707,12 +712,12 @@ intel_context_migrate_copy(struct intel_context *ce,
>>  		dst_sz = scatter_list_length(dst);
>>  		if (src_is_lmem) {
>>  			it_ccs = it_dst;
>> -			ccs_cache_level = dst_cache_level;
>> +			ccs_pat_index = dst_pat_index;
>>  			ccs_is_src = false;
>>  		} else if (dst_is_lmem) {
>>  			bytes_to_cpy = dst_sz;
>>  			it_ccs = it_src;
>> -			ccs_cache_level = src_cache_level;
>> +			ccs_pat_index = src_pat_index;
>>  			ccs_is_src = true;
>>  		}
>>  
>> @@ -773,7 +778,7 @@ intel_context_migrate_copy(struct intel_context *ce,
>>  		src_sz = calculate_chunk_sz(i915, src_is_lmem,
>>  					    bytes_to_cpy, ccs_bytes_to_cpy);
>>  
>> -		len = emit_pte(rq, &it_src, src_cache_level, src_is_lmem,
>> +		len = emit_pte(rq, &it_src, src_pat_index, src_is_lmem,
>>  			       src_offset, src_sz);
>>  		if (!len) {
>>  			err = -EINVAL;
>> @@ -784,7 +789,7 @@ intel_context_migrate_copy(struct intel_context *ce,
>>  			goto out_rq;
>>  		}
>>  
>> -		err = emit_pte(rq, &it_dst, dst_cache_level, dst_is_lmem,
>> +		err = emit_pte(rq, &it_dst, dst_pat_index, dst_is_lmem,
>>  			       dst_offset, len);
>>  		if (err < 0)
>>  			goto out_rq;
>> @@ -811,7 +816,7 @@ intel_context_migrate_copy(struct intel_context *ce,
>>  				goto out_rq;
>>  
>>  			ccs_sz = GET_CCS_BYTES(i915, len);
>> -			err = emit_pte(rq, &it_ccs, ccs_cache_level, false,
>> +			err = emit_pte(rq, &it_ccs, ccs_pat_index, false,
>>  				       ccs_is_src ? src_offset : dst_offset,
>>  				       ccs_sz);
>>  			if (err < 0)
>> @@ -979,7 +984,7 @@ int
>>  intel_context_migrate_clear(struct intel_context *ce,
>>  			    const struct i915_deps *deps,
>>  			    struct scatterlist *sg,
>> -			    enum i915_cache_level cache_level,
>> +			    unsigned int pat_index,
>>  			    bool is_lmem,
>>  			    u32 value,
>>  			    struct i915_request **out)
>> @@ -1027,7 +1032,7 @@ intel_context_migrate_clear(struct intel_context *ce,
>>  		if (err)
>>  			goto out_rq;
>>  
>> -		len = emit_pte(rq, &it, cache_level, is_lmem, offset, CHUNK_SZ);
>> +		len = emit_pte(rq, &it, pat_index, is_lmem, offset, CHUNK_SZ);
>>  		if (len <= 0) {
>>  			err = len;
>>  			goto out_rq;
>> @@ -1074,10 +1079,10 @@ int intel_migrate_copy(struct intel_migrate *m,
>>  		       struct i915_gem_ww_ctx *ww,
>>  		       const struct i915_deps *deps,
>>  		       struct scatterlist *src,
>> -		       enum i915_cache_level src_cache_level,
>> +		       unsigned int src_pat_index,
>>  		       bool src_is_lmem,
>>  		       struct scatterlist *dst,
>> -		       enum i915_cache_level dst_cache_level,
>> +		       unsigned int dst_pat_index,
>>  		       bool dst_is_lmem,
>>  		       struct i915_request **out)
>>  {
>> @@ -1098,8 +1103,8 @@ int intel_migrate_copy(struct intel_migrate *m,
>>  		goto out;
>>  
>>  	err = intel_context_migrate_copy(ce, deps,
>> -					 src, src_cache_level, src_is_lmem,
>> -					 dst, dst_cache_level, dst_is_lmem,
>> +					 src, src_pat_index, src_is_lmem,
>> +					 dst, dst_pat_index, dst_is_lmem,
>>  					 out);
>>  
>>  	intel_context_unpin(ce);
>> @@ -1113,7 +1118,7 @@ intel_migrate_clear(struct intel_migrate *m,
>>  		    struct i915_gem_ww_ctx *ww,
>>  		    const struct i915_deps *deps,
>>  		    struct scatterlist *sg,
>> -		    enum i915_cache_level cache_level,
>> +		    unsigned int pat_index,
>>  		    bool is_lmem,
>>  		    u32 value,
>>  		    struct i915_request **out)
>> @@ -1134,7 +1139,7 @@ intel_migrate_clear(struct intel_migrate *m,
>>  	if (err)
>>  		goto out;
>>  
>> -	err = intel_context_migrate_clear(ce, deps, sg, cache_level,
>> +	err = intel_context_migrate_clear(ce, deps, sg, pat_index,
>>  					  is_lmem, value, out);
>>  
>>  	intel_context_unpin(ce);
>> diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.h b/drivers/gpu/drm/i915/gt/intel_migrate.h
>> index ccc677ec4aa3..11fc09a00c4b 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_migrate.h
>> +++ b/drivers/gpu/drm/i915/gt/intel_migrate.h
>> @@ -16,7 +16,6 @@ struct i915_request;
>>  struct i915_gem_ww_ctx;
>>  struct intel_gt;
>>  struct scatterlist;
>> -enum i915_cache_level;
>>  
>>  int intel_migrate_init(struct intel_migrate *m, struct intel_gt *gt);
>>  
>> @@ -26,20 +25,20 @@ int intel_migrate_copy(struct intel_migrate *m,
>>  		       struct i915_gem_ww_ctx *ww,
>>  		       const struct i915_deps *deps,
>>  		       struct scatterlist *src,
>> -		       enum i915_cache_level src_cache_level,
>> +		       unsigned int src_pat_index,
>>  		       bool src_is_lmem,
>>  		       struct scatterlist *dst,
>> -		       enum i915_cache_level dst_cache_level,
>> +		       unsigned int dst_pat_index,
>>  		       bool dst_is_lmem,
>>  		       struct i915_request **out);
>>  
>>  int intel_context_migrate_copy(struct intel_context *ce,
>>  			       const struct i915_deps *deps,
>>  			       struct scatterlist *src,
>> -			       enum i915_cache_level src_cache_level,
>> +			       unsigned int src_pat_index,
>>  			       bool src_is_lmem,
>>  			       struct scatterlist *dst,
>> -			       enum i915_cache_level dst_cache_level,
>> +			       unsigned int dst_pat_index,
>>  			       bool dst_is_lmem,
>>  			       struct i915_request **out);
>>  
>> @@ -48,7 +47,7 @@ intel_migrate_clear(struct intel_migrate *m,
>>  		    struct i915_gem_ww_ctx *ww,
>>  		    const struct i915_deps *deps,
>>  		    struct scatterlist *sg,
>> -		    enum i915_cache_level cache_level,
>> +		    unsigned int pat_index,
>>  		    bool is_lmem,
>>  		    u32 value,
>>  		    struct i915_request **out);
>> @@ -56,7 +55,7 @@ int
>>  intel_context_migrate_clear(struct intel_context *ce,
>>  			    const struct i915_deps *deps,
>>  			    struct scatterlist *sg,
>> -			    enum i915_cache_level cache_level,
>> +			    unsigned int pat_index,
>>  			    bool is_lmem,
>>  			    u32 value,
>>  			    struct i915_request **out);
>> diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
>> index 7ecfa672f738..f0da3555c6db 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
>> @@ -98,7 +98,7 @@ void
>>  __set_pd_entry(struct i915_page_directory * const pd,
>>  	       const unsigned short idx,
>>  	       struct i915_page_table * const to,
>> -	       u64 (*encode)(const dma_addr_t, const enum i915_cache_level))
>> +	       u64 (*encode)(const dma_addr_t, const unsigned int))
>>  {
>>  	/* Each thread pre-pins the pd, and we may have a thread per pde. */
>>  	GEM_BUG_ON(atomic_read(px_used(pd)) > NALLOC * I915_PDES);
>> @@ -181,7 +181,7 @@ struct i915_ppgtt *i915_ppgtt_create(struct intel_gt *gt,
>>  void ppgtt_bind_vma(struct i915_address_space *vm,
>>  		    struct i915_vm_pt_stash *stash,
>>  		    struct i915_vma_resource *vma_res,
>> -		    enum i915_cache_level cache_level,
>> +		    unsigned int pat_index,
>>  		    u32 flags)
>>  {
>>  	u32 pte_flags;
>> @@ -199,7 +199,7 @@ void ppgtt_bind_vma(struct i915_address_space *vm,
>>  	if (vma_res->bi.lmem)
>>  		pte_flags |= PTE_LM;
>>  
>> -	vm->insert_entries(vm, vma_res, cache_level, pte_flags);
>> +	vm->insert_entries(vm, vma_res, pat_index, pte_flags);
>>  	wmb();
>>  }
>>  
>> diff --git a/drivers/gpu/drm/i915/gt/selftest_migrate.c b/drivers/gpu/drm/i915/gt/selftest_migrate.c
>> index e677f2da093d..3def5ca72dec 100644
>> --- a/drivers/gpu/drm/i915/gt/selftest_migrate.c
>> +++ b/drivers/gpu/drm/i915/gt/selftest_migrate.c
>> @@ -137,7 +137,7 @@ static int copy(struct intel_migrate *migrate,
>>  static int intel_context_copy_ccs(struct intel_context *ce,
>>  				  const struct i915_deps *deps,
>>  				  struct scatterlist *sg,
>> -				  enum i915_cache_level cache_level,
>> +				  unsigned int pat_index,
>>  				  bool write_to_ccs,
>>  				  struct i915_request **out)
>>  {
>> @@ -185,7 +185,7 @@ static int intel_context_copy_ccs(struct intel_context *ce,
>>  		if (err)
>>  			goto out_rq;
>>  
>> -		len = emit_pte(rq, &it, cache_level, true, offset, CHUNK_SZ);
>> +		len = emit_pte(rq, &it, pat_index, true, offset, CHUNK_SZ);
>>  		if (len <= 0) {
>>  			err = len;
>>  			goto out_rq;
>> @@ -223,7 +223,7 @@ intel_migrate_ccs_copy(struct intel_migrate *m,
>>  		       struct i915_gem_ww_ctx *ww,
>>  		       const struct i915_deps *deps,
>>  		       struct scatterlist *sg,
>> -		       enum i915_cache_level cache_level,
>> +		       unsigned int pat_index,
>>  		       bool write_to_ccs,
>>  		       struct i915_request **out)
>>  {
>> @@ -243,7 +243,7 @@ intel_migrate_ccs_copy(struct intel_migrate *m,
>>  	if (err)
>>  		goto out;
>>  
>> -	err = intel_context_copy_ccs(ce, deps, sg, cache_level,
>> +	err = intel_context_copy_ccs(ce, deps, sg, pat_index,
>>  				     write_to_ccs, out);
>>  
>>  	intel_context_unpin(ce);
>> @@ -300,7 +300,7 @@ static int clear(struct intel_migrate *migrate,
>>  			/* Write the obj data into ccs surface */
>>  			err = intel_migrate_ccs_copy(migrate, &ww, NULL,
>>  						     obj->mm.pages->sgl,
>> -						     obj->cache_level,
>> +						     obj->pat_index,
>>  						     true, &rq);
>>  			if (rq && !err) {
>>  				if (i915_request_wait(rq, 0, HZ) < 0) {
>> @@ -351,7 +351,7 @@ static int clear(struct intel_migrate *migrate,
>>  
>>  			err = intel_migrate_ccs_copy(migrate, &ww, NULL,
>>  						     obj->mm.pages->sgl,
>> -						     obj->cache_level,
>> +						     obj->pat_index,
>>  						     false, &rq);
>>  			if (rq && !err) {
>>  				if (i915_request_wait(rq, 0, HZ) < 0) {
>> @@ -414,9 +414,9 @@ static int __migrate_copy(struct intel_migrate *migrate,
>>  			  struct i915_request **out)
>>  {
>>  	return intel_migrate_copy(migrate, ww, NULL,
>> -				  src->mm.pages->sgl, src->cache_level,
>> +				  src->mm.pages->sgl, src->pat_index,
>>  				  i915_gem_object_is_lmem(src),
>> -				  dst->mm.pages->sgl, dst->cache_level,
>> +				  dst->mm.pages->sgl, dst->pat_index,
>>  				  i915_gem_object_is_lmem(dst),
>>  				  out);
>>  }
>> @@ -428,9 +428,9 @@ static int __global_copy(struct intel_migrate *migrate,
>>  			 struct i915_request **out)
>>  {
>>  	return intel_context_migrate_copy(migrate->context, NULL,
>> -					  src->mm.pages->sgl, src->cache_level,
>> +					  src->mm.pages->sgl, src->pat_index,
>>  					  i915_gem_object_is_lmem(src),
>> -					  dst->mm.pages->sgl, dst->cache_level,
>> +					  dst->mm.pages->sgl, dst->pat_index,
>>  					  i915_gem_object_is_lmem(dst),
>>  					  out);
>>  }
>> @@ -455,7 +455,7 @@ static int __migrate_clear(struct intel_migrate *migrate,
>>  {
>>  	return intel_migrate_clear(migrate, ww, NULL,
>>  				   obj->mm.pages->sgl,
>> -				   obj->cache_level,
>> +				   obj->pat_index,
>>  				   i915_gem_object_is_lmem(obj),
>>  				   value, out);
>>  }
>> @@ -468,7 +468,7 @@ static int __global_clear(struct intel_migrate *migrate,
>>  {
>>  	return intel_context_migrate_clear(migrate->context, NULL,
>>  					   obj->mm.pages->sgl,
>> -					   obj->cache_level,
>> +					   obj->pat_index,
>>  					   i915_gem_object_is_lmem(obj),
>>  					   value, out);
>>  }
>> @@ -648,7 +648,7 @@ static int live_emit_pte_full_ring(void *arg)
>>  	 */
>>  	pr_info("%s emite_pte ring space=%u\n", __func__, rq->ring->space);
>>  	it = sg_sgt(obj->mm.pages->sgl);
>> -	len = emit_pte(rq, &it, obj->cache_level, false, 0, CHUNK_SZ);
>> +	len = emit_pte(rq, &it, obj->pat_index, false, 0, CHUNK_SZ);
>>  	if (!len) {
>>  		err = -EINVAL;
>>  		goto out_rq;
>> @@ -844,7 +844,7 @@ static int wrap_ktime_compare(const void *A, const void *B)
>>  
>>  static int __perf_clear_blt(struct intel_context *ce,
>>  			    struct scatterlist *sg,
>> -			    enum i915_cache_level cache_level,
>> +			    unsigned int pat_index,
>>  			    bool is_lmem,
>>  			    size_t sz)
>>  {
>> @@ -858,7 +858,7 @@ static int __perf_clear_blt(struct intel_context *ce,
>>  
>>  		t0 = ktime_get();
>>  
>> -		err = intel_context_migrate_clear(ce, NULL, sg, cache_level,
>> +		err = intel_context_migrate_clear(ce, NULL, sg, pat_index,
>>  						  is_lmem, 0, &rq);
>>  		if (rq) {
>>  			if (i915_request_wait(rq, 0, MAX_SCHEDULE_TIMEOUT) < 0)
>> @@ -904,7 +904,8 @@ static int perf_clear_blt(void *arg)
>>  
>>  		err = __perf_clear_blt(gt->migrate.context,
>>  				       dst->mm.pages->sgl,
>> -				       I915_CACHE_NONE,
>> +				       i915_gem_get_pat_index(gt->i915,
>> +							      I915_CACHE_NONE),
>>  				       i915_gem_object_is_lmem(dst),
>>  				       sizes[i]);
>>  
>> @@ -919,10 +920,10 @@ static int perf_clear_blt(void *arg)
>>  
>>  static int __perf_copy_blt(struct intel_context *ce,
>>  			   struct scatterlist *src,
>> -			   enum i915_cache_level src_cache_level,
>> +			   unsigned int src_pat_index,
>>  			   bool src_is_lmem,
>>  			   struct scatterlist *dst,
>> -			   enum i915_cache_level dst_cache_level,
>> +			   unsigned int dst_pat_index,
>>  			   bool dst_is_lmem,
>>  			   size_t sz)
>>  {
>> @@ -937,9 +938,9 @@ static int __perf_copy_blt(struct intel_context *ce,
>>  		t0 = ktime_get();
>>  
>>  		err = intel_context_migrate_copy(ce, NULL,
>> -						 src, src_cache_level,
>> +						 src, src_pat_index,
>>  						 src_is_lmem,
>> -						 dst, dst_cache_level,
>> +						 dst, dst_pat_index,
>>  						 dst_is_lmem,
>>  						 &rq);
>>  		if (rq) {
>> @@ -994,10 +995,12 @@ static int perf_copy_blt(void *arg)
>>  
>>  		err = __perf_copy_blt(gt->migrate.context,
>>  				      src->mm.pages->sgl,
>> -				      I915_CACHE_NONE,
>> +				      i915_gem_get_pat_index(gt->i915,
>> +							     I915_CACHE_NONE),
>>  				      i915_gem_object_is_lmem(src),
>>  				      dst->mm.pages->sgl,
>> -				      I915_CACHE_NONE,
>> +				      i915_gem_get_pat_index(gt->i915,
>> +							     I915_CACHE_NONE),
>>  				      i915_gem_object_is_lmem(dst),
>>  				      sz);
>>  
>> diff --git a/drivers/gpu/drm/i915/gt/selftest_reset.c b/drivers/gpu/drm/i915/gt/selftest_reset.c
>> index a9e0a91bc0e0..79aa6ac66ad2 100644
>> --- a/drivers/gpu/drm/i915/gt/selftest_reset.c
>> +++ b/drivers/gpu/drm/i915/gt/selftest_reset.c
>> @@ -86,7 +86,9 @@ __igt_reset_stolen(struct intel_gt *gt,
>>  
>>  		ggtt->vm.insert_page(&ggtt->vm, dma,
>>  				     ggtt->error_capture.start,
>> -				     I915_CACHE_NONE, 0);
>> +				     i915_gem_get_pat_index(gt->i915,
>> +							    I915_CACHE_NONE),
>> +				     0);
>>  		mb();
>>  
>>  		s = io_mapping_map_wc(&ggtt->iomap,
>> @@ -127,7 +129,9 @@ __igt_reset_stolen(struct intel_gt *gt,
>>  
>>  		ggtt->vm.insert_page(&ggtt->vm, dma,
>>  				     ggtt->error_capture.start,
>> -				     I915_CACHE_NONE, 0);
>> +				     i915_gem_get_pat_index(gt->i915,
>> +							    I915_CACHE_NONE),
>> +				     0);
>>  		mb();
>>  
>>  		s = io_mapping_map_wc(&ggtt->iomap,
>> diff --git a/drivers/gpu/drm/i915/gt/selftest_timeline.c b/drivers/gpu/drm/i915/gt/selftest_timeline.c
>> index 9f536c251179..39c3ec12df1a 100644
>> --- a/drivers/gpu/drm/i915/gt/selftest_timeline.c
>> +++ b/drivers/gpu/drm/i915/gt/selftest_timeline.c
>> @@ -836,7 +836,7 @@ static int setup_watcher(struct hwsp_watcher *w, struct intel_gt *gt,
>>  		return PTR_ERR(obj);
>>  
>>  	/* keep the same cache settings as timeline */
>> -	i915_gem_object_set_cache_coherency(obj, tl->hwsp_ggtt->obj->cache_level);
>> +	i915_gem_object_set_pat_index(obj, tl->hwsp_ggtt->obj->pat_index);
>>  	w->map = i915_gem_object_pin_map_unlocked(obj,
>>  						  page_unmask_bits(tl->hwsp_ggtt->obj->mm.mapping));
>>  	if (IS_ERR(w->map)) {
>> diff --git a/drivers/gpu/drm/i915/gt/selftest_tlb.c b/drivers/gpu/drm/i915/gt/selftest_tlb.c
>> index e6cac1f15d6e..4493c8518e91 100644
>> --- a/drivers/gpu/drm/i915/gt/selftest_tlb.c
>> +++ b/drivers/gpu/drm/i915/gt/selftest_tlb.c
>> @@ -36,6 +36,8 @@ pte_tlbinv(struct intel_context *ce,
>>  	   u64 length,
>>  	   struct rnd_state *prng)
>>  {
>> +	const unsigned int pat_index =
>> +		i915_gem_get_pat_index(ce->vm->i915, I915_CACHE_NONE);
>>  	struct drm_i915_gem_object *batch;
>>  	struct drm_mm_node vb_node;
>>  	struct i915_request *rq;
>> @@ -155,7 +157,7 @@ pte_tlbinv(struct intel_context *ce,
>>  		/* Flip the PTE between A and B */
>>  		if (i915_gem_object_is_lmem(vb->obj))
>>  			pte_flags |= PTE_LM;
>> -		ce->vm->insert_entries(ce->vm, &vb_res, 0, pte_flags);
>> +		ce->vm->insert_entries(ce->vm, &vb_res, pat_index, pte_flags);
>>  
>>  		/* Flush the PTE update to concurrent HW */
>>  		tlbinv(ce->vm, addr & -length, length);
>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
>> index 264c952f777b..31182915f3d2 100644
>> --- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
>> @@ -876,9 +876,15 @@ static void uc_fw_bind_ggtt(struct intel_uc_fw *uc_fw)
>>  		pte_flags |= PTE_LM;
>>  
>>  	if (ggtt->vm.raw_insert_entries)
>> -		ggtt->vm.raw_insert_entries(&ggtt->vm, dummy, I915_CACHE_NONE, pte_flags);
>> +		ggtt->vm.raw_insert_entries(&ggtt->vm, dummy,
>> +					    i915_gem_get_pat_index(ggtt->vm.i915,
>> +								   I915_CACHE_NONE),
>> +					    pte_flags);
>>  	else
>> -		ggtt->vm.insert_entries(&ggtt->vm, dummy, I915_CACHE_NONE, pte_flags);
>> +		ggtt->vm.insert_entries(&ggtt->vm, dummy,
>> +					i915_gem_get_pat_index(ggtt->vm.i915,
>> +							       I915_CACHE_NONE),
>> +					pte_flags);
>>  }
>>  
>>  static void uc_fw_unbind_ggtt(struct intel_uc_fw *uc_fw)
>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
>> index 80c2bf98e341..1c407d59ff3d 100644
>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>> @@ -138,21 +138,56 @@ static const char *stringify_vma_type(const struct i915_vma *vma)
>>  	return "ppgtt";
>>  }
>>  
>> -static const char *i915_cache_level_str(struct drm_i915_private *i915, int type)
>> -{
>> -	switch (type) {
>> -	case I915_CACHE_NONE: return " uncached";
>> -	case I915_CACHE_LLC: return HAS_LLC(i915) ? " LLC" : " snooped";
>> -	case I915_CACHE_L3_LLC: return " L3+LLC";
>> -	case I915_CACHE_WT: return " WT";
>> -	default: return "";
>> +static const char *i915_cache_level_str(struct drm_i915_gem_object *obj)
>> +{
>> +	struct drm_i915_private *i915 = obj_to_i915(obj);
>> +
>> +	if (IS_METEORLAKE(i915)) {
>> +		switch (obj->pat_index) {
>> +		case 0: return " WB";
>> +		case 1: return " WT";
>> +		case 2: return " UC";
>> +		case 3: return " WB (1-Way Coh)";
>> +		case 4: return " WB (2-Way Coh)";
>> +		default: return " not defined";
>> +		}
>> +	} else if (IS_PONTEVECCHIO(i915)) {
>> +		switch (obj->pat_index) {
>> +		case 0: return " UC";
>> +		case 1: return " WC";
>> +		case 2: return " WT";
>> +		case 3: return " WB";
>> +		case 4: return " WT (CLOS1)";
>> +		case 5: return " WB (CLOS1)";
>> +		case 6: return " WT (CLOS2)";
>> +		case 7: return " WT (CLOS2)";
>> +		default: return " not defined";
>> +		}
>> +	} else if (GRAPHICS_VER(i915) >= 12) {
>> +		switch (obj->pat_index) {
>> +		case 0: return " WB";
>> +		case 1: return " WC";
>> +		case 2: return " WT";
>> +		case 3: return " UC";
>> +		default: return " not defined";
>> +		}
>> +	} else {
>> +		if (i915_gem_object_has_cache_level(obj, I915_CACHE_NONE))
>> +			return " uncached";
>> +		else if (i915_gem_object_has_cache_level(obj, I915_CACHE_LLC))
>> +			return HAS_LLC(i915) ? " LLC" : " snooped";
>> +		else if (i915_gem_object_has_cache_level(obj, I915_CACHE_L3_LLC))
>> +			return " L3+LLC";
>> +		else if (i915_gem_object_has_cache_level(obj, I915_CACHE_WT))
>> +			return " WT";
>> +		else
>> +			return " not defined";
>>  	}
>>  }
>>  
>>  void
>>  i915_debugfs_describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>>  {
>> -	struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
>>  	struct i915_vma *vma;
>>  	int pin_count = 0;
>>  
>> @@ -164,7 +199,7 @@ i915_debugfs_describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>>  		   obj->base.size / 1024,
>>  		   obj->read_domains,
>>  		   obj->write_domain,
>> -		   i915_cache_level_str(dev_priv, obj->cache_level),
>> +		   i915_cache_level_str(obj),
>>  		   obj->mm.dirty ? " dirty" : "",
>>  		   obj->mm.madv == I915_MADV_DONTNEED ? " purgeable" : "");
>>  	if (obj->base.name)
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index 2ba922fbbd5f..fbeddf81e729 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -420,8 +420,12 @@ i915_gem_gtt_pread(struct drm_i915_gem_object *obj,
>>  		page_length = remain < page_length ? remain : page_length;
>>  		if (drm_mm_node_allocated(&node)) {
>>  			ggtt->vm.insert_page(&ggtt->vm,
>> -					     i915_gem_object_get_dma_address(obj, offset >> PAGE_SHIFT),
>> -					     node.start, I915_CACHE_NONE, 0);
>> +					i915_gem_object_get_dma_address(obj,
>> +							offset >> PAGE_SHIFT),
>> +					node.start,
>> +					i915_gem_get_pat_index(i915,
>> +							       I915_CACHE_NONE),
>> +					0);
>>  		} else {
>>  			page_base += offset & PAGE_MASK;
>>  		}
>> @@ -598,8 +602,12 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj,
>>  			/* flush the write before we modify the GGTT */
>>  			intel_gt_flush_ggtt_writes(ggtt->vm.gt);
>>  			ggtt->vm.insert_page(&ggtt->vm,
>> -					     i915_gem_object_get_dma_address(obj, offset >> PAGE_SHIFT),
>> -					     node.start, I915_CACHE_NONE, 0);
>> +					i915_gem_object_get_dma_address(obj,
>> +							offset >> PAGE_SHIFT),
>> +					node.start,
>> +					i915_gem_get_pat_index(i915,
>> +							       I915_CACHE_NONE),
>> +					0);
>>  			wmb(); /* flush modifications to the GGTT (insert_page) */
>>  		} else {
>>  			page_base += offset & PAGE_MASK;
>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
>> index f020c0086fbc..54f17ba3b03c 100644
>> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
>> @@ -1117,10 +1117,14 @@ i915_vma_coredump_create(const struct intel_gt *gt,
>>  			mutex_lock(&ggtt->error_mutex);
>>  			if (ggtt->vm.raw_insert_page)
>>  				ggtt->vm.raw_insert_page(&ggtt->vm, dma, slot,
>> -							 I915_CACHE_NONE, 0);
>> +						i915_gem_get_pat_index(gt->i915,
>> +							I915_CACHE_NONE),
>> +						0);
>>  			else
>>  				ggtt->vm.insert_page(&ggtt->vm, dma, slot,
>> -						     I915_CACHE_NONE, 0);
>> +						i915_gem_get_pat_index(gt->i915,
>> +							I915_CACHE_NONE),
>> +						0);
>>  			mb();
>>  
>>  			s = io_mapping_map_wc(&ggtt->iomap, slot, PAGE_SIZE);
>> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
>> index f51fd9fd4c89..e5f5368b175f 100644
>> --- a/drivers/gpu/drm/i915/i915_vma.c
>> +++ b/drivers/gpu/drm/i915/i915_vma.c
>> @@ -315,7 +315,7 @@ struct i915_vma_work {
>>  	struct i915_vma_resource *vma_res;
>>  	struct drm_i915_gem_object *obj;
>>  	struct i915_sw_dma_fence_cb cb;
>> -	enum i915_cache_level cache_level;
>> +	unsigned int pat_index;
>>  	unsigned int flags;
>>  };
>>  
>> @@ -334,7 +334,7 @@ static void __vma_bind(struct dma_fence_work *work)
>>  		return;
>>  
>>  	vma_res->ops->bind_vma(vma_res->vm, &vw->stash,
>> -			       vma_res, vw->cache_level, vw->flags);
>> +			       vma_res, vw->pat_index, vw->flags);
>>  }
>>  
>>  static void __vma_release(struct dma_fence_work *work)
>> @@ -426,7 +426,7 @@ i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
>>  /**
>>   * i915_vma_bind - Sets up PTEs for an VMA in it's corresponding address space.
>>   * @vma: VMA to map
>> - * @cache_level: mapping cache level
>> + * @pat_index: PAT index to set in PTE
>>   * @flags: flags like global or local mapping
>>   * @work: preallocated worker for allocating and binding the PTE
>>   * @vma_res: pointer to a preallocated vma resource. The resource is either
>> @@ -437,7 +437,7 @@ i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
>>   * Note that DMA addresses are also the only part of the SG table we care about.
>>   */
>>  int i915_vma_bind(struct i915_vma *vma,
>> -		  enum i915_cache_level cache_level,
>> +		  unsigned int pat_index,
>>  		  u32 flags,
>>  		  struct i915_vma_work *work,
>>  		  struct i915_vma_resource *vma_res)
>> @@ -507,7 +507,7 @@ int i915_vma_bind(struct i915_vma *vma,
>>  		struct dma_fence *prev;
>>  
>>  		work->vma_res = i915_vma_resource_get(vma->resource);
>> -		work->cache_level = cache_level;
>> +		work->pat_index = pat_index;
>>  		work->flags = bind_flags;
>>  
>>  		/*
>> @@ -537,7 +537,7 @@ int i915_vma_bind(struct i915_vma *vma,
>>  
>>  			return ret;
>>  		}
>> -		vma->ops->bind_vma(vma->vm, NULL, vma->resource, cache_level,
>> +		vma->ops->bind_vma(vma->vm, NULL, vma->resource, pat_index,
>>  				   bind_flags);
>>  	}
>>  
>> @@ -813,7 +813,7 @@ i915_vma_insert(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
>>  	color = 0;
>>  
>>  	if (i915_vm_has_cache_coloring(vma->vm))
>> -		color = vma->obj->cache_level;
>> +		color = vma->obj->pat_index;
>>  
>>  	if (flags & PIN_OFFSET_FIXED) {
>>  		u64 offset = flags & PIN_OFFSET_MASK;
>> @@ -1517,7 +1517,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
>>  
>>  	GEM_BUG_ON(!vma->pages);
>>  	err = i915_vma_bind(vma,
>> -			    vma->obj->cache_level,
>> +			    vma->obj->pat_index,
>>  			    flags, work, vma_res);
>>  	vma_res = NULL;
>>  	if (err)
>> diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
>> index ed5c9d682a1b..31a8f8aa5558 100644
>> --- a/drivers/gpu/drm/i915/i915_vma.h
>> +++ b/drivers/gpu/drm/i915/i915_vma.h
>> @@ -250,7 +250,7 @@ i915_vma_compare(struct i915_vma *vma,
>>  
>>  struct i915_vma_work *i915_vma_work(void);
>>  int i915_vma_bind(struct i915_vma *vma,
>> -		  enum i915_cache_level cache_level,
>> +		  unsigned int pat_index,
>>  		  u32 flags,
>>  		  struct i915_vma_work *work,
>>  		  struct i915_vma_resource *vma_res);
>> diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
>> index 77fda2244d16..64472b7f0e77 100644
>> --- a/drivers/gpu/drm/i915/i915_vma_types.h
>> +++ b/drivers/gpu/drm/i915/i915_vma_types.h
>> @@ -32,8 +32,6 @@
>>  
>>  #include "gem/i915_gem_object_types.h"
>>  
>> -enum i915_cache_level;
>> -
>>  /**
>>   * DOC: Global GTT views
>>   *
>> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem.c b/drivers/gpu/drm/i915/selftests/i915_gem.c
>> index d91d0ade8abd..bde981a8f23f 100644
>> --- a/drivers/gpu/drm/i915/selftests/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/selftests/i915_gem.c
>> @@ -57,7 +57,10 @@ static void trash_stolen(struct drm_i915_private *i915)
>>  		u32 __iomem *s;
>>  		int x;
>>  
>> -		ggtt->vm.insert_page(&ggtt->vm, dma, slot, I915_CACHE_NONE, 0);
>> +		ggtt->vm.insert_page(&ggtt->vm, dma, slot,
>> +				     i915_gem_get_pat_index(i915,
>> +							I915_CACHE_NONE),
>> +				     0);
>>  
>>  		s = io_mapping_map_atomic_wc(&ggtt->iomap, slot);
>>  		for (x = 0; x < PAGE_SIZE / sizeof(u32); x++) {
>> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
>> index 37068542aafe..f13a4d265814 100644
>> --- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
>> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
>> @@ -245,7 +245,7 @@ static int igt_evict_for_cache_color(void *arg)
>>  	struct drm_mm_node target = {
>>  		.start = I915_GTT_PAGE_SIZE * 2,
>>  		.size = I915_GTT_PAGE_SIZE,
>> -		.color = I915_CACHE_LLC,
>> +		.color = i915_gem_get_pat_index(gt->i915, I915_CACHE_LLC),
>>  	};
>>  	struct drm_i915_gem_object *obj;
>>  	struct i915_vma *vma;
>> @@ -308,7 +308,7 @@ static int igt_evict_for_cache_color(void *arg)
>>  	/* Attempt to remove the first *pinned* vma, by removing the (empty)
>>  	 * neighbour -- this should fail.
>>  	 */
>> -	target.color = I915_CACHE_L3_LLC;
>> +	target.color = i915_gem_get_pat_index(gt->i915, I915_CACHE_L3_LLC);
>>  
>>  	mutex_lock(&ggtt->vm.mutex);
>>  	err = i915_gem_evict_for_node(&ggtt->vm, NULL, &target, 0);
>> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
>> index 5361ce70d3f2..0b6350eb4dad 100644
>> --- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
>> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
>> @@ -133,7 +133,7 @@ fake_dma_object(struct drm_i915_private *i915, u64 size)
>>  
>>  	obj->write_domain = I915_GEM_DOMAIN_CPU;
>>  	obj->read_domains = I915_GEM_DOMAIN_CPU;
>> -	obj->cache_level = I915_CACHE_NONE;
>> +	obj->pat_index = i915_gem_get_pat_index(i915, I915_CACHE_NONE);
>>  
>>  	/* Preallocate the "backing storage" */
>>  	if (i915_gem_object_pin_pages_unlocked(obj))
>> @@ -357,7 +357,9 @@ static int lowlevel_hole(struct i915_address_space *vm,
>>  
>>  			with_intel_runtime_pm(vm->gt->uncore->rpm, wakeref)
>>  			  vm->insert_entries(vm, mock_vma_res,
>> -						   I915_CACHE_NONE, 0);
>> +					     i915_gem_get_pat_index(vm->i915,
>> +						     I915_CACHE_NONE),
>> +					     0);
>>  		}
>>  		count = n;
>>  
>> @@ -1375,7 +1377,10 @@ static int igt_ggtt_page(void *arg)
>>  
>>  		ggtt->vm.insert_page(&ggtt->vm,
>>  				     i915_gem_object_get_dma_address(obj, 0),
>> -				     offset, I915_CACHE_NONE, 0);
>> +				     offset,
>> +				     i915_gem_get_pat_index(i915,
>> +					                    I915_CACHE_NONE),
>> +				     0);
>>  	}
>>  
>>  	order = i915_random_order(count, &prng);
>> @@ -1508,7 +1513,7 @@ static int reserve_gtt_with_resource(struct i915_vma *vma, u64 offset)
>>  	mutex_lock(&vm->mutex);
>>  	err = i915_gem_gtt_reserve(vm, NULL, &vma->node, obj->base.size,
>>  				   offset,
>> -				   obj->cache_level,
>> +				   obj->pat_index,
>>  				   0);
>>  	if (!err) {
>>  		i915_vma_resource_init_from_vma(vma_res, vma);
>> @@ -1688,7 +1693,7 @@ static int insert_gtt_with_resource(struct i915_vma *vma)
>>  
>>  	mutex_lock(&vm->mutex);
>>  	err = i915_gem_gtt_insert(vm, NULL, &vma->node, obj->base.size, 0,
>> -				  obj->cache_level, 0, vm->total, 0);
>> +				  obj->pat_index, 0, vm->total, 0);
>>  	if (!err) {
>>  		i915_vma_resource_init_from_vma(vma_res, vma);
>>  		vma->resource = vma_res;
>> diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
>> index 3b18e5905c86..cce180114d0c 100644
>> --- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
>> +++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
>> @@ -1070,7 +1070,9 @@ static int igt_lmem_write_cpu(void *arg)
>>  	/* Put the pages into a known state -- from the gpu for added fun */
>>  	intel_engine_pm_get(engine);
>>  	err = intel_context_migrate_clear(engine->gt->migrate.context, NULL,
>> -					  obj->mm.pages->sgl, I915_CACHE_NONE,
>> +					  obj->mm.pages->sgl,
>> +					  i915_gem_get_pat_index(i915,
>> +							I915_CACHE_NONE),
>>  					  true, 0xdeadbeaf, &rq);
>>  	if (rq) {
>>  		dma_resv_add_fence(obj->base.resv, &rq->fence,
>> diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c b/drivers/gpu/drm/i915/selftests/mock_gtt.c
>> index ece97e4faacb..a516c0aa88fd 100644
>> --- a/drivers/gpu/drm/i915/selftests/mock_gtt.c
>> +++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c
>> @@ -27,21 +27,21 @@
>>  static void mock_insert_page(struct i915_address_space *vm,
>>  			     dma_addr_t addr,
>>  			     u64 offset,
>> -			     enum i915_cache_level level,
>> +			     unsigned int pat_index,
>>  			     u32 flags)
>>  {
>>  }
>>  
>>  static void mock_insert_entries(struct i915_address_space *vm,
>>  				struct i915_vma_resource *vma_res,
>> -				enum i915_cache_level level, u32 flags)
>> +				unsigned int pat_index, u32 flags)
>>  {
>>  }
>>  
>>  static void mock_bind_ppgtt(struct i915_address_space *vm,
>>  			    struct i915_vm_pt_stash *stash,
>>  			    struct i915_vma_resource *vma_res,
>> -			    enum i915_cache_level cache_level,
>> +			    unsigned int pat_index,
>>  			    u32 flags)
>>  {
>>  	GEM_BUG_ON(flags & I915_VMA_GLOBAL_BIND);
>> @@ -94,7 +94,7 @@ struct i915_ppgtt *mock_ppgtt(struct drm_i915_private *i915, const char *name)
>>  static void mock_bind_ggtt(struct i915_address_space *vm,
>>  			   struct i915_vm_pt_stash *stash,
>>  			   struct i915_vma_resource *vma_res,
>> -			   enum i915_cache_level cache_level,
>> +			   unsigned int pat_index,
>>  			   u32 flags)
>>  {
>>  }
>> -- 
>> 2.25.1
>
> -- 
> Ville Syrjälä
> Intel
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Intel-gfx] [PATCH 5/7] drm/i915: use pat_index instead of cache_level
  2023-04-03 16:57     ` Yang, Fei
@ 2023-04-03 17:14       ` Ville Syrjälä
  2023-04-03 19:39         ` Yang, Fei
  0 siblings, 1 reply; 35+ messages in thread
From: Ville Syrjälä @ 2023-04-03 17:14 UTC (permalink / raw)
  To: Yang, Fei; +Cc: Chris Wilson, intel-gfx, Roper, Matthew D, dri-devel

On Mon, Apr 03, 2023 at 04:57:21PM +0000, Yang, Fei wrote:
> > Subject: Re: [PATCH 5/7] drm/i915: use pat_index instead of cache_level
> >
> > On Fri, Mar 31, 2023 at 11:38:28PM -0700, fei.yang@intel.com wrote:
> >> From: Fei Yang <fei.yang@intel.com>
> >> 
> >> Currently the KMD is using enum i915_cache_level to set caching policy for
> >> buffer objects. This is flaky because the PAT index which really controls
> >> the caching behavior in PTE has far more levels than what's defined in the
> >> enum.
> >
> > Then just add more enum values.
> 
> That would be really messy because PAT index is platform dependent, you would
> have to maintain many tables for the the translation.
> 
> > 'pat_index' is absolutely meaningless to the reader, it's just an
> > arbitrary number. Whereas 'cache_level' conveys how the thing is
> > actually going to get used and thus how the caches should behave.
> 
> By design UMD's understand PAT index. Both UMD and KMD should stand on the
> same ground, the Bspec, to avoid any potential ambiguity.
> 
> >> In addition, the PAT index is platform dependent, having to translate
> >> between i915_cache_level and PAT index is not reliable,
> >
> >If it's not realiable then the code is clearly broken.
> 
> Perhaps the word "reliable" is a bit confusing here. What I really meant to
> say is 'difficult to maintain', or 'error-prone'.
> 
> >> and makes the code more complicated.
> >
> > You have to translate somewhere anyway. Looks like you're now adding
> > translations the other way (pat_index->cache_level). How is that better?
> 
> No, there is no pat_index->cache_level translation.

i915_gem_object_has_cache_level() is exactly that. And that one
does look actually fragile since it assumes only one PAT index
maps to each cache level. So if the user picks any other pat_index
anything using i915_gem_object_has_cache_level() is likely to
do the wrong thing.

If we do switch to pat_index then I think cache_level should
be made a purely uapi concept, and all the internal code should
instead be made to query various aspects of the caching behaviour
of the current pat_index (eg. is LLC caching enabled, and thus
do I need to clflush?).

-- 
Ville Syrjälä
Intel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Intel-gfx] [PATCH 5/7] drm/i915: use pat_index instead of cache_level
  2023-04-03 17:14       ` Ville Syrjälä
@ 2023-04-03 19:39         ` Yang, Fei
  2023-04-03 19:52           ` Ville Syrjälä
  0 siblings, 1 reply; 35+ messages in thread
From: Yang, Fei @ 2023-04-03 19:39 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Chris Wilson, intel-gfx, Roper, Matthew D, dri-devel

>Subject: Re: [PATCH 5/7] drm/i915: use pat_index instead of cache_level
>
>On Mon, Apr 03, 2023 at 04:57:21PM +0000, Yang, Fei wrote:
>>> Subject: Re: [PATCH 5/7] drm/i915: use pat_index instead of
>>> cache_level
>>>
>>> On Fri, Mar 31, 2023 at 11:38:28PM -0700, fei.yang@intel.com wrote:
>>>> From: Fei Yang <fei.yang@intel.com>
>>>>
>>>> Currently the KMD is using enum i915_cache_level to set caching
>>>> policy for buffer objects. This is flaky because the PAT index
>>>> which really controls the caching behavior in PTE has far more
>>>> levels than what's defined in the enum.
>>>
>>> Then just add more enum values.
>>
>> That would be really messy because PAT index is platform dependent,
>> you would have to maintain many tables for the the translation.
>>
>>> 'pat_index' is absolutely meaningless to the reader, it's just an
>>> arbitrary number. Whereas 'cache_level' conveys how the thing is
>>> actually going to get used and thus how the caches should behave.
>>
>> By design UMD's understand PAT index. Both UMD and KMD should stand on
>> the same ground, the Bspec, to avoid any potential ambiguity.
>>
>>>> In addition, the PAT index is platform dependent, having to
>>>> translate between i915_cache_level and PAT index is not reliable,
>>>
>>> If it's not realiable then the code is clearly broken.
>>
>> Perhaps the word "reliable" is a bit confusing here. What I really
>> meant to say is 'difficult to maintain', or 'error-prone'.
>>
>>>> and makes the code more complicated.
>>>
>>> You have to translate somewhere anyway. Looks like you're now adding
>>> translations the other way (pat_index->cache_level). How is that better?
>>
>> No, there is no pat_index->cache_level translation.
>
> i915_gem_object_has_cache_level() is exactly that. And that one does look
> actually fragile since it assumes only one PAT index maps to each cache
> level. So if the user picks any other pat_index anything using
> i915_gem_object_has_cache_level() is likely to do the wrong thing.

That is still one way transaltion, from cache_level to pat_index.
The cache_level is only a KMD concept now. And inside the KMD, we have one
table to translate between cache_level to pat_index. Only KMD would be able
to trigger a comparison on pat_index for a KMD allocated BO.
User is not allowed to set pat_index dynamically any more. By design the cahce
setting for user space BO's should be immutable. That's why even the set caching
ioctl has been killed (from MTL onward).

> If we do switch to pat_index then I think cache_level should be made a
> purely uapi concept,

UMD's directly use pat_index because they are supposed to follow the b-spec.
The abstracted cache_level is no longer exposed to user space.

-Fei

> and all the internal code should instead be made to
> query various aspects of the caching behaviour of the current pat_index
> (eg. is LLC caching enabled, and thus do I need to clflush?).
>
> --
> Ville Syrjälä
> Intel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Intel-gfx] [PATCH 5/7] drm/i915: use pat_index instead of cache_level
  2023-04-03 19:39         ` Yang, Fei
@ 2023-04-03 19:52           ` Ville Syrjälä
  2023-04-06  6:28             ` Yang, Fei
  0 siblings, 1 reply; 35+ messages in thread
From: Ville Syrjälä @ 2023-04-03 19:52 UTC (permalink / raw)
  To: Yang, Fei; +Cc: Chris Wilson, intel-gfx, Roper, Matthew D, dri-devel

On Mon, Apr 03, 2023 at 07:39:37PM +0000, Yang, Fei wrote:
> >Subject: Re: [PATCH 5/7] drm/i915: use pat_index instead of cache_level
> >
> >On Mon, Apr 03, 2023 at 04:57:21PM +0000, Yang, Fei wrote:
> >>> Subject: Re: [PATCH 5/7] drm/i915: use pat_index instead of
> >>> cache_level
> >>>
> >>> On Fri, Mar 31, 2023 at 11:38:28PM -0700, fei.yang@intel.com wrote:
> >>>> From: Fei Yang <fei.yang@intel.com>
> >>>>
> >>>> Currently the KMD is using enum i915_cache_level to set caching
> >>>> policy for buffer objects. This is flaky because the PAT index
> >>>> which really controls the caching behavior in PTE has far more
> >>>> levels than what's defined in the enum.
> >>>
> >>> Then just add more enum values.
> >>
> >> That would be really messy because PAT index is platform dependent,
> >> you would have to maintain many tables for the the translation.
> >>
> >>> 'pat_index' is absolutely meaningless to the reader, it's just an
> >>> arbitrary number. Whereas 'cache_level' conveys how the thing is
> >>> actually going to get used and thus how the caches should behave.
> >>
> >> By design UMD's understand PAT index. Both UMD and KMD should stand on
> >> the same ground, the Bspec, to avoid any potential ambiguity.
> >>
> >>>> In addition, the PAT index is platform dependent, having to
> >>>> translate between i915_cache_level and PAT index is not reliable,
> >>>
> >>> If it's not realiable then the code is clearly broken.
> >>
> >> Perhaps the word "reliable" is a bit confusing here. What I really
> >> meant to say is 'difficult to maintain', or 'error-prone'.
> >>
> >>>> and makes the code more complicated.
> >>>
> >>> You have to translate somewhere anyway. Looks like you're now adding
> >>> translations the other way (pat_index->cache_level). How is that better?
> >>
> >> No, there is no pat_index->cache_level translation.
> >
> > i915_gem_object_has_cache_level() is exactly that. And that one does look
> > actually fragile since it assumes only one PAT index maps to each cache
> > level. So if the user picks any other pat_index anything using
> > i915_gem_object_has_cache_level() is likely to do the wrong thing.
> 
> That is still one way transaltion, from cache_level to pat_index.

Not really. The actual input to the thing is obj->pat_index.
And as stated, the whole thing is simply broken whenever
obj->pat_index isn't one of the magic numbers that you get
back from i915_gem_get_pat_index().

-- 
Ville Syrjälä
Intel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation
  2023-04-01  6:38 ` [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation fei.yang
  2023-04-03 16:02   ` Ville Syrjälä
@ 2023-04-04  7:29   ` Lionel Landwerlin
  2023-04-04 16:04     ` Yang, Fei
  2023-04-06  9:11   ` Matthew Auld
  2 siblings, 1 reply; 35+ messages in thread
From: Lionel Landwerlin @ 2023-04-04  7:29 UTC (permalink / raw)
  To: fei.yang, intel-gfx; +Cc: Chris Wilson, Matt Roper, dri-devel

On 01/04/2023 09:38, fei.yang@intel.com wrote:
> From: Fei Yang <fei.yang@intel.com>
>
> To comply with the design that buffer objects shall have immutable
> cache setting through out its life cycle, {set, get}_caching ioctl's
> are no longer supported from MTL onward. With that change caching
> policy can only be set at object creation time. The current code
> applies a default (platform dependent) cache setting for all objects.
> However this is not optimal for performance tuning. The patch extends
> the existing gem_create uAPI to let user set PAT index for the object
> at creation time.
> The new extension is platform independent, so UMD's can switch to using
> this extension for older platforms as well, while {set, get}_caching are
> still supported on these legacy paltforms for compatibility reason.
>
> Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
> Cc: Matt Roper <matthew.d.roper@intel.com>
> Signed-off-by: Fei Yang <fei.yang@intel.com>
> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>


Just like the protected content uAPI, there is no way for userspace to 
tell this feature is available other than trying using it.

Given the issues with protected content, is it not thing we could want 
to add?


Thanks,


-Lionel


> ---
>   drivers/gpu/drm/i915/gem/i915_gem_create.c | 33 ++++++++++++++++++++
>   include/uapi/drm/i915_drm.h                | 36 ++++++++++++++++++++++
>   tools/include/uapi/drm/i915_drm.h          | 36 ++++++++++++++++++++++
>   3 files changed, 105 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c b/drivers/gpu/drm/i915/gem/i915_gem_create.c
> index e76c9703680e..1c6e2034d28e 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
> @@ -244,6 +244,7 @@ struct create_ext {
>   	unsigned int n_placements;
>   	unsigned int placement_mask;
>   	unsigned long flags;
> +	unsigned int pat_index;
>   };
>   
>   static void repr_placements(char *buf, size_t size,
> @@ -393,11 +394,39 @@ static int ext_set_protected(struct i915_user_extension __user *base, void *data
>   	return 0;
>   }
>   
> +static int ext_set_pat(struct i915_user_extension __user *base, void *data)
> +{
> +	struct create_ext *ext_data = data;
> +	struct drm_i915_private *i915 = ext_data->i915;
> +	struct drm_i915_gem_create_ext_set_pat ext;
> +	unsigned int max_pat_index;
> +
> +	BUILD_BUG_ON(sizeof(struct drm_i915_gem_create_ext_set_pat) !=
> +		     offsetofend(struct drm_i915_gem_create_ext_set_pat, rsvd));
> +
> +	if (copy_from_user(&ext, base, sizeof(ext)))
> +		return -EFAULT;
> +
> +	max_pat_index = INTEL_INFO(i915)->max_pat_index;
> +
> +	if (ext.pat_index > max_pat_index) {
> +		drm_dbg(&i915->drm, "PAT index is invalid: %u\n",
> +			ext.pat_index);
> +		return -EINVAL;
> +	}
> +
> +	ext_data->pat_index = ext.pat_index;
> +
> +	return 0;
> +}
> +
>   static const i915_user_extension_fn create_extensions[] = {
>   	[I915_GEM_CREATE_EXT_MEMORY_REGIONS] = ext_set_placements,
>   	[I915_GEM_CREATE_EXT_PROTECTED_CONTENT] = ext_set_protected,
> +	[I915_GEM_CREATE_EXT_SET_PAT] = ext_set_pat,
>   };
>   
> +#define PAT_INDEX_NOT_SET	0xffff
>   /**
>    * Creates a new mm object and returns a handle to it.
>    * @dev: drm device pointer
> @@ -417,6 +446,7 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void *data,
>   	if (args->flags & ~I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS)
>   		return -EINVAL;
>   
> +	ext_data.pat_index = PAT_INDEX_NOT_SET;
>   	ret = i915_user_extensions(u64_to_user_ptr(args->extensions),
>   				   create_extensions,
>   				   ARRAY_SIZE(create_extensions),
> @@ -453,5 +483,8 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void *data,
>   	if (IS_ERR(obj))
>   		return PTR_ERR(obj);
>   
> +	if (ext_data.pat_index != PAT_INDEX_NOT_SET)
> +		i915_gem_object_set_pat_index(obj, ext_data.pat_index);
> +
>   	return i915_gem_publish(obj, file, &args->size, &args->handle);
>   }
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index dba7c5a5b25e..03c5c314846e 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -3630,9 +3630,13 @@ struct drm_i915_gem_create_ext {
>   	 *
>   	 * For I915_GEM_CREATE_EXT_PROTECTED_CONTENT usage see
>   	 * struct drm_i915_gem_create_ext_protected_content.
> +	 *
> +	 * For I915_GEM_CREATE_EXT_SET_PAT usage see
> +	 * struct drm_i915_gem_create_ext_set_pat.
>   	 */
>   #define I915_GEM_CREATE_EXT_MEMORY_REGIONS 0
>   #define I915_GEM_CREATE_EXT_PROTECTED_CONTENT 1
> +#define I915_GEM_CREATE_EXT_SET_PAT 2
>   	__u64 extensions;
>   };
>   
> @@ -3747,6 +3751,38 @@ struct drm_i915_gem_create_ext_protected_content {
>   	__u32 flags;
>   };
>   
> +/**
> + * struct drm_i915_gem_create_ext_set_pat - The
> + * I915_GEM_CREATE_EXT_SET_PAT extension.
> + *
> + * If this extension is provided, the specified caching policy (PAT index) is
> + * applied to the buffer object.
> + *
> + * Below is an example on how to create an object with specific caching policy:
> + *
> + * .. code-block:: C
> + *
> + *      struct drm_i915_gem_create_ext_set_pat set_pat_ext = {
> + *              .base = { .name = I915_GEM_CREATE_EXT_SET_PAT },
> + *              .pat_index = 0,
> + *      };
> + *      struct drm_i915_gem_create_ext create_ext = {
> + *              .size = PAGE_SIZE,
> + *              .extensions = (uintptr_t)&set_pat_ext,
> + *      };
> + *
> + *      int err = ioctl(fd, DRM_IOCTL_I915_GEM_CREATE_EXT, &create_ext);
> + *      if (err) ...
> + */
> +struct drm_i915_gem_create_ext_set_pat {
> +	/** @base: Extension link. See struct i915_user_extension. */
> +	struct i915_user_extension base;
> +	/** @pat_index: PAT index to be set */
> +	__u32 pat_index;
> +	/** @rsvd: reserved for future use */
> +	__u32 rsvd;
> +};
> +
>   /* ID of the protected content session managed by i915 when PXP is active */
>   #define I915_PROTECTED_CONTENT_DEFAULT_SESSION 0xf
>   
> diff --git a/tools/include/uapi/drm/i915_drm.h b/tools/include/uapi/drm/i915_drm.h
> index 8df261c5ab9b..8cdcdb5fac26 100644
> --- a/tools/include/uapi/drm/i915_drm.h
> +++ b/tools/include/uapi/drm/i915_drm.h
> @@ -3607,9 +3607,13 @@ struct drm_i915_gem_create_ext {
>   	 *
>   	 * For I915_GEM_CREATE_EXT_PROTECTED_CONTENT usage see
>   	 * struct drm_i915_gem_create_ext_protected_content.
> +	 *
> +	 * For I915_GEM_CREATE_EXT_SET_PAT usage see
> +	 * struct drm_i915_gem_create_ext_set_pat.
>   	 */
>   #define I915_GEM_CREATE_EXT_MEMORY_REGIONS 0
>   #define I915_GEM_CREATE_EXT_PROTECTED_CONTENT 1
> +#define I915_GEM_CREATE_EXT_SET_PAT 2
>   	__u64 extensions;
>   };
>   
> @@ -3724,6 +3728,38 @@ struct drm_i915_gem_create_ext_protected_content {
>   	__u32 flags;
>   };
>   
> +/**
> + * struct drm_i915_gem_create_ext_set_pat - The
> + * I915_GEM_CREATE_EXT_SET_PAT extension.
> + *
> + * If this extension is provided, the specified caching policy (PAT index) is
> + * applied to the buffer object.
> + *
> + * Below is an example on how to create an object with specific caching policy:
> + *
> + * .. code-block:: C
> + *
> + *      struct drm_i915_gem_create_ext_set_pat set_pat_ext = {
> + *              .base = { .name = I915_GEM_CREATE_EXT_SET_PAT },
> + *              .pat_index = 0,
> + *      };
> + *      struct drm_i915_gem_create_ext create_ext = {
> + *              .size = PAGE_SIZE,
> + *              .extensions = (uintptr_t)&set_pat_ext,
> + *      };
> + *
> + *      int err = ioctl(fd, DRM_IOCTL_I915_GEM_CREATE_EXT, &create_ext);
> + *      if (err) ...
> + */
> +struct drm_i915_gem_create_ext_set_pat {
> +	/** @base: Extension link. See struct i915_user_extension. */
> +	struct i915_user_extension base;
> +	/** @pat_index: PAT index to be set */
> +	__u32 pat_index;
> +	/** @rsvd: reserved for future use */
> +	__u32 rsvd;
> +};
> +
>   /* ID of the protected content session managed by i915 when PXP is active */
>   #define I915_PROTECTED_CONTENT_DEFAULT_SESSION 0xf
>   



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation
  2023-04-04  7:29   ` Lionel Landwerlin
@ 2023-04-04 16:04     ` Yang, Fei
  2023-04-05  7:45       ` Lionel Landwerlin
  0 siblings, 1 reply; 35+ messages in thread
From: Yang, Fei @ 2023-04-04 16:04 UTC (permalink / raw)
  To: Landwerlin, Lionel G, intel-gfx; +Cc: Chris Wilson, Roper, Matthew D, dri-devel

> Subject: Re: [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation
>
> On 01/04/2023 09:38, fei.yang@intel.com wrote:
>> From: Fei Yang <fei.yang@intel.com>
>>
>> To comply with the design that buffer objects shall have immutable
>> cache setting through out its life cycle, {set, get}_caching ioctl's
>> are no longer supported from MTL onward. With that change caching
>> policy can only be set at object creation time. The current code
>> applies a default (platform dependent) cache setting for all objects.
>> However this is not optimal for performance tuning. The patch extends
>> the existing gem_create uAPI to let user set PAT index for the object
>> at creation time.
>> The new extension is platform independent, so UMD's can switch to
>> using this extension for older platforms as well, while {set,
>> get}_caching are still supported on these legacy paltforms for compatibility reason.
>>
>> Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
>> Cc: Matt Roper <matthew.d.roper@intel.com>
>> Signed-off-by: Fei Yang <fei.yang@intel.com>
>> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
>
>
> Just like the protected content uAPI, there is no way for userspace to tell
> this feature is available other than trying using it.
>
> Given the issues with protected content, is it not thing we could want to add?

Sorry I'm not aware of the issues with protected content, could you elaborate?
There was a long discussion on teams uAPI channel, could you comment there if
any concerns?

https://teams.microsoft.com/l/message/19:f1767bda6734476ba0a9c7d147b928d1@thread.skype/1675860924675?tenantId=46c98d88-e344-4ed4-8496-4ed7712e255d&groupId=379f3ae1-d138-4205-bb65-d4c7d38cb481&parentMessageId=1675860924675&teamName=GSE%20OSGC&channelName=i915%20uAPI%20changes&createdTime=1675860924675&allowXTenantAccess=false

Thanks,
-Fei

>Thanks,
>
>-Lionel
>
>
>> ---
>>   drivers/gpu/drm/i915/gem/i915_gem_create.c | 33 ++++++++++++++++++++
>>   include/uapi/drm/i915_drm.h                | 36 ++++++++++++++++++++++
>>   tools/include/uapi/drm/i915_drm.h          | 36 ++++++++++++++++++++++
>>   3 files changed, 105 insertions(+)
>>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation
  2023-04-03 16:48       ` Ville Syrjälä
@ 2023-04-04 22:15         ` Kenneth Graunke
  0 siblings, 0 replies; 35+ messages in thread
From: Kenneth Graunke @ 2023-04-04 22:15 UTC (permalink / raw)
  To: Matt Roper, intel-gfx, Chris Wilson; +Cc: dri-devel

[-- Attachment #1: Type: text/plain, Size: 4040 bytes --]

On Monday, April 3, 2023 9:48:40 AM PDT Ville Syrjälä wrote:
> On Mon, Apr 03, 2023 at 09:35:32AM -0700, Matt Roper wrote:
> > On Mon, Apr 03, 2023 at 07:02:08PM +0300, Ville Syrjälä wrote:
> > > On Fri, Mar 31, 2023 at 11:38:30PM -0700, fei.yang@intel.com wrote:
> > > > From: Fei Yang <fei.yang@intel.com>
> > > > 
> > > > To comply with the design that buffer objects shall have immutable
> > > > cache setting through out its life cycle, {set, get}_caching ioctl's
> > > > are no longer supported from MTL onward. With that change caching
> > > > policy can only be set at object creation time. The current code
> > > > applies a default (platform dependent) cache setting for all objects.
> > > > However this is not optimal for performance tuning. The patch extends
> > > > the existing gem_create uAPI to let user set PAT index for the object
> > > > at creation time.
> > > 
> > > This is missing the whole justification for the new uapi.
> > > Why is MOCS not sufficient?
> > 
> > PAT and MOCS are somewhat related, but they're not the same thing.  The
> > general direction of the hardware architecture recently has been to
> > slowly dumb down MOCS and move more of the important memory/cache
> > control over to the PAT instead.  On current platforms there is some
> > overlap (and MOCS has an "ignore PAT" setting that makes the MOCS "win"
> > for the specific fields that both can control), but MOCS doesn't have a
> > way to express things like snoop/coherency mode (on MTL), or class of
> > service (on PVC).  And if you check some of the future platforms, the
> > hardware design starts packing even more stuff into the PAT (not just
> > cache behavior) which will never be handled by MOCS.
> 
> Sigh. So the hardware designers screwed up MOCS yet again and
> instead of getting that fixed we are adding a new uapi to work
> around it?
> 
> The IMO sane approach (which IIRC was the situation for a few
> platform generations at least) is that you just shove the PAT
> index into MOCS (or tell it to go look it up from the PTE).
> Why the heck did they not just stick with that?

There are actually some use cases in newer APIs where MOCS doesn't
work well.  For example, VK_KHR_buffer_device_address in Vulkan 1.2:

https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VK_KHR_buffer_device_address.html

It essentially adds "pointers to buffer memory in shaders", where apps
can just get a 64-bit pointer, and use it with an address.  On our EUs,
that turns into A64 data port messages which refer directly to memory.
Notably, there's no descriptor (i.e. SURFACE_STATE) where you could
stuff a MOCS value.  So, you get one single MOCS entry for all such
buffers...which is specified in STATE_BASE_ADDRESS.  Hope you wanted
all of them to have the same cache & coherency settings!

With PAT/PTE, we can at least specify settings for each buffer, rather
than one global setting.

Compression has also been moving towards virtual address-based solutions
and handling in the caches and memory controller, rather than in e.g.
the sampler reading SURFACE_STATE.  (It started evolving that way with
Tigerlake, really, but continues.)

> > Also keep in mind that MOCS generally applies at the GPU instruction
> > level; although a lot of instructions have a field to provide a MOCS
> > index, or can use a MOCS already associated with a surface state, there
> > are still some that don't. PAT is the source of memory access
> > characteristics for anything that can't provide a MOCS directly.
> 
> So what are the things that don't have MOCS and where we need
> some custom cache behaviour, and we already know all that at
> buffer creation time?

For Meteorlake...we have MOCS for cache settings.  We only need to use
PAT for coherency settings; I believe we can get away with deciding that
up-front at buffer creation time.  If we were doing full cacheability,
I'd be very nervous about deciding performance tuning at creation time.

--Ken

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation
  2023-04-04 16:04     ` Yang, Fei
@ 2023-04-05  7:45       ` Lionel Landwerlin
  2023-04-05 20:26         ` Jordan Justen
  2023-04-05 23:06         ` Yang, Fei
  0 siblings, 2 replies; 35+ messages in thread
From: Lionel Landwerlin @ 2023-04-05  7:45 UTC (permalink / raw)
  To: Yang, Fei, intel-gfx; +Cc: Chris Wilson, Roper, Matthew D, dri-devel

On 04/04/2023 19:04, Yang, Fei wrote:
>> Subject: Re: [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation
>>
>> On 01/04/2023 09:38, fei.yang@intel.com wrote:
>>> From: Fei Yang <fei.yang@intel.com>
>>>
>>> To comply with the design that buffer objects shall have immutable
>>> cache setting through out its life cycle, {set, get}_caching ioctl's
>>> are no longer supported from MTL onward. With that change caching
>>> policy can only be set at object creation time. The current code
>>> applies a default (platform dependent) cache setting for all objects.
>>> However this is not optimal for performance tuning. The patch extends
>>> the existing gem_create uAPI to let user set PAT index for the object
>>> at creation time.
>>> The new extension is platform independent, so UMD's can switch to
>>> using this extension for older platforms as well, while {set,
>>> get}_caching are still supported on these legacy paltforms for compatibility reason.
>>>
>>> Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
>>> Cc: Matt Roper <matthew.d.roper@intel.com>
>>> Signed-off-by: Fei Yang <fei.yang@intel.com>
>>> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
>>
>> Just like the protected content uAPI, there is no way for userspace to tell
>> this feature is available other than trying using it.
>>
>> Given the issues with protected content, is it not thing we could want to add?
> Sorry I'm not aware of the issues with protected content, could you elaborate?
> There was a long discussion on teams uAPI channel, could you comment there if
> any concerns?
>
> https://teams.microsoft.com/l/message/19:f1767bda6734476ba0a9c7d147b928d1@thread.skype/1675860924675?tenantId=46c98d88-e344-4ed4-8496-4ed7712e255d&groupId=379f3ae1-d138-4205-bb65-d4c7d38cb481&parentMessageId=1675860924675&teamName=GSE%20OSGC&channelName=i915%20uAPI%20changes&createdTime=1675860924675&allowXTenantAccess=false
>
> Thanks,
> -Fei


We wanted to have a getparam to detect protected support and were told 
to detect it by trying to create a context with it.

Now it appears trying to create a protected context can block for 
several seconds.

Since we have to report capabilities to the user even before it creates 
protected contexts, any app is at risk of blocking.


-Lionel


>
>> Thanks,
>>
>> -Lionel
>>
>>
>>> ---
>>>    drivers/gpu/drm/i915/gem/i915_gem_create.c | 33 ++++++++++++++++++++
>>>    include/uapi/drm/i915_drm.h                | 36 ++++++++++++++++++++++
>>>    tools/include/uapi/drm/i915_drm.h          | 36 ++++++++++++++++++++++
>>>    3 files changed, 105 insertions(+)
>>>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation
  2023-04-05  7:45       ` Lionel Landwerlin
@ 2023-04-05 20:26         ` Jordan Justen
  2023-04-10  8:23           ` Jordan Justen
  2023-04-05 23:06         ` Yang, Fei
  1 sibling, 1 reply; 35+ messages in thread
From: Jordan Justen @ 2023-04-05 20:26 UTC (permalink / raw)
  To: Yang, Fei, Lionel Landwerlin, intel-gfx
  Cc: Roper, Matthew D, Chris Wilson, dri-devel

On 2023-04-05 00:45:24, Lionel Landwerlin wrote:
> On 04/04/2023 19:04, Yang, Fei wrote:
> >> Subject: Re: [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation
> >>
> >> Just like the protected content uAPI, there is no way for userspace to tell
> >> this feature is available other than trying using it.
> >>
> >> Given the issues with protected content, is it not thing we could want to add?
> > Sorry I'm not aware of the issues with protected content, could you elaborate?
> > There was a long discussion on teams uAPI channel, could you comment there if
> > any concerns?
> >
> 
> We wanted to have a getparam to detect protected support and were told 
> to detect it by trying to create a context with it.
> 

An extensions system where the detection mechanism is "just try it",
and assume it's not supported if it fails. ??

This seem likely to get more and more problematic as a detection
mechanism as more extensions are added.

> 
> Now it appears trying to create a protected context can block for 
> several seconds.
> 
> Since we have to report capabilities to the user even before it creates 
> protected contexts, any app is at risk of blocking.
> 

This failure path is not causing any re-thinking about using this as
the extension detection mechanism?

Doesn't the ioctl# + input-struct-size + u64-extension# identify the
extension such that the kernel could indicate if it is supported or
not. (Or, perhaps return an array of the supported extensions so the
umd doesn't have to potentially make many ioctls for each extension of
interest.)

-Jordan

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation
  2023-04-05  7:45       ` Lionel Landwerlin
  2023-04-05 20:26         ` Jordan Justen
@ 2023-04-05 23:06         ` Yang, Fei
  1 sibling, 0 replies; 35+ messages in thread
From: Yang, Fei @ 2023-04-05 23:06 UTC (permalink / raw)
  To: Landwerlin, Lionel G, intel-gfx; +Cc: Chris Wilson, Roper, Matthew D, dri-devel

>Subject: Re: [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation
>
>On 04/04/2023 19:04, Yang, Fei wrote:
>>> Subject: Re: [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set
>>> cache at BO creation
>>>
>>> On 01/04/2023 09:38, fei.yang@intel.com wrote:
>>>> From: Fei Yang <fei.yang@intel.com>
>>>>
>>>> To comply with the design that buffer objects shall have immutable
>>>> cache setting through out its life cycle, {set, get}_caching ioctl's
>>>> are no longer supported from MTL onward. With that change caching
>>>> policy can only be set at object creation time. The current code
>>>> applies a default (platform dependent) cache setting for all objects.
>>>> However this is not optimal for performance tuning. The patch
>>>> extends the existing gem_create uAPI to let user set PAT index for
>>>> the object at creation time.
>>>> The new extension is platform independent, so UMD's can switch to
>>>> using this extension for older platforms as well, while {set,
>>>> get}_caching are still supported on these legacy paltforms for
>>>> compatibility reason.
>>>>
>>>> Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
>>>> Cc: Matt Roper <matthew.d.roper@intel.com>
>>>> Signed-off-by: Fei Yang <fei.yang@intel.com>
>>>> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
>>>
>>> Just like the protected content uAPI, there is no way for userspace
>>> to tell this feature is available other than trying using it.
>>>
>>> Given the issues with protected content, is it not thing we could want to add?
>> Sorry I'm not aware of the issues with protected content, could you elaborate?
>> There was a long discussion on teams uAPI channel, could you comment
>> there if any concerns?
>>
>> https://teams.microsoft.com/l/message/19:f1767bda6734476ba0a9c7d147b92
>> 8d1@thread.skype/1675860924675?tenantId=46c98d88-e344-4ed4-8496-4ed771
>> 2e255d&groupId=379f3ae1-d138-4205-bb65-d4c7d38cb481&parentMessageId=16
>> 75860924675&teamName=GSE%20OSGC&channelName=i915%20uAPI%20changes&crea
>> tedTime=1675860924675&allowXTenantAccess=false
>>
>> Thanks,
>> -Fei
>
>
> We wanted to have a getparam to detect protected support and were told
> to detect it by trying to create a context with it.
>
> Now it appears trying to create a protected context can block for several
> seconds.
>
> Since we have to report capabilities to the user even before it creates
> protected contexts, any app is at risk of blocking.

Can we detect this capability by creating a buffer object? This extension is
not blocking, it just provide a way to set caching policy, and should complete
very fast. There is a IGT test I created for this extension (not merged yet),
please take a look at http://intel-gfx-pw.fi.intel.com/series/19149/

I'm not familiar with getparam, will take a look there as well. But I think it
would be easier just create an object.

-Fei

>-Lionel
>
>
>>
>>> Thanks,
>>>
>>> -Lionel
>>>
>>>
>>>> ---
>>>>    drivers/gpu/drm/i915/gem/i915_gem_create.c | 33 ++++++++++++++++++++
>>>>    include/uapi/drm/i915_drm.h                | 36 ++++++++++++++++++++++
>>>>    tools/include/uapi/drm/i915_drm.h          | 36 ++++++++++++++++++++++
>>>>    3 files changed, 105 insertions(+)
>>>>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Intel-gfx] [PATCH 5/7] drm/i915: use pat_index instead of cache_level
  2023-04-03 19:52           ` Ville Syrjälä
@ 2023-04-06  6:28             ` Yang, Fei
  0 siblings, 0 replies; 35+ messages in thread
From: Yang, Fei @ 2023-04-06  6:28 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Chris Wilson, intel-gfx, Roper, Matthew D, dri-devel

> On Mon, Apr 03, 2023 at 07:39:37PM +0000, Yang, Fei wrote:
>>> Subject: Re: [PATCH 5/7] drm/i915: use pat_index instead of cache_level
>>>
>>> On Mon, Apr 03, 2023 at 04:57:21PM +0000, Yang, Fei wrote:
>>>>> Subject: Re: [PATCH 5/7] drm/i915: use pat_index instead of
>>>>> cache_level
>>>>>
>>>>> On Fri, Mar 31, 2023 at 11:38:28PM -0700, fei.yang@intel.com wrote:
>>>>>> From: Fei Yang <fei.yang@intel.com>
>>>>>>
>>>>>> Currently the KMD is using enum i915_cache_level to set caching
>>>>>> policy for buffer objects. This is flaky because the PAT index
>>>>>> which really controls the caching behavior in PTE has far more
>>>>>> levels than what's defined in the enum.
>>>>>
>>>>> Then just add more enum values.
>>>>
>>>> That would be really messy because PAT index is platform dependent,
>>>> you would have to maintain many tables for the the translation.
>>>>
>>>>> 'pat_index' is absolutely meaningless to the reader, it's just an
>>>>> arbitrary number. Whereas 'cache_level' conveys how the thing is
>>>>> actually going to get used and thus how the caches should behave.
>>>>
>>>> By design UMD's understand PAT index. Both UMD and KMD should stand
>>>> on the same ground, the Bspec, to avoid any potential ambiguity.
>>>>
>>>>>> In addition, the PAT index is platform dependent, having to
>>>>>> translate between i915_cache_level and PAT index is not reliable,
>>>>>
>>>>> If it's not realiable then the code is clearly broken.
>>>>
>>>> Perhaps the word "reliable" is a bit confusing here. What I really
>>>> meant to say is 'difficult to maintain', or 'error-prone'.
>>>>
>>>>>> and makes the code more complicated.
>>>>>
>>>>> You have to translate somewhere anyway. Looks like you're now
>>>>> adding translations the other way (pat_index->cache_level). How is that better?
>>>>
>>>> No, there is no pat_index->cache_level translation.
>>>
>>> i915_gem_object_has_cache_level() is exactly that. And that one does
>>> look actually fragile since it assumes only one PAT index maps to
>>> each cache level. So if the user picks any other pat_index anything
>>> using
>>> i915_gem_object_has_cache_level() is likely to do the wrong thing.
>>
>> That is still one way transaltion, from cache_level to pat_index.
>
> Not really. The actual input to the thing is obj->pat_index.
> And as stated, the whole thing is simply broken whenever
> obj->pat_index isn't one of the magic numbers that you get
> back from i915_gem_get_pat_index().

I proposed a patch for diic which is directly applicable to drm-tip as well.
Could you review http://intel-gfx-pw.fi.intel.com/series/19405/ and let me
know if that would address your concern here?

-Fei

> --
> Ville Syrjälä
> Intel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Intel-gfx] [PATCH 1/7] drm/i915/mtl: Define MOCS and PAT tables for MTL
  2023-04-03 12:50   ` Jani Nikula
@ 2023-04-06  8:16     ` Andi Shyti
  2023-04-06 18:22       ` Yang, Fei
  0 siblings, 1 reply; 35+ messages in thread
From: Andi Shyti @ 2023-04-06  8:16 UTC (permalink / raw)
  To: Jani Nikula; +Cc: Lucas De Marchi, intel-gfx, Matt Roper, dri-devel

Hi Fei,

On Mon, Apr 03, 2023 at 03:50:26PM +0300, Jani Nikula wrote:
> On Fri, 31 Mar 2023, fei.yang@intel.com wrote:
> > From: Fei Yang <fei.yang@intel.com>
> >
> > On MTL, GT can no longer allocate on LLC - only the CPU can.
> > This, along with addition of support for ADM/L4 cache calls a
> > MOCS/PAT table update.
> > Also add PTE encode functions for MTL as it has different PAT
> > index definition than previous platforms.
> 
> As a general observation, turning something into a function pointer and
> extending it to more platforms should be two separate changes.

Agree with Jani. Fei, would you mind splitting this patch? It
eases the review, as well.

Thanks,
Andi

> BR,
> Jani.
> 
> >
> > BSpec: 44509, 45101, 44235
> >
> > Cc: Matt Roper <matthew.d.roper@intel.com>
> > Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> > Signed-off-by: Madhumitha Tolakanahalli Pradeep <madhumitha.tolakanahalli.pradeep@intel.com>
> > Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
> > Signed-off-by: Fei Yang <fei.yang@intel.com>
> > ---
> >  drivers/gpu/drm/i915/display/intel_dpt.c |  2 +-
> >  drivers/gpu/drm/i915/gt/gen8_ppgtt.c     | 43 ++++++++++++--
> >  drivers/gpu/drm/i915/gt/gen8_ppgtt.h     |  3 +
> >  drivers/gpu/drm/i915/gt/intel_ggtt.c     | 36 ++++++++++-
> >  drivers/gpu/drm/i915/gt/intel_gtt.c      | 23 ++++++-
> >  drivers/gpu/drm/i915/gt/intel_gtt.h      | 20 ++++++-
> >  drivers/gpu/drm/i915/gt/intel_mocs.c     | 76 ++++++++++++++++++++++--
> >  drivers/gpu/drm/i915/gt/selftest_mocs.c  |  2 +-
> >  drivers/gpu/drm/i915/i915_pci.c          |  1 +
> >  9 files changed, 189 insertions(+), 17 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c b/drivers/gpu/drm/i915/display/intel_dpt.c
> > index b8027392144d..c5eacfdba1a5 100644
> > --- a/drivers/gpu/drm/i915/display/intel_dpt.c
> > +++ b/drivers/gpu/drm/i915/display/intel_dpt.c
> > @@ -300,7 +300,7 @@ intel_dpt_create(struct intel_framebuffer *fb)
> >  	vm->vma_ops.bind_vma    = dpt_bind_vma;
> >  	vm->vma_ops.unbind_vma  = dpt_unbind_vma;
> >  
> > -	vm->pte_encode = gen8_ggtt_pte_encode;
> > +	vm->pte_encode = vm->gt->ggtt->vm.pte_encode;
> >  
> >  	dpt->obj = dpt_obj;
> >  	dpt->obj->is_dpt = true;
> > diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> > index 4daaa6f55668..4197b43150cc 100644
> > --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> > +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> > @@ -55,6 +55,34 @@ static u64 gen8_pte_encode(dma_addr_t addr,
> >  	return pte;
> >  }
> >  
> > +static u64 mtl_pte_encode(dma_addr_t addr,
> > +			  enum i915_cache_level level,
> > +			  u32 flags)
> > +{
> > +	gen8_pte_t pte = addr | GEN8_PAGE_PRESENT | GEN8_PAGE_RW;
> > +
> > +	if (unlikely(flags & PTE_READ_ONLY))
> > +		pte &= ~GEN8_PAGE_RW;
> > +
> > +	if (flags & PTE_LM)
> > +		pte |= GEN12_PPGTT_PTE_LM | GEN12_PPGTT_PTE_NC;
> > +
> > +	switch (level) {
> > +	case I915_CACHE_NONE:
> > +		pte |= GEN12_PPGTT_PTE_PAT1;
> > +		break;
> > +	case I915_CACHE_LLC:
> > +	case I915_CACHE_L3_LLC:
> > +		pte |= GEN12_PPGTT_PTE_PAT0 | GEN12_PPGTT_PTE_PAT1;
> > +		break;
> > +	case I915_CACHE_WT:
> > +		pte |= GEN12_PPGTT_PTE_PAT0;
> > +		break;
> > +	}
> > +
> > +	return pte;
> > +}
> > +
> >  static void gen8_ppgtt_notify_vgt(struct i915_ppgtt *ppgtt, bool create)
> >  {
> >  	struct drm_i915_private *i915 = ppgtt->vm.i915;
> > @@ -427,7 +455,7 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
> >  		      u32 flags)
> >  {
> >  	struct i915_page_directory *pd;
> > -	const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
> > +	const gen8_pte_t pte_encode = ppgtt->vm.pte_encode(0, cache_level, flags);
> >  	gen8_pte_t *vaddr;
> >  
> >  	pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2));
> > @@ -580,7 +608,7 @@ static void gen8_ppgtt_insert_huge(struct i915_address_space *vm,
> >  				   enum i915_cache_level cache_level,
> >  				   u32 flags)
> >  {
> > -	const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
> > +	const gen8_pte_t pte_encode = vm->pte_encode(0, cache_level, flags);
> >  	unsigned int rem = sg_dma_len(iter->sg);
> >  	u64 start = vma_res->start;
> >  
> > @@ -743,7 +771,7 @@ static void gen8_ppgtt_insert_entry(struct i915_address_space *vm,
> >  	GEM_BUG_ON(pt->is_compact);
> >  
> >  	vaddr = px_vaddr(pt);
> > -	vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags);
> > +	vaddr[gen8_pd_index(idx, 0)] = vm->pte_encode(addr, level, flags);
> >  	drm_clflush_virt_range(&vaddr[gen8_pd_index(idx, 0)], sizeof(*vaddr));
> >  }
> >  
> > @@ -773,7 +801,7 @@ static void __xehpsdv_ppgtt_insert_entry_lm(struct i915_address_space *vm,
> >  	}
> >  
> >  	vaddr = px_vaddr(pt);
> > -	vaddr[gen8_pd_index(idx, 0) / 16] = gen8_pte_encode(addr, level, flags);
> > +	vaddr[gen8_pd_index(idx, 0) / 16] = vm->pte_encode(addr, level, flags);
> >  }
> >  
> >  static void xehpsdv_ppgtt_insert_entry(struct i915_address_space *vm,
> > @@ -820,7 +848,7 @@ static int gen8_init_scratch(struct i915_address_space *vm)
> >  		pte_flags |= PTE_LM;
> >  
> >  	vm->scratch[0]->encode =
> > -		gen8_pte_encode(px_dma(vm->scratch[0]),
> > +		vm->pte_encode(px_dma(vm->scratch[0]),
> >  				I915_CACHE_NONE, pte_flags);
> >  
> >  	for (i = 1; i <= vm->top; i++) {
> > @@ -963,7 +991,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
> >  	 */
> >  	ppgtt->vm.alloc_scratch_dma = alloc_pt_dma;
> >  
> > -	ppgtt->vm.pte_encode = gen8_pte_encode;
> > +	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 70))
> > +		ppgtt->vm.pte_encode = mtl_pte_encode;
> > +	else
> > +		ppgtt->vm.pte_encode = gen8_pte_encode;
> >  
> >  	ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND;
> >  	ppgtt->vm.insert_entries = gen8_ppgtt_insert;
> > diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
> > index f541d19264b4..6b8ce7f4d25a 100644
> > --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
> > +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
> > @@ -18,5 +18,8 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
> >  u64 gen8_ggtt_pte_encode(dma_addr_t addr,
> >  			 enum i915_cache_level level,
> >  			 u32 flags);
> > +u64 mtl_ggtt_pte_encode(dma_addr_t addr,
> > +			unsigned int pat_index,
> > +			u32 flags);
> >  
> >  #endif
> > diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> > index 3c7f1ed92f5b..ba3109338aee 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> > @@ -220,6 +220,33 @@ static void guc_ggtt_invalidate(struct i915_ggtt *ggtt)
> >  	}
> >  }
> >  
> > +u64 mtl_ggtt_pte_encode(dma_addr_t addr,
> > +			enum i915_cache_level level,
> > +			u32 flags)
> > +{
> > +	gen8_pte_t pte = addr | GEN8_PAGE_PRESENT;
> > +
> > +	GEM_BUG_ON(addr & ~GEN12_GGTT_PTE_ADDR_MASK);
> > +
> > +	if (flags & PTE_LM)
> > +		pte |= GEN12_GGTT_PTE_LM;
> > +
> > +	switch (level) {
> > +	case I915_CACHE_NONE:
> > +		pte |= MTL_GGTT_PTE_PAT1;
> > +		break;
> > +	case I915_CACHE_LLC:
> > +	case I915_CACHE_L3_LLC:
> > +		pte |= MTL_GGTT_PTE_PAT0 | MTL_GGTT_PTE_PAT1;
> > +		break;
> > +	case I915_CACHE_WT:
> > +		pte |= MTL_GGTT_PTE_PAT0;
> > +		break;
> > +	}
> > +
> > +	return pte;
> > +}
> > +
> >  u64 gen8_ggtt_pte_encode(dma_addr_t addr,
> >  			 enum i915_cache_level level,
> >  			 u32 flags)
> > @@ -247,7 +274,7 @@ static void gen8_ggtt_insert_page(struct i915_address_space *vm,
> >  	gen8_pte_t __iomem *pte =
> >  		(gen8_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE;
> >  
> > -	gen8_set_pte(pte, gen8_ggtt_pte_encode(addr, level, flags));
> > +	gen8_set_pte(pte, ggtt->vm.pte_encode(addr, level, flags));
> >  
> >  	ggtt->invalidate(ggtt);
> >  }
> > @@ -257,8 +284,8 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
> >  				     enum i915_cache_level level,
> >  				     u32 flags)
> >  {
> > -	const gen8_pte_t pte_encode = gen8_ggtt_pte_encode(0, level, flags);
> >  	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
> > +	const gen8_pte_t pte_encode = ggtt->vm.pte_encode(0, level, flags);
> >  	gen8_pte_t __iomem *gte;
> >  	gen8_pte_t __iomem *end;
> >  	struct sgt_iter iter;
> > @@ -981,7 +1008,10 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
> >  	ggtt->vm.vma_ops.bind_vma    = intel_ggtt_bind_vma;
> >  	ggtt->vm.vma_ops.unbind_vma  = intel_ggtt_unbind_vma;
> >  
> > -	ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
> > +	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
> > +		ggtt->vm.pte_encode = mtl_ggtt_pte_encode;
> > +	else
> > +		ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
> >  
> >  	return ggtt_probe_common(ggtt, size);
> >  }
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
> > index 4f436ba7a3c8..1e1b34e22cf5 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> > @@ -468,6 +468,25 @@ void gtt_write_workarounds(struct intel_gt *gt)
> >  	}
> >  }
> >  
> > +static void mtl_setup_private_ppat(struct intel_uncore *uncore)
> > +{
> > +	intel_uncore_write(uncore, GEN12_PAT_INDEX(0),
> > +			   MTL_PPAT_L4_0_WB);
> > +	intel_uncore_write(uncore, GEN12_PAT_INDEX(1),
> > +			   MTL_PPAT_L4_1_WT);
> > +	intel_uncore_write(uncore, GEN12_PAT_INDEX(2),
> > +			   MTL_PPAT_L4_3_UC);
> > +	intel_uncore_write(uncore, GEN12_PAT_INDEX(3),
> > +			   MTL_PPAT_L4_0_WB | MTL_2_COH_1W);
> > +	intel_uncore_write(uncore, GEN12_PAT_INDEX(4),
> > +			   MTL_PPAT_L4_0_WB | MTL_3_COH_2W);
> > +
> > +	/*
> > +	 * Remaining PAT entries are left at the hardware-default
> > +	 * fully-cached setting
> > +	 */
> > +}
> > +
> >  static void tgl_setup_private_ppat(struct intel_uncore *uncore)
> >  {
> >  	/* TGL doesn't support LLC or AGE settings */
> > @@ -603,7 +622,9 @@ void setup_private_pat(struct intel_gt *gt)
> >  
> >  	GEM_BUG_ON(GRAPHICS_VER(i915) < 8);
> >  
> > -	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
> > +	if (IS_METEORLAKE(i915))
> > +		mtl_setup_private_ppat(uncore);
> > +	else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
> >  		xehp_setup_private_ppat(gt);
> >  	else if (GRAPHICS_VER(i915) >= 12)
> >  		tgl_setup_private_ppat(uncore);
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
> > index 69ce55f517f5..b632167eaf2e 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> > @@ -88,9 +88,18 @@ typedef u64 gen8_pte_t;
> >  #define BYT_PTE_SNOOPED_BY_CPU_CACHES	REG_BIT(2)
> >  #define BYT_PTE_WRITEABLE		REG_BIT(1)
> >  
> > +#define GEN12_PPGTT_PTE_PAT3    BIT_ULL(62)
> >  #define GEN12_PPGTT_PTE_LM	BIT_ULL(11)
> > +#define GEN12_PPGTT_PTE_PAT2    BIT_ULL(7)
> > +#define GEN12_PPGTT_PTE_NC      BIT_ULL(5)
> > +#define GEN12_PPGTT_PTE_PAT1    BIT_ULL(4)
> > +#define GEN12_PPGTT_PTE_PAT0    BIT_ULL(3)
> >  
> > -#define GEN12_GGTT_PTE_LM	BIT_ULL(1)
> > +#define GEN12_GGTT_PTE_LM		BIT_ULL(1)
> > +#define MTL_GGTT_PTE_PAT0		BIT_ULL(52)
> > +#define MTL_GGTT_PTE_PAT1		BIT_ULL(53)
> > +#define GEN12_GGTT_PTE_ADDR_MASK	GENMASK_ULL(45, 12)
> > +#define MTL_GGTT_PTE_PAT_MASK		GENMASK_ULL(53, 52)
> >  
> >  #define GEN12_PDE_64K BIT(6)
> >  #define GEN12_PTE_PS64 BIT(8)
> > @@ -147,6 +156,15 @@ typedef u64 gen8_pte_t;
> >  #define GEN8_PDE_IPS_64K BIT(11)
> >  #define GEN8_PDE_PS_2M   BIT(7)
> >  
> > +#define MTL_PPAT_L4_CACHE_POLICY_MASK	REG_GENMASK(3, 2)
> > +#define MTL_PAT_INDEX_COH_MODE_MASK	REG_GENMASK(1, 0)
> > +#define MTL_PPAT_L4_3_UC	REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 3)
> > +#define MTL_PPAT_L4_1_WT	REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 1)
> > +#define MTL_PPAT_L4_0_WB	REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 0)
> > +#define MTL_3_COH_2W	REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 3)
> > +#define MTL_2_COH_1W	REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 2)
> > +#define MTL_0_COH_NON	REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 0)
> > +
> >  enum i915_cache_level;
> >  
> >  struct drm_i915_gem_object;
> > diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c
> > index 69b489e8dfed..89570f137b2c 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_mocs.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
> > @@ -40,6 +40,10 @@ struct drm_i915_mocs_table {
> >  #define LE_COS(value)		((value) << 15)
> >  #define LE_SSE(value)		((value) << 17)
> >  
> > +/* Defines for the tables (GLOB_MOCS_0 - GLOB_MOCS_16) */
> > +#define _L4_CACHEABILITY(value)	((value) << 2)
> > +#define IG_PAT(value)		((value) << 8)
> > +
> >  /* Defines for the tables (LNCFMOCS0 - LNCFMOCS31) - two entries per word */
> >  #define L3_ESC(value)		((value) << 0)
> >  #define L3_SCC(value)		((value) << 1)
> > @@ -50,6 +54,7 @@ struct drm_i915_mocs_table {
> >  /* Helper defines */
> >  #define GEN9_NUM_MOCS_ENTRIES	64  /* 63-64 are reserved, but configured. */
> >  #define PVC_NUM_MOCS_ENTRIES	3
> > +#define MTL_NUM_MOCS_ENTRIES	16
> >  
> >  /* (e)LLC caching options */
> >  /*
> > @@ -73,6 +78,12 @@ struct drm_i915_mocs_table {
> >  #define L3_2_RESERVED		_L3_CACHEABILITY(2)
> >  #define L3_3_WB			_L3_CACHEABILITY(3)
> >  
> > +/* L4 caching options */
> > +#define L4_0_WB			_L4_CACHEABILITY(0)
> > +#define L4_1_WT			_L4_CACHEABILITY(1)
> > +#define L4_2_RESERVED		_L4_CACHEABILITY(2)
> > +#define L4_3_UC			_L4_CACHEABILITY(3)
> > +
> >  #define MOCS_ENTRY(__idx, __control_value, __l3cc_value) \
> >  	[__idx] = { \
> >  		.control_value = __control_value, \
> > @@ -416,6 +427,57 @@ static const struct drm_i915_mocs_entry pvc_mocs_table[] = {
> >  	MOCS_ENTRY(2, 0, L3_3_WB),
> >  };
> >  
> > +static const struct drm_i915_mocs_entry mtl_mocs_table[] = {
> > +	/* Error - Reserved for Non-Use */
> > +	MOCS_ENTRY(0,
> > +		   IG_PAT(0),
> > +		   L3_LKUP(1) | L3_3_WB),
> > +	/* Cached - L3 + L4 */
> > +	MOCS_ENTRY(1,
> > +		   IG_PAT(1),
> > +		   L3_LKUP(1) | L3_3_WB),
> > +	/* L4 - GO:L3 */
> > +	MOCS_ENTRY(2,
> > +		   IG_PAT(1),
> > +		   L3_LKUP(1) | L3_1_UC),
> > +	/* Uncached - GO:L3 */
> > +	MOCS_ENTRY(3,
> > +		   IG_PAT(1) | L4_3_UC,
> > +		   L3_LKUP(1) | L3_1_UC),
> > +	/* L4 - GO:Mem */
> > +	MOCS_ENTRY(4,
> > +		   IG_PAT(1),
> > +		   L3_LKUP(1) | L3_GLBGO(1) | L3_1_UC),
> > +	/* Uncached - GO:Mem */
> > +	MOCS_ENTRY(5,
> > +		   IG_PAT(1) | L4_3_UC,
> > +		   L3_LKUP(1) | L3_GLBGO(1) | L3_1_UC),
> > +	/* L4 - L3:NoLKUP; GO:L3 */
> > +	MOCS_ENTRY(6,
> > +		   IG_PAT(1),
> > +		   L3_1_UC),
> > +	/* Uncached - L3:NoLKUP; GO:L3 */
> > +	MOCS_ENTRY(7,
> > +		   IG_PAT(1) | L4_3_UC,
> > +		   L3_1_UC),
> > +	/* L4 - L3:NoLKUP; GO:Mem */
> > +	MOCS_ENTRY(8,
> > +		   IG_PAT(1),
> > +		   L3_GLBGO(1) | L3_1_UC),
> > +	/* Uncached - L3:NoLKUP; GO:Mem */
> > +	MOCS_ENTRY(9,
> > +		   IG_PAT(1) | L4_3_UC,
> > +		   L3_GLBGO(1) | L3_1_UC),
> > +	/* Display - L3; L4:WT */
> > +	MOCS_ENTRY(14,
> > +		   IG_PAT(1) | L4_1_WT,
> > +		   L3_LKUP(1) | L3_3_WB),
> > +	/* CCS - Non-Displayable */
> > +	MOCS_ENTRY(15,
> > +		   IG_PAT(1),
> > +		   L3_GLBGO(1) | L3_1_UC),
> > +};
> > +
> >  enum {
> >  	HAS_GLOBAL_MOCS = BIT(0),
> >  	HAS_ENGINE_MOCS = BIT(1),
> > @@ -445,7 +507,13 @@ static unsigned int get_mocs_settings(const struct drm_i915_private *i915,
> >  	memset(table, 0, sizeof(struct drm_i915_mocs_table));
> >  
> >  	table->unused_entries_index = I915_MOCS_PTE;
> > -	if (IS_PONTEVECCHIO(i915)) {
> > +	if (IS_METEORLAKE(i915)) {
> > +		table->size = ARRAY_SIZE(mtl_mocs_table);
> > +		table->table = mtl_mocs_table;
> > +		table->n_entries = MTL_NUM_MOCS_ENTRIES;
> > +		table->uc_index = 9;
> > +		table->unused_entries_index = 1;
> > +	} else if (IS_PONTEVECCHIO(i915)) {
> >  		table->size = ARRAY_SIZE(pvc_mocs_table);
> >  		table->table = pvc_mocs_table;
> >  		table->n_entries = PVC_NUM_MOCS_ENTRIES;
> > @@ -646,9 +714,9 @@ void intel_mocs_init_engine(struct intel_engine_cs *engine)
> >  		init_l3cc_table(engine->gt, &table);
> >  }
> >  
> > -static u32 global_mocs_offset(void)
> > +static u32 global_mocs_offset(struct intel_gt *gt)
> >  {
> > -	return i915_mmio_reg_offset(GEN12_GLOBAL_MOCS(0));
> > +	return i915_mmio_reg_offset(GEN12_GLOBAL_MOCS(0)) + gt->uncore->gsi_offset;
> >  }
> >  
> >  void intel_set_mocs_index(struct intel_gt *gt)
> > @@ -671,7 +739,7 @@ void intel_mocs_init(struct intel_gt *gt)
> >  	 */
> >  	flags = get_mocs_settings(gt->i915, &table);
> >  	if (flags & HAS_GLOBAL_MOCS)
> > -		__init_mocs_table(gt->uncore, &table, global_mocs_offset());
> > +		__init_mocs_table(gt->uncore, &table, global_mocs_offset(gt));
> >  
> >  	/*
> >  	 * Initialize the L3CC table as part of mocs initalization to make
> > diff --git a/drivers/gpu/drm/i915/gt/selftest_mocs.c b/drivers/gpu/drm/i915/gt/selftest_mocs.c
> > index ca009a6a13bd..730796346514 100644
> > --- a/drivers/gpu/drm/i915/gt/selftest_mocs.c
> > +++ b/drivers/gpu/drm/i915/gt/selftest_mocs.c
> > @@ -137,7 +137,7 @@ static int read_mocs_table(struct i915_request *rq,
> >  		return 0;
> >  
> >  	if (HAS_GLOBAL_MOCS_REGISTERS(rq->engine->i915))
> > -		addr = global_mocs_offset();
> > +		addr = global_mocs_offset(rq->engine->gt);
> >  	else
> >  		addr = mocs_offset(rq->engine);
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> > index 621730b6551c..480b128499ae 100644
> > --- a/drivers/gpu/drm/i915/i915_pci.c
> > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > @@ -1149,6 +1149,7 @@ static const struct intel_device_info mtl_info = {
> >  	.has_flat_ccs = 0,
> >  	.has_gmd_id = 1,
> >  	.has_guc_deprivilege = 1,
> > +	.has_llc = 0,
> >  	.has_mslice_steering = 0,
> >  	.has_snoop = 1,
> >  	.__runtime.memory_regions = REGION_SMEM | REGION_STOLEN_LMEM,
> 
> -- 
> Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Intel-gfx] [PATCH 1/7] drm/i915/mtl: Define MOCS and PAT tables for MTL
  2023-04-01  6:38 ` [Intel-gfx] [PATCH 1/7] " fei.yang
  2023-04-03 12:50   ` Jani Nikula
@ 2023-04-06  8:28   ` Das, Nirmoy
  2023-04-06 14:55     ` Yang, Fei
  1 sibling, 1 reply; 35+ messages in thread
From: Das, Nirmoy @ 2023-04-06  8:28 UTC (permalink / raw)
  To: fei.yang, intel-gfx; +Cc: Matt Roper, Lucas De Marchi, dri-devel

Hi Fei,

On 4/1/2023 8:38 AM, fei.yang@intel.com wrote:
> From: Fei Yang <fei.yang@intel.com>
>
> On MTL, GT can no longer allocate on LLC - only the CPU can.
> This, along with addition of support for ADM/L4 cache calls a
> MOCS/PAT table update.
> Also add PTE encode functions for MTL as it has different PAT
> index definition than previous platforms.
>
> BSpec: 44509, 45101, 44235
>
> Cc: Matt Roper <matthew.d.roper@intel.com>
> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> Signed-off-by: Madhumitha Tolakanahalli Pradeep <madhumitha.tolakanahalli.pradeep@intel.com>
> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
> Signed-off-by: Fei Yang <fei.yang@intel.com>
> ---
>   drivers/gpu/drm/i915/display/intel_dpt.c |  2 +-
>   drivers/gpu/drm/i915/gt/gen8_ppgtt.c     | 43 ++++++++++++--
>   drivers/gpu/drm/i915/gt/gen8_ppgtt.h     |  3 +
>   drivers/gpu/drm/i915/gt/intel_ggtt.c     | 36 ++++++++++-
>   drivers/gpu/drm/i915/gt/intel_gtt.c      | 23 ++++++-
>   drivers/gpu/drm/i915/gt/intel_gtt.h      | 20 ++++++-
>   drivers/gpu/drm/i915/gt/intel_mocs.c     | 76 ++++++++++++++++++++++--
>   drivers/gpu/drm/i915/gt/selftest_mocs.c  |  2 +-
>   drivers/gpu/drm/i915/i915_pci.c          |  1 +
>   9 files changed, 189 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c b/drivers/gpu/drm/i915/display/intel_dpt.c
> index b8027392144d..c5eacfdba1a5 100644
> --- a/drivers/gpu/drm/i915/display/intel_dpt.c
> +++ b/drivers/gpu/drm/i915/display/intel_dpt.c
> @@ -300,7 +300,7 @@ intel_dpt_create(struct intel_framebuffer *fb)
>   	vm->vma_ops.bind_vma    = dpt_bind_vma;
>   	vm->vma_ops.unbind_vma  = dpt_unbind_vma;
>   
> -	vm->pte_encode = gen8_ggtt_pte_encode;
> +	vm->pte_encode = vm->gt->ggtt->vm.pte_encode;
>   
>   	dpt->obj = dpt_obj;
>   	dpt->obj->is_dpt = true;
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> index 4daaa6f55668..4197b43150cc 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> @@ -55,6 +55,34 @@ static u64 gen8_pte_encode(dma_addr_t addr,
>   	return pte;
>   }
>   
> +static u64 mtl_pte_encode(dma_addr_t addr,
> +			  enum i915_cache_level level,
> +			  u32 flags)
> +{
> +	gen8_pte_t pte = addr | GEN8_PAGE_PRESENT | GEN8_PAGE_RW;
> +
> +	if (unlikely(flags & PTE_READ_ONLY))
> +		pte &= ~GEN8_PAGE_RW;
> +
> +	if (flags & PTE_LM)
> +		pte |= GEN12_PPGTT_PTE_LM | GEN12_PPGTT_PTE_NC;
> +
> +	switch (level) {
> +	case I915_CACHE_NONE:
> +		pte |= GEN12_PPGTT_PTE_PAT1;
> +		break;
> +	case I915_CACHE_LLC:
> +	case I915_CACHE_L3_LLC:
> +		pte |= GEN12_PPGTT_PTE_PAT0 | GEN12_PPGTT_PTE_PAT1;
> +		break;
> +	case I915_CACHE_WT:
> +		pte |= GEN12_PPGTT_PTE_PAT0;
> +		break;
> +	}
> +
> +	return pte;
> +}
> +
>   static void gen8_ppgtt_notify_vgt(struct i915_ppgtt *ppgtt, bool create)
>   {
>   	struct drm_i915_private *i915 = ppgtt->vm.i915;
> @@ -427,7 +455,7 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
>   		      u32 flags)
>   {
>   	struct i915_page_directory *pd;
> -	const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
> +	const gen8_pte_t pte_encode = ppgtt->vm.pte_encode(0, cache_level, flags);
>   	gen8_pte_t *vaddr;
>   
>   	pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2));
> @@ -580,7 +608,7 @@ static void gen8_ppgtt_insert_huge(struct i915_address_space *vm,
>   				   enum i915_cache_level cache_level,
>   				   u32 flags)
>   {
> -	const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
> +	const gen8_pte_t pte_encode = vm->pte_encode(0, cache_level, flags);
>   	unsigned int rem = sg_dma_len(iter->sg);
>   	u64 start = vma_res->start;
>   
> @@ -743,7 +771,7 @@ static void gen8_ppgtt_insert_entry(struct i915_address_space *vm,
>   	GEM_BUG_ON(pt->is_compact);
>   
>   	vaddr = px_vaddr(pt);
> -	vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags);
> +	vaddr[gen8_pd_index(idx, 0)] = vm->pte_encode(addr, level, flags);
>   	drm_clflush_virt_range(&vaddr[gen8_pd_index(idx, 0)], sizeof(*vaddr));
>   }
>   
> @@ -773,7 +801,7 @@ static void __xehpsdv_ppgtt_insert_entry_lm(struct i915_address_space *vm,
>   	}
>   
>   	vaddr = px_vaddr(pt);
> -	vaddr[gen8_pd_index(idx, 0) / 16] = gen8_pte_encode(addr, level, flags);
> +	vaddr[gen8_pd_index(idx, 0) / 16] = vm->pte_encode(addr, level, flags);
>   }
>   
>   static void xehpsdv_ppgtt_insert_entry(struct i915_address_space *vm,
> @@ -820,7 +848,7 @@ static int gen8_init_scratch(struct i915_address_space *vm)
>   		pte_flags |= PTE_LM;
>   
>   	vm->scratch[0]->encode =
> -		gen8_pte_encode(px_dma(vm->scratch[0]),
> +		vm->pte_encode(px_dma(vm->scratch[0]),
>   				I915_CACHE_NONE, pte_flags);
>   
>   	for (i = 1; i <= vm->top; i++) {
> @@ -963,7 +991,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
>   	 */
>   	ppgtt->vm.alloc_scratch_dma = alloc_pt_dma;
>   
> -	ppgtt->vm.pte_encode = gen8_pte_encode;
> +	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 70))
> +		ppgtt->vm.pte_encode = mtl_pte_encode;
> +	else
> +		ppgtt->vm.pte_encode = gen8_pte_encode;
>   
>   	ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND;
>   	ppgtt->vm.insert_entries = gen8_ppgtt_insert;
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
> index f541d19264b4..6b8ce7f4d25a 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
> @@ -18,5 +18,8 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
>   u64 gen8_ggtt_pte_encode(dma_addr_t addr,
>   			 enum i915_cache_level level,
>   			 u32 flags);
> +u64 mtl_ggtt_pte_encode(dma_addr_t addr,
> +			unsigned int pat_index,
> +			u32 flags);
>   
>   #endif
> diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> index 3c7f1ed92f5b..ba3109338aee 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> @@ -220,6 +220,33 @@ static void guc_ggtt_invalidate(struct i915_ggtt *ggtt)
>   	}
>   }
>   
> +u64 mtl_ggtt_pte_encode(dma_addr_t addr,
> +			enum i915_cache_level level,
> +			u32 flags)
> +{
> +	gen8_pte_t pte = addr | GEN8_PAGE_PRESENT;
> +
> +	GEM_BUG_ON(addr & ~GEN12_GGTT_PTE_ADDR_MASK);
> +
> +	if (flags & PTE_LM)
> +		pte |= GEN12_GGTT_PTE_LM;
> +
> +	switch (level) {
> +	case I915_CACHE_NONE:
> +		pte |= MTL_GGTT_PTE_PAT1;
> +		break;
> +	case I915_CACHE_LLC:
> +	case I915_CACHE_L3_LLC:
> +		pte |= MTL_GGTT_PTE_PAT0 | MTL_GGTT_PTE_PAT1;
> +		break;
> +	case I915_CACHE_WT:
> +		pte |= MTL_GGTT_PTE_PAT0;
> +		break;
> +	}
> +
> +	return pte;
> +}
> +
>   u64 gen8_ggtt_pte_encode(dma_addr_t addr,
>   			 enum i915_cache_level level,
>   			 u32 flags)
> @@ -247,7 +274,7 @@ static void gen8_ggtt_insert_page(struct i915_address_space *vm,
>   	gen8_pte_t __iomem *pte =
>   		(gen8_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE;
>   
> -	gen8_set_pte(pte, gen8_ggtt_pte_encode(addr, level, flags));
> +	gen8_set_pte(pte, ggtt->vm.pte_encode(addr, level, flags));
>   
>   	ggtt->invalidate(ggtt);
>   }
> @@ -257,8 +284,8 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
>   				     enum i915_cache_level level,
>   				     u32 flags)
>   {
> -	const gen8_pte_t pte_encode = gen8_ggtt_pte_encode(0, level, flags);
>   	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
> +	const gen8_pte_t pte_encode = ggtt->vm.pte_encode(0, level, flags);
>   	gen8_pte_t __iomem *gte;
>   	gen8_pte_t __iomem *end;
>   	struct sgt_iter iter;
> @@ -981,7 +1008,10 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
>   	ggtt->vm.vma_ops.bind_vma    = intel_ggtt_bind_vma;
>   	ggtt->vm.vma_ops.unbind_vma  = intel_ggtt_unbind_vma;
>   
> -	ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
> +	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
> +		ggtt->vm.pte_encode = mtl_ggtt_pte_encode;
> +	else
> +		ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
>   
>   	return ggtt_probe_common(ggtt, size);
>   }
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
> index 4f436ba7a3c8..1e1b34e22cf5 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> @@ -468,6 +468,25 @@ void gtt_write_workarounds(struct intel_gt *gt)
>   	}
>   }
>   
> +static void mtl_setup_private_ppat(struct intel_uncore *uncore)
> +{
> +	intel_uncore_write(uncore, GEN12_PAT_INDEX(0),
> +			   MTL_PPAT_L4_0_WB);
> +	intel_uncore_write(uncore, GEN12_PAT_INDEX(1),
> +			   MTL_PPAT_L4_1_WT);
> +	intel_uncore_write(uncore, GEN12_PAT_INDEX(2),
> +			   MTL_PPAT_L4_3_UC);
> +	intel_uncore_write(uncore, GEN12_PAT_INDEX(3),
> +			   MTL_PPAT_L4_0_WB | MTL_2_COH_1W);
> +	intel_uncore_write(uncore, GEN12_PAT_INDEX(4),
> +			   MTL_PPAT_L4_0_WB | MTL_3_COH_2W);
> +
> +	/*
> +	 * Remaining PAT entries are left at the hardware-default
> +	 * fully-cached setting
> +	 */
> +}
> +
>   static void tgl_setup_private_ppat(struct intel_uncore *uncore)
>   {
>   	/* TGL doesn't support LLC or AGE settings */
> @@ -603,7 +622,9 @@ void setup_private_pat(struct intel_gt *gt)
>   
>   	GEM_BUG_ON(GRAPHICS_VER(i915) < 8);
>   
> -	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
> +	if (IS_METEORLAKE(i915))
> +		mtl_setup_private_ppat(uncore);


Could you please sync this with DII. We should be programming PAT for 
media tile too.

I have refactor this patch in DII along with taking care of media tile 
and I think we should

get those changes here too.


Regards,

Nirmoy

> +	else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
>   		xehp_setup_private_ppat(gt);
>   	else if (GRAPHICS_VER(i915) >= 12)
>   		tgl_setup_private_ppat(uncore);
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
> index 69ce55f517f5..b632167eaf2e 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> @@ -88,9 +88,18 @@ typedef u64 gen8_pte_t;
>   #define BYT_PTE_SNOOPED_BY_CPU_CACHES	REG_BIT(2)
>   #define BYT_PTE_WRITEABLE		REG_BIT(1)
>   
> +#define GEN12_PPGTT_PTE_PAT3    BIT_ULL(62)
>   #define GEN12_PPGTT_PTE_LM	BIT_ULL(11)
> +#define GEN12_PPGTT_PTE_PAT2    BIT_ULL(7)
> +#define GEN12_PPGTT_PTE_NC      BIT_ULL(5)
> +#define GEN12_PPGTT_PTE_PAT1    BIT_ULL(4)
> +#define GEN12_PPGTT_PTE_PAT0    BIT_ULL(3)
>   
> -#define GEN12_GGTT_PTE_LM	BIT_ULL(1)
> +#define GEN12_GGTT_PTE_LM		BIT_ULL(1)
> +#define MTL_GGTT_PTE_PAT0		BIT_ULL(52)
> +#define MTL_GGTT_PTE_PAT1		BIT_ULL(53)
> +#define GEN12_GGTT_PTE_ADDR_MASK	GENMASK_ULL(45, 12)
> +#define MTL_GGTT_PTE_PAT_MASK		GENMASK_ULL(53, 52)
>   
>   #define GEN12_PDE_64K BIT(6)
>   #define GEN12_PTE_PS64 BIT(8)
> @@ -147,6 +156,15 @@ typedef u64 gen8_pte_t;
>   #define GEN8_PDE_IPS_64K BIT(11)
>   #define GEN8_PDE_PS_2M   BIT(7)
>   
> +#define MTL_PPAT_L4_CACHE_POLICY_MASK	REG_GENMASK(3, 2)
> +#define MTL_PAT_INDEX_COH_MODE_MASK	REG_GENMASK(1, 0)
> +#define MTL_PPAT_L4_3_UC	REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 3)
> +#define MTL_PPAT_L4_1_WT	REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 1)
> +#define MTL_PPAT_L4_0_WB	REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 0)
> +#define MTL_3_COH_2W	REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 3)
> +#define MTL_2_COH_1W	REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 2)
> +#define MTL_0_COH_NON	REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 0)
> +
>   enum i915_cache_level;
>   
>   struct drm_i915_gem_object;
> diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c
> index 69b489e8dfed..89570f137b2c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_mocs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
> @@ -40,6 +40,10 @@ struct drm_i915_mocs_table {
>   #define LE_COS(value)		((value) << 15)
>   #define LE_SSE(value)		((value) << 17)
>   
> +/* Defines for the tables (GLOB_MOCS_0 - GLOB_MOCS_16) */
> +#define _L4_CACHEABILITY(value)	((value) << 2)
> +#define IG_PAT(value)		((value) << 8)
> +
>   /* Defines for the tables (LNCFMOCS0 - LNCFMOCS31) - two entries per word */
>   #define L3_ESC(value)		((value) << 0)
>   #define L3_SCC(value)		((value) << 1)
> @@ -50,6 +54,7 @@ struct drm_i915_mocs_table {
>   /* Helper defines */
>   #define GEN9_NUM_MOCS_ENTRIES	64  /* 63-64 are reserved, but configured. */
>   #define PVC_NUM_MOCS_ENTRIES	3
> +#define MTL_NUM_MOCS_ENTRIES	16
>   
>   /* (e)LLC caching options */
>   /*
> @@ -73,6 +78,12 @@ struct drm_i915_mocs_table {
>   #define L3_2_RESERVED		_L3_CACHEABILITY(2)
>   #define L3_3_WB			_L3_CACHEABILITY(3)
>   
> +/* L4 caching options */
> +#define L4_0_WB			_L4_CACHEABILITY(0)
> +#define L4_1_WT			_L4_CACHEABILITY(1)
> +#define L4_2_RESERVED		_L4_CACHEABILITY(2)
> +#define L4_3_UC			_L4_CACHEABILITY(3)
> +
>   #define MOCS_ENTRY(__idx, __control_value, __l3cc_value) \
>   	[__idx] = { \
>   		.control_value = __control_value, \
> @@ -416,6 +427,57 @@ static const struct drm_i915_mocs_entry pvc_mocs_table[] = {
>   	MOCS_ENTRY(2, 0, L3_3_WB),
>   };
>   
> +static const struct drm_i915_mocs_entry mtl_mocs_table[] = {
> +	/* Error - Reserved for Non-Use */
> +	MOCS_ENTRY(0,
> +		   IG_PAT(0),
> +		   L3_LKUP(1) | L3_3_WB),
> +	/* Cached - L3 + L4 */
> +	MOCS_ENTRY(1,
> +		   IG_PAT(1),
> +		   L3_LKUP(1) | L3_3_WB),
> +	/* L4 - GO:L3 */
> +	MOCS_ENTRY(2,
> +		   IG_PAT(1),
> +		   L3_LKUP(1) | L3_1_UC),
> +	/* Uncached - GO:L3 */
> +	MOCS_ENTRY(3,
> +		   IG_PAT(1) | L4_3_UC,
> +		   L3_LKUP(1) | L3_1_UC),
> +	/* L4 - GO:Mem */
> +	MOCS_ENTRY(4,
> +		   IG_PAT(1),
> +		   L3_LKUP(1) | L3_GLBGO(1) | L3_1_UC),
> +	/* Uncached - GO:Mem */
> +	MOCS_ENTRY(5,
> +		   IG_PAT(1) | L4_3_UC,
> +		   L3_LKUP(1) | L3_GLBGO(1) | L3_1_UC),
> +	/* L4 - L3:NoLKUP; GO:L3 */
> +	MOCS_ENTRY(6,
> +		   IG_PAT(1),
> +		   L3_1_UC),
> +	/* Uncached - L3:NoLKUP; GO:L3 */
> +	MOCS_ENTRY(7,
> +		   IG_PAT(1) | L4_3_UC,
> +		   L3_1_UC),
> +	/* L4 - L3:NoLKUP; GO:Mem */
> +	MOCS_ENTRY(8,
> +		   IG_PAT(1),
> +		   L3_GLBGO(1) | L3_1_UC),
> +	/* Uncached - L3:NoLKUP; GO:Mem */
> +	MOCS_ENTRY(9,
> +		   IG_PAT(1) | L4_3_UC,
> +		   L3_GLBGO(1) | L3_1_UC),
> +	/* Display - L3; L4:WT */
> +	MOCS_ENTRY(14,
> +		   IG_PAT(1) | L4_1_WT,
> +		   L3_LKUP(1) | L3_3_WB),
> +	/* CCS - Non-Displayable */
> +	MOCS_ENTRY(15,
> +		   IG_PAT(1),
> +		   L3_GLBGO(1) | L3_1_UC),
> +};
> +
>   enum {
>   	HAS_GLOBAL_MOCS = BIT(0),
>   	HAS_ENGINE_MOCS = BIT(1),
> @@ -445,7 +507,13 @@ static unsigned int get_mocs_settings(const struct drm_i915_private *i915,
>   	memset(table, 0, sizeof(struct drm_i915_mocs_table));
>   
>   	table->unused_entries_index = I915_MOCS_PTE;
> -	if (IS_PONTEVECCHIO(i915)) {
> +	if (IS_METEORLAKE(i915)) {
> +		table->size = ARRAY_SIZE(mtl_mocs_table);
> +		table->table = mtl_mocs_table;
> +		table->n_entries = MTL_NUM_MOCS_ENTRIES;
> +		table->uc_index = 9;
> +		table->unused_entries_index = 1;
> +	} else if (IS_PONTEVECCHIO(i915)) {
>   		table->size = ARRAY_SIZE(pvc_mocs_table);
>   		table->table = pvc_mocs_table;
>   		table->n_entries = PVC_NUM_MOCS_ENTRIES;
> @@ -646,9 +714,9 @@ void intel_mocs_init_engine(struct intel_engine_cs *engine)
>   		init_l3cc_table(engine->gt, &table);
>   }
>   
> -static u32 global_mocs_offset(void)
> +static u32 global_mocs_offset(struct intel_gt *gt)
>   {
> -	return i915_mmio_reg_offset(GEN12_GLOBAL_MOCS(0));
> +	return i915_mmio_reg_offset(GEN12_GLOBAL_MOCS(0)) + gt->uncore->gsi_offset;
>   }
>   
>   void intel_set_mocs_index(struct intel_gt *gt)
> @@ -671,7 +739,7 @@ void intel_mocs_init(struct intel_gt *gt)
>   	 */
>   	flags = get_mocs_settings(gt->i915, &table);
>   	if (flags & HAS_GLOBAL_MOCS)
> -		__init_mocs_table(gt->uncore, &table, global_mocs_offset());
> +		__init_mocs_table(gt->uncore, &table, global_mocs_offset(gt));
>   
>   	/*
>   	 * Initialize the L3CC table as part of mocs initalization to make
> diff --git a/drivers/gpu/drm/i915/gt/selftest_mocs.c b/drivers/gpu/drm/i915/gt/selftest_mocs.c
> index ca009a6a13bd..730796346514 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_mocs.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_mocs.c
> @@ -137,7 +137,7 @@ static int read_mocs_table(struct i915_request *rq,
>   		return 0;
>   
>   	if (HAS_GLOBAL_MOCS_REGISTERS(rq->engine->i915))
> -		addr = global_mocs_offset();
> +		addr = global_mocs_offset(rq->engine->gt);
>   	else
>   		addr = mocs_offset(rq->engine);
>   
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index 621730b6551c..480b128499ae 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -1149,6 +1149,7 @@ static const struct intel_device_info mtl_info = {
>   	.has_flat_ccs = 0,
>   	.has_gmd_id = 1,
>   	.has_guc_deprivilege = 1,
> +	.has_llc = 0,
>   	.has_mslice_steering = 0,
>   	.has_snoop = 1,
>   	.__runtime.memory_regions = REGION_SMEM | REGION_STOLEN_LMEM,

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation
  2023-04-01  6:38 ` [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation fei.yang
  2023-04-03 16:02   ` Ville Syrjälä
  2023-04-04  7:29   ` Lionel Landwerlin
@ 2023-04-06  9:11   ` Matthew Auld
  2 siblings, 0 replies; 35+ messages in thread
From: Matthew Auld @ 2023-04-06  9:11 UTC (permalink / raw)
  To: fei.yang; +Cc: Chris Wilson, intel-gfx, Matt Roper, dri-devel

On Sat, 1 Apr 2023 at 07:37, <fei.yang@intel.com> wrote:
>
> From: Fei Yang <fei.yang@intel.com>
>
> To comply with the design that buffer objects shall have immutable
> cache setting through out its life cycle, {set, get}_caching ioctl's
> are no longer supported from MTL onward. With that change caching
> policy can only be set at object creation time. The current code
> applies a default (platform dependent) cache setting for all objects.
> However this is not optimal for performance tuning. The patch extends
> the existing gem_create uAPI to let user set PAT index for the object
> at creation time.
> The new extension is platform independent, so UMD's can switch to using
> this extension for older platforms as well, while {set, get}_caching are
> still supported on these legacy paltforms for compatibility reason.

Do we forbid {set, get}_caching, when combined with this new extension
on the same BO? There is some documentation in @cache_dirty. The
concern is being able to subvert the flush-on-acquire for non-LLC.

>
> Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
> Cc: Matt Roper <matthew.d.roper@intel.com>
> Signed-off-by: Fei Yang <fei.yang@intel.com>
> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_create.c | 33 ++++++++++++++++++++
>  include/uapi/drm/i915_drm.h                | 36 ++++++++++++++++++++++
>  tools/include/uapi/drm/i915_drm.h          | 36 ++++++++++++++++++++++
>  3 files changed, 105 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c b/drivers/gpu/drm/i915/gem/i915_gem_create.c
> index e76c9703680e..1c6e2034d28e 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
> @@ -244,6 +244,7 @@ struct create_ext {
>         unsigned int n_placements;
>         unsigned int placement_mask;
>         unsigned long flags;
> +       unsigned int pat_index;
>  };
>
>  static void repr_placements(char *buf, size_t size,
> @@ -393,11 +394,39 @@ static int ext_set_protected(struct i915_user_extension __user *base, void *data
>         return 0;
>  }
>
> +static int ext_set_pat(struct i915_user_extension __user *base, void *data)
> +{
> +       struct create_ext *ext_data = data;
> +       struct drm_i915_private *i915 = ext_data->i915;
> +       struct drm_i915_gem_create_ext_set_pat ext;
> +       unsigned int max_pat_index;
> +
> +       BUILD_BUG_ON(sizeof(struct drm_i915_gem_create_ext_set_pat) !=
> +                    offsetofend(struct drm_i915_gem_create_ext_set_pat, rsvd));
> +
> +       if (copy_from_user(&ext, base, sizeof(ext)))
> +               return -EFAULT;
> +
> +       max_pat_index = INTEL_INFO(i915)->max_pat_index;
> +
> +       if (ext.pat_index > max_pat_index) {
> +               drm_dbg(&i915->drm, "PAT index is invalid: %u\n",
> +                       ext.pat_index);
> +               return -EINVAL;
> +       }
> +
> +       ext_data->pat_index = ext.pat_index;
> +
> +       return 0;
> +}
> +
>  static const i915_user_extension_fn create_extensions[] = {
>         [I915_GEM_CREATE_EXT_MEMORY_REGIONS] = ext_set_placements,
>         [I915_GEM_CREATE_EXT_PROTECTED_CONTENT] = ext_set_protected,
> +       [I915_GEM_CREATE_EXT_SET_PAT] = ext_set_pat,
>  };
>
> +#define PAT_INDEX_NOT_SET      0xffff
>  /**
>   * Creates a new mm object and returns a handle to it.
>   * @dev: drm device pointer
> @@ -417,6 +446,7 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void *data,
>         if (args->flags & ~I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS)
>                 return -EINVAL;
>
> +       ext_data.pat_index = PAT_INDEX_NOT_SET;
>         ret = i915_user_extensions(u64_to_user_ptr(args->extensions),
>                                    create_extensions,
>                                    ARRAY_SIZE(create_extensions),
> @@ -453,5 +483,8 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void *data,
>         if (IS_ERR(obj))
>                 return PTR_ERR(obj);
>
> +       if (ext_data.pat_index != PAT_INDEX_NOT_SET)
> +               i915_gem_object_set_pat_index(obj, ext_data.pat_index);
> +
>         return i915_gem_publish(obj, file, &args->size, &args->handle);
>  }
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index dba7c5a5b25e..03c5c314846e 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -3630,9 +3630,13 @@ struct drm_i915_gem_create_ext {
>          *
>          * For I915_GEM_CREATE_EXT_PROTECTED_CONTENT usage see
>          * struct drm_i915_gem_create_ext_protected_content.
> +        *
> +        * For I915_GEM_CREATE_EXT_SET_PAT usage see
> +        * struct drm_i915_gem_create_ext_set_pat.
>          */
>  #define I915_GEM_CREATE_EXT_MEMORY_REGIONS 0
>  #define I915_GEM_CREATE_EXT_PROTECTED_CONTENT 1
> +#define I915_GEM_CREATE_EXT_SET_PAT 2
>         __u64 extensions;
>  };
>
> @@ -3747,6 +3751,38 @@ struct drm_i915_gem_create_ext_protected_content {
>         __u32 flags;
>  };
>
> +/**
> + * struct drm_i915_gem_create_ext_set_pat - The
> + * I915_GEM_CREATE_EXT_SET_PAT extension.
> + *
> + * If this extension is provided, the specified caching policy (PAT index) is
> + * applied to the buffer object.
> + *
> + * Below is an example on how to create an object with specific caching policy:
> + *
> + * .. code-block:: C
> + *
> + *      struct drm_i915_gem_create_ext_set_pat set_pat_ext = {
> + *              .base = { .name = I915_GEM_CREATE_EXT_SET_PAT },
> + *              .pat_index = 0,
> + *      };
> + *      struct drm_i915_gem_create_ext create_ext = {
> + *              .size = PAGE_SIZE,
> + *              .extensions = (uintptr_t)&set_pat_ext,
> + *      };
> + *
> + *      int err = ioctl(fd, DRM_IOCTL_I915_GEM_CREATE_EXT, &create_ext);
> + *      if (err) ...
> + */
> +struct drm_i915_gem_create_ext_set_pat {
> +       /** @base: Extension link. See struct i915_user_extension. */
> +       struct i915_user_extension base;
> +       /** @pat_index: PAT index to be set */
> +       __u32 pat_index;
> +       /** @rsvd: reserved for future use */
> +       __u32 rsvd;
> +};
> +
>  /* ID of the protected content session managed by i915 when PXP is active */
>  #define I915_PROTECTED_CONTENT_DEFAULT_SESSION 0xf
>
> diff --git a/tools/include/uapi/drm/i915_drm.h b/tools/include/uapi/drm/i915_drm.h
> index 8df261c5ab9b..8cdcdb5fac26 100644
> --- a/tools/include/uapi/drm/i915_drm.h
> +++ b/tools/include/uapi/drm/i915_drm.h
> @@ -3607,9 +3607,13 @@ struct drm_i915_gem_create_ext {
>          *
>          * For I915_GEM_CREATE_EXT_PROTECTED_CONTENT usage see
>          * struct drm_i915_gem_create_ext_protected_content.
> +        *
> +        * For I915_GEM_CREATE_EXT_SET_PAT usage see
> +        * struct drm_i915_gem_create_ext_set_pat.
>          */
>  #define I915_GEM_CREATE_EXT_MEMORY_REGIONS 0
>  #define I915_GEM_CREATE_EXT_PROTECTED_CONTENT 1
> +#define I915_GEM_CREATE_EXT_SET_PAT 2
>         __u64 extensions;
>  };
>
> @@ -3724,6 +3728,38 @@ struct drm_i915_gem_create_ext_protected_content {
>         __u32 flags;
>  };
>
> +/**
> + * struct drm_i915_gem_create_ext_set_pat - The
> + * I915_GEM_CREATE_EXT_SET_PAT extension.
> + *
> + * If this extension is provided, the specified caching policy (PAT index) is
> + * applied to the buffer object.
> + *
> + * Below is an example on how to create an object with specific caching policy:
> + *
> + * .. code-block:: C
> + *
> + *      struct drm_i915_gem_create_ext_set_pat set_pat_ext = {
> + *              .base = { .name = I915_GEM_CREATE_EXT_SET_PAT },
> + *              .pat_index = 0,
> + *      };
> + *      struct drm_i915_gem_create_ext create_ext = {
> + *              .size = PAGE_SIZE,
> + *              .extensions = (uintptr_t)&set_pat_ext,
> + *      };
> + *
> + *      int err = ioctl(fd, DRM_IOCTL_I915_GEM_CREATE_EXT, &create_ext);
> + *      if (err) ...
> + */
> +struct drm_i915_gem_create_ext_set_pat {
> +       /** @base: Extension link. See struct i915_user_extension. */
> +       struct i915_user_extension base;
> +       /** @pat_index: PAT index to be set */
> +       __u32 pat_index;
> +       /** @rsvd: reserved for future use */
> +       __u32 rsvd;
> +};
> +
>  /* ID of the protected content session managed by i915 when PXP is active */
>  #define I915_PROTECTED_CONTENT_DEFAULT_SESSION 0xf
>
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Intel-gfx] [PATCH 1/7] drm/i915/mtl: Define MOCS and PAT tables for MTL
  2023-04-06  8:28   ` Das, Nirmoy
@ 2023-04-06 14:55     ` Yang, Fei
  2023-04-06 18:13       ` Das, Nirmoy
  0 siblings, 1 reply; 35+ messages in thread
From: Yang, Fei @ 2023-04-06 14:55 UTC (permalink / raw)
  To: Das, Nirmoy, intel-gfx; +Cc: Roper, Matthew D, De Marchi, Lucas, dri-devel

[-- Attachment #1: Type: text/plain, Size: 19845 bytes --]

> On 4/1/2023 8:38 AM, fei.yang@intel.com wrote:
>> From: Fei Yang <fei.yang@intel.com>
>>
>> On MTL, GT can no longer allocate on LLC - only the CPU can.
>> This, along with addition of support for ADM/L4 cache calls a
>> MOCS/PAT table update.
>> Also add PTE encode functions for MTL as it has different PAT
>> index definition than previous platforms.
>>
>> BSpec: 44509, 45101, 44235
>>
>> Cc: Matt Roper <matthew.d.roper@intel.com>
>> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
>> Signed-off-by: Madhumitha Tolakanahalli Pradeep <madhumitha.tolakanahalli.pradeep@intel.com>
>> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
>> Signed-off-by: Fei Yang <fei.yang@intel.com>
>> ---
>>   drivers/gpu/drm/i915/display/intel_dpt.c |  2 +-
>>   drivers/gpu/drm/i915/gt/gen8_ppgtt.c     | 43 ++++++++++++--
>>   drivers/gpu/drm/i915/gt/gen8_ppgtt.h     |  3 +
>>   drivers/gpu/drm/i915/gt/intel_ggtt.c     | 36 ++++++++++-
>>   drivers/gpu/drm/i915/gt/intel_gtt.c      | 23 ++++++-
>>   drivers/gpu/drm/i915/gt/intel_gtt.h      | 20 ++++++-
>>   drivers/gpu/drm/i915/gt/intel_mocs.c     | 76 ++++++++++++++++++++++--
>>   drivers/gpu/drm/i915/gt/selftest_mocs.c  |  2 +-
>>   drivers/gpu/drm/i915/i915_pci.c          |  1 +
>>   9 files changed, 189 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c b/drivers/gpu/drm/i915/display/intel_dpt.c
>> index b8027392144d..c5eacfdba1a5 100644
>> --- a/drivers/gpu/drm/i915/display/intel_dpt.c
>> +++ b/drivers/gpu/drm/i915/display/intel_dpt.c
>> @@ -300,7 +300,7 @@ intel_dpt_create(struct intel_framebuffer *fb)
>>        vm->vma_ops.bind_vma    = dpt_bind_vma;
>>        vm->vma_ops.unbind_vma  = dpt_unbind_vma;
>>
>> -     vm->pte_encode = gen8_ggtt_pte_encode;
>> +     vm->pte_encode = vm->gt->ggtt->vm.pte_encode;
>>
>>        dpt->obj = dpt_obj;
>>        dpt->obj->is_dpt = true;
>> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
>> index 4daaa6f55668..4197b43150cc 100644
>> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
>> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
>> @@ -55,6 +55,34 @@ static u64 gen8_pte_encode(dma_addr_t addr,
>>        return pte;
>>   }
>>
>> +static u64 mtl_pte_encode(dma_addr_t addr,
>> +                       enum i915_cache_level level,
>> +                       u32 flags)
>> +{
>> +     gen8_pte_t pte = addr | GEN8_PAGE_PRESENT | GEN8_PAGE_RW;
>> +
>> +     if (unlikely(flags & PTE_READ_ONLY))
>> +             pte &= ~GEN8_PAGE_RW;
>> +
>> +     if (flags & PTE_LM)
>> +             pte |= GEN12_PPGTT_PTE_LM | GEN12_PPGTT_PTE_NC;
>> +
>> +     switch (level) {
>> +     case I915_CACHE_NONE:
>> +             pte |= GEN12_PPGTT_PTE_PAT1;
>> +             break;
>> +     case I915_CACHE_LLC:
>> +     case I915_CACHE_L3_LLC:
>> +             pte |= GEN12_PPGTT_PTE_PAT0 | GEN12_PPGTT_PTE_PAT1;
>> +             break;
>> +     case I915_CACHE_WT:
>> +             pte |= GEN12_PPGTT_PTE_PAT0;
>> +             break;
>> +     }
>> +
>> +     return pte;
>> +}
>> +
>>   static void gen8_ppgtt_notify_vgt(struct i915_ppgtt *ppgtt, bool create)
>>   {
>>        struct drm_i915_private *i915 = ppgtt->vm.i915;
>> @@ -427,7 +455,7 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
>>                      u32 flags)
>>   {
>>        struct i915_page_directory *pd;
>> -     const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
>> +     const gen8_pte_t pte_encode = ppgtt->vm.pte_encode(0, cache_level, flags);
>>        gen8_pte_t *vaddr;
>>
>>        pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2));
>> @@ -580,7 +608,7 @@ static void gen8_ppgtt_insert_huge(struct i915_address_space *vm,
>>                                   enum i915_cache_level cache_level,
>>                                   u32 flags)
>>   {
>> -     const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
>> +     const gen8_pte_t pte_encode = vm->pte_encode(0, cache_level, flags);
>>        unsigned int rem = sg_dma_len(iter->sg);
>>        u64 start = vma_res->start;
>>
>> @@ -743,7 +771,7 @@ static void gen8_ppgtt_insert_entry(struct i915_address_space *vm,
>>        GEM_BUG_ON(pt->is_compact);
>>
>>        vaddr = px_vaddr(pt);
>> -     vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags);
>> +     vaddr[gen8_pd_index(idx, 0)] = vm->pte_encode(addr, level, flags);
>>        drm_clflush_virt_range(&vaddr[gen8_pd_index(idx, 0)], sizeof(*vaddr));
>>   }
>>
>> @@ -773,7 +801,7 @@ static void __xehpsdv_ppgtt_insert_entry_lm(struct i915_address_space *vm,
>>        }
>>
>>        vaddr = px_vaddr(pt);
>> -     vaddr[gen8_pd_index(idx, 0) / 16] = gen8_pte_encode(addr, level, flags);
>> +     vaddr[gen8_pd_index(idx, 0) / 16] = vm->pte_encode(addr, level, flags);
>>   }
>>
>>   static void xehpsdv_ppgtt_insert_entry(struct i915_address_space *vm,
>> @@ -820,7 +848,7 @@ static int gen8_init_scratch(struct i915_address_space *vm)
>>                pte_flags |= PTE_LM;
>>
>>        vm->scratch[0]->encode =
>> -             gen8_pte_encode(px_dma(vm->scratch[0]),
>> +             vm->pte_encode(px_dma(vm->scratch[0]),
>>                                I915_CACHE_NONE, pte_flags);
>>
>>        for (i = 1; i <= vm->top; i++) {
>> @@ -963,7 +991,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
>>         */
>>        ppgtt->vm.alloc_scratch_dma = alloc_pt_dma;
>>
>> -     ppgtt->vm.pte_encode = gen8_pte_encode;
>> +     if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 70))
>> +             ppgtt->vm.pte_encode = mtl_pte_encode;
>> +     else
>> +             ppgtt->vm.pte_encode = gen8_pte_encode;
>>
>>        ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND;
>>        ppgtt->vm.insert_entries = gen8_ppgtt_insert;
>> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
>> index f541d19264b4..6b8ce7f4d25a 100644
>> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
>> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
>> @@ -18,5 +18,8 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
>>   u64 gen8_ggtt_pte_encode(dma_addr_t addr,
>>                         enum i915_cache_level level,
>>                         u32 flags);
>> +u64 mtl_ggtt_pte_encode(dma_addr_t addr,
>> +                     unsigned int pat_index,
>> +                     u32 flags);
>>
>>   #endif
>> diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
>> index 3c7f1ed92f5b..ba3109338aee 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
>> @@ -220,6 +220,33 @@ static void guc_ggtt_invalidate(struct i915_ggtt *ggtt)
>>        }
>>   }
>>
>> +u64 mtl_ggtt_pte_encode(dma_addr_t addr,
>> +                     enum i915_cache_level level,
>> +                     u32 flags)
>> +{
>> +     gen8_pte_t pte = addr | GEN8_PAGE_PRESENT;
>> +
>> +     GEM_BUG_ON(addr & ~GEN12_GGTT_PTE_ADDR_MASK);
>> +
>> +     if (flags & PTE_LM)
>> +             pte |= GEN12_GGTT_PTE_LM;
>> +
>> +     switch (level) {
>> +     case I915_CACHE_NONE:
>> +             pte |= MTL_GGTT_PTE_PAT1;
>> +             break;
>> +     case I915_CACHE_LLC:
>> +     case I915_CACHE_L3_LLC:
>> +             pte |= MTL_GGTT_PTE_PAT0 | MTL_GGTT_PTE_PAT1;
>> +             break;
>> +     case I915_CACHE_WT:
>> +             pte |= MTL_GGTT_PTE_PAT0;
>> +             break;
>> +     }
>> +
>> +     return pte;
>> +}
>> +
>>   u64 gen8_ggtt_pte_encode(dma_addr_t addr,
>>                         enum i915_cache_level level,
>>                         u32 flags)
>> @@ -247,7 +274,7 @@ static void gen8_ggtt_insert_page(struct i915_address_space *vm,
>>        gen8_pte_t __iomem *pte =
>>                (gen8_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE;
>>
>> -     gen8_set_pte(pte, gen8_ggtt_pte_encode(addr, level, flags));
>> +     gen8_set_pte(pte, ggtt->vm.pte_encode(addr, level, flags));
>>
>>        ggtt->invalidate(ggtt);
>>   }
>> @@ -257,8 +284,8 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
>>                                     enum i915_cache_level level,
>>                                     u32 flags)
>>   {
>> -     const gen8_pte_t pte_encode = gen8_ggtt_pte_encode(0, level, flags);
>>        struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
>> +     const gen8_pte_t pte_encode = ggtt->vm.pte_encode(0, level, flags);
>>        gen8_pte_t __iomem *gte;
>>        gen8_pte_t __iomem *end;
>>        struct sgt_iter iter;
>> @@ -981,7 +1008,10 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
>>        ggtt->vm.vma_ops.bind_vma    = intel_ggtt_bind_vma;
>>        ggtt->vm.vma_ops.unbind_vma  = intel_ggtt_unbind_vma;
>>
>> -     ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
>> +     if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
>> +             ggtt->vm.pte_encode = mtl_ggtt_pte_encode;
>> +     else
>> +             ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
>>
>>        return ggtt_probe_common(ggtt, size);
>>   }
>> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
>> index 4f436ba7a3c8..1e1b34e22cf5 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
>> @@ -468,6 +468,25 @@ void gtt_write_workarounds(struct intel_gt *gt)
>>        }
>>   }
>>
>> +static void mtl_setup_private_ppat(struct intel_uncore *uncore)
>> +{
>> +     intel_uncore_write(uncore, GEN12_PAT_INDEX(0),
>> +                        MTL_PPAT_L4_0_WB);
>> +     intel_uncore_write(uncore, GEN12_PAT_INDEX(1),
>> +                        MTL_PPAT_L4_1_WT);
>> +     intel_uncore_write(uncore, GEN12_PAT_INDEX(2),
>> +                        MTL_PPAT_L4_3_UC);
>> +     intel_uncore_write(uncore, GEN12_PAT_INDEX(3),
>> +                        MTL_PPAT_L4_0_WB | MTL_2_COH_1W);
>> +     intel_uncore_write(uncore, GEN12_PAT_INDEX(4),
>> +                        MTL_PPAT_L4_0_WB | MTL_3_COH_2W);
>> +
>> +     /*
>> +      * Remaining PAT entries are left at the hardware-default
>> +      * fully-cached setting
>> +      */
>> +}
>> +
>>   static void tgl_setup_private_ppat(struct intel_uncore *uncore)
>>   {
>>        /* TGL doesn't support LLC or AGE settings */
>> @@ -603,7 +622,9 @@ void setup_private_pat(struct intel_gt *gt)
>>
>>        GEM_BUG_ON(GRAPHICS_VER(i915) < 8);
>>
>> -     if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
>> +     if (IS_METEORLAKE(i915))
>> +             mtl_setup_private_ppat(uncore);
>
>
> Could you please sync this with DII. We should be programming PAT for
> media tile too.
>
> I have refactor this patch in DII along with taking care of media tile
> and I think we should
>
> get those changes here too.

I don't think the PAT index registers are multicasted for MTL. The registers
are at 0x4800 for the render tile and 0x384800 for the media tile, and they
get programmed in gt_init separately when iterating through each gt. No?

-Fei

> Regards,
>
> Nirmoy
>
>> +     else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
>>                xehp_setup_private_ppat(gt);
>>        else if (GRAPHICS_VER(i915) >= 12)
>>                tgl_setup_private_ppat(uncore);
>> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
>> index 69ce55f517f5..b632167eaf2e 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
>> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
>> @@ -88,9 +88,18 @@ typedef u64 gen8_pte_t;
>>   #define BYT_PTE_SNOOPED_BY_CPU_CACHES       REG_BIT(2)
>>   #define BYT_PTE_WRITEABLE           REG_BIT(1)
>>
>> +#define GEN12_PPGTT_PTE_PAT3    BIT_ULL(62)
>>   #define GEN12_PPGTT_PTE_LM  BIT_ULL(11)
>> +#define GEN12_PPGTT_PTE_PAT2    BIT_ULL(7)
>> +#define GEN12_PPGTT_PTE_NC      BIT_ULL(5)
>> +#define GEN12_PPGTT_PTE_PAT1    BIT_ULL(4)
>> +#define GEN12_PPGTT_PTE_PAT0    BIT_ULL(3)
>>
>> -#define GEN12_GGTT_PTE_LM    BIT_ULL(1)
>> +#define GEN12_GGTT_PTE_LM            BIT_ULL(1)
>> +#define MTL_GGTT_PTE_PAT0            BIT_ULL(52)
>> +#define MTL_GGTT_PTE_PAT1            BIT_ULL(53)
>> +#define GEN12_GGTT_PTE_ADDR_MASK     GENMASK_ULL(45, 12)
>> +#define MTL_GGTT_PTE_PAT_MASK                GENMASK_ULL(53, 52)
>>
>>   #define GEN12_PDE_64K BIT(6)
>>   #define GEN12_PTE_PS64 BIT(8)
>> @@ -147,6 +156,15 @@ typedef u64 gen8_pte_t;
>>   #define GEN8_PDE_IPS_64K BIT(11)
>>   #define GEN8_PDE_PS_2M   BIT(7)
>>
>> +#define MTL_PPAT_L4_CACHE_POLICY_MASK        REG_GENMASK(3, 2)
>> +#define MTL_PAT_INDEX_COH_MODE_MASK  REG_GENMASK(1, 0)
>> +#define MTL_PPAT_L4_3_UC     REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 3)
>> +#define MTL_PPAT_L4_1_WT     REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 1)
>> +#define MTL_PPAT_L4_0_WB     REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 0)
>> +#define MTL_3_COH_2W REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 3)
>> +#define MTL_2_COH_1W REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 2)
>> +#define MTL_0_COH_NON        REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 0)
>> +
>>   enum i915_cache_level;
>>
>>   struct drm_i915_gem_object;
>> diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c
>> index 69b489e8dfed..89570f137b2c 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_mocs.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
>> @@ -40,6 +40,10 @@ struct drm_i915_mocs_table {
>>   #define LE_COS(value)               ((value) << 15)
>>   #define LE_SSE(value)               ((value) << 17)
>>
>> +/* Defines for the tables (GLOB_MOCS_0 - GLOB_MOCS_16) */
>> +#define _L4_CACHEABILITY(value)      ((value) << 2)
>> +#define IG_PAT(value)                ((value) << 8)
>> +
>>   /* Defines for the tables (LNCFMOCS0 - LNCFMOCS31) - two entries per word */
>>   #define L3_ESC(value)               ((value) << 0)
>>   #define L3_SCC(value)               ((value) << 1)
>> @@ -50,6 +54,7 @@ struct drm_i915_mocs_table {
>>   /* Helper defines */
>>   #define GEN9_NUM_MOCS_ENTRIES       64  /* 63-64 are reserved, but configured. */
>>   #define PVC_NUM_MOCS_ENTRIES        3
>> +#define MTL_NUM_MOCS_ENTRIES 16
>>
>>   /* (e)LLC caching options */
>>   /*
>> @@ -73,6 +78,12 @@ struct drm_i915_mocs_table {
>>   #define L3_2_RESERVED               _L3_CACHEABILITY(2)
>>   #define L3_3_WB                     _L3_CACHEABILITY(3)
>>
>> +/* L4 caching options */
>> +#define L4_0_WB                      _L4_CACHEABILITY(0)
>> +#define L4_1_WT                      _L4_CACHEABILITY(1)
>> +#define L4_2_RESERVED                _L4_CACHEABILITY(2)
>> +#define L4_3_UC                      _L4_CACHEABILITY(3)
>> +
>>   #define MOCS_ENTRY(__idx, __control_value, __l3cc_value) \
>>        [__idx] = { \
>>                .control_value = __control_value, \
>> @@ -416,6 +427,57 @@ static const struct drm_i915_mocs_entry pvc_mocs_table[] = {
>>        MOCS_ENTRY(2, 0, L3_3_WB),
>>   };
>>
>> +static const struct drm_i915_mocs_entry mtl_mocs_table[] = {
>> +     /* Error - Reserved for Non-Use */
>> +     MOCS_ENTRY(0,
>> +                IG_PAT(0),
>> +                L3_LKUP(1) | L3_3_WB),
>> +     /* Cached - L3 + L4 */
>> +     MOCS_ENTRY(1,
>> +                IG_PAT(1),
>> +                L3_LKUP(1) | L3_3_WB),
>> +     /* L4 - GO:L3 */
>> +     MOCS_ENTRY(2,
>> +                IG_PAT(1),
>> +                L3_LKUP(1) | L3_1_UC),
>> +     /* Uncached - GO:L3 */
>> +     MOCS_ENTRY(3,
>> +                IG_PAT(1) | L4_3_UC,
>> +                L3_LKUP(1) | L3_1_UC),
>> +     /* L4 - GO:Mem */
>> +     MOCS_ENTRY(4,
>> +                IG_PAT(1),
>> +                L3_LKUP(1) | L3_GLBGO(1) | L3_1_UC),
>> +     /* Uncached - GO:Mem */
>> +     MOCS_ENTRY(5,
>> +                IG_PAT(1) | L4_3_UC,
>> +                L3_LKUP(1) | L3_GLBGO(1) | L3_1_UC),
>> +     /* L4 - L3:NoLKUP; GO:L3 */
>> +     MOCS_ENTRY(6,
>> +                IG_PAT(1),
>> +                L3_1_UC),
>> +     /* Uncached - L3:NoLKUP; GO:L3 */
>> +     MOCS_ENTRY(7,
>> +                IG_PAT(1) | L4_3_UC,
>> +                L3_1_UC),
>> +     /* L4 - L3:NoLKUP; GO:Mem */
>> +     MOCS_ENTRY(8,
>> +                IG_PAT(1),
>> +                L3_GLBGO(1) | L3_1_UC),
>> +     /* Uncached - L3:NoLKUP; GO:Mem */
>> +     MOCS_ENTRY(9,
>> +                IG_PAT(1) | L4_3_UC,
>> +                L3_GLBGO(1) | L3_1_UC),
>> +     /* Display - L3; L4:WT */
>> +     MOCS_ENTRY(14,
>> +                IG_PAT(1) | L4_1_WT,
>> +                L3_LKUP(1) | L3_3_WB),
>> +     /* CCS - Non-Displayable */
>> +     MOCS_ENTRY(15,
>> +                IG_PAT(1),
>> +                L3_GLBGO(1) | L3_1_UC),
>> +};
>> +
>>   enum {
>>        HAS_GLOBAL_MOCS = BIT(0),
>>        HAS_ENGINE_MOCS = BIT(1),
>> @@ -445,7 +507,13 @@ static unsigned int get_mocs_settings(const struct drm_i915_private *i915,
>>        memset(table, 0, sizeof(struct drm_i915_mocs_table));
>>
>>        table->unused_entries_index = I915_MOCS_PTE;
>> -     if (IS_PONTEVECCHIO(i915)) {
>> +     if (IS_METEORLAKE(i915)) {
>> +             table->size = ARRAY_SIZE(mtl_mocs_table);
>> +             table->table = mtl_mocs_table;
>> +             table->n_entries = MTL_NUM_MOCS_ENTRIES;
>> +             table->uc_index = 9;
>> +             table->unused_entries_index = 1;
>> +     } else if (IS_PONTEVECCHIO(i915)) {
>>                table->size = ARRAY_SIZE(pvc_mocs_table);
>>                table->table = pvc_mocs_table;
>>                table->n_entries = PVC_NUM_MOCS_ENTRIES;
>> @@ -646,9 +714,9 @@ void intel_mocs_init_engine(struct intel_engine_cs *engine)
>>                init_l3cc_table(engine->gt, &table);
>>   }
>>
>> -static u32 global_mocs_offset(void)
>> +static u32 global_mocs_offset(struct intel_gt *gt)
>>   {
>> -     return i915_mmio_reg_offset(GEN12_GLOBAL_MOCS(0));
>> +     return i915_mmio_reg_offset(GEN12_GLOBAL_MOCS(0)) + gt->uncore->gsi_offset;
>>   }
>>
>>   void intel_set_mocs_index(struct intel_gt *gt)
>> @@ -671,7 +739,7 @@ void intel_mocs_init(struct intel_gt *gt)
>>         */
>>        flags = get_mocs_settings(gt->i915, &table);
>>        if (flags & HAS_GLOBAL_MOCS)
>> -             __init_mocs_table(gt->uncore, &table, global_mocs_offset());
>> +             __init_mocs_table(gt->uncore, &table, global_mocs_offset(gt));
>>
>>        /*
>>         * Initialize the L3CC table as part of mocs initalization to make
>> diff --git a/drivers/gpu/drm/i915/gt/selftest_mocs.c b/drivers/gpu/drm/i915/gt/selftest_mocs.c
>> index ca009a6a13bd..730796346514 100644
>> --- a/drivers/gpu/drm/i915/gt/selftest_mocs.c
>> +++ b/drivers/gpu/drm/i915/gt/selftest_mocs.c
>> @@ -137,7 +137,7 @@ static int read_mocs_table(struct i915_request *rq,
>>                return 0;
>>
>>        if (HAS_GLOBAL_MOCS_REGISTERS(rq->engine->i915))
>> -             addr = global_mocs_offset();
>> +             addr = global_mocs_offset(rq->engine->gt);
>>        else
>>                addr = mocs_offset(rq->engine);
>>
>> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
>> index 621730b6551c..480b128499ae 100644
>> --- a/drivers/gpu/drm/i915/i915_pci.c
>> +++ b/drivers/gpu/drm/i915/i915_pci.c
>> @@ -1149,6 +1149,7 @@ static const struct intel_device_info mtl_info = {
>>        .has_flat_ccs = 0,
>>        .has_gmd_id = 1,
>>        .has_guc_deprivilege = 1,
>> +     .has_llc = 0,
>>        .has_mslice_steering = 0,
>>        .has_snoop = 1,
>>        .__runtime.memory_regions = REGION_SMEM | REGION_STOLEN_LMEM,


[-- Attachment #2: Type: text/html, Size: 46624 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Intel-gfx] [PATCH 1/7] drm/i915/mtl: Define MOCS and PAT tables for MTL
  2023-04-06 14:55     ` Yang, Fei
@ 2023-04-06 18:13       ` Das, Nirmoy
  0 siblings, 0 replies; 35+ messages in thread
From: Das, Nirmoy @ 2023-04-06 18:13 UTC (permalink / raw)
  To: Yang, Fei, intel-gfx; +Cc: Roper, Matthew D, De Marchi, Lucas, dri-devel

[-- Attachment #1: Type: text/plain, Size: 20698 bytes --]

Hi Fei,

On 4/6/2023 4:55 PM, Yang, Fei wrote:
> > On 4/1/2023 8:38 AM, fei.yang@intel.com wrote:
> >> From: Fei Yang <fei.yang@intel.com>
> >>
> >> On MTL, GT can no longer allocate on LLC - only the CPU can.
> >> This, along with addition of support for ADM/L4 cache calls a
> >> MOCS/PAT table update.
> >> Also add PTE encode functions for MTL as it has different PAT
> >> index definition than previous platforms.
> >>
> >> BSpec: 44509, 45101, 44235
> >>
> >> Cc: Matt Roper <matthew.d.roper@intel.com>
> >> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> >> Signed-off-by: Madhumitha Tolakanahalli Pradeep 
> <madhumitha.tolakanahalli.pradeep@intel.com>
> >> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
> >> Signed-off-by: Fei Yang <fei.yang@intel.com>
> >> ---
> >> drivers/gpu/drm/i915/display/intel_dpt.c |  2 +-
> >> drivers/gpu/drm/i915/gt/gen8_ppgtt.c     | 43 ++++++++++++--
> >> drivers/gpu/drm/i915/gt/gen8_ppgtt.h     |  3 +
> >> drivers/gpu/drm/i915/gt/intel_ggtt.c     | 36 ++++++++++-
> >> drivers/gpu/drm/i915/gt/intel_gtt.c      | 23 ++++++-
> >> drivers/gpu/drm/i915/gt/intel_gtt.h      | 20 ++++++-
> >> drivers/gpu/drm/i915/gt/intel_mocs.c     | 76 ++++++++++++++++++++++--
> >> drivers/gpu/drm/i915/gt/selftest_mocs.c  |  2 +-
> >> drivers/gpu/drm/i915/i915_pci.c          |  1 +
> >>   9 files changed, 189 insertions(+), 17 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c 
> b/drivers/gpu/drm/i915/display/intel_dpt.c
> >> index b8027392144d..c5eacfdba1a5 100644
> >> --- a/drivers/gpu/drm/i915/display/intel_dpt.c
> >> +++ b/drivers/gpu/drm/i915/display/intel_dpt.c
> >> @@ -300,7 +300,7 @@ intel_dpt_create(struct intel_framebuffer *fb)
> >>  vm->vma_ops.bind_vma    = dpt_bind_vma;
> >>  vm->vma_ops.unbind_vma  = dpt_unbind_vma;
> >>
> >> -     vm->pte_encode = gen8_ggtt_pte_encode;
> >> +     vm->pte_encode = vm->gt->ggtt->vm.pte_encode;
> >>
> >>        dpt->obj = dpt_obj;
> >>  dpt->obj->is_dpt = true;
> >> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
> b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> >> index 4daaa6f55668..4197b43150cc 100644
> >> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> >> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> >> @@ -55,6 +55,34 @@ static u64 gen8_pte_encode(dma_addr_t addr,
> >>        return pte;
> >>   }
> >>
> >> +static u64 mtl_pte_encode(dma_addr_t addr,
> >> + enum i915_cache_level level,
> >> +                       u32 flags)
> >> +{
> >> +     gen8_pte_t pte = addr | GEN8_PAGE_PRESENT | GEN8_PAGE_RW;
> >> +
> >> +     if (unlikely(flags & PTE_READ_ONLY))
> >> +             pte &= ~GEN8_PAGE_RW;
> >> +
> >> +     if (flags & PTE_LM)
> >> +             pte |= GEN12_PPGTT_PTE_LM | GEN12_PPGTT_PTE_NC;
> >> +
> >> +     switch (level) {
> >> +     case I915_CACHE_NONE:
> >> +             pte |= GEN12_PPGTT_PTE_PAT1;
> >> +             break;
> >> +     case I915_CACHE_LLC:
> >> +     case I915_CACHE_L3_LLC:
> >> +             pte |= GEN12_PPGTT_PTE_PAT0 | GEN12_PPGTT_PTE_PAT1;
> >> +             break;
> >> +     case I915_CACHE_WT:
> >> +             pte |= GEN12_PPGTT_PTE_PAT0;
> >> +             break;
> >> +     }
> >> +
> >> +     return pte;
> >> +}
> >> +
> >>   static void gen8_ppgtt_notify_vgt(struct i915_ppgtt *ppgtt, bool 
> create)
> >>   {
> >>        struct drm_i915_private *i915 = ppgtt->vm.i915;
> >> @@ -427,7 +455,7 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
> >>                      u32 flags)
> >>   {
> >>        struct i915_page_directory *pd;
> >> -     const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, 
> flags);
> >> +     const gen8_pte_t pte_encode = ppgtt->vm.pte_encode(0, 
> cache_level, flags);
> >>        gen8_pte_t *vaddr;
> >>
> >>        pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2));
> >> @@ -580,7 +608,7 @@ static void gen8_ppgtt_insert_huge(struct 
> i915_address_space *vm,
> >>       enum i915_cache_level cache_level,
> >>       u32 flags)
> >>   {
> >> -     const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, 
> flags);
> >> +     const gen8_pte_t pte_encode = vm->pte_encode(0, cache_level, 
> flags);
> >>        unsigned int rem = sg_dma_len(iter->sg);
> >>        u64 start = vma_res->start;
> >>
> >> @@ -743,7 +771,7 @@ static void gen8_ppgtt_insert_entry(struct 
> i915_address_space *vm,
> >>  GEM_BUG_ON(pt->is_compact);
> >>
> >>        vaddr = px_vaddr(pt);
> >> - vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags);
> >> + vaddr[gen8_pd_index(idx, 0)] = vm->pte_encode(addr, level, flags);
> >>  drm_clflush_virt_range(&vaddr[gen8_pd_index(idx, 0)], sizeof(*vaddr));
> >>   }
> >>
> >> @@ -773,7 +801,7 @@ static void 
> __xehpsdv_ppgtt_insert_entry_lm(struct i915_address_space *vm,
> >>        }
> >>
> >>        vaddr = px_vaddr(pt);
> >> - vaddr[gen8_pd_index(idx, 0) / 16] = gen8_pte_encode(addr, level, 
> flags);
> >> + vaddr[gen8_pd_index(idx, 0) / 16] = vm->pte_encode(addr, level, 
> flags);
> >>   }
> >>
> >>   static void xehpsdv_ppgtt_insert_entry(struct i915_address_space *vm,
> >> @@ -820,7 +848,7 @@ static int gen8_init_scratch(struct 
> i915_address_space *vm)
> >>                pte_flags |= PTE_LM;
> >>
> >>  vm->scratch[0]->encode =
> >> - gen8_pte_encode(px_dma(vm->scratch[0]),
> >> + vm->pte_encode(px_dma(vm->scratch[0]),
> >>    I915_CACHE_NONE, pte_flags);
> >>
> >>        for (i = 1; i <= vm->top; i++) {
> >> @@ -963,7 +991,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct 
> intel_gt *gt,
> >>         */
> >>  ppgtt->vm.alloc_scratch_dma = alloc_pt_dma;
> >>
> >> - ppgtt->vm.pte_encode = gen8_pte_encode;
> >> +     if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 70))
> >> + ppgtt->vm.pte_encode = mtl_pte_encode;
> >> +     else
> >> + ppgtt->vm.pte_encode = gen8_pte_encode;
> >>
> >>  ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND;
> >>  ppgtt->vm.insert_entries = gen8_ppgtt_insert;
> >> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h 
> b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
> >> index f541d19264b4..6b8ce7f4d25a 100644
> >> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
> >> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
> >> @@ -18,5 +18,8 @@ struct i915_ppgtt *gen8_ppgtt_create(struct 
> intel_gt *gt,
> >>   u64 gen8_ggtt_pte_encode(dma_addr_t addr,
> >> enum i915_cache_level level,
> >>                         u32 flags);
> >> +u64 mtl_ggtt_pte_encode(dma_addr_t addr,
> >> + unsigned int pat_index,
> >> +                     u32 flags);
> >>
> >>   #endif
> >> diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
> b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> >> index 3c7f1ed92f5b..ba3109338aee 100644
> >> --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
> >> +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> >> @@ -220,6 +220,33 @@ static void guc_ggtt_invalidate(struct 
> i915_ggtt *ggtt)
> >>        }
> >>   }
> >>
> >> +u64 mtl_ggtt_pte_encode(dma_addr_t addr,
> >> +                     enum i915_cache_level level,
> >> +                     u32 flags)
> >> +{
> >> +     gen8_pte_t pte = addr | GEN8_PAGE_PRESENT;
> >> +
> >> +     GEM_BUG_ON(addr & ~GEN12_GGTT_PTE_ADDR_MASK);
> >> +
> >> +     if (flags & PTE_LM)
> >> +             pte |= GEN12_GGTT_PTE_LM;
> >> +
> >> +     switch (level) {
> >> +     case I915_CACHE_NONE:
> >> +             pte |= MTL_GGTT_PTE_PAT1;
> >> +             break;
> >> +     case I915_CACHE_LLC:
> >> +     case I915_CACHE_L3_LLC:
> >> +             pte |= MTL_GGTT_PTE_PAT0 | MTL_GGTT_PTE_PAT1;
> >> +             break;
> >> +     case I915_CACHE_WT:
> >> +             pte |= MTL_GGTT_PTE_PAT0;
> >> +             break;
> >> +     }
> >> +
> >> +     return pte;
> >> +}
> >> +
> >>   u64 gen8_ggtt_pte_encode(dma_addr_t addr,
> >> enum i915_cache_level level,
> >>                         u32 flags)
> >> @@ -247,7 +274,7 @@ static void gen8_ggtt_insert_page(struct 
> i915_address_space *vm,
> >>        gen8_pte_t __iomem *pte =
> >>                (gen8_pte_t __iomem *)ggtt->gsm + offset / 
> I915_GTT_PAGE_SIZE;
> >>
> >> -     gen8_set_pte(pte, gen8_ggtt_pte_encode(addr, level, flags));
> >> +     gen8_set_pte(pte, ggtt->vm.pte_encode(addr, level, flags));
> >>
> >>  ggtt->invalidate(ggtt);
> >>   }
> >> @@ -257,8 +284,8 @@ static void gen8_ggtt_insert_entries(struct 
> i915_address_space *vm,
> >>         enum i915_cache_level level,
> >>         u32 flags)
> >>   {
> >> -     const gen8_pte_t pte_encode = gen8_ggtt_pte_encode(0, level, 
> flags);
> >>        struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
> >> +     const gen8_pte_t pte_encode = ggtt->vm.pte_encode(0, level, 
> flags);
> >>        gen8_pte_t __iomem *gte;
> >>        gen8_pte_t __iomem *end;
> >>        struct sgt_iter iter;
> >> @@ -981,7 +1008,10 @@ static int gen8_gmch_probe(struct i915_ggtt 
> *ggtt)
> >>  ggtt->vm.vma_ops.bind_vma    = intel_ggtt_bind_vma;
> >>  ggtt->vm.vma_ops.unbind_vma  = intel_ggtt_unbind_vma;
> >>
> >> - ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
> >> +     if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
> >> + ggtt->vm.pte_encode = mtl_ggtt_pte_encode;
> >> +     else
> >> + ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
> >>
> >>        return ggtt_probe_common(ggtt, size);
> >>   }
> >> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
> b/drivers/gpu/drm/i915/gt/intel_gtt.c
> >> index 4f436ba7a3c8..1e1b34e22cf5 100644
> >> --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> >> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> >> @@ -468,6 +468,25 @@ void gtt_write_workarounds(struct intel_gt *gt)
> >>        }
> >>   }
> >>
> >> +static void mtl_setup_private_ppat(struct intel_uncore *uncore)
> >> +{
> >> + intel_uncore_write(uncore, GEN12_PAT_INDEX(0),
> >> +  MTL_PPAT_L4_0_WB);
> >> + intel_uncore_write(uncore, GEN12_PAT_INDEX(1),
> >> +  MTL_PPAT_L4_1_WT);
> >> + intel_uncore_write(uncore, GEN12_PAT_INDEX(2),
> >> +  MTL_PPAT_L4_3_UC);
> >> + intel_uncore_write(uncore, GEN12_PAT_INDEX(3),
> >> +  MTL_PPAT_L4_0_WB | MTL_2_COH_1W);
> >> + intel_uncore_write(uncore, GEN12_PAT_INDEX(4),
> >> +  MTL_PPAT_L4_0_WB | MTL_3_COH_2W);
> >> +
> >> +     /*
> >> +      * Remaining PAT entries are left at the hardware-default
> >> +      * fully-cached setting
> >> +      */
> >> +}
> >> +
> >>   static void tgl_setup_private_ppat(struct intel_uncore *uncore)
> >>   {
> >>        /* TGL doesn't support LLC or AGE settings */
> >> @@ -603,7 +622,9 @@ void setup_private_pat(struct intel_gt *gt)
> >>
> >>  GEM_BUG_ON(GRAPHICS_VER(i915) < 8);
> >>
> >> -     if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
> >> +     if (IS_METEORLAKE(i915))
> >> + mtl_setup_private_ppat(uncore);
> >
> >
> > Could you please sync this with DII. We should be programming PAT for
> > media tile too.
> >
> > I have refactor this patch in DII along with taking care of media tile
> > and I think we should
> >
> > get those changes here too.
>
> I don't think the PAT index registers are multicasted for MTL. The 
> registers
> are at 0x4800 for the render tile and 0x384800 for the media tile, and 
> they
> get programmed in gt_init separately when iterating through each gt. No?


Primary GT is multicasted but the media one is not for MTL.
Added you to a internal email thread where Matt clarified this.


Regards,

Nirmoy

>
> -Fei
>
> > Regards,
> >
> > Nirmoy
> >
> >> +     else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
> >>  xehp_setup_private_ppat(gt);
> >>        else if (GRAPHICS_VER(i915) >= 12)
> >>  tgl_setup_private_ppat(uncore);
> >> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
> b/drivers/gpu/drm/i915/gt/intel_gtt.h
> >> index 69ce55f517f5..b632167eaf2e 100644
> >> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> >> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> >> @@ -88,9 +88,18 @@ typedef u64 gen8_pte_t;
> >>   #define BYT_PTE_SNOOPED_BY_CPU_CACHES       REG_BIT(2)
> >>   #define BYT_PTE_WRITEABLE           REG_BIT(1)
> >>
> >> +#define GEN12_PPGTT_PTE_PAT3    BIT_ULL(62)
> >>   #define GEN12_PPGTT_PTE_LM  BIT_ULL(11)
> >> +#define GEN12_PPGTT_PTE_PAT2    BIT_ULL(7)
> >> +#define GEN12_PPGTT_PTE_NC      BIT_ULL(5)
> >> +#define GEN12_PPGTT_PTE_PAT1    BIT_ULL(4)
> >> +#define GEN12_PPGTT_PTE_PAT0    BIT_ULL(3)
> >>
> >> -#define GEN12_GGTT_PTE_LM    BIT_ULL(1)
> >> +#define GEN12_GGTT_PTE_LM            BIT_ULL(1)
> >> +#define MTL_GGTT_PTE_PAT0            BIT_ULL(52)
> >> +#define MTL_GGTT_PTE_PAT1            BIT_ULL(53)
> >> +#define GEN12_GGTT_PTE_ADDR_MASK     GENMASK_ULL(45, 12)
> >> +#define MTL_GGTT_PTE_PAT_MASK                GENMASK_ULL(53, 52)
> >>
> >>   #define GEN12_PDE_64K BIT(6)
> >>   #define GEN12_PTE_PS64 BIT(8)
> >> @@ -147,6 +156,15 @@ typedef u64 gen8_pte_t;
> >>   #define GEN8_PDE_IPS_64K BIT(11)
> >>   #define GEN8_PDE_PS_2M BIT(7)
> >>
> >> +#define MTL_PPAT_L4_CACHE_POLICY_MASK        REG_GENMASK(3, 2)
> >> +#define MTL_PAT_INDEX_COH_MODE_MASK  REG_GENMASK(1, 0)
> >> +#define MTL_PPAT_L4_3_UC   
> REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 3)
> >> +#define MTL_PPAT_L4_1_WT   
> REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 1)
> >> +#define MTL_PPAT_L4_0_WB   
> REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 0)
> >> +#define MTL_3_COH_2W REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 3)
> >> +#define MTL_2_COH_1W REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 2)
> >> +#define MTL_0_COH_NON   
>  REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 0)
> >> +
> >>   enum i915_cache_level;
> >>
> >>   struct drm_i915_gem_object;
> >> diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c 
> b/drivers/gpu/drm/i915/gt/intel_mocs.c
> >> index 69b489e8dfed..89570f137b2c 100644
> >> --- a/drivers/gpu/drm/i915/gt/intel_mocs.c
> >> +++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
> >> @@ -40,6 +40,10 @@ struct drm_i915_mocs_table {
> >>   #define LE_COS(value)           ((value) << 15)
> >>   #define LE_SSE(value)           ((value) << 17)
> >>
> >> +/* Defines for the tables (GLOB_MOCS_0 - GLOB_MOCS_16) */
> >> +#define _L4_CACHEABILITY(value)      ((value) << 2)
> >> +#define IG_PAT(value)            ((value) << 8)
> >> +
> >>   /* Defines for the tables (LNCFMOCS0 - LNCFMOCS31) - two entries 
> per word */
> >>   #define L3_ESC(value)           ((value) << 0)
> >>   #define L3_SCC(value)           ((value) << 1)
> >> @@ -50,6 +54,7 @@ struct drm_i915_mocs_table {
> >>   /* Helper defines */
> >>   #define GEN9_NUM_MOCS_ENTRIES       64  /* 63-64 are reserved, 
> but configured. */
> >>   #define PVC_NUM_MOCS_ENTRIES        3
> >> +#define MTL_NUM_MOCS_ENTRIES 16
> >>
> >>   /* (e)LLC caching options */
> >>   /*
> >> @@ -73,6 +78,12 @@ struct drm_i915_mocs_table {
> >>   #define L3_2_RESERVED           _L3_CACHEABILITY(2)
> >>   #define L3_3_WB           _L3_CACHEABILITY(3)
> >>
> >> +/* L4 caching options */
> >> +#define L4_0_WB            _L4_CACHEABILITY(0)
> >> +#define L4_1_WT            _L4_CACHEABILITY(1)
> >> +#define L4_2_RESERVED            _L4_CACHEABILITY(2)
> >> +#define L4_3_UC            _L4_CACHEABILITY(3)
> >> +
> >>   #define MOCS_ENTRY(__idx, __control_value, __l3cc_value) \
> >>        [__idx] = { \
> >>  .control_value = __control_value, \
> >> @@ -416,6 +427,57 @@ static const struct drm_i915_mocs_entry 
> pvc_mocs_table[] = {
> >>        MOCS_ENTRY(2, 0, L3_3_WB),
> >>   };
> >>
> >> +static const struct drm_i915_mocs_entry mtl_mocs_table[] = {
> >> +     /* Error - Reserved for Non-Use */
> >> +     MOCS_ENTRY(0,
> >> +                IG_PAT(0),
> >> +                L3_LKUP(1) | L3_3_WB),
> >> +     /* Cached - L3 + L4 */
> >> +     MOCS_ENTRY(1,
> >> +                IG_PAT(1),
> >> +                L3_LKUP(1) | L3_3_WB),
> >> +     /* L4 - GO:L3 */
> >> +     MOCS_ENTRY(2,
> >> +                IG_PAT(1),
> >> +                L3_LKUP(1) | L3_1_UC),
> >> +     /* Uncached - GO:L3 */
> >> +     MOCS_ENTRY(3,
> >> +                IG_PAT(1) | L4_3_UC,
> >> +                L3_LKUP(1) | L3_1_UC),
> >> +     /* L4 - GO:Mem */
> >> +     MOCS_ENTRY(4,
> >> +                IG_PAT(1),
> >> +                L3_LKUP(1) | L3_GLBGO(1) | L3_1_UC),
> >> +     /* Uncached - GO:Mem */
> >> +     MOCS_ENTRY(5,
> >> +                IG_PAT(1) | L4_3_UC,
> >> +                L3_LKUP(1) | L3_GLBGO(1) | L3_1_UC),
> >> +     /* L4 - L3:NoLKUP; GO:L3 */
> >> +     MOCS_ENTRY(6,
> >> +                IG_PAT(1),
> >> +                L3_1_UC),
> >> +     /* Uncached - L3:NoLKUP; GO:L3 */
> >> +     MOCS_ENTRY(7,
> >> +                IG_PAT(1) | L4_3_UC,
> >> +                L3_1_UC),
> >> +     /* L4 - L3:NoLKUP; GO:Mem */
> >> +     MOCS_ENTRY(8,
> >> +                IG_PAT(1),
> >> +  L3_GLBGO(1) | L3_1_UC),
> >> +     /* Uncached - L3:NoLKUP; GO:Mem */
> >> +     MOCS_ENTRY(9,
> >> +                IG_PAT(1) | L4_3_UC,
> >> +  L3_GLBGO(1) | L3_1_UC),
> >> +     /* Display - L3; L4:WT */
> >> +     MOCS_ENTRY(14,
> >> +                IG_PAT(1) | L4_1_WT,
> >> +                L3_LKUP(1) | L3_3_WB),
> >> +     /* CCS - Non-Displayable */
> >> +     MOCS_ENTRY(15,
> >> +                IG_PAT(1),
> >> +  L3_GLBGO(1) | L3_1_UC),
> >> +};
> >> +
> >>   enum {
> >>        HAS_GLOBAL_MOCS = BIT(0),
> >>        HAS_ENGINE_MOCS = BIT(1),
> >> @@ -445,7 +507,13 @@ static unsigned int get_mocs_settings(const 
> struct drm_i915_private *i915,
> >>        memset(table, 0, sizeof(struct drm_i915_mocs_table));
> >>
> >>  table->unused_entries_index = I915_MOCS_PTE;
> >> -     if (IS_PONTEVECCHIO(i915)) {
> >> +     if (IS_METEORLAKE(i915)) {
> >> + table->size = ARRAY_SIZE(mtl_mocs_table);
> >> + table->table = mtl_mocs_table;
> >> + table->n_entries = MTL_NUM_MOCS_ENTRIES;
> >> + table->uc_index = 9;
> >> + table->unused_entries_index = 1;
> >> +     } else if (IS_PONTEVECCHIO(i915)) {
> >>  table->size = ARRAY_SIZE(pvc_mocs_table);
> >>  table->table = pvc_mocs_table;
> >>  table->n_entries = PVC_NUM_MOCS_ENTRIES;
> >> @@ -646,9 +714,9 @@ void intel_mocs_init_engine(struct 
> intel_engine_cs *engine)
> >>  init_l3cc_table(engine->gt, &table);
> >>   }
> >>
> >> -static u32 global_mocs_offset(void)
> >> +static u32 global_mocs_offset(struct intel_gt *gt)
> >>   {
> >> -     return i915_mmio_reg_offset(GEN12_GLOBAL_MOCS(0));
> >> +     return i915_mmio_reg_offset(GEN12_GLOBAL_MOCS(0)) + 
> gt->uncore->gsi_offset;
> >>   }
> >>
> >>   void intel_set_mocs_index(struct intel_gt *gt)
> >> @@ -671,7 +739,7 @@ void intel_mocs_init(struct intel_gt *gt)
> >>         */
> >>        flags = get_mocs_settings(gt->i915, &table);
> >>        if (flags & HAS_GLOBAL_MOCS)
> >> - __init_mocs_table(gt->uncore, &table, global_mocs_offset());
> >> + __init_mocs_table(gt->uncore, &table, global_mocs_offset(gt));
> >>
> >>        /*
> >>         * Initialize the L3CC table as part of mocs initalization 
> to make
> >> diff --git a/drivers/gpu/drm/i915/gt/selftest_mocs.c 
> b/drivers/gpu/drm/i915/gt/selftest_mocs.c
> >> index ca009a6a13bd..730796346514 100644
> >> --- a/drivers/gpu/drm/i915/gt/selftest_mocs.c
> >> +++ b/drivers/gpu/drm/i915/gt/selftest_mocs.c
> >> @@ -137,7 +137,7 @@ static int read_mocs_table(struct i915_request *rq,
> >>                return 0;
> >>
> >>        if (HAS_GLOBAL_MOCS_REGISTERS(rq->engine->i915))
> >> -             addr = global_mocs_offset();
> >> +             addr = global_mocs_offset(rq->engine->gt);
> >>        else
> >>                addr = mocs_offset(rq->engine);
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_pci.c 
> b/drivers/gpu/drm/i915/i915_pci.c
> >> index 621730b6551c..480b128499ae 100644
> >> --- a/drivers/gpu/drm/i915/i915_pci.c
> >> +++ b/drivers/gpu/drm/i915/i915_pci.c
> >> @@ -1149,6 +1149,7 @@ static const struct intel_device_info 
> mtl_info = {
> >>        .has_flat_ccs = 0,
> >>        .has_gmd_id = 1,
> >>        .has_guc_deprivilege = 1,
> >> +     .has_llc = 0,
> >>        .has_mslice_steering = 0,
> >>        .has_snoop = 1,
> >>  .__runtime.memory_regions = REGION_SMEM | REGION_STOLEN_LMEM,
>

[-- Attachment #2: Type: text/html, Size: 49103 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Intel-gfx] [PATCH 1/7] drm/i915/mtl: Define MOCS and PAT tables for MTL
  2023-04-06  8:16     ` Andi Shyti
@ 2023-04-06 18:22       ` Yang, Fei
  0 siblings, 0 replies; 35+ messages in thread
From: Yang, Fei @ 2023-04-06 18:22 UTC (permalink / raw)
  To: Andi Shyti, Jani Nikula
  Cc: De Marchi, Lucas, intel-gfx, Roper, Matthew D, dri-devel

>Subject: Re: [Intel-gfx] [PATCH 1/7] drm/i915/mtl: Define MOCS and PAT tables for MTL
>
> Hi Fei,
>
> On Mon, Apr 03, 2023 at 03:50:26PM +0300, Jani Nikula wrote:
>> On Fri, 31 Mar 2023, fei.yang@intel.com wrote:
>>> From: Fei Yang <fei.yang@intel.com>
>>>
>>> On MTL, GT can no longer allocate on LLC - only the CPU can.
>>> This, along with addition of support for ADM/L4 cache calls a 
>>> MOCS/PAT table update.
>>> Also add PTE encode functions for MTL as it has different PAT index 
>>> definition than previous platforms.
>> 
>> As a general observation, turning something into a function pointer 
>> and extending it to more platforms should be two separate changes.
>
> Agree with Jani. Fei, would you mind splitting this patch? It eases the review, as well.

Yes, I'm working on this. Still need to address another comment from Ville.
Will send an update soon.

>Thanks,
>Andi
>
>> BR,
>> Jani.
>> 
>>>
>>> BSpec: 44509, 45101, 44235
>>>
>>> Cc: Matt Roper <matthew.d.roper@intel.com>
>>> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
>>> Signed-off-by: Madhumitha Tolakanahalli Pradeep 
>>> <madhumitha.tolakanahalli.pradeep@intel.com>
>>> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
>>> Signed-off-by: Fei Yang <fei.yang@intel.com>
>>> ---
>>>  drivers/gpu/drm/i915/display/intel_dpt.c |  2 +-
>>>  drivers/gpu/drm/i915/gt/gen8_ppgtt.c     | 43 ++++++++++++--
>>>  drivers/gpu/drm/i915/gt/gen8_ppgtt.h     |  3 +
>>>  drivers/gpu/drm/i915/gt/intel_ggtt.c     | 36 ++++++++++-
>>>  drivers/gpu/drm/i915/gt/intel_gtt.c      | 23 ++++++-
>>>  drivers/gpu/drm/i915/gt/intel_gtt.h      | 20 ++++++-
>>>  drivers/gpu/drm/i915/gt/intel_mocs.c     | 76 ++++++++++++++++++++++--
>>>  drivers/gpu/drm/i915/gt/selftest_mocs.c  |  2 +-
>>>  drivers/gpu/drm/i915/i915_pci.c          |  1 +
>>>  9 files changed, 189 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c 
>>> b/drivers/gpu/drm/i915/display/intel_dpt.c
>>> index b8027392144d..c5eacfdba1a5 100644
>>> --- a/drivers/gpu/drm/i915/display/intel_dpt.c
>>> +++ b/drivers/gpu/drm/i915/display/intel_dpt.c
>>> @@ -300,7 +300,7 @@ intel_dpt_create(struct intel_framebuffer *fb)
>>>  	vm->vma_ops.bind_vma    = dpt_bind_vma;
>>>  	vm->vma_ops.unbind_vma  = dpt_unbind_vma;
>>>  
>>> -	vm->pte_encode = gen8_ggtt_pte_encode;
>>> +	vm->pte_encode = vm->gt->ggtt->vm.pte_encode;
>>>  
>>>  	dpt->obj = dpt_obj;
>>>  	dpt->obj->is_dpt = true;
>>> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
>>> b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
>>> index 4daaa6f55668..4197b43150cc 100644
>>> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
>>> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
>>> @@ -55,6 +55,34 @@ static u64 gen8_pte_encode(dma_addr_t addr,
>>>  	return pte;
>>>  }
>>>  
>>> +static u64 mtl_pte_encode(dma_addr_t addr,
>>> +			  enum i915_cache_level level,
>>> +			  u32 flags)
>>> +{
>>> +	gen8_pte_t pte = addr | GEN8_PAGE_PRESENT | GEN8_PAGE_RW;
>>> +
>>> +	if (unlikely(flags & PTE_READ_ONLY))
>>> +		pte &= ~GEN8_PAGE_RW;
>>> +
>>> +	if (flags & PTE_LM)
>>> +		pte |= GEN12_PPGTT_PTE_LM | GEN12_PPGTT_PTE_NC;
>>> +
>>> +	switch (level) {
>>> +	case I915_CACHE_NONE:
>>> +		pte |= GEN12_PPGTT_PTE_PAT1;
>>> +		break;
>>> +	case I915_CACHE_LLC:
>>> +	case I915_CACHE_L3_LLC:
>>> +		pte |= GEN12_PPGTT_PTE_PAT0 | GEN12_PPGTT_PTE_PAT1;
>>> +		break;
>>> +	case I915_CACHE_WT:
>>> +		pte |= GEN12_PPGTT_PTE_PAT0;
>>> +		break;
>>> +	}
>>> +
>>> +	return pte;
>>> +}
>>> +
>>>  static void gen8_ppgtt_notify_vgt(struct i915_ppgtt *ppgtt, bool 
>>> create)  {
>>>  	struct drm_i915_private *i915 = ppgtt->vm.i915; @@ -427,7 +455,7 
>>> @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
>>>  		      u32 flags)
>>>  {
>>>  	struct i915_page_directory *pd;
>>> -	const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
>>> +	const gen8_pte_t pte_encode = ppgtt->vm.pte_encode(0, cache_level, 
>>> +flags);
>>>  	gen8_pte_t *vaddr;
>>>  
>>>  	pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2)); @@ -580,7 +608,7 
>>> @@ static void gen8_ppgtt_insert_huge(struct i915_address_space *vm,
>>>  				   enum i915_cache_level cache_level,
>>>  				   u32 flags)
>>>  {
>>> -	const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
>>> +	const gen8_pte_t pte_encode = vm->pte_encode(0, cache_level, 
>>> +flags);
>>>  	unsigned int rem = sg_dma_len(iter->sg);
>>>  	u64 start = vma_res->start;
>>>  
>>> @@ -743,7 +771,7 @@ static void gen8_ppgtt_insert_entry(struct i915_address_space *vm,
>>>  	GEM_BUG_ON(pt->is_compact);
>>>  
>>>  	vaddr = px_vaddr(pt);
>>> -	vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags);
>>> +	vaddr[gen8_pd_index(idx, 0)] = vm->pte_encode(addr, level, flags);
>>>  	drm_clflush_virt_range(&vaddr[gen8_pd_index(idx, 0)], 
>>> sizeof(*vaddr));  }
>>>  
>>> @@ -773,7 +801,7 @@ static void __xehpsdv_ppgtt_insert_entry_lm(struct i915_address_space *vm,
>>>  	}
>>>  
>>>  	vaddr = px_vaddr(pt);
>>> -	vaddr[gen8_pd_index(idx, 0) / 16] = gen8_pte_encode(addr, level, flags);
>>> +	vaddr[gen8_pd_index(idx, 0) / 16] = vm->pte_encode(addr, level, 
>>> +flags);
>>>  }
>>>  
>>>  static void xehpsdv_ppgtt_insert_entry(struct i915_address_space 
>>> *vm, @@ -820,7 +848,7 @@ static int gen8_init_scratch(struct i915_address_space *vm)
>>>  		pte_flags |= PTE_LM;
>>>  
>>>  	vm->scratch[0]->encode =
>>> -		gen8_pte_encode(px_dma(vm->scratch[0]),
>>> +		vm->pte_encode(px_dma(vm->scratch[0]),
>>>  				I915_CACHE_NONE, pte_flags);
>>>  
>>>  	for (i = 1; i <= vm->top; i++) {
>>> @@ -963,7 +991,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
>>>  	 */
>>>  	ppgtt->vm.alloc_scratch_dma = alloc_pt_dma;
>>>  
>>> -	ppgtt->vm.pte_encode = gen8_pte_encode;
>>> +	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 70))
>>> +		ppgtt->vm.pte_encode = mtl_pte_encode;
>>> +	else
>>> +		ppgtt->vm.pte_encode = gen8_pte_encode;
>>>  
>>>  	ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND;
>>>  	ppgtt->vm.insert_entries = gen8_ppgtt_insert; diff --git 
>>> a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h 
>>> b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
>>> index f541d19264b4..6b8ce7f4d25a 100644
>>> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
>>> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
>>> @@ -18,5 +18,8 @@ struct i915_ppgtt *gen8_ppgtt_create(struct 
>>> intel_gt *gt,
>>>  u64 gen8_ggtt_pte_encode(dma_addr_t addr,
>>>  			 enum i915_cache_level level,
>>>  			 u32 flags);
>>> +u64 mtl_ggtt_pte_encode(dma_addr_t addr,
>>> +			unsigned int pat_index,
>>> +			u32 flags);
>>>  
>>>  #endif
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
>>> b/drivers/gpu/drm/i915/gt/intel_ggtt.c
>>> index 3c7f1ed92f5b..ba3109338aee 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
>>> @@ -220,6 +220,33 @@ static void guc_ggtt_invalidate(struct i915_ggtt *ggtt)
>>>  	}
>>>  }
>>>  
>>> +u64 mtl_ggtt_pte_encode(dma_addr_t addr,
>>> +			enum i915_cache_level level,
>>> +			u32 flags)
>>> +{
>>> +	gen8_pte_t pte = addr | GEN8_PAGE_PRESENT;
>>> +
>>> +	GEM_BUG_ON(addr & ~GEN12_GGTT_PTE_ADDR_MASK);
>>> +
>>> +	if (flags & PTE_LM)
>>> +		pte |= GEN12_GGTT_PTE_LM;
>>> +
>>> +	switch (level) {
>>> +	case I915_CACHE_NONE:
>>> +		pte |= MTL_GGTT_PTE_PAT1;
>>> +		break;
>>> +	case I915_CACHE_LLC:
>>> +	case I915_CACHE_L3_LLC:
>>> +		pte |= MTL_GGTT_PTE_PAT0 | MTL_GGTT_PTE_PAT1;
>>> +		break;
>>> +	case I915_CACHE_WT:
>>> +		pte |= MTL_GGTT_PTE_PAT0;
>>> +		break;
>>> +	}
>>> +
>>> +	return pte;
>>> +}
>>> +
>>>  u64 gen8_ggtt_pte_encode(dma_addr_t addr,
>>>  			 enum i915_cache_level level,
>>>  			 u32 flags)
>>> @@ -247,7 +274,7 @@ static void gen8_ggtt_insert_page(struct i915_address_space *vm,
>>>  	gen8_pte_t __iomem *pte =
>>>  		(gen8_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE;
>>>  
>>> -	gen8_set_pte(pte, gen8_ggtt_pte_encode(addr, level, flags));
>>> +	gen8_set_pte(pte, ggtt->vm.pte_encode(addr, level, flags));
>>>  
>>>  	ggtt->invalidate(ggtt);
>>>  }
>>> @@ -257,8 +284,8 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
>>>  				     enum i915_cache_level level,
>>>  				     u32 flags)
>>>  {
>>> -	const gen8_pte_t pte_encode = gen8_ggtt_pte_encode(0, level, flags);
>>>  	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
>>> +	const gen8_pte_t pte_encode = ggtt->vm.pte_encode(0, level, 
>>> +flags);
>>>  	gen8_pte_t __iomem *gte;
>>>  	gen8_pte_t __iomem *end;
>>>  	struct sgt_iter iter;
>>> @@ -981,7 +1008,10 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
>>>  	ggtt->vm.vma_ops.bind_vma    = intel_ggtt_bind_vma;
>>>  	ggtt->vm.vma_ops.unbind_vma  = intel_ggtt_unbind_vma;
>>>  
>>> -	ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
>>> +	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
>>> +		ggtt->vm.pte_encode = mtl_ggtt_pte_encode;
>>> +	else
>>> +		ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
>>>  
>>>  	return ggtt_probe_common(ggtt, size);  } diff --git 
>>> a/drivers/gpu/drm/i915/gt/intel_gtt.c 
>>> b/drivers/gpu/drm/i915/gt/intel_gtt.c
>>> index 4f436ba7a3c8..1e1b34e22cf5 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
>>> @@ -468,6 +468,25 @@ void gtt_write_workarounds(struct intel_gt *gt)
>>>  	}
>>>  }
>>>  
>>> +static void mtl_setup_private_ppat(struct intel_uncore *uncore) {
>>> +	intel_uncore_write(uncore, GEN12_PAT_INDEX(0),
>>> +			   MTL_PPAT_L4_0_WB);
>>> +	intel_uncore_write(uncore, GEN12_PAT_INDEX(1),
>>> +			   MTL_PPAT_L4_1_WT);
>>> +	intel_uncore_write(uncore, GEN12_PAT_INDEX(2),
>>> +			   MTL_PPAT_L4_3_UC);
>>> +	intel_uncore_write(uncore, GEN12_PAT_INDEX(3),
>>> +			   MTL_PPAT_L4_0_WB | MTL_2_COH_1W);
>>> +	intel_uncore_write(uncore, GEN12_PAT_INDEX(4),
>>> +			   MTL_PPAT_L4_0_WB | MTL_3_COH_2W);
>>> +
>>> +	/*
>>> +	 * Remaining PAT entries are left at the hardware-default
>>> +	 * fully-cached setting
>>> +	 */
>>> +}
>>> +
>>>  static void tgl_setup_private_ppat(struct intel_uncore *uncore)  {
>>>  	/* TGL doesn't support LLC or AGE settings */ @@ -603,7 +622,9 @@ 
>>> void setup_private_pat(struct intel_gt *gt)
>>>  
>>>  	GEM_BUG_ON(GRAPHICS_VER(i915) < 8);
>>>  
>>> -	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
>>> +	if (IS_METEORLAKE(i915))
>>> +		mtl_setup_private_ppat(uncore);
>>> +	else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
>>>  		xehp_setup_private_ppat(gt);
>>>  	else if (GRAPHICS_VER(i915) >= 12)
>>>  		tgl_setup_private_ppat(uncore);
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
>>> b/drivers/gpu/drm/i915/gt/intel_gtt.h
>>> index 69ce55f517f5..b632167eaf2e 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
>>> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
>>> @@ -88,9 +88,18 @@ typedef u64 gen8_pte_t;
>>>  #define BYT_PTE_SNOOPED_BY_CPU_CACHES	REG_BIT(2)
>>>  #define BYT_PTE_WRITEABLE		REG_BIT(1)
>>>  
>>> +#define GEN12_PPGTT_PTE_PAT3    BIT_ULL(62)
>>>  #define GEN12_PPGTT_PTE_LM	BIT_ULL(11)
>>> +#define GEN12_PPGTT_PTE_PAT2    BIT_ULL(7)
>>> +#define GEN12_PPGTT_PTE_NC      BIT_ULL(5)
>>> +#define GEN12_PPGTT_PTE_PAT1    BIT_ULL(4)
>>> +#define GEN12_PPGTT_PTE_PAT0    BIT_ULL(3)
>>>  
>>> -#define GEN12_GGTT_PTE_LM	BIT_ULL(1)
>>> +#define GEN12_GGTT_PTE_LM		BIT_ULL(1)
>>> +#define MTL_GGTT_PTE_PAT0		BIT_ULL(52)
>>> +#define MTL_GGTT_PTE_PAT1		BIT_ULL(53)
>>> +#define GEN12_GGTT_PTE_ADDR_MASK	GENMASK_ULL(45, 12)
>>> +#define MTL_GGTT_PTE_PAT_MASK		GENMASK_ULL(53, 52)
>>>  
>>>  #define GEN12_PDE_64K BIT(6)
>>>  #define GEN12_PTE_PS64 BIT(8)
>>> @@ -147,6 +156,15 @@ typedef u64 gen8_pte_t;  #define 
>>> GEN8_PDE_IPS_64K BIT(11)
>>>  #define GEN8_PDE_PS_2M   BIT(7)
>>>  
>>> +#define MTL_PPAT_L4_CACHE_POLICY_MASK	REG_GENMASK(3, 2)
>>> +#define MTL_PAT_INDEX_COH_MODE_MASK	REG_GENMASK(1, 0)
>>> +#define MTL_PPAT_L4_3_UC	REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 3)
>>> +#define MTL_PPAT_L4_1_WT	REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 1)
>>> +#define MTL_PPAT_L4_0_WB	REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 0)
>>> +#define MTL_3_COH_2W	REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 3)
>>> +#define MTL_2_COH_1W	REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 2)
>>> +#define MTL_0_COH_NON	REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 0)
>>> +
>>>  enum i915_cache_level;
>>>  
>>>  struct drm_i915_gem_object;
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c 
>>> b/drivers/gpu/drm/i915/gt/intel_mocs.c
>>> index 69b489e8dfed..89570f137b2c 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_mocs.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
>>> @@ -40,6 +40,10 @@ struct drm_i915_mocs_table {
>>>  #define LE_COS(value)		((value) << 15)
>>>  #define LE_SSE(value)		((value) << 17)
>>>  
>>> +/* Defines for the tables (GLOB_MOCS_0 - GLOB_MOCS_16) */
>>> +#define _L4_CACHEABILITY(value)	((value) << 2)
>>> +#define IG_PAT(value)		((value) << 8)
>>> +
>>>  /* Defines for the tables (LNCFMOCS0 - LNCFMOCS31) - two entries per word */
>>>  #define L3_ESC(value)		((value) << 0)
>>>  #define L3_SCC(value)		((value) << 1)
>>> @@ -50,6 +54,7 @@ struct drm_i915_mocs_table {
>>>  /* Helper defines */
>>>  #define GEN9_NUM_MOCS_ENTRIES	64  /* 63-64 are reserved, but configured. */
>>>  #define PVC_NUM_MOCS_ENTRIES	3
>>> +#define MTL_NUM_MOCS_ENTRIES	16
>>>  
>>>  /* (e)LLC caching options */
>>>  /*
>>> @@ -73,6 +78,12 @@ struct drm_i915_mocs_table {
>>>  #define L3_2_RESERVED		_L3_CACHEABILITY(2)
>>>  #define L3_3_WB			_L3_CACHEABILITY(3)
>>>  
>>> +/* L4 caching options */
>>> +#define L4_0_WB			_L4_CACHEABILITY(0)
>>> +#define L4_1_WT			_L4_CACHEABILITY(1)
>>> +#define L4_2_RESERVED		_L4_CACHEABILITY(2)
>>> +#define L4_3_UC			_L4_CACHEABILITY(3)
>>> +
>>>  #define MOCS_ENTRY(__idx, __control_value, __l3cc_value) \
>>>  	[__idx] = { \
>>>  		.control_value = __control_value, \ @@ -416,6 +427,57 @@ static 
>>> const struct drm_i915_mocs_entry pvc_mocs_table[] = {
>>>  	MOCS_ENTRY(2, 0, L3_3_WB),
>>>  };
>>>  
>>> +static const struct drm_i915_mocs_entry mtl_mocs_table[] = {
>>> +	/* Error - Reserved for Non-Use */
>>> +	MOCS_ENTRY(0,
>>> +		   IG_PAT(0),
>>> +		   L3_LKUP(1) | L3_3_WB),
>>> +	/* Cached - L3 + L4 */
>>> +	MOCS_ENTRY(1,
>>> +		   IG_PAT(1),
>>> +		   L3_LKUP(1) | L3_3_WB),
>>> +	/* L4 - GO:L3 */
>>> +	MOCS_ENTRY(2,
>>> +		   IG_PAT(1),
>>> +		   L3_LKUP(1) | L3_1_UC),
>>> +	/* Uncached - GO:L3 */
>>> +	MOCS_ENTRY(3,
>>> +		   IG_PAT(1) | L4_3_UC,
>>> +		   L3_LKUP(1) | L3_1_UC),
>>> +	/* L4 - GO:Mem */
>>> +	MOCS_ENTRY(4,
>>> +		   IG_PAT(1),
>>> +		   L3_LKUP(1) | L3_GLBGO(1) | L3_1_UC),
>>> +	/* Uncached - GO:Mem */
>>> +	MOCS_ENTRY(5,
>>> +		   IG_PAT(1) | L4_3_UC,
>>> +		   L3_LKUP(1) | L3_GLBGO(1) | L3_1_UC),
>>> +	/* L4 - L3:NoLKUP; GO:L3 */
>>> +	MOCS_ENTRY(6,
>>> +		   IG_PAT(1),
>>> +		   L3_1_UC),
>>> +	/* Uncached - L3:NoLKUP; GO:L3 */
>>> +	MOCS_ENTRY(7,
>>> +		   IG_PAT(1) | L4_3_UC,
>>> +		   L3_1_UC),
>>> +	/* L4 - L3:NoLKUP; GO:Mem */
>>> +	MOCS_ENTRY(8,
>>> +		   IG_PAT(1),
>>> +		   L3_GLBGO(1) | L3_1_UC),
>>> +	/* Uncached - L3:NoLKUP; GO:Mem */
>>> +	MOCS_ENTRY(9,
>>> +		   IG_PAT(1) | L4_3_UC,
>>> +		   L3_GLBGO(1) | L3_1_UC),
>>> +	/* Display - L3; L4:WT */
>>> +	MOCS_ENTRY(14,
>>> +		   IG_PAT(1) | L4_1_WT,
>>> +		   L3_LKUP(1) | L3_3_WB),
>>> +	/* CCS - Non-Displayable */
>>> +	MOCS_ENTRY(15,
>>> +		   IG_PAT(1),
>>> +		   L3_GLBGO(1) | L3_1_UC),
>>> +};
>>> +
>>>  enum {
>>>  	HAS_GLOBAL_MOCS = BIT(0),
>>>  	HAS_ENGINE_MOCS = BIT(1),
>>> @@ -445,7 +507,13 @@ static unsigned int get_mocs_settings(const struct drm_i915_private *i915,
>>>  	memset(table, 0, sizeof(struct drm_i915_mocs_table));
>>>  
>>>  	table->unused_entries_index = I915_MOCS_PTE;
>>> -	if (IS_PONTEVECCHIO(i915)) {
>>> +	if (IS_METEORLAKE(i915)) {
>>> +		table->size = ARRAY_SIZE(mtl_mocs_table);
>>> +		table->table = mtl_mocs_table;
>>> +		table->n_entries = MTL_NUM_MOCS_ENTRIES;
>>> +		table->uc_index = 9;
>>> +		table->unused_entries_index = 1;
>>> +	} else if (IS_PONTEVECCHIO(i915)) {
>>>  		table->size = ARRAY_SIZE(pvc_mocs_table);
>>>  		table->table = pvc_mocs_table;
>>>  		table->n_entries = PVC_NUM_MOCS_ENTRIES; @@ -646,9 +714,9 @@ void 
>>> intel_mocs_init_engine(struct intel_engine_cs *engine)
>>>  		init_l3cc_table(engine->gt, &table);  }
>>>  
>>> -static u32 global_mocs_offset(void)
>>> +static u32 global_mocs_offset(struct intel_gt *gt)
>>>  {
>>> -	return i915_mmio_reg_offset(GEN12_GLOBAL_MOCS(0));
>>> +	return i915_mmio_reg_offset(GEN12_GLOBAL_MOCS(0)) + 
>>> +gt->uncore->gsi_offset;
>>>  }
>>>  
>>>  void intel_set_mocs_index(struct intel_gt *gt) @@ -671,7 +739,7 @@ 
>>> void intel_mocs_init(struct intel_gt *gt)
>>>  	 */
>>>  	flags = get_mocs_settings(gt->i915, &table);
>>>  	if (flags & HAS_GLOBAL_MOCS)
>>> -		__init_mocs_table(gt->uncore, &table, global_mocs_offset());
>>> +		__init_mocs_table(gt->uncore, &table, global_mocs_offset(gt));
>>>  
>>>  	/*
>>>  	 * Initialize the L3CC table as part of mocs initalization to make 
>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_mocs.c 
>>> b/drivers/gpu/drm/i915/gt/selftest_mocs.c
>>> index ca009a6a13bd..730796346514 100644
>>> --- a/drivers/gpu/drm/i915/gt/selftest_mocs.c
>>> +++ b/drivers/gpu/drm/i915/gt/selftest_mocs.c
>>> @@ -137,7 +137,7 @@ static int read_mocs_table(struct i915_request *rq,
>>>  		return 0;
>>>  
>>>  	if (HAS_GLOBAL_MOCS_REGISTERS(rq->engine->i915))
>>> -		addr = global_mocs_offset();
>>> +		addr = global_mocs_offset(rq->engine->gt);
>>>  	else
>>>  		addr = mocs_offset(rq->engine);
>>>  
>>> diff --git a/drivers/gpu/drm/i915/i915_pci.c 
>>> b/drivers/gpu/drm/i915/i915_pci.c index 621730b6551c..480b128499ae 
>>> 100644
>>> --- a/drivers/gpu/drm/i915/i915_pci.c
>>> +++ b/drivers/gpu/drm/i915/i915_pci.c
>>> @@ -1149,6 +1149,7 @@ static const struct intel_device_info mtl_info = {
>>>  	.has_flat_ccs = 0,
>>>  	.has_gmd_id = 1,
>>>  	.has_guc_deprivilege = 1,
>>> +	.has_llc = 0,
>>>  	.has_mslice_steering = 0,
>>>  	.has_snoop = 1,
>>>  	.__runtime.memory_regions = REGION_SMEM | REGION_STOLEN_LMEM,
>> 
>> --
>> Jani Nikula, Intel Open Source Graphics Center


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation
  2023-04-05 20:26         ` Jordan Justen
@ 2023-04-10  8:23           ` Jordan Justen
  2023-04-13 20:49             ` Yang, Fei
  0 siblings, 1 reply; 35+ messages in thread
From: Jordan Justen @ 2023-04-10  8:23 UTC (permalink / raw)
  To: Yang, Fei, Lionel Landwerlin, Roper, Matthew D, Chris Wilson, intel-gfx
  Cc: Kenneth Graunke, dri-devel

On 2023-04-05 13:26:43, Jordan Justen wrote:
> On 2023-04-05 00:45:24, Lionel Landwerlin wrote:
> > On 04/04/2023 19:04, Yang, Fei wrote:
> > >> Subject: Re: [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation
> > >>
> > >> Just like the protected content uAPI, there is no way for userspace to tell
> > >> this feature is available other than trying using it.
> > >>
> > >> Given the issues with protected content, is it not thing we could want to add?
> > > Sorry I'm not aware of the issues with protected content, could you elaborate?
> > > There was a long discussion on teams uAPI channel, could you comment there if
> > > any concerns?
> > >
> > 
> > We wanted to have a getparam to detect protected support and were told 
> > to detect it by trying to create a context with it.
> > 
> 
> An extensions system where the detection mechanism is "just try it",
> and assume it's not supported if it fails. ??
> 

I guess no one wants to discuss the issues with this so-called
detection mechanism for i915 extensions. (Just try it and if it fails,
it must not be supported.)

I wonder how many ioctls we will be making a couple years down the
road just to see what the kernel supports.

Maybe we'll get more fun 8 second timeouts to deal with. Maybe these
probing ioctls failing or succeeding will alter the kmd's state in
some unexpected way.

It'll also be fun to debug cases where the driver is not starting up
with the noise of a bunch of probing ioctls flying by.

I thought about suggesting at least something like
I915_PARAM_CMD_PARSER_VERSION, but I don't know if that could have
prevented this 8 second timeout for creating a protected content
context. Maybe it's better than nothing though.

Of course, there was also the vague idea I threw out below for getting
a list of supported extentions.

-Jordan

> 
> This seem likely to get more and more problematic as a detection
> mechanism as more extensions are added.
> 
> > 
> > Now it appears trying to create a protected context can block for 
> > several seconds.
> > 
> > Since we have to report capabilities to the user even before it creates 
> > protected contexts, any app is at risk of blocking.
> > 
> 
> This failure path is not causing any re-thinking about using this as
> the extension detection mechanism?
> 
> Doesn't the ioctl# + input-struct-size + u64-extension# identify the
> extension such that the kernel could indicate if it is supported or
> not. (Or, perhaps return an array of the supported extensions so the
> umd doesn't have to potentially make many ioctls for each extension of
> interest.)
> 
> -Jordan

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation
  2023-04-10  8:23           ` Jordan Justen
@ 2023-04-13 20:49             ` Yang, Fei
  0 siblings, 0 replies; 35+ messages in thread
From: Yang, Fei @ 2023-04-13 20:49 UTC (permalink / raw)
  To: Justen, Jordan L, Landwerlin, Lionel G, Roper, Matthew D,
	Chris Wilson, intel-gfx
  Cc: Kenneth Graunke, dri-devel

> Subject: Re: [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation
>
> On 2023-04-05 13:26:43, Jordan Justen wrote:
>> On 2023-04-05 00:45:24, Lionel Landwerlin wrote:
>>> On 04/04/2023 19:04, Yang, Fei wrote:
>>>>> Subject: Re: [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set 
>>>>> cache at BO creation
>>>>>
>>>>> Just like the protected content uAPI, there is no way for 
>>>>> userspace to tell this feature is available other than trying using it.
>>>>>
>>>>> Given the issues with protected content, is it not thing we could want to add?
>>>> Sorry I'm not aware of the issues with protected content, could you elaborate?
>>>> There was a long discussion on teams uAPI channel, could you 
>>>> comment there if any concerns?
>>>>
>>> 
>>> We wanted to have a getparam to detect protected support and were 
>>> told to detect it by trying to create a context with it.
>>> 
>> 
>> An extensions system where the detection mechanism is "just try it", 
>> and assume it's not supported if it fails. ??
>> 
>
> I guess no one wants to discuss the issues with this so-called detection
> mechanism for i915 extensions. (Just try it and if it fails, it must not
 be supported.)
>
> I wonder how many ioctls we will be making a couple years down the road
> just to see what the kernel supports.
>
> Maybe we'll get more fun 8 second timeouts to deal with. Maybe these probing
> ioctls failing or succeeding will alter the kmd's state in some unexpected way.

For this SET_PAT extension, I can assure you there is no 8 second wait :)
This is definitely a non-blocking call.

> It'll also be fun to debug cases where the driver is not starting up with the
> noise of a bunch of probing ioctls flying by.
>
> I thought about suggesting at least something like I915_PARAM_CMD_PARSER_VERSION,
> but I don't know if that could have prevented this 8 second timeout for creating
> a protected content context. Maybe it's better than nothing though.
>
> Of course, there was also the vague idea I threw out below for getting a list of
> supported extentions.

The detection mechanism itself is an uAPI change, I don't think it's a good idea to
combine that with this SET_PAT extension patch.
I suggest we start a discussion in the "i915 uAPI changes" teams channel just like
how we sorted out a solution for this setting cache policy issue. Would that work?

https://teams.microsoft.com/l/channel/19%3af1767bda6734476ba0a9c7d147b928d1%40thread.skype/i915%2520uAPI%2520changes?groupId=379f3ae1-d138-4205-bb65-d4c7d38cb481&tenantId=46c98d88-e344-4ed4-8496-4ed7712e255d

-Fei

> -Jordan
>
>> 
>> This seem likely to get more and more problematic as a detection 
>> mechanism as more extensions are added.
>> 
>> > 
>> > Now it appears trying to create a protected context can block for 
>> > several seconds.
>> > 
>> > Since we have to report capabilities to the user even before it 
>> > creates protected contexts, any app is at risk of blocking.
>> > 
>> 
>> This failure path is not causing any re-thinking about using this as 
>> the extension detection mechanism?
>> 
>> Doesn't the ioctl# + input-struct-size + u64-extension# identify the 
>> extension such that the kernel could indicate if it is supported or 
>> not. (Or, perhaps return an array of the supported extensions so the 
>> umd doesn't have to potentially make many ioctls for each extension of
>> interest.)
>> 
>> -Jordan

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2023-04-13 20:49 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-01  6:38 [Intel-gfx] [PATCH 0/7] drm/i915/mtl: Define MOCS and PAT tables for MTL fei.yang
2023-04-01  6:38 ` [Intel-gfx] [PATCH 1/7] " fei.yang
2023-04-03 12:50   ` Jani Nikula
2023-04-06  8:16     ` Andi Shyti
2023-04-06 18:22       ` Yang, Fei
2023-04-06  8:28   ` Das, Nirmoy
2023-04-06 14:55     ` Yang, Fei
2023-04-06 18:13       ` Das, Nirmoy
2023-04-01  6:38 ` [Intel-gfx] [PATCH 2/7] drm/i915/mtl: workaround coherency issue for Media fei.yang
2023-04-01  6:38 ` [Intel-gfx] [PATCH 3/7] drm/i915/mtl: end support for set caching ioctl fei.yang
2023-04-01  6:38 ` [Intel-gfx] [PATCH 4/7] drm/i915: preparation for using PAT index fei.yang
2023-04-01  6:38 ` [Intel-gfx] [PATCH 5/7] drm/i915: use pat_index instead of cache_level fei.yang
2023-04-03 14:50   ` Ville Syrjälä
2023-04-03 16:57     ` Yang, Fei
2023-04-03 17:14       ` Ville Syrjälä
2023-04-03 19:39         ` Yang, Fei
2023-04-03 19:52           ` Ville Syrjälä
2023-04-06  6:28             ` Yang, Fei
2023-04-01  6:38 ` [Intel-gfx] [PATCH 6/7] drm/i915: make sure correct pte encode is used fei.yang
2023-04-01  6:38 ` [Intel-gfx] [PATCH 7/7] drm/i915: Allow user to set cache at BO creation fei.yang
2023-04-03 16:02   ` Ville Syrjälä
2023-04-03 16:35     ` Matt Roper
2023-04-03 16:48       ` Ville Syrjälä
2023-04-04 22:15         ` Kenneth Graunke
2023-04-04  7:29   ` Lionel Landwerlin
2023-04-04 16:04     ` Yang, Fei
2023-04-05  7:45       ` Lionel Landwerlin
2023-04-05 20:26         ` Jordan Justen
2023-04-10  8:23           ` Jordan Justen
2023-04-13 20:49             ` Yang, Fei
2023-04-05 23:06         ` Yang, Fei
2023-04-06  9:11   ` Matthew Auld
2023-04-01  7:03 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/mtl: Define MOCS and PAT tables for MTL Patchwork
2023-04-01  7:03 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2023-04-01  7:20 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).