intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH v4 00/18] Begin enabling Xe_HP SDV and DG2 platforms
@ 2021-07-29 16:59 Matt Roper
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 01/18] drm/i915/xehp: handle new steering options Matt Roper
                   ` (18 more replies)
  0 siblings, 19 replies; 33+ messages in thread
From: Matt Roper @ 2021-07-29 16:59 UTC (permalink / raw)
  To: intel-gfx; +Cc: Lucas De Marchi

This series provides some of the initial enablement patches for two
upcoming discrete GPUs:
 * XeHP SDV:  Xe_HP (version 12.50) graphics IP, no display IP
 * DG2:  Xe_HPG (version 12.55) graphics IP, Xe_LPD (version 13) display IP

Both platforms will need additional enablement patches beyond what's
present in this series before they're truly usable, including various
LMEM and GuC work that's already happening separately.  The new
features/functionality that these platforms bring (such as multi-tile
support, dedicated compute engines, etc.) may be referenced in passing
in some of these patches but will be fully enabled in future series.

v2:
 - General rebase and incorporation of r-b's.
 - Re-order intel_gt_info and intel_device_info structures to eliminate
   some unnecessary padding after the size change of
   intel_engine_mask_t.  (Tvrtko)
 - Use 'intel_step' mechanisms for revid->stepping mapping.  (Jani)
 - Drop the DSC patches for now; they need some rework.  (Jani)

v3:
 - About 20 of the patches have landed upstream now.  Rebase and resend
   the rest.  Some of these are already reviewed, but have dependencies
   on other unreviewed patches (e.g., the new engine definitions, the
   initial SNPS PHY support, etc.).

v4:
 - Several more patches have landed upstream; rebase and re-send the
   rest.  Some of the remaining patches are reviewed but still have
   dependencies on non-reviewed patches, so the order is shuffled this
   time to group patches by dependency rather than by xehp vs xehpsdv vs
   dg2.
 - Minor cleanup to "drm/i915/xehp: handle new steering options"
   suggested by Caz.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: James Ausmus <james.ausmus@intel.com>

Akeem G Abodunrin (1):
  drm/i915/dg2: Add new LRI reg offsets

Ankit Nautiyal (1):
  drm/i915/dg2: Configure PCON in DP pre-enable path

Daniele Ceraolo Spurio (1):
  drm/i915/xehp: handle new steering options

Lucas De Marchi (2):
  drm/i915/xehpsdv: Define MOCS table for XeHP SDV
  drm/i915/xehpsdv: factor out function to read RP_STATE_CAP

Matt Roper (11):
  drm/i915/xehpsdv: Define steering tables
  drm/i915/dg2: Add forcewake table
  drm/i915/dg2: Update LNCF steering ranges
  drm/i915/dg2: Add SQIDI steering
  drm/i915/xehp: Loop over all gslices for INSTDONE processing
  drm/i915/dg2: Report INSTDONE_GEOM values in error state
  drm/i915/xehpsdv: Add maximum sseu limits
  drm/i915/dg2: DG2 uses the same sseu limits as XeHP SDV
  drm/i915/dg2: Define MOCS table for DG2
  drm/i915/xehpsdv: Read correct RP_STATE_CAP register
  drm/i915/dg2: Maintain backward-compatible nested batch behavior

Matthew Auld (1):
  drm/i915/xehp: Changes to ss/eu definitions

Stuart Summers (1):
  drm/i915/xehpsdv: Add compute DSS type

 drivers/gpu/drm/i915/display/intel_ddi.c     |   3 +
 drivers/gpu/drm/i915/gt/debugfs_gt_pm.c      |   8 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  55 ++--
 drivers/gpu/drm/i915/gt/intel_engine_types.h |  15 +-
 drivers/gpu/drm/i915/gt/intel_gt.c           |  68 ++++-
 drivers/gpu/drm/i915/gt/intel_gt_types.h     |   7 +
 drivers/gpu/drm/i915/gt/intel_lrc.c          |  85 +++++-
 drivers/gpu/drm/i915/gt/intel_mocs.c         |  66 +++-
 drivers/gpu/drm/i915/gt/intel_region_lmem.c  |   1 +
 drivers/gpu/drm/i915/gt/intel_rps.c          |  19 +-
 drivers/gpu/drm/i915/gt/intel_rps.h          |   1 +
 drivers/gpu/drm/i915/gt/intel_sseu.c         | 116 +++++--
 drivers/gpu/drm/i915/gt/intel_sseu.h         |  20 +-
 drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c |   2 +-
 drivers/gpu/drm/i915/gt/intel_workarounds.c  | 155 +++++++++-
 drivers/gpu/drm/i915/i915_debugfs.c          |   8 +-
 drivers/gpu/drm/i915/i915_drv.h              |   3 +
 drivers/gpu/drm/i915/i915_getparam.c         |   6 +-
 drivers/gpu/drm/i915/i915_gpu_error.c        |  36 ++-
 drivers/gpu/drm/i915/i915_pci.c              |   1 +
 drivers/gpu/drm/i915/i915_reg.h              |  15 +-
 drivers/gpu/drm/i915/intel_device_info.h     |   1 +
 drivers/gpu/drm/i915/intel_uncore.c          | 305 ++++++++++---------
 include/uapi/drm/i915_drm.h                  |   3 -
 24 files changed, 771 insertions(+), 228 deletions(-)

-- 
2.25.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v4 01/18] drm/i915/xehp: handle new steering options
  2021-07-29 16:59 [Intel-gfx] [PATCH v4 00/18] Begin enabling Xe_HP SDV and DG2 platforms Matt Roper
@ 2021-07-29 16:59 ` Matt Roper
  2021-08-04 19:53   ` Lucas De Marchi
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 02/18] drm/i915/xehpsdv: Define steering tables Matt Roper
                   ` (17 subsequent siblings)
  18 siblings, 1 reply; 33+ messages in thread
From: Matt Roper @ 2021-07-29 16:59 UTC (permalink / raw)
  To: intel-gfx

From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

Xe_HP is more modular than its predecessors and as a consequence it has
more types of replicated registers.  As with l3bank regions on previous
platforms, we may need to explicitly re-steer accesses to these new
types of ranges at runtime if we can't find a single default steering
value that satisfies the fusing of all types.

v2:
 - Add a local 'i915' variable to reduce gt->i915 usage.  (Caz)
 - Drop unused 'intel_gt_read_register' prototype.  (Caz)

Bspec: 66534
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Caz Yokoyama <caz.yokoyama@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt.c          | 42 +++++++++-
 drivers/gpu/drm/i915/gt/intel_gt_types.h    |  7 ++
 drivers/gpu/drm/i915/gt/intel_region_lmem.c |  1 +
 drivers/gpu/drm/i915/gt/intel_sseu.c        | 18 +++++
 drivers/gpu/drm/i915/gt/intel_sseu.h        |  6 ++
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 89 +++++++++++++++++++--
 drivers/gpu/drm/i915/i915_drv.h             |  3 +
 drivers/gpu/drm/i915/i915_pci.c             |  1 +
 drivers/gpu/drm/i915/i915_reg.h             |  4 +
 drivers/gpu/drm/i915/intel_device_info.h    |  1 +
 10 files changed, 166 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index a64aa43f7cd9..39b224600f38 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -89,18 +89,40 @@ static const struct intel_mmio_range icl_l3bank_steering_table[] = {
 	{},
 };
 
+static u16 slicemask(struct intel_gt *gt, int count)
+{
+	u64 dss_mask = intel_sseu_get_subslices(&gt->info.sseu, 0);
+
+	return intel_slicemask_from_dssmask(dss_mask, count);
+}
+
 int intel_gt_init_mmio(struct intel_gt *gt)
 {
+	struct drm_i915_private *i915 = gt->i915;
+
 	intel_gt_init_clock_frequency(gt);
 
 	intel_uc_init_mmio(&gt->uc);
 	intel_sseu_info_init(gt);
 
-	if (GRAPHICS_VER(gt->i915) >= 11) {
+	/*
+	 * An mslice is unavailable only if both the meml3 for the slice is
+	 * disabled *and* all of the DSS in the slice (quadrant) are disabled.
+	 */
+	if (HAS_MSLICES(i915))
+		gt->info.mslice_mask =
+			slicemask(gt, GEN_DSS_PER_MSLICE) |
+			(intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
+			 GEN12_MEML3_EN_MASK);
+
+	if (GRAPHICS_VER(i915) >= 11 &&
+		   GRAPHICS_VER_FULL(i915) < IP_VER(12, 50)) {
 		gt->steering_table[L3BANK] = icl_l3bank_steering_table;
 		gt->info.l3bank_mask =
 			~intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
 			GEN10_L3BANK_MASK;
+	} else if (HAS_MSLICES(i915)) {
+		MISSING_CASE(INTEL_INFO(i915)->platform);
 	}
 
 	return intel_engines_init_mmio(gt);
@@ -787,6 +809,24 @@ static void intel_gt_get_valid_steering(struct intel_gt *gt,
 		*sliceid = 0;		/* unused */
 		*subsliceid = __ffs(gt->info.l3bank_mask);
 		break;
+	case MSLICE:
+		GEM_DEBUG_WARN_ON(!gt->info.mslice_mask); /* should be impossible! */
+
+		*sliceid = __ffs(gt->info.mslice_mask);
+		*subsliceid = 0;	/* unused */
+		break;
+	case LNCF:
+		GEM_DEBUG_WARN_ON(!gt->info.mslice_mask); /* should be impossible! */
+
+		/*
+		 * 0xFDC[29:28] selects the mslice to steer to and 0xFDC[27]
+		 * selects which LNCF within the mslice to steer to.  An LNCF
+		 * is always present if its mslice is present, so we can safely
+		 * just steer to LNCF 0 in all cases.
+		 */
+		*sliceid = __ffs(gt->info.mslice_mask) << 1;
+		*subsliceid = 0;	/* unused */
+		break;
 	default:
 		MISSING_CASE(type);
 		*sliceid = 0;
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index 97a5075288d2..a81e21bf1bd1 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -47,9 +47,14 @@ struct intel_mmio_range {
  * of multicast registers.  If another type of steering does not have any
  * overlap in valid steering targets with 'subslice' style registers, we will
  * need to explicitly re-steer reads of registers of the other type.
+ *
+ * Only the replication types that may need additional non-default steering
+ * are listed here.
  */
 enum intel_steering_type {
 	L3BANK,
+	MSLICE,
+	LNCF,
 
 	NUM_STEERING_TYPES
 };
@@ -184,6 +189,8 @@ struct intel_gt {
 
 		/* Slice/subslice/EU info */
 		struct sseu_dev_info sseu;
+
+		unsigned long mslice_mask;
 	} info;
 };
 
diff --git a/drivers/gpu/drm/i915/gt/intel_region_lmem.c b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
index e3a2a2fa5f94..a74b72f50cc9 100644
--- a/drivers/gpu/drm/i915/gt/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
@@ -10,6 +10,7 @@
 #include "gem/i915_gem_lmem.h"
 #include "gem/i915_gem_region.h"
 #include "gem/i915_gem_ttm.h"
+#include "gt/intel_gt.h"
 
 static int init_fake_lmem_bar(struct intel_memory_region *mem)
 {
diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c
index 367fd44b81c8..bbed8e8625e1 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.c
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
@@ -759,3 +759,21 @@ void intel_sseu_print_topology(const struct sseu_dev_info *sseu,
 		}
 	}
 }
+
+u16 intel_slicemask_from_dssmask(u64 dss_mask, int dss_per_slice)
+{
+	u16 slice_mask = 0;
+	int i;
+
+	WARN_ON(sizeof(dss_mask) * 8 / dss_per_slice > 8 * sizeof(slice_mask));
+
+	for (i = 0; dss_mask; i++) {
+		if (dss_mask & GENMASK(dss_per_slice - 1, 0))
+			slice_mask |= BIT(i);
+
+		dss_mask >>= dss_per_slice;
+	}
+
+	return slice_mask;
+}
+
diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h
index 4cd1a8a7298a..1073471d1980 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.h
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
@@ -22,6 +22,10 @@ struct drm_printer;
 #define GEN_MAX_EUS		(16) /* TGL upper bound */
 #define GEN_MAX_EU_STRIDE GEN_SSEU_STRIDE(GEN_MAX_EUS)
 
+#define GEN_DSS_PER_GSLICE	4
+#define GEN_DSS_PER_CSLICE	8
+#define GEN_DSS_PER_MSLICE	8
+
 struct sseu_dev_info {
 	u8 slice_mask;
 	u8 subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
@@ -104,4 +108,6 @@ void intel_sseu_dump(const struct sseu_dev_info *sseu, struct drm_printer *p);
 void intel_sseu_print_topology(const struct sseu_dev_info *sseu,
 			       struct drm_printer *p);
 
+u16 intel_slicemask_from_dssmask(u64 dss_mask, int dss_per_slice);
+
 #endif /* __INTEL_SSEU_H__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 9173df59821a..f2ceabb0e2ea 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -889,12 +889,24 @@ cfl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
 		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
 }
 
+static void __add_mcr_wa(struct drm_i915_private *i915, struct i915_wa_list *wal,
+			 unsigned slice, unsigned subslice)
+{
+	u32 mcr, mcr_mask;
+
+	mcr = GEN11_MCR_SLICE(slice) | GEN11_MCR_SUBSLICE(subslice);
+	mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
+
+	drm_dbg(&i915->drm, "MCR slice/subslice = %x\n", mcr);
+
+	wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr);
+}
+
 static void
 icl_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
 {
 	const struct sseu_dev_info *sseu = &i915->gt.info.sseu;
 	unsigned int slice, subslice;
-	u32 mcr, mcr_mask;
 
 	GEM_BUG_ON(GRAPHICS_VER(i915) < 11);
 	GEM_BUG_ON(hweight8(sseu->slice_mask) > 1);
@@ -919,12 +931,79 @@ icl_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
 	if (i915->gt.info.l3bank_mask & BIT(subslice))
 		i915->gt.steering_table[L3BANK] = NULL;
 
-	mcr = GEN11_MCR_SLICE(slice) | GEN11_MCR_SUBSLICE(subslice);
-	mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
+	__add_mcr_wa(i915, wal, slice, subslice);
+}
 
-	drm_dbg(&i915->drm, "MCR slice/subslice = %x\n", mcr);
+__maybe_unused
+static void
+xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
+{
+	struct drm_i915_private *i915 = gt->i915;
+	const struct sseu_dev_info *sseu = &gt->info.sseu;
+	unsigned long slice, subslice = 0, slice_mask = 0;
+	u64 dss_mask = 0;
+	u32 lncf_mask = 0;
+	int i;
 
-	wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr);
+	/*
+	 * On Xe_HP the steering increases in complexity. There are now several
+	 * more units that require steering and we're not guaranteed to be able
+	 * to find a common setting for all of them. These are:
+	 * - GSLICE (fusable)
+	 * - DSS (sub-unit within gslice; fusable)
+	 * - L3 Bank (fusable)
+	 * - MSLICE (fusable)
+	 * - LNCF (sub-unit within mslice; always present if mslice is present)
+	 * - SQIDI (always on)
+	 *
+	 * We'll do our default/implicit steering based on GSLICE (in the
+	 * sliceid field) and DSS (in the subsliceid field).  If we can
+	 * find overlap between the valid MSLICE and/or LNCF values with
+	 * a suitable GSLICE, then we can just re-use the default value and
+	 * skip and explicit steering at runtime.
+	 *
+	 * We only need to look for overlap between GSLICE/MSLICE/LNCF to find
+	 * a valid sliceid value.  DSS steering is the only type of steering
+	 * that utilizes the 'subsliceid' bits.
+	 *
+	 * Also note that, even though the steering domain is called "GSlice"
+	 * and it is encoded in the register using the gslice format, the spec
+	 * says that the combined (geometry | compute) fuse should be used to
+	 * select the steering.
+	 */
+
+	/* Find the potential gslice candidates */
+	dss_mask = intel_sseu_get_subslices(sseu, 0);
+	slice_mask = intel_slicemask_from_dssmask(dss_mask, GEN_DSS_PER_GSLICE);
+
+	/*
+	 * Find the potential LNCF candidates.  Either LNCF within a valid
+	 * mslice is fine.
+	 */
+	for_each_set_bit(i, &gt->info.mslice_mask, GEN12_MAX_MSLICES)
+		lncf_mask |= (0x3 << (i * 2));
+
+	/*
+	 * Are there any sliceid values that work for both GSLICE and LNCF
+	 * steering?
+	 */
+	if (slice_mask & lncf_mask) {
+		slice_mask &= lncf_mask;
+		gt->steering_table[LNCF] = NULL;
+	}
+
+	/* How about sliceid values that also work for MSLICE steering? */
+	if (slice_mask & gt->info.mslice_mask) {
+		slice_mask &= gt->info.mslice_mask;
+		gt->steering_table[MSLICE] = NULL;
+	}
+
+	slice = __ffs(slice_mask);
+	subslice = __ffs(dss_mask >> (slice * GEN_DSS_PER_GSLICE));
+	WARN_ON(subslice > GEN_DSS_PER_GSLICE);
+	WARN_ON(dss_mask >> (slice * GEN_DSS_PER_GSLICE) == 0);
+
+	__add_mcr_wa(i915, wal, slice, subslice);
 }
 
 static void
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 70bef688c6da..31b8bb43c581 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1694,6 +1694,9 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 #define HAS_RUNTIME_PM(dev_priv) (INTEL_INFO(dev_priv)->has_runtime_pm)
 #define HAS_64BIT_RELOC(dev_priv) (INTEL_INFO(dev_priv)->has_64bit_reloc)
 
+#define HAS_MSLICES(dev_priv) \
+	(INTEL_INFO(dev_priv)->has_mslices)
+
 #define HAS_IPC(dev_priv)		 (INTEL_INFO(dev_priv)->display.has_ipc)
 
 #define HAS_REGION(i915, i) (INTEL_INFO(i915)->memory_regions & (i))
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 08651ca03478..ed9edda727d2 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -1007,6 +1007,7 @@ static const struct intel_device_info adl_p_info = {
 	.has_llc = 1, \
 	.has_logical_ring_contexts = 1, \
 	.has_logical_ring_elsq = 1, \
+	.has_mslices = 1, \
 	.has_rc6 = 1, \
 	.has_reset_engine = 1, \
 	.has_rps = 1, \
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 1a164b90e9fc..f4113e7e8271 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2766,6 +2766,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define   GEN11_MCR_SLICE_MASK		GEN11_MCR_SLICE(0xf)
 #define   GEN11_MCR_SUBSLICE(subslice)	(((subslice) & 0x7) << 24)
 #define   GEN11_MCR_SUBSLICE_MASK	GEN11_MCR_SUBSLICE(0x7)
+#define   GEN11_MCR_MULTICAST		REG_BIT(31)
 #define RING_IPEIR(base)	_MMIO((base) + 0x64)
 #define RING_IPEHR(base)	_MMIO((base) + 0x68)
 #define RING_EIR(base)		_MMIO((base) + 0xb0)
@@ -3184,6 +3185,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define	GEN10_MIRROR_FUSE3		_MMIO(0x9118)
 #define GEN10_L3BANK_PAIR_COUNT     4
 #define GEN10_L3BANK_MASK   0x0F
+/* on Xe_HP the same fuses indicates mslices instead of L3 banks */
+#define GEN12_MAX_MSLICES 4
+#define GEN12_MEML3_EN_MASK 0x0F
 
 #define GEN8_EU_DISABLE0		_MMIO(0x9134)
 #define   GEN8_EU_DIS0_S0_MASK		0xffffff
diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
index 121d6d9afd3a..2177372f9ac3 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -133,6 +133,7 @@ enum intel_ppgtt_type {
 	func(has_llc); \
 	func(has_logical_ring_contexts); \
 	func(has_logical_ring_elsq); \
+	func(has_mslices); \
 	func(has_pooled_eu); \
 	func(has_rc6); \
 	func(has_rc6p); \
-- 
2.25.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v4 02/18] drm/i915/xehpsdv: Define steering tables
  2021-07-29 16:59 [Intel-gfx] [PATCH v4 00/18] Begin enabling Xe_HP SDV and DG2 platforms Matt Roper
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 01/18] drm/i915/xehp: handle new steering options Matt Roper
@ 2021-07-29 16:59 ` Matt Roper
  2021-08-04 20:13   ` Lucas De Marchi
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 03/18] drm/i915/dg2: Add forcewake table Matt Roper
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 33+ messages in thread
From: Matt Roper @ 2021-07-29 16:59 UTC (permalink / raw)
  To: intel-gfx

Define and initialize the MMIO ranges for which XeHP SDV requires MSLICE
and LNCF steering.

Bspec: 66534
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt.c          | 19 ++++++++++++++++++-
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 11 +++++++++--
 2 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index 39b224600f38..8e13bfc81634 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -89,6 +89,20 @@ static const struct intel_mmio_range icl_l3bank_steering_table[] = {
 	{},
 };
 
+static const struct intel_mmio_range xehpsdv_mslice_steering_table[] = {
+	{ 0x004000, 0x004AFF },
+	{ 0x00C800, 0x00CFFF },
+	{ 0x00DD00, 0x00DDFF },
+	{ 0x00E900, 0x00FFFF }, /* 0xEA00 - OxEFFF is unused */
+	{},
+};
+
+static const struct intel_mmio_range xehpsdv_lncf_steering_table[] = {
+	{ 0x00B000, 0x00B0FF },
+	{ 0x00D800, 0x00D8FF },
+	{},
+};
+
 static u16 slicemask(struct intel_gt *gt, int count)
 {
 	u64 dss_mask = intel_sseu_get_subslices(&gt->info.sseu, 0);
@@ -115,7 +129,10 @@ int intel_gt_init_mmio(struct intel_gt *gt)
 			(intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
 			 GEN12_MEML3_EN_MASK);
 
-	if (GRAPHICS_VER(i915) >= 11 &&
+	if (IS_XEHPSDV(i915)) {
+		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
+		gt->steering_table[LNCF] = xehpsdv_lncf_steering_table;
+	} else if (GRAPHICS_VER(i915) >= 11 &&
 		   GRAPHICS_VER_FULL(i915) < IP_VER(12, 50)) {
 		gt->steering_table[L3BANK] = icl_l3bank_steering_table;
 		gt->info.l3bank_mask =
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index f2ceabb0e2ea..8717337a6c81 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -934,7 +934,6 @@ icl_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
 	__add_mcr_wa(i915, wal, slice, subslice);
 }
 
-__maybe_unused
 static void
 xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
 {
@@ -1136,10 +1135,18 @@ dg1_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
 			    VSUNIT_CLKGATE_DIS_TGL);
 }
 
+static void
+xehpsdv_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
+{
+	xehp_init_mcr(&i915->gt, wal);
+}
+
 static void
 gt_init_workarounds(struct drm_i915_private *i915, struct i915_wa_list *wal)
 {
-	if (IS_DG1(i915))
+	if (IS_XEHPSDV(i915))
+		xehpsdv_gt_workarounds_init(i915, wal);
+	else if (IS_DG1(i915))
 		dg1_gt_workarounds_init(i915, wal);
 	else if (IS_TIGERLAKE(i915))
 		tgl_gt_workarounds_init(i915, wal);
-- 
2.25.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v4 03/18] drm/i915/dg2: Add forcewake table
  2021-07-29 16:59 [Intel-gfx] [PATCH v4 00/18] Begin enabling Xe_HP SDV and DG2 platforms Matt Roper
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 01/18] drm/i915/xehp: handle new steering options Matt Roper
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 02/18] drm/i915/xehpsdv: Define steering tables Matt Roper
@ 2021-07-29 16:59 ` Matt Roper
  2021-08-04  0:15   ` Souza, Jose
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 04/18] drm/i915/dg2: Update LNCF steering ranges Matt Roper
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 33+ messages in thread
From: Matt Roper @ 2021-07-29 16:59 UTC (permalink / raw)
  To: intel-gfx

The DG2 forcewake table is very similar to the one used by XeHP SDV (and
both platforms are even presented as a single table in the bspec).  For
the most part DG2 starts using a few additional ranges that were
'reserved' on XeHP SDV and stops using some others.  However there is a
single range (0xd800-0xd87f) that needs to be handled differently
between the two platforms (it needs GT wake on XeHP SDV, but render wake
on DG2) so unless we want to wake both domains (which could waste power)
or define new types of forcewake domains for this special case we need
to have separate tables for the two platforms.  Let's define the ranges
for both platforms with a parameterized macro so that we don't actually
need to duplicate everything in the code.

It should be fine for DG2 to re-use the Xe_HP shadow register list so we
can continue to use the 'xehpsdv' MMIO write functions and don't need to
spin up a separate DG2 instance.

Bspec: 66534
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/intel_uncore.c | 305 +++++++++++++++-------------
 1 file changed, 168 insertions(+), 137 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 8cf53f54559d..6b38bc2811c1 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1317,143 +1317,170 @@ static const struct intel_forcewake_range __gen12_fw_ranges[] = {
 		0x1d3f00 - 0x1d3fff: VD2 */
 };
 
-/* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
-static const struct intel_forcewake_range __xehp_fw_ranges[] = {
-	GEN_FW_RANGE(0x0, 0x1fff, 0), /*
-		  0x0 -  0xaff: reserved
-		0xb00 - 0x1fff: always on */
-	GEN_FW_RANGE(0x2000, 0x26ff, FORCEWAKE_RENDER),
-	GEN_FW_RANGE(0x2700, 0x4aff, FORCEWAKE_GT),
-	GEN_FW_RANGE(0x4b00, 0x51ff, 0), /*
-		0x4b00 - 0x4fff: reserved
-		0x5000 - 0x51ff: always on */
-	GEN_FW_RANGE(0x5200, 0x7fff, FORCEWAKE_RENDER),
-	GEN_FW_RANGE(0x8000, 0x813f, FORCEWAKE_GT),
-	GEN_FW_RANGE(0x8140, 0x815f, FORCEWAKE_RENDER),
-	GEN_FW_RANGE(0x8160, 0x81ff, 0), /*
-		0x8160 - 0x817f: reserved
-		0x8180 - 0x81ff: always on */
-	GEN_FW_RANGE(0x8200, 0x82ff, FORCEWAKE_GT),
-	GEN_FW_RANGE(0x8300, 0x84ff, FORCEWAKE_RENDER),
-	GEN_FW_RANGE(0x8500, 0x94cf, FORCEWAKE_GT), /*
-		0x8500 - 0x87ff: gt
-		0x8800 - 0x8fff: reserved
-		0x9000 - 0x947f: gt
-		0x9480 - 0x94cf: reserved */
-	GEN_FW_RANGE(0x94d0, 0x955f, FORCEWAKE_RENDER),
-	GEN_FW_RANGE(0x9560, 0x97ff, 0), /*
-		0x9560 - 0x95ff: always on
-		0x9600 - 0x97ff: reserved */
-	GEN_FW_RANGE(0x9800, 0xcfff, FORCEWAKE_GT), /*
-		0x9800 - 0xb4ff: gt
-		0xb500 - 0xbfff: reserved
-		0xc000 - 0xcfff: gt */
-	GEN_FW_RANGE(0xd000, 0xd7ff, 0),
-	GEN_FW_RANGE(0xd800, 0xdbff, FORCEWAKE_GT),
-	GEN_FW_RANGE(0xdc00, 0xdcff, FORCEWAKE_RENDER),
-	GEN_FW_RANGE(0xdd00, 0xde7f, FORCEWAKE_GT), /*
-		0xdd00 - 0xddff: gt
-		0xde00 - 0xde7f: reserved */
-	GEN_FW_RANGE(0xde80, 0xe8ff, FORCEWAKE_RENDER), /*
-		0xde80 - 0xdfff: render
-		0xe000 - 0xe0ff: reserved
-		0xe100 - 0xe8ff: render */
-	GEN_FW_RANGE(0xe900, 0xffff, FORCEWAKE_GT), /*
-		0xe900 - 0xe9ff: gt
-		0xea00 - 0xefff: reserved
-		0xf000 - 0xffff: gt */
-	GEN_FW_RANGE(0x10000, 0x13fff, 0), /*
-		0x10000 - 0x11fff: reserved
-		0x12000 - 0x127ff: always on
-		0x12800 - 0x13fff: reserved */
-	GEN_FW_RANGE(0x14000, 0x141ff, FORCEWAKE_MEDIA_VDBOX0),
-	GEN_FW_RANGE(0x14200, 0x143ff, FORCEWAKE_MEDIA_VDBOX2),
-	GEN_FW_RANGE(0x14400, 0x145ff, FORCEWAKE_MEDIA_VDBOX4),
-	GEN_FW_RANGE(0x14600, 0x147ff, FORCEWAKE_MEDIA_VDBOX6),
-	GEN_FW_RANGE(0x14800, 0x1ffff, FORCEWAKE_RENDER), /*
-		0x14800 - 0x14fff: render
-		0x15000 - 0x16dff: reserved
-		0x16e00 - 0x1ffff: render */
-	GEN_FW_RANGE(0x20000, 0x21fff, FORCEWAKE_MEDIA_VDBOX0), /*
-		0x20000 - 0x20fff: VD0
-		0x21000 - 0x21fff: reserved */
-	GEN_FW_RANGE(0x22000, 0x23fff, FORCEWAKE_GT),
-	GEN_FW_RANGE(0x24000, 0x2417f, 0), /*
-		0x24000 - 0x2407f: always on
-		0x24080 - 0x2417f: reserved */
-	GEN_FW_RANGE(0x24180, 0x249ff, FORCEWAKE_GT), /*
-		0x24180 - 0x241ff: gt
-		0x24200 - 0x249ff: reserved */
-	GEN_FW_RANGE(0x24a00, 0x251ff, FORCEWAKE_RENDER), /*
-		0x24a00 - 0x24a7f: render
-		0x24a80 - 0x251ff: reserved */
-	GEN_FW_RANGE(0x25200, 0x25fff, FORCEWAKE_GT), /*
-		0x25200 - 0x252ff: gt
-		0x25300 - 0x25fff: reserved */
-	GEN_FW_RANGE(0x26000, 0x2ffff, FORCEWAKE_RENDER), /*
-		0x26000 - 0x27fff: render
-		0x28000 - 0x29fff: reserved
-		0x2a000 - 0x2ffff: undocumented */
-	GEN_FW_RANGE(0x30000, 0x3ffff, FORCEWAKE_GT),
-	GEN_FW_RANGE(0x40000, 0x1bffff, 0),
-	GEN_FW_RANGE(0x1c0000, 0x1c3fff, FORCEWAKE_MEDIA_VDBOX0), /*
-		0x1c0000 - 0x1c2bff: VD0
-		0x1c2c00 - 0x1c2cff: reserved
-		0x1c2d00 - 0x1c2dff: VD0
-		0x1c2e00 - 0x1c3eff: reserved
-		0x1c3f00 - 0x1c3fff: VD0 */
-	GEN_FW_RANGE(0x1c4000, 0x1c7fff, FORCEWAKE_MEDIA_VDBOX1), /*
-		0x1c4000 - 0x1c6bff: VD1
-		0x1c6c00 - 0x1c6cff: reserved
-		0x1c6d00 - 0x1c6dff: VD1
-		0x1c6e00 - 0x1c7fff: reserved */
-	GEN_FW_RANGE(0x1c8000, 0x1cbfff, FORCEWAKE_MEDIA_VEBOX0), /*
-		0x1c8000 - 0x1ca0ff: VE0
-		0x1ca100 - 0x1cbfff: reserved */
-	GEN_FW_RANGE(0x1cc000, 0x1ccfff, FORCEWAKE_MEDIA_VDBOX0),
-	GEN_FW_RANGE(0x1cd000, 0x1cdfff, FORCEWAKE_MEDIA_VDBOX2),
-	GEN_FW_RANGE(0x1ce000, 0x1cefff, FORCEWAKE_MEDIA_VDBOX4),
-	GEN_FW_RANGE(0x1cf000, 0x1cffff, FORCEWAKE_MEDIA_VDBOX6),
-	GEN_FW_RANGE(0x1d0000, 0x1d3fff, FORCEWAKE_MEDIA_VDBOX2), /*
-		0x1d0000 - 0x1d2bff: VD2
-		0x1d2c00 - 0x1d2cff: reserved
-		0x1d2d00 - 0x1d2dff: VD2
-		0x1d2e00 - 0x1d3eff: reserved
-		0x1d3f00 - 0x1d3fff: VD2 */
-	GEN_FW_RANGE(0x1d4000, 0x1d7fff, FORCEWAKE_MEDIA_VDBOX3), /*
-		0x1d4000 - 0x1d6bff: VD3
-		0x1d6c00 - 0x1d6cff: reserved
-		0x1d6d00 - 0x1d6dff: VD3
-		0x1d6e00 - 0x1d7fff: reserved */
-	GEN_FW_RANGE(0x1d8000, 0x1dffff, FORCEWAKE_MEDIA_VEBOX1), /*
-		0x1d8000 - 0x1da0ff: VE1
-		0x1da100 - 0x1dffff: reserved */
-	GEN_FW_RANGE(0x1e0000, 0x1e3fff, FORCEWAKE_MEDIA_VDBOX4), /*
-		0x1e0000 - 0x1e2bff: VD4
-		0x1e2c00 - 0x1e2cff: reserved
-		0x1e2d00 - 0x1e2dff: VD4
-		0x1e2e00 - 0x1e3eff: reserved
-		0x1e3f00 - 0x1e3fff: VD4 */
-	GEN_FW_RANGE(0x1e4000, 0x1e7fff, FORCEWAKE_MEDIA_VDBOX5), /*
-		0x1e4000 - 0x1e6bff: VD5
-		0x1e6c00 - 0x1e6cff: reserved
-		0x1e6d00 - 0x1e6dff: VD5
-		0x1e6e00 - 0x1e7fff: reserved */
-	GEN_FW_RANGE(0x1e8000, 0x1effff, FORCEWAKE_MEDIA_VEBOX2), /*
-		0x1e8000 - 0x1ea0ff: VE2
-		0x1ea100 - 0x1effff: reserved */
-	GEN_FW_RANGE(0x1f0000, 0x1f3fff, FORCEWAKE_MEDIA_VDBOX6), /*
-		0x1f0000 - 0x1f2bff: VD6
-		0x1f2c00 - 0x1f2cff: reserved
-		0x1f2d00 - 0x1f2dff: VD6
-		0x1f2e00 - 0x1f3eff: reserved
-		0x1f3f00 - 0x1f3fff: VD6 */
-	GEN_FW_RANGE(0x1f4000, 0x1f7fff, FORCEWAKE_MEDIA_VDBOX7), /*
-		0x1f4000 - 0x1f6bff: VD7
-		0x1f6c00 - 0x1f6cff: reserved
-		0x1f6d00 - 0x1f6dff: VD7
-		0x1f6e00 - 0x1f7fff: reserved */
+/*
+ * Graphics IP version 12.55 brings a slight change to the 0xd800 range,
+ * switching it from the GT domain to the render domain.
+ *
+ * *Must* be sorted by offset ranges! See intel_fw_table_check().
+ */
+#define XEHP_FWRANGES(FW_RANGE_D800)					\
+	GEN_FW_RANGE(0x0, 0x1fff, 0), /*					\
+		  0x0 -  0xaff: reserved					\
+		0xb00 - 0x1fff: always on */					\
+	GEN_FW_RANGE(0x2000, 0x26ff, FORCEWAKE_RENDER),				\
+	GEN_FW_RANGE(0x2700, 0x4aff, FORCEWAKE_GT),				\
+	GEN_FW_RANGE(0x4b00, 0x51ff, 0), /*					\
+		0x4b00 - 0x4fff: reserved					\
+		0x5000 - 0x51ff: always on */					\
+	GEN_FW_RANGE(0x5200, 0x7fff, FORCEWAKE_RENDER),				\
+	GEN_FW_RANGE(0x8000, 0x813f, FORCEWAKE_GT),				\
+	GEN_FW_RANGE(0x8140, 0x815f, FORCEWAKE_RENDER),				\
+	GEN_FW_RANGE(0x8160, 0x81ff, 0), /*					\
+		0x8160 - 0x817f: reserved					\
+		0x8180 - 0x81ff: always on */					\
+	GEN_FW_RANGE(0x8200, 0x82ff, FORCEWAKE_GT),				\
+	GEN_FW_RANGE(0x8300, 0x84ff, FORCEWAKE_RENDER),				\
+	GEN_FW_RANGE(0x8500, 0x8cff, FORCEWAKE_GT), /*				\
+		0x8500 - 0x87ff: gt						\
+		0x8800 - 0x8c7f: reserved					\
+		0x8c80 - 0x8cff: gt (DG2 only) */				\
+	GEN_FW_RANGE(0x8d00, 0x8fff, FORCEWAKE_RENDER), /*			\
+		0x8d00 - 0x8dff: render (DG2 only)				\
+		0x8e00 - 0x8fff: reserved */					\
+	GEN_FW_RANGE(0x9000, 0x94cf, FORCEWAKE_GT), /*				\
+		0x9000 - 0x947f: gt						\
+		0x9480 - 0x94cf: reserved */					\
+	GEN_FW_RANGE(0x94d0, 0x955f, FORCEWAKE_RENDER),				\
+	GEN_FW_RANGE(0x9560, 0x967f, 0), /*					\
+		0x9560 - 0x95ff: always on					\
+		0x9600 - 0x967f: reserved */					\
+	GEN_FW_RANGE(0x9680, 0x97ff, FORCEWAKE_RENDER), /*			\
+		0x9680 - 0x96ff: render (DG2 only)				\
+		0x9700 - 0x97ff: reserved */					\
+	GEN_FW_RANGE(0x9800, 0xcfff, FORCEWAKE_GT), /*				\
+		0x9800 - 0xb4ff: gt						\
+		0xb500 - 0xbfff: reserved					\
+		0xc000 - 0xcfff: gt */						\
+	GEN_FW_RANGE(0xd000, 0xd7ff, 0),					\
+	GEN_FW_RANGE(0xd800, 0xd87f, FW_RANGE_D800),			\
+	GEN_FW_RANGE(0xd880, 0xdbff, FORCEWAKE_GT),				\
+	GEN_FW_RANGE(0xdc00, 0xdcff, FORCEWAKE_RENDER),				\
+	GEN_FW_RANGE(0xdd00, 0xde7f, FORCEWAKE_GT), /*				\
+		0xdd00 - 0xddff: gt						\
+		0xde00 - 0xde7f: reserved */					\
+	GEN_FW_RANGE(0xde80, 0xe8ff, FORCEWAKE_RENDER), /*			\
+		0xde80 - 0xdfff: render						\
+		0xe000 - 0xe0ff: reserved					\
+		0xe100 - 0xe8ff: render */					\
+	GEN_FW_RANGE(0xe900, 0xffff, FORCEWAKE_GT), /*				\
+		0xe900 - 0xe9ff: gt						\
+		0xea00 - 0xefff: reserved					\
+		0xf000 - 0xffff: gt */						\
+	GEN_FW_RANGE(0x10000, 0x12fff, 0), /*					\
+		0x10000 - 0x11fff: reserved					\
+		0x12000 - 0x127ff: always on					\
+		0x12800 - 0x12fff: reserved */					\
+	GEN_FW_RANGE(0x13000, 0x131ff, FORCEWAKE_MEDIA_VDBOX0), /* DG2 only */	\
+	GEN_FW_RANGE(0x13200, 0x13fff, FORCEWAKE_MEDIA_VDBOX2), /*		\
+		0x13200 - 0x133ff: VD2 (DG2 only)				\
+		0x13400 - 0x13fff: reserved */					\
+	GEN_FW_RANGE(0x14000, 0x141ff, FORCEWAKE_MEDIA_VDBOX0), /* XEHPSDV only */	\
+	GEN_FW_RANGE(0x14200, 0x143ff, FORCEWAKE_MEDIA_VDBOX2), /* XEHPSDV only */	\
+	GEN_FW_RANGE(0x14400, 0x145ff, FORCEWAKE_MEDIA_VDBOX4), /* XEHPSDV only */	\
+	GEN_FW_RANGE(0x14600, 0x147ff, FORCEWAKE_MEDIA_VDBOX6), /* XEHPSDV only */	\
+	GEN_FW_RANGE(0x14800, 0x14fff, FORCEWAKE_RENDER),			\
+	GEN_FW_RANGE(0x15000, 0x16dff, FORCEWAKE_GT), /*			\
+		0x15000 - 0x15fff: gt (DG2 only)				\
+		0x16000 - 0x16dff: reserved */					\
+	GEN_FW_RANGE(0x16e00, 0x1ffff, FORCEWAKE_RENDER),			\
+	GEN_FW_RANGE(0x20000, 0x21fff, FORCEWAKE_MEDIA_VDBOX0), /*		\
+		0x20000 - 0x20fff: VD0 (XEHPSDV only)				\
+		0x21000 - 0x21fff: reserved */					\
+	GEN_FW_RANGE(0x22000, 0x23fff, FORCEWAKE_GT),				\
+	GEN_FW_RANGE(0x24000, 0x2417f, 0), /*					\
+		0x24000 - 0x2407f: always on					\
+		0x24080 - 0x2417f: reserved */					\
+	GEN_FW_RANGE(0x24180, 0x249ff, FORCEWAKE_GT), /*			\
+		0x24180 - 0x241ff: gt						\
+		0x24200 - 0x249ff: reserved */					\
+	GEN_FW_RANGE(0x24a00, 0x251ff, FORCEWAKE_RENDER), /*			\
+		0x24a00 - 0x24a7f: render					\
+		0x24a80 - 0x251ff: reserved */					\
+	GEN_FW_RANGE(0x25200, 0x25fff, FORCEWAKE_GT), /*			\
+		0x25200 - 0x252ff: gt						\
+		0x25300 - 0x25fff: reserved */					\
+	GEN_FW_RANGE(0x26000, 0x2ffff, FORCEWAKE_RENDER), /*			\
+		0x26000 - 0x27fff: render					\
+		0x28000 - 0x29fff: reserved					\
+		0x2a000 - 0x2ffff: undocumented */				\
+	GEN_FW_RANGE(0x30000, 0x3ffff, FORCEWAKE_GT),				\
+	GEN_FW_RANGE(0x40000, 0x1bffff, 0),					\
+	GEN_FW_RANGE(0x1c0000, 0x1c3fff, FORCEWAKE_MEDIA_VDBOX0), /*		\
+		0x1c0000 - 0x1c2bff: VD0					\
+		0x1c2c00 - 0x1c2cff: reserved					\
+		0x1c2d00 - 0x1c2dff: VD0					\
+		0x1c2e00 - 0x1c3eff: VD0 (DG2 only)				\
+		0x1c3f00 - 0x1c3fff: VD0 */					\
+	GEN_FW_RANGE(0x1c4000, 0x1c7fff, FORCEWAKE_MEDIA_VDBOX1), /*		\
+		0x1c4000 - 0x1c6bff: VD1					\
+		0x1c6c00 - 0x1c6cff: reserved					\
+		0x1c6d00 - 0x1c6dff: VD1					\
+		0x1c6e00 - 0x1c7fff: reserved */				\
+	GEN_FW_RANGE(0x1c8000, 0x1cbfff, FORCEWAKE_MEDIA_VEBOX0), /*		\
+		0x1c8000 - 0x1ca0ff: VE0					\
+		0x1ca100 - 0x1cbfff: reserved */				\
+	GEN_FW_RANGE(0x1cc000, 0x1ccfff, FORCEWAKE_MEDIA_VDBOX0),		\
+	GEN_FW_RANGE(0x1cd000, 0x1cdfff, FORCEWAKE_MEDIA_VDBOX2),		\
+	GEN_FW_RANGE(0x1ce000, 0x1cefff, FORCEWAKE_MEDIA_VDBOX4),		\
+	GEN_FW_RANGE(0x1cf000, 0x1cffff, FORCEWAKE_MEDIA_VDBOX6),		\
+	GEN_FW_RANGE(0x1d0000, 0x1d3fff, FORCEWAKE_MEDIA_VDBOX2), /*		\
+		0x1d0000 - 0x1d2bff: VD2					\
+		0x1d2c00 - 0x1d2cff: reserved					\
+		0x1d2d00 - 0x1d2dff: VD2					\
+		0x1d2e00 - 0x1d3dff: VD2 (DG2 only)				\
+		0x1d3e00 - 0x1d3eff: reserved					\
+		0x1d3f00 - 0x1d3fff: VD2 */					\
+	GEN_FW_RANGE(0x1d4000, 0x1d7fff, FORCEWAKE_MEDIA_VDBOX3), /*		\
+		0x1d4000 - 0x1d6bff: VD3					\
+		0x1d6c00 - 0x1d6cff: reserved					\
+		0x1d6d00 - 0x1d6dff: VD3					\
+		0x1d6e00 - 0x1d7fff: reserved */				\
+	GEN_FW_RANGE(0x1d8000, 0x1dffff, FORCEWAKE_MEDIA_VEBOX1), /*		\
+		0x1d8000 - 0x1da0ff: VE1					\
+		0x1da100 - 0x1dffff: reserved */				\
+	GEN_FW_RANGE(0x1e0000, 0x1e3fff, FORCEWAKE_MEDIA_VDBOX4), /*		\
+		0x1e0000 - 0x1e2bff: VD4					\
+		0x1e2c00 - 0x1e2cff: reserved					\
+		0x1e2d00 - 0x1e2dff: VD4					\
+		0x1e2e00 - 0x1e3eff: reserved					\
+		0x1e3f00 - 0x1e3fff: VD4 */					\
+	GEN_FW_RANGE(0x1e4000, 0x1e7fff, FORCEWAKE_MEDIA_VDBOX5), /*		\
+		0x1e4000 - 0x1e6bff: VD5					\
+		0x1e6c00 - 0x1e6cff: reserved					\
+		0x1e6d00 - 0x1e6dff: VD5					\
+		0x1e6e00 - 0x1e7fff: reserved */				\
+	GEN_FW_RANGE(0x1e8000, 0x1effff, FORCEWAKE_MEDIA_VEBOX2), /*		\
+		0x1e8000 - 0x1ea0ff: VE2					\
+		0x1ea100 - 0x1effff: reserved */				\
+	GEN_FW_RANGE(0x1f0000, 0x1f3fff, FORCEWAKE_MEDIA_VDBOX6), /*		\
+		0x1f0000 - 0x1f2bff: VD6					\
+		0x1f2c00 - 0x1f2cff: reserved					\
+		0x1f2d00 - 0x1f2dff: VD6					\
+		0x1f2e00 - 0x1f3eff: reserved					\
+		0x1f3f00 - 0x1f3fff: VD6 */					\
+	GEN_FW_RANGE(0x1f4000, 0x1f7fff, FORCEWAKE_MEDIA_VDBOX7), /*		\
+		0x1f4000 - 0x1f6bff: VD7					\
+		0x1f6c00 - 0x1f6cff: reserved					\
+		0x1f6d00 - 0x1f6dff: VD7					\
+		0x1f6e00 - 0x1f7fff: reserved */				\
 	GEN_FW_RANGE(0x1f8000, 0x1fa0ff, FORCEWAKE_MEDIA_VEBOX3),
+
+static const struct intel_forcewake_range __xehp_fw_ranges[] = {
+	XEHP_FWRANGES(FORCEWAKE_GT)
+};
+
+static const struct intel_forcewake_range __dg2_fw_ranges[] = {
+	XEHP_FWRANGES(FORCEWAKE_RENDER)
 };
 
 static void
@@ -2084,7 +2111,11 @@ static int uncore_forcewake_init(struct intel_uncore *uncore)
 		return ret;
 	forcewake_early_sanitize(uncore, 0);
 
-	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
+	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) {
+		ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges);
+		ASSIGN_WRITE_MMIO_VFUNCS(uncore, xehp_fwtable);
+		ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
+	} else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
 		ASSIGN_FW_DOMAINS_TABLE(uncore, __xehp_fw_ranges);
 		ASSIGN_WRITE_MMIO_VFUNCS(uncore, xehp_fwtable);
 		ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
-- 
2.25.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v4 04/18] drm/i915/dg2: Update LNCF steering ranges
  2021-07-29 16:59 [Intel-gfx] [PATCH v4 00/18] Begin enabling Xe_HP SDV and DG2 platforms Matt Roper
                   ` (2 preceding siblings ...)
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 03/18] drm/i915/dg2: Add forcewake table Matt Roper
@ 2021-07-29 16:59 ` Matt Roper
  2021-08-04 20:16   ` Lucas De Marchi
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 05/18] drm/i915/dg2: Add SQIDI steering Matt Roper
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 33+ messages in thread
From: Matt Roper @ 2021-07-29 16:59 UTC (permalink / raw)
  To: intel-gfx

DG2's replicated register ranges are almost the same at XeHP SDV with
the exception of one LNCF sub-range that switches to gslice steering.
We can re-use the XeHP SDV mslice steering table and just provide a
DG2-specific LNCF steering table.

Bspec: 66534
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index 8e13bfc81634..1971e34da254 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -103,6 +103,12 @@ static const struct intel_mmio_range xehpsdv_lncf_steering_table[] = {
 	{},
 };
 
+static const struct intel_mmio_range dg2_lncf_steering_table[] = {
+	{ 0x00B000, 0x00B0FF },
+	{ 0x00D880, 0x00D8FF },
+	{},
+};
+
 static u16 slicemask(struct intel_gt *gt, int count)
 {
 	u64 dss_mask = intel_sseu_get_subslices(&gt->info.sseu, 0);
@@ -129,7 +135,10 @@ int intel_gt_init_mmio(struct intel_gt *gt)
 			(intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
 			 GEN12_MEML3_EN_MASK);
 
-	if (IS_XEHPSDV(i915)) {
+	if (IS_DG2(i915)) {
+		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
+		gt->steering_table[LNCF] = dg2_lncf_steering_table;
+	} else if (IS_XEHPSDV(i915)) {
 		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
 		gt->steering_table[LNCF] = xehpsdv_lncf_steering_table;
 	} else if (GRAPHICS_VER(i915) >= 11 &&
-- 
2.25.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v4 05/18] drm/i915/dg2: Add SQIDI steering
  2021-07-29 16:59 [Intel-gfx] [PATCH v4 00/18] Begin enabling Xe_HP SDV and DG2 platforms Matt Roper
                   ` (3 preceding siblings ...)
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 04/18] drm/i915/dg2: Update LNCF steering ranges Matt Roper
@ 2021-07-29 16:59 ` Matt Roper
  2021-08-04 20:22   ` Lucas De Marchi
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 06/18] drm/i915/xehp: Loop over all gslices for INSTDONE processing Matt Roper
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 33+ messages in thread
From: Matt Roper @ 2021-07-29 16:59 UTC (permalink / raw)
  To: intel-gfx

Although DG2_G10 platforms will always have all SQIDI's present and
don't need steering for registers in a SQIDI MMIO range, this isn't true
for DG2_G11 platforms; only SQIDI's 2 and 3 can be used on those.

We handle SQIDI ranges a bit differently from other types of explicit
steering.  The SQIDI ranges belong to either the MCFG unit or the SF
unit, both of which have their own dedicated steering registers and do
not use the typical 0xFDC steering control that all other types of
ranges use.  Thus we only need to worry about picking a valid initial
value for the MCFG and SF steering registers (0xFD0 and 0xFD8
resepectively) at driver init; they won't change after we set them up so
we don't need to worry about re-steering them explicitly at runtime.

Given that any SQIDI value should work fine for DG2-G10 and XeHP SDV,
while only values of 2 and 3 are valid for DG2-G11, we'll just
initialize the MCFG and SF steering registers to a constant value of "2"
for all XeHP-based platforms for simplicity --- that will work in all
cases.

Bspec: 66534
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 28 +++++++++++++++++----
 drivers/gpu/drm/i915/i915_reg.h             |  2 ++
 2 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 8717337a6c81..6895b083523d 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -889,17 +889,24 @@ cfl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
 		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
 }
 
-static void __add_mcr_wa(struct drm_i915_private *i915, struct i915_wa_list *wal,
-			 unsigned slice, unsigned subslice)
+static void __set_mcr_steering(struct i915_wa_list *wal,
+			       i915_reg_t steering_reg,
+			       unsigned int slice, unsigned int subslice)
 {
 	u32 mcr, mcr_mask;
 
 	mcr = GEN11_MCR_SLICE(slice) | GEN11_MCR_SUBSLICE(subslice);
 	mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
 
-	drm_dbg(&i915->drm, "MCR slice/subslice = %x\n", mcr);
+	wa_write_clr_set(wal, steering_reg, mcr_mask, mcr);
+}
+
+static void __add_mcr_wa(struct drm_i915_private *i915, struct i915_wa_list *wal,
+			 unsigned int slice, unsigned int subslice)
+{
+	drm_dbg(&i915->drm, "MCR slice=0x%x, subslice=0x%x\n", slice, subslice);
 
-	wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr);
+	__set_mcr_steering(wal, GEN8_MCR_SELECTOR, slice, subslice);
 }
 
 static void
@@ -953,7 +960,6 @@ xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
 	 * - L3 Bank (fusable)
 	 * - MSLICE (fusable)
 	 * - LNCF (sub-unit within mslice; always present if mslice is present)
-	 * - SQIDI (always on)
 	 *
 	 * We'll do our default/implicit steering based on GSLICE (in the
 	 * sliceid field) and DSS (in the subsliceid field).  If we can
@@ -1003,6 +1009,18 @@ xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
 	WARN_ON(dss_mask >> (slice * GEN_DSS_PER_GSLICE) == 0);
 
 	__add_mcr_wa(i915, wal, slice, subslice);
+
+	/*
+	 * SQIDI ranges are special because they use different steering
+	 * registers than everything else we work with.  On XeHP SDV and
+	 * DG2-G10, any value in the steering registers will work fine since
+	 * all instances are present, but DG2-G11 only has SQIDI instances at
+	 * ID's 2 and 3, so we need to steer to one of those.  For simplicity
+	 * we'll just steer to a hardcoded "2" since that value will work
+	 * everywhere.
+	 */
+	__set_mcr_steering(wal, MCFG_MCR_SELECTOR, 0, 2);
+	__set_mcr_steering(wal, SF_MCR_SELECTOR, 0, 2);
 }
 
 static void
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index f4113e7e8271..39ce6befff52 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2757,6 +2757,8 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define GEN12_SC_INSTDONE_EXTRA2	_MMIO(0x7108)
 #define GEN7_SAMPLER_INSTDONE	_MMIO(0xe160)
 #define GEN7_ROW_INSTDONE	_MMIO(0xe164)
+#define MCFG_MCR_SELECTOR		_MMIO(0xfd0)
+#define SF_MCR_SELECTOR			_MMIO(0xfd8)
 #define GEN8_MCR_SELECTOR		_MMIO(0xfdc)
 #define   GEN8_MCR_SLICE(slice)		(((slice) & 3) << 26)
 #define   GEN8_MCR_SLICE_MASK		GEN8_MCR_SLICE(3)
-- 
2.25.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v4 06/18] drm/i915/xehp: Loop over all gslices for INSTDONE processing
  2021-07-29 16:59 [Intel-gfx] [PATCH v4 00/18] Begin enabling Xe_HP SDV and DG2 platforms Matt Roper
                   ` (4 preceding siblings ...)
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 05/18] drm/i915/dg2: Add SQIDI steering Matt Roper
@ 2021-07-29 16:59 ` Matt Roper
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 07/18] drm/i915/dg2: Report INSTDONE_GEOM values in error state Matt Roper
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 33+ messages in thread
From: Matt Roper @ 2021-07-29 16:59 UTC (permalink / raw)
  To: intel-gfx

We no longer have traditional slices on Xe_HP platforms, but the
INSTDONE registers are replicated according to gslice representation
which is similar.  We can mostly re-use the existing instdone code with
just a few modifications:

 * Create an alternate instdone loop macro that will iterate over the
   flat DSS space, but still provide the gslice/dss steering values for
   compatibility with the legacy code.

 * We should allocate INSTDONE storage space according to the maximum
   number of gslices rather than the maximum number of legacy slices to
   ensure we have enough storage space to hold all of the values.  XeHP
   design has 8 gslices, whereas older platforms never had more than 3
   slices.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c    | 48 +++++++++++---------
 drivers/gpu/drm/i915/gt/intel_engine_types.h | 12 ++++-
 drivers/gpu/drm/i915/gt/intel_sseu.h         |  7 +++
 drivers/gpu/drm/i915/i915_gpu_error.c        | 32 +++++++++----
 4 files changed, 66 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index dea0e522c5c7..06733e86e88b 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1166,16 +1166,16 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine,
 	u32 mmio_base = engine->mmio_base;
 	int slice;
 	int subslice;
+	int iter;
 
 	memset(instdone, 0, sizeof(*instdone));
 
-	switch (GRAPHICS_VER(i915)) {
-	default:
+	if (GRAPHICS_VER(i915) >= 8) {
 		instdone->instdone =
 			intel_uncore_read(uncore, RING_INSTDONE(mmio_base));
 
 		if (engine->id != RCS0)
-			break;
+			return;
 
 		instdone->slice_common =
 			intel_uncore_read(uncore, GEN7_SC_INSTDONE);
@@ -1185,21 +1185,32 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine,
 			instdone->slice_common_extra[1] =
 				intel_uncore_read(uncore, GEN12_SC_INSTDONE_EXTRA2);
 		}
-		for_each_instdone_slice_subslice(i915, sseu, slice, subslice) {
-			instdone->sampler[slice][subslice] =
-				read_subslice_reg(engine, slice, subslice,
-						  GEN7_SAMPLER_INSTDONE);
-			instdone->row[slice][subslice] =
-				read_subslice_reg(engine, slice, subslice,
-						  GEN7_ROW_INSTDONE);
+
+		if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
+			for_each_instdone_gslice_dss_xehp(i915, sseu, iter, slice, subslice) {
+				instdone->sampler[slice][subslice] =
+					read_subslice_reg(engine, slice, subslice,
+							  GEN7_SAMPLER_INSTDONE);
+				instdone->row[slice][subslice] =
+					read_subslice_reg(engine, slice, subslice,
+							  GEN7_ROW_INSTDONE);
+			}
+		} else {
+			for_each_instdone_slice_subslice(i915, sseu, slice, subslice) {
+				instdone->sampler[slice][subslice] =
+					read_subslice_reg(engine, slice, subslice,
+							  GEN7_SAMPLER_INSTDONE);
+				instdone->row[slice][subslice] =
+					read_subslice_reg(engine, slice, subslice,
+							  GEN7_ROW_INSTDONE);
+			}
 		}
-		break;
-	case 7:
+	} else if (GRAPHICS_VER(i915) >= 7) {
 		instdone->instdone =
 			intel_uncore_read(uncore, RING_INSTDONE(mmio_base));
 
 		if (engine->id != RCS0)
-			break;
+			return;
 
 		instdone->slice_common =
 			intel_uncore_read(uncore, GEN7_SC_INSTDONE);
@@ -1207,22 +1218,15 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine,
 			intel_uncore_read(uncore, GEN7_SAMPLER_INSTDONE);
 		instdone->row[0][0] =
 			intel_uncore_read(uncore, GEN7_ROW_INSTDONE);
-
-		break;
-	case 6:
-	case 5:
-	case 4:
+	} else if (GRAPHICS_VER(i915) >= 4) {
 		instdone->instdone =
 			intel_uncore_read(uncore, RING_INSTDONE(mmio_base));
 		if (engine->id == RCS0)
 			/* HACK: Using the wrong struct member */
 			instdone->slice_common =
 				intel_uncore_read(uncore, GEN4_INSTDONE1);
-		break;
-	case 3:
-	case 2:
+	} else {
 		instdone->instdone = intel_uncore_read(uncore, GEN2_INSTDONE);
-		break;
 	}
 }
 
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index ed91bcff20eb..0b4846b01626 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -67,8 +67,8 @@ struct intel_instdone {
 	/* The following exist only in the RCS engine */
 	u32 slice_common;
 	u32 slice_common_extra[2];
-	u32 sampler[I915_MAX_SLICES][I915_MAX_SUBSLICES];
-	u32 row[I915_MAX_SLICES][I915_MAX_SUBSLICES];
+	u32 sampler[GEN_MAX_GSLICES][I915_MAX_SUBSLICES];
+	u32 row[GEN_MAX_GSLICES][I915_MAX_SUBSLICES];
 };
 
 /*
@@ -578,4 +578,12 @@ intel_engine_has_relative_mmio(const struct intel_engine_cs * const engine)
 		for_each_if((instdone_has_slice(dev_priv_, sseu_, slice_)) && \
 			    (instdone_has_subslice(dev_priv_, sseu_, slice_, \
 						    subslice_)))
+
+#define for_each_instdone_gslice_dss_xehp(dev_priv_, sseu_, iter_, gslice_, dss_) \
+	for ((iter_) = 0, (gslice_) = 0, (dss_) = 0; \
+	     (iter_) < GEN_MAX_SUBSLICES; \
+	     (iter_)++, (gslice_) = (iter_) / GEN_DSS_PER_GSLICE, \
+	     (dss_) = (iter_) % GEN_DSS_PER_GSLICE) \
+		for_each_if(intel_sseu_has_subslice((sseu_), 0, (iter_)))
+
 #endif /* __INTEL_ENGINE_TYPES_H__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h
index 1073471d1980..74487650b08f 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.h
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
@@ -26,6 +26,9 @@ struct drm_printer;
 #define GEN_DSS_PER_CSLICE	8
 #define GEN_DSS_PER_MSLICE	8
 
+#define GEN_MAX_GSLICES		(GEN_MAX_SUBSLICES / GEN_DSS_PER_GSLICE)
+#define GEN_MAX_CSLICES		(GEN_MAX_SUBSLICES / GEN_DSS_PER_CSLICE)
+
 struct sseu_dev_info {
 	u8 slice_mask;
 	u8 subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
@@ -78,6 +81,10 @@ intel_sseu_has_subslice(const struct sseu_dev_info *sseu, int slice,
 	u8 mask;
 	int ss_idx = subslice / BITS_PER_BYTE;
 
+	if (slice >= sseu->max_slices ||
+	    subslice >= sseu->max_subslices)
+		return false;
+
 	GEM_BUG_ON(ss_idx >= sseu->ss_stride);
 
 	mask = sseu->subslice_mask[slice * sseu->ss_stride + ss_idx];
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 0f08bcfbe964..8230bc3ac8a9 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -444,15 +444,29 @@ static void error_print_instdone(struct drm_i915_error_state_buf *m,
 	if (GRAPHICS_VER(m->i915) <= 6)
 		return;
 
-	for_each_instdone_slice_subslice(m->i915, sseu, slice, subslice)
-		err_printf(m, "  SAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
-			   slice, subslice,
-			   ee->instdone.sampler[slice][subslice]);
-
-	for_each_instdone_slice_subslice(m->i915, sseu, slice, subslice)
-		err_printf(m, "  ROW_INSTDONE[%d][%d]: 0x%08x\n",
-			   slice, subslice,
-			   ee->instdone.row[slice][subslice]);
+	if (GRAPHICS_VER_FULL(m->i915) >= IP_VER(12, 50)) {
+		int iter;
+
+		for_each_instdone_gslice_dss_xehp(m->i915, sseu, iter, slice, subslice)
+			err_printf(m, "  SAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
+				   slice, subslice,
+				   ee->instdone.sampler[slice][subslice]);
+
+		for_each_instdone_gslice_dss_xehp(m->i915, sseu, iter, slice, subslice)
+			err_printf(m, "  ROW_INSTDONE[%d][%d]: 0x%08x\n",
+				   slice, subslice,
+				   ee->instdone.row[slice][subslice]);
+	} else {
+		for_each_instdone_slice_subslice(m->i915, sseu, slice, subslice)
+			err_printf(m, "  SAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
+				   slice, subslice,
+				   ee->instdone.sampler[slice][subslice]);
+
+		for_each_instdone_slice_subslice(m->i915, sseu, slice, subslice)
+			err_printf(m, "  ROW_INSTDONE[%d][%d]: 0x%08x\n",
+				   slice, subslice,
+				   ee->instdone.row[slice][subslice]);
+	}
 
 	if (GRAPHICS_VER(m->i915) < 12)
 		return;
-- 
2.25.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v4 07/18] drm/i915/dg2: Report INSTDONE_GEOM values in error state
  2021-07-29 16:59 [Intel-gfx] [PATCH v4 00/18] Begin enabling Xe_HP SDV and DG2 platforms Matt Roper
                   ` (5 preceding siblings ...)
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 06/18] drm/i915/xehp: Loop over all gslices for INSTDONE processing Matt Roper
@ 2021-07-29 16:59 ` Matt Roper
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 08/18] drm/i915/xehp: Changes to ss/eu definitions Matt Roper
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 33+ messages in thread
From: Matt Roper @ 2021-07-29 16:59 UTC (permalink / raw)
  To: intel-gfx

Xe_HPG adds some additional INSTDONE_GEOM debug registers; the Mesa team
has indicated that having these reported in the error state would be
useful for debugging GPU hangs.  These registers are replicated per-DSS
with gslice steering.

Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  7 +++++++
 drivers/gpu/drm/i915/gt/intel_engine_types.h |  3 +++
 drivers/gpu/drm/i915/i915_gpu_error.c        | 10 ++++++++--
 drivers/gpu/drm/i915/i915_reg.h              |  1 +
 4 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 06733e86e88b..0a70a3baef9d 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1205,6 +1205,13 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine,
 							  GEN7_ROW_INSTDONE);
 			}
 		}
+
+		if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) {
+			for_each_instdone_gslice_dss_xehp(i915, sseu, iter, slice, subslice)
+				instdone->geom_svg[slice][subslice] =
+					read_subslice_reg(engine, slice, subslice,
+							  XEHPG_INSTDONE_GEOM_SVG);
+		}
 	} else if (GRAPHICS_VER(i915) >= 7) {
 		instdone->instdone =
 			intel_uncore_read(uncore, RING_INSTDONE(mmio_base));
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 0b4846b01626..bfbfe53c23dd 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -69,6 +69,9 @@ struct intel_instdone {
 	u32 slice_common_extra[2];
 	u32 sampler[GEN_MAX_GSLICES][I915_MAX_SUBSLICES];
 	u32 row[GEN_MAX_GSLICES][I915_MAX_SUBSLICES];
+
+	/* Added in XeHPG */
+	u32 geom_svg[GEN_MAX_GSLICES][I915_MAX_SUBSLICES];
 };
 
 /*
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 8230bc3ac8a9..91d5da7b0a2b 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -431,6 +431,7 @@ static void error_print_instdone(struct drm_i915_error_state_buf *m,
 	const struct sseu_dev_info *sseu = &ee->engine->gt->info.sseu;
 	int slice;
 	int subslice;
+	int iter;
 
 	err_printf(m, "  INSTDONE: 0x%08x\n",
 		   ee->instdone.instdone);
@@ -445,8 +446,6 @@ static void error_print_instdone(struct drm_i915_error_state_buf *m,
 		return;
 
 	if (GRAPHICS_VER_FULL(m->i915) >= IP_VER(12, 50)) {
-		int iter;
-
 		for_each_instdone_gslice_dss_xehp(m->i915, sseu, iter, slice, subslice)
 			err_printf(m, "  SAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
 				   slice, subslice,
@@ -471,6 +470,13 @@ static void error_print_instdone(struct drm_i915_error_state_buf *m,
 	if (GRAPHICS_VER(m->i915) < 12)
 		return;
 
+	if (GRAPHICS_VER_FULL(m->i915) >= IP_VER(12, 55)) {
+		for_each_instdone_gslice_dss_xehp(m->i915, sseu, iter, slice, subslice)
+			err_printf(m, "  GEOM_SVGUNIT_INSTDONE[%d][%d]: 0x%08x\n",
+				   slice, subslice,
+				   ee->instdone.geom_svg[slice][subslice]);
+	}
+
 	err_printf(m, "  SC_INSTDONE_EXTRA: 0x%08x\n",
 		   ee->instdone.slice_common_extra[0]);
 	err_printf(m, "  SC_INSTDONE_EXTRA2: 0x%08x\n",
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 39ce6befff52..62af453c8c54 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2757,6 +2757,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define GEN12_SC_INSTDONE_EXTRA2	_MMIO(0x7108)
 #define GEN7_SAMPLER_INSTDONE	_MMIO(0xe160)
 #define GEN7_ROW_INSTDONE	_MMIO(0xe164)
+#define XEHPG_INSTDONE_GEOM_SVG		_MMIO(0x666c)
 #define MCFG_MCR_SELECTOR		_MMIO(0xfd0)
 #define SF_MCR_SELECTOR			_MMIO(0xfd8)
 #define GEN8_MCR_SELECTOR		_MMIO(0xfdc)
-- 
2.25.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v4 08/18] drm/i915/xehp: Changes to ss/eu definitions
  2021-07-29 16:59 [Intel-gfx] [PATCH v4 00/18] Begin enabling Xe_HP SDV and DG2 platforms Matt Roper
                   ` (6 preceding siblings ...)
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 07/18] drm/i915/dg2: Report INSTDONE_GEOM values in error state Matt Roper
@ 2021-07-29 16:59 ` Matt Roper
  2021-08-04  0:17   ` Souza, Jose
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 09/18] drm/i915/xehpsdv: Add maximum sseu limits Matt Roper
                   ` (10 subsequent siblings)
  18 siblings, 1 reply; 33+ messages in thread
From: Matt Roper @ 2021-07-29 16:59 UTC (permalink / raw)
  To: intel-gfx; +Cc: Lucas De Marchi, Matthew Auld

From: Matthew Auld <matthew.auld@intel.com>

Xe_HP no longer has "slices" in the same way that old platforms did.
There are new concepts (gslices, cslices, mslices) that apply in various
contexts, but for the purposes of fusing slices no longer exist and we
just have one large pool of dual-subslices (DSS) to work with.
Furthermore, the meaning of the DSS fuse is inverted compared to past
platforms --- it now specifies which DSS are enabled rather than which
ones are disabled.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Prasad Nallani <prasad.nallani@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_sseu.c | 24 ++++++++++++++++++++----
 drivers/gpu/drm/i915/i915_getparam.c |  6 ++++--
 drivers/gpu/drm/i915/i915_reg.h      |  3 +++
 3 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c
index bbed8e8625e1..5d1b7d06c96b 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.c
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
@@ -139,17 +139,33 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
 	 * Gen12 has Dual-Subslices, which behave similarly to 2 gen11 SS.
 	 * Instead of splitting these, provide userspace with an array
 	 * of DSS to more closely represent the hardware resource.
+	 *
+	 * In addition, the concept of slice has been removed in Xe_HP.
+	 * To be compatible with prior generations, assume a single slice
+	 * across the entire device. Then calculate out the DSS for each
+	 * workload type within that software slice.
 	 */
 	intel_sseu_set_info(sseu, 1, 6, 16);
 
-	s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) &
-		GEN11_GT_S_ENA_MASK;
+	/*
+	 * As mentioned above, Xe_HP does not have the concept of a slice.
+	 * Enable one for software backwards compatibility.
+	 */
+	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
+		s_en = 0x1;
+	else
+		s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) &
+		       GEN11_GT_S_ENA_MASK;
 
 	dss_en = intel_uncore_read(uncore, GEN12_GT_DSS_ENABLE);
 
 	/* one bit per pair of EUs */
-	eu_en_fuse = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) &
-		       GEN11_EU_DIS_MASK);
+	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
+		eu_en_fuse = intel_uncore_read(uncore, XEHP_EU_ENABLE) & XEHP_EU_ENA_MASK;
+	else
+		eu_en_fuse = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) &
+			       GEN11_EU_DIS_MASK);
+
 	for (eu = 0; eu < sseu->max_eus_per_subslice / 2; eu++)
 		if (eu_en_fuse & BIT(eu))
 			eu_en |= BIT(eu * 2) | BIT(eu * 2 + 1);
diff --git a/drivers/gpu/drm/i915/i915_getparam.c b/drivers/gpu/drm/i915/i915_getparam.c
index 24e18219eb50..e289397d9178 100644
--- a/drivers/gpu/drm/i915/i915_getparam.c
+++ b/drivers/gpu/drm/i915/i915_getparam.c
@@ -15,7 +15,7 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data,
 	struct pci_dev *pdev = to_pci_dev(dev->dev);
 	const struct sseu_dev_info *sseu = &i915->gt.info.sseu;
 	drm_i915_getparam_t *param = data;
-	int value;
+	int value = 0;
 
 	switch (param->param) {
 	case I915_PARAM_IRQ_ACTIVE:
@@ -150,7 +150,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data,
 			return -ENODEV;
 		break;
 	case I915_PARAM_SUBSLICE_MASK:
-		value = sseu->subslice_mask[0];
+		/* Only copy bits from the first slice */
+		memcpy(&value, sseu->subslice_mask,
+		       min(sseu->ss_stride, (u8)sizeof(value)));
 		if (!value)
 			return -ENODEV;
 		break;
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 62af453c8c54..99858bc593f0 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -3225,6 +3225,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 
 #define GEN12_GT_DSS_ENABLE _MMIO(0x913C)
 
+#define XEHP_EU_ENABLE			_MMIO(0x9134)
+#define XEHP_EU_ENA_MASK		0xFF
+
 #define GEN6_BSD_SLEEP_PSMI_CONTROL	_MMIO(0x12050)
 #define   GEN6_BSD_SLEEP_MSG_DISABLE	(1 << 0)
 #define   GEN6_BSD_SLEEP_FLUSH_DISABLE	(1 << 2)
-- 
2.25.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v4 09/18] drm/i915/xehpsdv: Add maximum sseu limits
  2021-07-29 16:59 [Intel-gfx] [PATCH v4 00/18] Begin enabling Xe_HP SDV and DG2 platforms Matt Roper
                   ` (7 preceding siblings ...)
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 08/18] drm/i915/xehp: Changes to ss/eu definitions Matt Roper
@ 2021-07-29 16:59 ` Matt Roper
  2021-08-04 20:26   ` Lucas De Marchi
  2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 10/18] drm/i915/xehpsdv: Add compute DSS type Matt Roper
                   ` (9 subsequent siblings)
  18 siblings, 1 reply; 33+ messages in thread
From: Matt Roper @ 2021-07-29 16:59 UTC (permalink / raw)
  To: intel-gfx

Due to the removal of legacy slices and the transition to a
gslice/cslice/mslice/etc. design, we'll internally store all DSS under
"slice0."

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Caz Yokoyama <caz.yokoyama@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_sseu.c         | 5 ++++-
 drivers/gpu/drm/i915/gt/intel_sseu.h         | 2 +-
 drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c | 2 +-
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c
index 5d1b7d06c96b..16c0552fcd1d 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.c
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
@@ -145,7 +145,10 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
 	 * across the entire device. Then calculate out the DSS for each
 	 * workload type within that software slice.
 	 */
-	intel_sseu_set_info(sseu, 1, 6, 16);
+	if (IS_XEHPSDV(gt->i915))
+		intel_sseu_set_info(sseu, 1, 32, 16);
+	else
+		intel_sseu_set_info(sseu, 1, 6, 16);
 
 	/*
 	 * As mentioned above, Xe_HP does not have the concept of a slice.
diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h
index 74487650b08f..204ea6709460 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.h
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
@@ -16,7 +16,7 @@ struct intel_gt;
 struct drm_printer;
 
 #define GEN_MAX_SLICES		(6) /* CNL upper bound */
-#define GEN_MAX_SUBSLICES	(8) /* ICL upper bound */
+#define GEN_MAX_SUBSLICES	(32) /* XEHPSDV upper bound */
 #define GEN_SSEU_STRIDE(max_entries) DIV_ROUND_UP(max_entries, BITS_PER_BYTE)
 #define GEN_MAX_SUBSLICE_STRIDE GEN_SSEU_STRIDE(GEN_MAX_SUBSLICES)
 #define GEN_MAX_EUS		(16) /* TGL upper bound */
diff --git a/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c b/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c
index 714fe8495775..a424150b052e 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c
+++ b/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c
@@ -53,7 +53,7 @@ static void cherryview_sseu_device_status(struct intel_gt *gt,
 static void gen10_sseu_device_status(struct intel_gt *gt,
 				     struct sseu_dev_info *sseu)
 {
-#define SS_MAX 6
+#define SS_MAX 8
 	struct intel_uncore *uncore = gt->uncore;
 	const struct intel_gt_info *info = &gt->info;
 	u32 s_reg[SS_MAX], eu_reg[2 * SS_MAX], eu_mask[2];
-- 
2.25.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v4 10/18] drm/i915/xehpsdv: Add compute DSS type
  2021-07-29 16:59 [Intel-gfx] [PATCH v4 00/18] Begin enabling Xe_HP SDV and DG2 platforms Matt Roper
                   ` (8 preceding siblings ...)
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 09/18] drm/i915/xehpsdv: Add maximum sseu limits Matt Roper
@ 2021-07-29 17:00 ` Matt Roper
  2021-08-04 20:36   ` Lucas De Marchi
  2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 11/18] drm/i915/dg2: DG2 uses the same sseu limits as XeHP SDV Matt Roper
                   ` (8 subsequent siblings)
  18 siblings, 1 reply; 33+ messages in thread
From: Matt Roper @ 2021-07-29 17:00 UTC (permalink / raw)
  To: intel-gfx

From: Stuart Summers <stuart.summers@intel.com>

Starting in XeHP, the concept of slice has been removed in favor of
DSS (Dual-Subslice) masks for various workload types. These workloads have
been divided into those enabled for geometry and those enabled for compute.

i915 currently maintains a single set of S/SS/EU masks for the device.
The goal of this patch set is to minimize the amount of impact to prior
generations while still giving the user maximum flexibility.

Bspec: 33117, 33118, 20376
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Steve Hampson <steven.t.hampson@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_sseu.c | 73 ++++++++++++++++++++--------
 drivers/gpu/drm/i915/gt/intel_sseu.h |  5 +-
 drivers/gpu/drm/i915/i915_reg.h      |  3 +-
 include/uapi/drm/i915_drm.h          |  3 --
 4 files changed, 59 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c
index 16c0552fcd1d..5d3b8dff464c 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.c
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
@@ -46,11 +46,11 @@ u32 intel_sseu_get_subslices(const struct sseu_dev_info *sseu, u8 slice)
 }
 
 void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
-			      u32 ss_mask)
+			      u8 *subslice_mask, u32 ss_mask)
 {
 	int offset = slice * sseu->ss_stride;
 
-	memcpy(&sseu->subslice_mask[offset], &ss_mask, sseu->ss_stride);
+	memcpy(&subslice_mask[offset], &ss_mask, sseu->ss_stride);
 }
 
 unsigned int
@@ -100,14 +100,24 @@ static u16 compute_eu_total(const struct sseu_dev_info *sseu)
 	return total;
 }
 
-static void gen11_compute_sseu_info(struct sseu_dev_info *sseu,
-				    u8 s_en, u32 ss_en, u16 eu_en)
+static u32 get_ss_stride_mask(struct sseu_dev_info *sseu, u8 s, u32 ss_en)
+{
+	u32 ss_mask;
+
+	ss_mask = ss_en >> (s * sseu->max_subslices);
+	ss_mask &= GENMASK(sseu->max_subslices - 1, 0);
+
+	return ss_mask;
+}
+
+static void gen11_compute_sseu_info(struct sseu_dev_info *sseu, u8 s_en,
+				    u32 g_ss_en, u32 c_ss_en, u16 eu_en)
 {
 	int s, ss;
 
-	/* ss_en represents entire subslice mask across all slices */
+	/* g_ss_en/c_ss_en represent entire subslice mask across all slices */
 	GEM_BUG_ON(sseu->max_slices * sseu->max_subslices >
-		   sizeof(ss_en) * BITS_PER_BYTE);
+		   sizeof(g_ss_en) * BITS_PER_BYTE);
 
 	for (s = 0; s < sseu->max_slices; s++) {
 		if ((s_en & BIT(s)) == 0)
@@ -115,7 +125,23 @@ static void gen11_compute_sseu_info(struct sseu_dev_info *sseu,
 
 		sseu->slice_mask |= BIT(s);
 
-		intel_sseu_set_subslices(sseu, s, ss_en);
+		/*
+		 * XeHP introduces the concept of compute vs
+		 * geometry DSS. To reduce variation between GENs
+		 * around subslice usage, store a mask for both the
+		 * geometry and compute enabled masks, to provide
+		 * to user space later in QUERY_TOPOLOGY_INFO, and
+		 * compute a total enabled subslice count for the
+		 * purposes of selecting subslices to use in a
+		 * particular GEM context.
+		 */
+		intel_sseu_set_subslices(sseu, s, sseu->compute_subslice_mask,
+					 get_ss_stride_mask(sseu, s, c_ss_en));
+		intel_sseu_set_subslices(sseu, s, sseu->geometry_subslice_mask,
+					 get_ss_stride_mask(sseu, s, g_ss_en));
+		intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
+					 get_ss_stride_mask(sseu, s,
+							    g_ss_en | c_ss_en));
 
 		for (ss = 0; ss < sseu->max_subslices; ss++)
 			if (intel_sseu_has_subslice(sseu, s, ss))
@@ -129,7 +155,7 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
 {
 	struct sseu_dev_info *sseu = &gt->info.sseu;
 	struct intel_uncore *uncore = gt->uncore;
-	u32 dss_en;
+	u32 g_dss_en, c_dss_en = 0;
 	u16 eu_en = 0;
 	u8 eu_en_fuse;
 	u8 s_en;
@@ -145,10 +171,12 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
 	 * across the entire device. Then calculate out the DSS for each
 	 * workload type within that software slice.
 	 */
-	if (IS_XEHPSDV(gt->i915))
+	if (IS_XEHPSDV(gt->i915)) {
 		intel_sseu_set_info(sseu, 1, 32, 16);
-	else
+		sseu->has_compute_dss = 1;
+	} else {
 		intel_sseu_set_info(sseu, 1, 6, 16);
+	}
 
 	/*
 	 * As mentioned above, Xe_HP does not have the concept of a slice.
@@ -160,7 +188,9 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
 		s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) &
 		       GEN11_GT_S_ENA_MASK;
 
-	dss_en = intel_uncore_read(uncore, GEN12_GT_DSS_ENABLE);
+	g_dss_en = intel_uncore_read(uncore, GEN12_GT_GEOMETRY_DSS_ENABLE);
+	if (sseu->has_compute_dss)
+		c_dss_en = intel_uncore_read(uncore, GEN12_GT_COMPUTE_DSS_ENABLE);
 
 	/* one bit per pair of EUs */
 	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
@@ -173,7 +203,7 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
 		if (eu_en_fuse & BIT(eu))
 			eu_en |= BIT(eu * 2) | BIT(eu * 2 + 1);
 
-	gen11_compute_sseu_info(sseu, s_en, dss_en, eu_en);
+	gen11_compute_sseu_info(sseu, s_en, g_dss_en, c_dss_en, eu_en);
 
 	/* TGL only supports slice-level power gating */
 	sseu->has_slice_pg = 1;
@@ -199,7 +229,7 @@ static void gen11_sseu_info_init(struct intel_gt *gt)
 	eu_en = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) &
 		  GEN11_EU_DIS_MASK);
 
-	gen11_compute_sseu_info(sseu, s_en, ss_en, eu_en);
+	gen11_compute_sseu_info(sseu, s_en, ss_en, 0, eu_en);
 
 	/* ICL has no power gating restrictions. */
 	sseu->has_slice_pg = 1;
@@ -260,9 +290,9 @@ static void gen10_sseu_info_init(struct intel_gt *gt)
 		 * Slice0 can have up to 3 subslices, but there are only 2 in
 		 * slice1/2.
 		 */
-		intel_sseu_set_subslices(sseu, s, s == 0 ?
-					 subslice_mask_with_eus :
-					 subslice_mask_with_eus & 0x3);
+		intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
+					 s == 0 ? subslice_mask_with_eus :
+						  subslice_mask_with_eus & 0x3);
 	}
 
 	sseu->eu_total = compute_eu_total(sseu);
@@ -317,7 +347,7 @@ static void cherryview_sseu_info_init(struct intel_gt *gt)
 		sseu_set_eus(sseu, 0, 1, ~disabled_mask);
 	}
 
-	intel_sseu_set_subslices(sseu, 0, subslice_mask);
+	intel_sseu_set_subslices(sseu, 0, sseu->subslice_mask, subslice_mask);
 
 	sseu->eu_total = compute_eu_total(sseu);
 
@@ -373,7 +403,8 @@ static void gen9_sseu_info_init(struct intel_gt *gt)
 			/* skip disabled slice */
 			continue;
 
-		intel_sseu_set_subslices(sseu, s, subslice_mask);
+		intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
+					 subslice_mask);
 
 		eu_disable = intel_uncore_read(uncore, GEN9_EU_DISABLE(s));
 		for (ss = 0; ss < sseu->max_subslices; ss++) {
@@ -485,7 +516,8 @@ static void bdw_sseu_info_init(struct intel_gt *gt)
 			/* skip disabled slice */
 			continue;
 
-		intel_sseu_set_subslices(sseu, s, subslice_mask);
+		intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
+					 subslice_mask);
 
 		for (ss = 0; ss < sseu->max_subslices; ss++) {
 			u8 eu_disabled_mask;
@@ -583,7 +615,8 @@ static void hsw_sseu_info_init(struct intel_gt *gt)
 			    sseu->eu_per_subslice);
 
 	for (s = 0; s < sseu->max_slices; s++) {
-		intel_sseu_set_subslices(sseu, s, subslice_mask);
+		intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
+					 subslice_mask);
 
 		for (ss = 0; ss < sseu->max_subslices; ss++) {
 			sseu_set_eus(sseu, s, ss,
diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h
index 204ea6709460..b383e7d97554 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.h
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
@@ -32,6 +32,8 @@ struct drm_printer;
 struct sseu_dev_info {
 	u8 slice_mask;
 	u8 subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
+	u8 geometry_subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
+	u8 compute_subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
 	u8 eu_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICES * GEN_MAX_EU_STRIDE];
 	u16 eu_total;
 	u8 eu_per_subslice;
@@ -41,6 +43,7 @@ struct sseu_dev_info {
 	u8 has_slice_pg:1;
 	u8 has_subslice_pg:1;
 	u8 has_eu_pg:1;
+	u8 has_compute_dss:1;
 
 	/* Topology fields */
 	u8 max_slices;
@@ -104,7 +107,7 @@ intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice);
 u32  intel_sseu_get_subslices(const struct sseu_dev_info *sseu, u8 slice);
 
 void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
-			      u32 ss_mask);
+			      u8 *subslice_mask, u32 ss_mask);
 
 void intel_sseu_info_init(struct intel_gt *gt);
 
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 99858bc593f0..d7e4418955f7 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -3223,7 +3223,8 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 
 #define GEN11_GT_SUBSLICE_DISABLE _MMIO(0x913C)
 
-#define GEN12_GT_DSS_ENABLE _MMIO(0x913C)
+#define GEN12_GT_GEOMETRY_DSS_ENABLE _MMIO(0x913C)
+#define GEN12_GT_COMPUTE_DSS_ENABLE _MMIO(0x9144)
 
 #define XEHP_EU_ENABLE			_MMIO(0x9134)
 #define XEHP_EU_ENA_MASK		0xFF
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 7f13d241417f..aef15542a95b 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -2589,9 +2589,6 @@ struct drm_i915_query {
  *                 Z / 8] >> (Z % 8)) & 1
  */
 struct drm_i915_query_topology_info {
-	/*
-	 * Unused for now. Must be cleared to zero.
-	 */
 	__u16 flags;
 
 	__u16 max_slices;
-- 
2.25.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v4 11/18] drm/i915/dg2: DG2 uses the same sseu limits as XeHP SDV
  2021-07-29 16:59 [Intel-gfx] [PATCH v4 00/18] Begin enabling Xe_HP SDV and DG2 platforms Matt Roper
                   ` (9 preceding siblings ...)
  2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 10/18] drm/i915/xehpsdv: Add compute DSS type Matt Roper
@ 2021-07-29 17:00 ` Matt Roper
  2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 12/18] drm/i915/xehpsdv: Define MOCS table for " Matt Roper
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 33+ messages in thread
From: Matt Roper @ 2021-07-29 17:00 UTC (permalink / raw)
  To: intel-gfx

DG2 supports compute DSS and has the same maximum number of DSS and EU
as XeHP SDV.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Caz Yokoyama <caz.yokoyama@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_sseu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c
index 5d3b8dff464c..eaff221db5b0 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.c
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
@@ -171,7 +171,7 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
 	 * across the entire device. Then calculate out the DSS for each
 	 * workload type within that software slice.
 	 */
-	if (IS_XEHPSDV(gt->i915)) {
+	if (IS_DG2(gt->i915) || IS_XEHPSDV(gt->i915)) {
 		intel_sseu_set_info(sseu, 1, 32, 16);
 		sseu->has_compute_dss = 1;
 	} else {
-- 
2.25.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v4 12/18] drm/i915/xehpsdv: Define MOCS table for XeHP SDV
  2021-07-29 16:59 [Intel-gfx] [PATCH v4 00/18] Begin enabling Xe_HP SDV and DG2 platforms Matt Roper
                   ` (10 preceding siblings ...)
  2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 11/18] drm/i915/dg2: DG2 uses the same sseu limits as XeHP SDV Matt Roper
@ 2021-07-29 17:00 ` Matt Roper
  2021-07-29 17:31   ` Lucas De Marchi
  2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 13/18] drm/i915/dg2: Define MOCS table for DG2 Matt Roper
                   ` (6 subsequent siblings)
  18 siblings, 1 reply; 33+ messages in thread
From: Matt Roper @ 2021-07-29 17:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: Lucas De Marchi

From: Lucas De Marchi <lucas.demarchi@intel.com>

Like DG1, XeHP SDV doesn't have LLC/eDRAM control values due to being a
dgfx card. XeHP SDV adds 2 more bits: L3_GLBGO to "push the Go point to
memory for L3 destined transaction" and L3_LKP to "enable Lookup for
uncacheable accesses".

Bspec: 45101
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_mocs.c | 33 +++++++++++++++++++++++++++-
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c
index 17848807f111..0c9d0b936c20 100644
--- a/drivers/gpu/drm/i915/gt/intel_mocs.c
+++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
@@ -40,6 +40,8 @@ struct drm_i915_mocs_table {
 #define L3_ESC(value)		((value) << 0)
 #define L3_SCC(value)		((value) << 1)
 #define _L3_CACHEABILITY(value)	((value) << 4)
+#define L3_GLBGO(value)		((value) << 6)
+#define L3_LKUP(value)		((value) << 7)
 
 /* Helper defines */
 #define GEN9_NUM_MOCS_ENTRIES	64  /* 63-64 are reserved, but configured. */
@@ -314,6 +316,31 @@ static const struct drm_i915_mocs_entry dg1_mocs_table[] = {
 	MOCS_ENTRY(63, 0, L3_1_UC),
 };
 
+static const struct drm_i915_mocs_entry xehpsdv_mocs_table[] = {
+	/* wa_1608975824 */
+	MOCS_ENTRY(0, 0, L3_3_WB | L3_LKUP(1)),
+
+	/* UC - Coherent; GO:L3 */
+	MOCS_ENTRY(1, 0, L3_1_UC | L3_LKUP(1)),
+	/* UC - Coherent; GO:Memory */
+	MOCS_ENTRY(2, 0, L3_1_UC | L3_GLBGO(1) | L3_LKUP(1)),
+	/* UC - Non-Coherent; GO:Memory */
+	MOCS_ENTRY(3, 0, L3_1_UC | L3_GLBGO(1)),
+	/* UC - Non-Coherent; GO:L3 */
+	MOCS_ENTRY(4, 0, L3_1_UC),
+
+	/* WB */
+	MOCS_ENTRY(5, 0, L3_3_WB | L3_LKUP(1)),
+
+	/* HW Reserved - SW program but never use. */
+	MOCS_ENTRY(48, 0, L3_3_WB | L3_LKUP(1)),
+	MOCS_ENTRY(49, 0, L3_1_UC | L3_LKUP(1)),
+	MOCS_ENTRY(60, 0, L3_1_UC),
+	MOCS_ENTRY(61, 0, L3_1_UC),
+	MOCS_ENTRY(62, 0, L3_1_UC),
+	MOCS_ENTRY(63, 0, L3_1_UC),
+};
+
 enum {
 	HAS_GLOBAL_MOCS = BIT(0),
 	HAS_ENGINE_MOCS = BIT(1),
@@ -340,7 +367,11 @@ static unsigned int get_mocs_settings(const struct drm_i915_private *i915,
 {
 	unsigned int flags;
 
-	if (IS_DG1(i915)) {
+	if (IS_XEHPSDV(i915)) {
+		table->size = ARRAY_SIZE(xehpsdv_mocs_table);
+		table->table = xehpsdv_mocs_table;
+		table->n_entries = GEN9_NUM_MOCS_ENTRIES;
+	} else if (IS_DG1(i915)) {
 		table->size = ARRAY_SIZE(dg1_mocs_table);
 		table->table = dg1_mocs_table;
 		table->n_entries = GEN9_NUM_MOCS_ENTRIES;
-- 
2.25.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v4 13/18] drm/i915/dg2: Define MOCS table for DG2
  2021-07-29 16:59 [Intel-gfx] [PATCH v4 00/18] Begin enabling Xe_HP SDV and DG2 platforms Matt Roper
                   ` (11 preceding siblings ...)
  2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 12/18] drm/i915/xehpsdv: Define MOCS table for " Matt Roper
@ 2021-07-29 17:00 ` Matt Roper
  2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 14/18] drm/i915/xehpsdv: factor out function to read RP_STATE_CAP Matt Roper
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 33+ messages in thread
From: Matt Roper @ 2021-07-29 17:00 UTC (permalink / raw)
  To: intel-gfx

Bspec: 45101, 45427
Cc: Ramalingam C <ramalingam.c@intel.com>(v5)
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_mocs.c | 35 +++++++++++++++++++++++++++-
 1 file changed, 34 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c
index 0c9d0b936c20..ad021337225d 100644
--- a/drivers/gpu/drm/i915/gt/intel_mocs.c
+++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
@@ -341,6 +341,30 @@ static const struct drm_i915_mocs_entry xehpsdv_mocs_table[] = {
 	MOCS_ENTRY(63, 0, L3_1_UC),
 };
 
+static const struct drm_i915_mocs_entry dg2_mocs_table[] = {
+	/* UC - Coherent; GO:L3 */
+	MOCS_ENTRY(0, 0, L3_1_UC | L3_LKUP(1)),
+	/* UC - Coherent; GO:Memory */
+	MOCS_ENTRY(1, 0, L3_1_UC | L3_GLBGO(1) | L3_LKUP(1)),
+	/* UC - Non-Coherent; GO:Memory */
+	MOCS_ENTRY(2, 0, L3_1_UC | L3_GLBGO(1)),
+
+	/* WB - LC */
+	MOCS_ENTRY(3, 0, L3_3_WB | L3_LKUP(1)),
+};
+
+static const struct drm_i915_mocs_entry dg2_mocs_table_g10_ax[] = {
+	/* Wa_14011441408: Set Go to Memory for MOCS#0 */
+	MOCS_ENTRY(0, 0, L3_1_UC | L3_GLBGO(1) | L3_LKUP(1)),
+	/* UC - Coherent; GO:Memory */
+	MOCS_ENTRY(1, 0, L3_1_UC | L3_GLBGO(1) | L3_LKUP(1)),
+	/* UC - Non-Coherent; GO:Memory */
+	MOCS_ENTRY(2, 0, L3_1_UC | L3_GLBGO(1)),
+
+	/* WB - LC */
+	MOCS_ENTRY(3, 0, L3_3_WB | L3_LKUP(1)),
+};
+
 enum {
 	HAS_GLOBAL_MOCS = BIT(0),
 	HAS_ENGINE_MOCS = BIT(1),
@@ -367,7 +391,16 @@ static unsigned int get_mocs_settings(const struct drm_i915_private *i915,
 {
 	unsigned int flags;
 
-	if (IS_XEHPSDV(i915)) {
+	if (IS_DG2(i915)) {
+		if (IS_DG2_GT_STEP(i915, G10, STEP_A0, STEP_B0)) {
+			table->size = ARRAY_SIZE(dg2_mocs_table_g10_ax);
+			table->table = dg2_mocs_table_g10_ax;
+		} else {
+			table->size = ARRAY_SIZE(dg2_mocs_table);
+			table->table = dg2_mocs_table;
+		}
+		table->n_entries = GEN9_NUM_MOCS_ENTRIES;
+	} else if (IS_XEHPSDV(i915)) {
 		table->size = ARRAY_SIZE(xehpsdv_mocs_table);
 		table->table = xehpsdv_mocs_table;
 		table->n_entries = GEN9_NUM_MOCS_ENTRIES;
-- 
2.25.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v4 14/18] drm/i915/xehpsdv: factor out function to read RP_STATE_CAP
  2021-07-29 16:59 [Intel-gfx] [PATCH v4 00/18] Begin enabling Xe_HP SDV and DG2 platforms Matt Roper
                   ` (12 preceding siblings ...)
  2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 13/18] drm/i915/dg2: Define MOCS table for DG2 Matt Roper
@ 2021-07-29 17:00 ` Matt Roper
  2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 15/18] drm/i915/xehpsdv: Read correct RP_STATE_CAP register Matt Roper
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 33+ messages in thread
From: Matt Roper @ 2021-07-29 17:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: Lucas De Marchi

From: Lucas De Marchi <lucas.demarchi@intel.com>

Instead of maintaining the same if ladder in 3 different places, add a
function to read RP_STATE_CAP.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/debugfs_gt_pm.c |  8 +++-----
 drivers/gpu/drm/i915/gt/intel_rps.c     | 17 ++++++++++++-----
 drivers/gpu/drm/i915/gt/intel_rps.h     |  1 +
 drivers/gpu/drm/i915/i915_debugfs.c     |  8 +++-----
 4 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c
index 4270b5a34a83..1061a62bdfce 100644
--- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c
@@ -309,13 +309,11 @@ static int frequency_show(struct seq_file *m, void *unused)
 		int max_freq;
 
 		rp_state_limits = intel_uncore_read(uncore, GEN6_RP_STATE_LIMITS);
-		if (IS_GEN9_LP(i915)) {
-			rp_state_cap = intel_uncore_read(uncore, BXT_RP_STATE_CAP);
+		rp_state_cap = intel_rps_read_state_cap(rps);
+		if (IS_GEN9_LP(i915))
 			gt_perf_status = intel_uncore_read(uncore, BXT_GT_PERF_STATUS);
-		} else {
-			rp_state_cap = intel_uncore_read(uncore, GEN6_RP_STATE_CAP);
+		else
 			gt_perf_status = intel_uncore_read(uncore, GEN6_GT_PERF_STATUS);
-		}
 
 		/* RPSTAT1 is in the GT power well */
 		intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL);
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c
index 0c8e7f2b06f0..8b0f429ba5be 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -975,20 +975,16 @@ int intel_rps_set(struct intel_rps *rps, u8 val)
 static void gen6_rps_init(struct intel_rps *rps)
 {
 	struct drm_i915_private *i915 = rps_to_i915(rps);
-	struct intel_uncore *uncore = rps_to_uncore(rps);
+	u32 rp_state_cap = intel_rps_read_state_cap(rps);
 
 	/* All of these values are in units of 50MHz */
 
 	/* static values from HW: RP0 > RP1 > RPn (min_freq) */
 	if (IS_GEN9_LP(i915)) {
-		u32 rp_state_cap = intel_uncore_read(uncore, BXT_RP_STATE_CAP);
-
 		rps->rp0_freq = (rp_state_cap >> 16) & 0xff;
 		rps->rp1_freq = (rp_state_cap >>  8) & 0xff;
 		rps->min_freq = (rp_state_cap >>  0) & 0xff;
 	} else {
-		u32 rp_state_cap = intel_uncore_read(uncore, GEN6_RP_STATE_CAP);
-
 		rps->rp0_freq = (rp_state_cap >>  0) & 0xff;
 		rps->rp1_freq = (rp_state_cap >>  8) & 0xff;
 		rps->min_freq = (rp_state_cap >> 16) & 0xff;
@@ -1940,6 +1936,17 @@ u32 intel_rps_read_actual_frequency(struct intel_rps *rps)
 	return freq;
 }
 
+u32 intel_rps_read_state_cap(struct intel_rps *rps)
+{
+	struct drm_i915_private *i915 = rps_to_i915(rps);
+	struct intel_uncore *uncore = rps_to_uncore(rps);
+
+	if (IS_GEN9_LP(i915))
+		return intel_uncore_read(uncore, BXT_RP_STATE_CAP);
+	else
+		return intel_uncore_read(uncore, GEN6_RP_STATE_CAP);
+}
+
 /* External interface for intel_ips.ko */
 
 static struct drm_i915_private __rcu *ips_mchdev;
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.h b/drivers/gpu/drm/i915/gt/intel_rps.h
index 1d2cfc98b510..6e06dd61f818 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.h
+++ b/drivers/gpu/drm/i915/gt/intel_rps.h
@@ -31,6 +31,7 @@ int intel_gpu_freq(struct intel_rps *rps, int val);
 int intel_freq_opcode(struct intel_rps *rps, int val);
 u32 intel_rps_get_cagf(struct intel_rps *rps, u32 rpstat1);
 u32 intel_rps_read_actual_frequency(struct intel_rps *rps);
+u32 intel_rps_read_state_cap(struct intel_rps *rps);
 
 void gen5_rps_irq_handler(struct intel_rps *rps);
 void gen6_rps_irq_handler(struct intel_rps *rps, u32 pm_iir);
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 0529576f069c..37056b2c044a 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -420,13 +420,11 @@ static int i915_frequency_info(struct seq_file *m, void *unused)
 		int max_freq;
 
 		rp_state_limits = intel_uncore_read(&dev_priv->uncore, GEN6_RP_STATE_LIMITS);
-		if (IS_GEN9_LP(dev_priv)) {
-			rp_state_cap = intel_uncore_read(&dev_priv->uncore, BXT_RP_STATE_CAP);
+		rp_state_cap = intel_rps_read_state_cap(rps);
+		if (IS_GEN9_LP(dev_priv))
 			gt_perf_status = intel_uncore_read(&dev_priv->uncore, BXT_GT_PERF_STATUS);
-		} else {
-			rp_state_cap = intel_uncore_read(&dev_priv->uncore, GEN6_RP_STATE_CAP);
+		else
 			gt_perf_status = intel_uncore_read(&dev_priv->uncore, GEN6_GT_PERF_STATUS);
-		}
 
 		/* RPSTAT1 is in the GT power well */
 		intel_uncore_forcewake_get(&dev_priv->uncore, FORCEWAKE_ALL);
-- 
2.25.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v4 15/18] drm/i915/xehpsdv: Read correct RP_STATE_CAP register
  2021-07-29 16:59 [Intel-gfx] [PATCH v4 00/18] Begin enabling Xe_HP SDV and DG2 platforms Matt Roper
                   ` (13 preceding siblings ...)
  2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 14/18] drm/i915/xehpsdv: factor out function to read RP_STATE_CAP Matt Roper
@ 2021-07-29 17:00 ` Matt Roper
  2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 16/18] drm/i915/dg2: Add new LRI reg offsets Matt Roper
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 33+ messages in thread
From: Matt Roper @ 2021-07-29 17:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: Lucas De Marchi

The RP_STATE_CAP register is no longer part of the MCHBAR on XEHPSDV; this
register is now a per-tile register at GTTMMADDR offset 0x250014.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_rps.c | 4 +++-
 drivers/gpu/drm/i915/i915_reg.h     | 1 +
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c
index 8b0f429ba5be..10bb172a3826 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -1941,7 +1941,9 @@ u32 intel_rps_read_state_cap(struct intel_rps *rps)
 	struct drm_i915_private *i915 = rps_to_i915(rps);
 	struct intel_uncore *uncore = rps_to_uncore(rps);
 
-	if (IS_GEN9_LP(i915))
+	if (IS_XEHPSDV(i915))
+		return intel_uncore_read(uncore, XEHPSDV_RP_STATE_CAP);
+	else if (IS_GEN9_LP(i915))
 		return intel_uncore_read(uncore, BXT_RP_STATE_CAP);
 	else
 		return intel_uncore_read(uncore, GEN6_RP_STATE_CAP);
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index d7e4418955f7..ed85f3ec1727 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -4184,6 +4184,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define GEN6_RP_STATE_CAP	_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5998)
 #define BXT_RP_STATE_CAP        _MMIO(0x138170)
 #define GEN9_RP_STATE_LIMITS	_MMIO(0x138148)
+#define XEHPSDV_RP_STATE_CAP	_MMIO(0x250014)
 
 /*
  * Logical Context regs
-- 
2.25.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v4 16/18] drm/i915/dg2: Add new LRI reg offsets
  2021-07-29 16:59 [Intel-gfx] [PATCH v4 00/18] Begin enabling Xe_HP SDV and DG2 platforms Matt Roper
                   ` (14 preceding siblings ...)
  2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 15/18] drm/i915/xehpsdv: Read correct RP_STATE_CAP register Matt Roper
@ 2021-07-29 17:00 ` Matt Roper
  2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 17/18] drm/i915/dg2: Maintain backward-compatible nested batch behavior Matt Roper
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 33+ messages in thread
From: Matt Roper @ 2021-07-29 17:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris P Wilson

From: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>

New LRI register offsets were introduced for DG2, this patch adds
those extra registers, and create new register table for setting offsets
to compare with HW generated context image - especially for gt_lrc test.
Also updates general purpose register with scratch offset for DG2, in
order to use it for live_lrc_fixed selftest.

Cc: Chris P Wilson <chris.p.wilson@intel.com>
Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 85 ++++++++++++++++++++++++++++-
 1 file changed, 83 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index c3f5bec8ae15..1b7e75e4c011 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -226,6 +226,40 @@ static const u8 gen12_xcs_offsets[] = {
 	END
 };
 
+static const u8 dg2_xcs_offsets[] = {
+	NOP(1),
+	LRI(15, POSTED),
+	REG16(0x244),
+	REG(0x034),
+	REG(0x030),
+	REG(0x038),
+	REG(0x03c),
+	REG(0x168),
+	REG(0x140),
+	REG(0x110),
+	REG(0x1c0),
+	REG(0x1c4),
+	REG(0x1c8),
+	REG(0x180),
+	REG16(0x2b4),
+	REG(0x120),
+	REG(0x124),
+
+	NOP(1),
+	LRI(9, POSTED),
+	REG16(0x3a8),
+	REG16(0x28c),
+	REG16(0x288),
+	REG16(0x284),
+	REG16(0x280),
+	REG16(0x27c),
+	REG16(0x278),
+	REG16(0x274),
+	REG16(0x270),
+
+	END
+};
+
 static const u8 gen8_rcs_offsets[] = {
 	NOP(1),
 	LRI(14, POSTED),
@@ -525,6 +559,49 @@ static const u8 xehp_rcs_offsets[] = {
 	END
 };
 
+static const u8 dg2_rcs_offsets[] = {
+	NOP(1),
+	LRI(15, POSTED),
+	REG16(0x244),
+	REG(0x034),
+	REG(0x030),
+	REG(0x038),
+	REG(0x03c),
+	REG(0x168),
+	REG(0x140),
+	REG(0x110),
+	REG(0x1c0),
+	REG(0x1c4),
+	REG(0x1c8),
+	REG(0x180),
+	REG16(0x2b4),
+	REG(0x120),
+	REG(0x124),
+
+	NOP(1),
+	LRI(9, POSTED),
+	REG16(0x3a8),
+	REG16(0x28c),
+	REG16(0x288),
+	REG16(0x284),
+	REG16(0x280),
+	REG16(0x27c),
+	REG16(0x278),
+	REG16(0x274),
+	REG16(0x270),
+
+	LRI(3, POSTED),
+	REG(0x1b0),
+	REG16(0x5a8),
+	REG16(0x5ac),
+
+	NOP(6),
+	LRI(1, 0),
+	REG(0x0c8),
+
+	END
+};
+
 #undef END
 #undef REG16
 #undef REG
@@ -543,7 +620,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs *engine)
 		   !intel_engine_has_relative_mmio(engine));
 
 	if (engine->class == RENDER_CLASS) {
-		if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
+		if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 55))
+			return dg2_rcs_offsets;
+		else if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
 			return xehp_rcs_offsets;
 		else if (GRAPHICS_VER(engine->i915) >= 12)
 			return gen12_rcs_offsets;
@@ -554,7 +633,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs *engine)
 		else
 			return gen8_rcs_offsets;
 	} else {
-		if (GRAPHICS_VER(engine->i915) >= 12)
+		if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 55))
+			return dg2_xcs_offsets;
+		else if (GRAPHICS_VER(engine->i915) >= 12)
 			return gen12_xcs_offsets;
 		else if (GRAPHICS_VER(engine->i915) >= 9)
 			return gen9_xcs_offsets;
-- 
2.25.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v4 17/18] drm/i915/dg2: Maintain backward-compatible nested batch behavior
  2021-07-29 16:59 [Intel-gfx] [PATCH v4 00/18] Begin enabling Xe_HP SDV and DG2 platforms Matt Roper
                   ` (15 preceding siblings ...)
  2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 16/18] drm/i915/dg2: Add new LRI reg offsets Matt Roper
@ 2021-07-29 17:00 ` Matt Roper
  2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 18/18] drm/i915/dg2: Configure PCON in DP pre-enable path Matt Roper
  2021-07-29 20:59 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for Begin enabling Xe_HP SDV and DG2 platforms (rev8) Patchwork
  18 siblings, 0 replies; 33+ messages in thread
From: Matt Roper @ 2021-07-29 17:00 UTC (permalink / raw)
  To: intel-gfx

For tgl+, the per-context setting of MI_MODE[12] determines whether
the bits of a nested MI_BATCH_BUFFER_START instruction should be
interpreted in the traditional manner or whether they should
instead use a new tgl+ meaning that breaks backward compatibility, but
allows nesting into 3rd-level batchbuffers.  For previous platforms,
the hardware default for this register bit is to maintain
backward-compatible behavior unless a context intentionally opts into
the new behavior; however Xe_HPG flips the hardware default behavior.

From a SW perspective, we want to maintain the backward-compatible
behavior for userspace, so we'll apply a fake workaround to set it back
to the legacy behavior on platforms where the hardware default is to
break compatibility.  At the moment there is no Linux userspace that
utilizes third-level batchbuffers, so this will avoid userspace from
needing to make any changes.  using the legacy meaning is the correct
thing to do.  If/when we have userspace consumers that want to utilize
third-level batch nesting, we can provide a context parameter to allow
them to opt-in.

Bspec: 45974, 45718
Cc: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 39 +++++++++++++++++++--
 drivers/gpu/drm/i915/i915_reg.h             |  1 +
 2 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 6895b083523d..3e756b761526 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -644,6 +644,37 @@ static void dg1_ctx_workarounds_init(struct intel_engine_cs *engine,
 		     DG1_HZ_READ_SUPPRESSION_OPTIMIZATION_DISABLE);
 }
 
+static void fakewa_disable_nestedbb_mode(struct intel_engine_cs *engine,
+					 struct i915_wa_list *wal)
+{
+	/*
+	 * This is a "fake" workaround defined by software to ensure we
+	 * maintain reliable, backward-compatible behavior for userspace with
+	 * regards to how nested MI_BATCH_BUFFER_START commands are handled.
+	 *
+	 * The per-context setting of MI_MODE[12] determines whether the bits
+	 * of a nested MI_BATCH_BUFFER_START instruction should be interpreted
+	 * in the traditional manner or whether they should instead use a new
+	 * tgl+ meaning that breaks backward compatibility, but allows nesting
+	 * into 3rd-level batchbuffers.  When this new capability was first
+	 * added in TGL, it remained off by default unless a context
+	 * intentionally opted in to the new behavior.  However Xe_HPG now
+	 * flips this on by default and requires that we explicitly opt out if
+	 * we don't want the new behavior.
+	 *
+	 * From a SW perspective, we want to maintain the backward-compatible
+	 * behavior for userspace, so we'll apply a fake workaround to set it
+	 * back to the legacy behavior on platforms where the hardware default
+	 * is to break compatibility.  At the moment there is no Linux
+	 * userspace that utilizes third-level batchbuffers, so this will avoid
+	 * userspace from needing to make any changes.  using the legacy
+	 * meaning is the correct thing to do.  If/when we have userspace
+	 * consumers that want to utilize third-level batch nesting, we can
+	 * provide a context parameter to allow them to opt-in.
+	 */
+	wa_masked_dis(wal, RING_MI_MODE(engine->mmio_base), TGL_NESTED_BB_EN);
+}
+
 static void
 __intel_engine_init_ctx_wa(struct intel_engine_cs *engine,
 			   struct i915_wa_list *wal,
@@ -651,11 +682,15 @@ __intel_engine_init_ctx_wa(struct intel_engine_cs *engine,
 {
 	struct drm_i915_private *i915 = engine->i915;
 
+	wa_init_start(wal, name, engine->name);
+
+	/* Applies to all engines */
+	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55))
+		fakewa_disable_nestedbb_mode(engine, wal);
+
 	if (engine->class != RENDER_CLASS)
 		return;
 
-	wa_init_start(wal, name, engine->name);
-
 	if (IS_DG1(i915))
 		dg1_ctx_workarounds_init(engine, wal);
 	else if (GRAPHICS_VER(i915) == 12)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index ed85f3ec1727..c192c1e096d7 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2893,6 +2893,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define MI_MODE		_MMIO(0x209c)
 # define VS_TIMER_DISPATCH				(1 << 6)
 # define MI_FLUSH_ENABLE				(1 << 12)
+# define TGL_NESTED_BB_EN				(1 << 12)
 # define ASYNC_FLIP_PERF_DISABLE			(1 << 14)
 # define MODE_IDLE					(1 << 9)
 # define STOP_RING					(1 << 8)
-- 
2.25.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Intel-gfx] [PATCH v4 18/18] drm/i915/dg2: Configure PCON in DP pre-enable path
  2021-07-29 16:59 [Intel-gfx] [PATCH v4 00/18] Begin enabling Xe_HP SDV and DG2 platforms Matt Roper
                   ` (16 preceding siblings ...)
  2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 17/18] drm/i915/dg2: Maintain backward-compatible nested batch behavior Matt Roper
@ 2021-07-29 17:00 ` Matt Roper
  2021-07-29 20:59 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for Begin enabling Xe_HP SDV and DG2 platforms (rev8) Patchwork
  18 siblings, 0 replies; 33+ messages in thread
From: Matt Roper @ 2021-07-29 17:00 UTC (permalink / raw)
  To: intel-gfx

From: Ankit Nautiyal <ankit.k.nautiyal@intel.com>

Add the functions to configure HDMI2.1 pcon for DG2, before DP link
training.

Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/display/intel_ddi.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c
index 85856d040637..233f02c617a4 100644
--- a/drivers/gpu/drm/i915/display/intel_ddi.c
+++ b/drivers/gpu/drm/i915/display/intel_ddi.c
@@ -2576,6 +2576,7 @@ static void dg2_ddi_pre_enable_dp(struct intel_atomic_state *state,
 	if (!is_mst)
 		intel_dp_set_power(intel_dp, DP_SET_POWER_D0);
 
+	intel_dp_configure_protocol_converter(intel_dp, crtc_state);
 	intel_dp_sink_set_decompression_state(intel_dp, crtc_state, true);
 	/*
 	 * DDI FEC: "anticipates enabling FEC encoding sets the FEC_READY bit
@@ -2583,6 +2584,8 @@ static void dg2_ddi_pre_enable_dp(struct intel_atomic_state *state,
 	 * training
 	 */
 	intel_dp_sink_set_fec_ready(intel_dp, crtc_state);
+	intel_dp_check_frl_training(intel_dp);
+	intel_dp_pcon_dsc_configure(intel_dp, crtc_state);
 
 	/*
 	 * 5.h Follow DisplayPort specification training sequence (see notes for
-- 
2.25.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v4 12/18] drm/i915/xehpsdv: Define MOCS table for XeHP SDV
  2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 12/18] drm/i915/xehpsdv: Define MOCS table for " Matt Roper
@ 2021-07-29 17:31   ` Lucas De Marchi
  2021-07-30  7:16     ` Siddiqui, Ayaz A
  0 siblings, 1 reply; 33+ messages in thread
From: Lucas De Marchi @ 2021-07-29 17:31 UTC (permalink / raw)
  To: Matt Roper; +Cc: Daniel Vetter, intel-gfx

On Thu, Jul 29, 2021 at 10:00:02AM -0700, Matt Roper wrote:
>From: Lucas De Marchi <lucas.demarchi@intel.com>
>
>Like DG1, XeHP SDV doesn't have LLC/eDRAM control values due to being a
>dgfx card. XeHP SDV adds 2 more bits: L3_GLBGO to "push the Go point to
>memory for L3 destined transaction" and L3_LKP to "enable Lookup for
>uncacheable accesses".
>
>Bspec: 45101
>Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
>Signed-off-by: Stuart Summers <stuart.summers@intel.com>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>

+Ayaz,  +Daniel

I think this can't land as is since we risk forgetting additional
changes that we will have to do. We already made the mistake once of
forgetting MOCS changes.

There are some patches to initialize unused MOCS entries and similar
that should have been sent already to upstream. Ayaz, what's the state
of those patches?

Lucas De Marchi

>---
> drivers/gpu/drm/i915/gt/intel_mocs.c | 33 +++++++++++++++++++++++++++-
> 1 file changed, 32 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c
>index 17848807f111..0c9d0b936c20 100644
>--- a/drivers/gpu/drm/i915/gt/intel_mocs.c
>+++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
>@@ -40,6 +40,8 @@ struct drm_i915_mocs_table {
> #define L3_ESC(value)		((value) << 0)
> #define L3_SCC(value)		((value) << 1)
> #define _L3_CACHEABILITY(value)	((value) << 4)
>+#define L3_GLBGO(value)		((value) << 6)
>+#define L3_LKUP(value)		((value) << 7)
>
> /* Helper defines */
> #define GEN9_NUM_MOCS_ENTRIES	64  /* 63-64 are reserved, but configured. */
>@@ -314,6 +316,31 @@ static const struct drm_i915_mocs_entry dg1_mocs_table[] = {
> 	MOCS_ENTRY(63, 0, L3_1_UC),
> };
>
>+static const struct drm_i915_mocs_entry xehpsdv_mocs_table[] = {
>+	/* wa_1608975824 */
>+	MOCS_ENTRY(0, 0, L3_3_WB | L3_LKUP(1)),
>+
>+	/* UC - Coherent; GO:L3 */
>+	MOCS_ENTRY(1, 0, L3_1_UC | L3_LKUP(1)),
>+	/* UC - Coherent; GO:Memory */
>+	MOCS_ENTRY(2, 0, L3_1_UC | L3_GLBGO(1) | L3_LKUP(1)),
>+	/* UC - Non-Coherent; GO:Memory */
>+	MOCS_ENTRY(3, 0, L3_1_UC | L3_GLBGO(1)),
>+	/* UC - Non-Coherent; GO:L3 */
>+	MOCS_ENTRY(4, 0, L3_1_UC),
>+
>+	/* WB */
>+	MOCS_ENTRY(5, 0, L3_3_WB | L3_LKUP(1)),
>+
>+	/* HW Reserved - SW program but never use. */
>+	MOCS_ENTRY(48, 0, L3_3_WB | L3_LKUP(1)),
>+	MOCS_ENTRY(49, 0, L3_1_UC | L3_LKUP(1)),
>+	MOCS_ENTRY(60, 0, L3_1_UC),
>+	MOCS_ENTRY(61, 0, L3_1_UC),
>+	MOCS_ENTRY(62, 0, L3_1_UC),
>+	MOCS_ENTRY(63, 0, L3_1_UC),
>+};
>+
> enum {
> 	HAS_GLOBAL_MOCS = BIT(0),
> 	HAS_ENGINE_MOCS = BIT(1),
>@@ -340,7 +367,11 @@ static unsigned int get_mocs_settings(const struct drm_i915_private *i915,
> {
> 	unsigned int flags;
>
>-	if (IS_DG1(i915)) {
>+	if (IS_XEHPSDV(i915)) {
>+		table->size = ARRAY_SIZE(xehpsdv_mocs_table);
>+		table->table = xehpsdv_mocs_table;
>+		table->n_entries = GEN9_NUM_MOCS_ENTRIES;
>+	} else if (IS_DG1(i915)) {
> 		table->size = ARRAY_SIZE(dg1_mocs_table);
> 		table->table = dg1_mocs_table;
> 		table->n_entries = GEN9_NUM_MOCS_ENTRIES;
>-- 
>2.25.4
>
>_______________________________________________
>Intel-gfx mailing list
>Intel-gfx@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: failure for Begin enabling Xe_HP SDV and DG2 platforms (rev8)
  2021-07-29 16:59 [Intel-gfx] [PATCH v4 00/18] Begin enabling Xe_HP SDV and DG2 platforms Matt Roper
                   ` (17 preceding siblings ...)
  2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 18/18] drm/i915/dg2: Configure PCON in DP pre-enable path Matt Roper
@ 2021-07-29 20:59 ` Patchwork
  18 siblings, 0 replies; 33+ messages in thread
From: Patchwork @ 2021-07-29 20:59 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx

== Series Details ==

Series: Begin enabling Xe_HP SDV and DG2 platforms (rev8)
URL   : https://patchwork.freedesktop.org/series/92135/
State : failure

== Summary ==

Applying: drm/i915/xehp: handle new steering options
Applying: drm/i915/xehpsdv: Define steering tables
Applying: drm/i915/dg2: Add forcewake table
Applying: drm/i915/dg2: Update LNCF steering ranges
Applying: drm/i915/dg2: Add SQIDI steering
Applying: drm/i915/xehp: Loop over all gslices for INSTDONE processing
Applying: drm/i915/dg2: Report INSTDONE_GEOM values in error state
Applying: drm/i915/xehp: Changes to ss/eu definitions
Applying: drm/i915/xehpsdv: Add maximum sseu limits
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/i915/gt/intel_sseu.c
M	drivers/gpu/drm/i915/gt/intel_sseu.h
M	drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c
Falling back to patching base and 3-way merge...
Auto-merging drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c
Auto-merging drivers/gpu/drm/i915/gt/intel_sseu.h
CONFLICT (content): Merge conflict in drivers/gpu/drm/i915/gt/intel_sseu.h
Auto-merging drivers/gpu/drm/i915/gt/intel_sseu.c
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0009 drm/i915/xehpsdv: Add maximum sseu limits
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v4 12/18] drm/i915/xehpsdv: Define MOCS table for XeHP SDV
  2021-07-29 17:31   ` Lucas De Marchi
@ 2021-07-30  7:16     ` Siddiqui, Ayaz A
  2021-07-30 17:01       ` Matt Roper
  0 siblings, 1 reply; 33+ messages in thread
From: Siddiqui, Ayaz A @ 2021-07-30  7:16 UTC (permalink / raw)
  To: De Marchi, Lucas, Roper, Matthew D
  Cc: Daniel Vetter, intel-gfx, Lahtinen, Joonas



> -----Original Message-----
> From: De Marchi, Lucas <lucas.demarchi@intel.com>
> Sent: Thursday, July 29, 2021 11:01 PM
> To: Roper, Matthew D <matthew.d.roper@intel.com>
> Cc: intel-gfx@lists.freedesktop.org; Siddiqui, Ayaz A
> <ayaz.siddiqui@intel.com>; Daniel Vetter <daniel.vetter@ffwll.ch>
> Subject: Re: [Intel-gfx] [PATCH v4 12/18] drm/i915/xehpsdv: Define MOCS
> table for XeHP SDV
> 
> On Thu, Jul 29, 2021 at 10:00:02AM -0700, Matt Roper wrote:
> >From: Lucas De Marchi <lucas.demarchi@intel.com>
> >
> >Like DG1, XeHP SDV doesn't have LLC/eDRAM control values due to being a
> >dgfx card. XeHP SDV adds 2 more bits: L3_GLBGO to "push the Go point to
> >memory for L3 destined transaction" and L3_LKP to "enable Lookup for
> >uncacheable accesses".
> >
> >Bspec: 45101
> >Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> >Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> >Signed-off-by: Stuart Summers <stuart.summers@intel.com>
> >Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> 
> +Ayaz,  +Daniel
> 
> I think this can't land as is since we risk forgetting additional changes that we
> will have to do. We already made the mistake once of forgetting MOCS
> changes.
> 
> There are some patches to initialize unused MOCS entries and similar that
> should have been sent already to upstream. Ayaz, what's the state of those
> patches?
There are two section of patches .
1. Programming to  CMD_CCTL for with UC this patch should sent for upstream.
I'll share the patches for CMD_CCTL programming.  
2. Programming of unused/unspecified MOCS index to WB, we observed some
regression on media use cases. So we decided to delay  upstreaming of
those patches till  MOCS programming are fixed in UMDs.
Regards
-Ayaz
 
> 
> Lucas De Marchi
> 
> >---
> > drivers/gpu/drm/i915/gt/intel_mocs.c | 33
> +++++++++++++++++++++++++++-
> > 1 file changed, 32 insertions(+), 1 deletion(-)
> >
> >diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c
> >b/drivers/gpu/drm/i915/gt/intel_mocs.c
> >index 17848807f111..0c9d0b936c20 100644
> >--- a/drivers/gpu/drm/i915/gt/intel_mocs.c
> >+++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
> >@@ -40,6 +40,8 @@ struct drm_i915_mocs_table {
> > #define L3_ESC(value)		((value) << 0)
> > #define L3_SCC(value)		((value) << 1)
> > #define _L3_CACHEABILITY(value)	((value) << 4)
> >+#define L3_GLBGO(value)		((value) << 6)
> >+#define L3_LKUP(value)		((value) << 7)
> >
> > /* Helper defines */
> > #define GEN9_NUM_MOCS_ENTRIES	64  /* 63-64 are reserved, but
> configured. */
> >@@ -314,6 +316,31 @@ static const struct drm_i915_mocs_entry
> dg1_mocs_table[] = {
> > 	MOCS_ENTRY(63, 0, L3_1_UC),
> > };
> >
> >+static const struct drm_i915_mocs_entry xehpsdv_mocs_table[] = {
> >+	/* wa_1608975824 */
> >+	MOCS_ENTRY(0, 0, L3_3_WB | L3_LKUP(1)),
> >+
> >+	/* UC - Coherent; GO:L3 */
> >+	MOCS_ENTRY(1, 0, L3_1_UC | L3_LKUP(1)),
> >+	/* UC - Coherent; GO:Memory */
> >+	MOCS_ENTRY(2, 0, L3_1_UC | L3_GLBGO(1) | L3_LKUP(1)),
> >+	/* UC - Non-Coherent; GO:Memory */
> >+	MOCS_ENTRY(3, 0, L3_1_UC | L3_GLBGO(1)),
> >+	/* UC - Non-Coherent; GO:L3 */
> >+	MOCS_ENTRY(4, 0, L3_1_UC),
> >+
> >+	/* WB */
> >+	MOCS_ENTRY(5, 0, L3_3_WB | L3_LKUP(1)),
> >+
> >+	/* HW Reserved - SW program but never use. */
> >+	MOCS_ENTRY(48, 0, L3_3_WB | L3_LKUP(1)),
> >+	MOCS_ENTRY(49, 0, L3_1_UC | L3_LKUP(1)),
> >+	MOCS_ENTRY(60, 0, L3_1_UC),
> >+	MOCS_ENTRY(61, 0, L3_1_UC),
> >+	MOCS_ENTRY(62, 0, L3_1_UC),
> >+	MOCS_ENTRY(63, 0, L3_1_UC),
> >+};
> >+
> > enum {
> > 	HAS_GLOBAL_MOCS = BIT(0),
> > 	HAS_ENGINE_MOCS = BIT(1),
> >@@ -340,7 +367,11 @@ static unsigned int get_mocs_settings(const struct
> >drm_i915_private *i915,  {
> > 	unsigned int flags;
> >
> >-	if (IS_DG1(i915)) {
> >+	if (IS_XEHPSDV(i915)) {
> >+		table->size = ARRAY_SIZE(xehpsdv_mocs_table);
> >+		table->table = xehpsdv_mocs_table;
> >+		table->n_entries = GEN9_NUM_MOCS_ENTRIES;
> >+	} else if (IS_DG1(i915)) {
> > 		table->size = ARRAY_SIZE(dg1_mocs_table);
> > 		table->table = dg1_mocs_table;
> > 		table->n_entries = GEN9_NUM_MOCS_ENTRIES;
> >--
> >2.25.4
> >
> >_______________________________________________
> >Intel-gfx mailing list
> >Intel-gfx@lists.freedesktop.org
> >https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v4 12/18] drm/i915/xehpsdv: Define MOCS table for XeHP SDV
  2021-07-30  7:16     ` Siddiqui, Ayaz A
@ 2021-07-30 17:01       ` Matt Roper
  0 siblings, 0 replies; 33+ messages in thread
From: Matt Roper @ 2021-07-30 17:01 UTC (permalink / raw)
  To: Siddiqui, Ayaz A
  Cc: De Marchi, Lucas, intel-gfx, Daniel Vetter, Lahtinen, Joonas

On Fri, Jul 30, 2021 at 12:16:14AM -0700, Siddiqui, Ayaz A wrote:
> 
> 
> > -----Original Message-----
> > From: De Marchi, Lucas <lucas.demarchi@intel.com>
> > Sent: Thursday, July 29, 2021 11:01 PM
> > To: Roper, Matthew D <matthew.d.roper@intel.com>
> > Cc: intel-gfx@lists.freedesktop.org; Siddiqui, Ayaz A
> > <ayaz.siddiqui@intel.com>; Daniel Vetter <daniel.vetter@ffwll.ch>
> > Subject: Re: [Intel-gfx] [PATCH v4 12/18] drm/i915/xehpsdv: Define MOCS
> > table for XeHP SDV
> >
> > On Thu, Jul 29, 2021 at 10:00:02AM -0700, Matt Roper wrote:
> > >From: Lucas De Marchi <lucas.demarchi@intel.com>
> > >
> > >Like DG1, XeHP SDV doesn't have LLC/eDRAM control values due to being a
> > >dgfx card. XeHP SDV adds 2 more bits: L3_GLBGO to "push the Go point to
> > >memory for L3 destined transaction" and L3_LKP to "enable Lookup for
> > >uncacheable accesses".
> > >
> > >Bspec: 45101
> > >Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> > >Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> > >Signed-off-by: Stuart Summers <stuart.summers@intel.com>
> > >Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> >
> > +Ayaz,  +Daniel
> >
> > I think this can't land as is since we risk forgetting additional changes that we
> > will have to do. We already made the mistake once of forgetting MOCS
> > changes.
> >
> > There are some patches to initialize unused MOCS entries and similar that
> > should have been sent already to upstream. Ayaz, what's the state of those
> > patches?
> There are two section of patches .
> 1. Programming to  CMD_CCTL for with UC this patch should sent for upstream.
> I'll share the patches for CMD_CCTL programming.
> 2. Programming of unused/unspecified MOCS index to WB, we observed some
> regression on media use cases. So we decided to delay  upstreaming of
> those patches till  MOCS programming are fixed in UMDs.

I don't follow on #2.  At the moment we don't have a complete XeHP SDV
driver upstream (the device IDs aren't even there and won't be for a
while) so there's nothing that can regress yet from an upstream point of
view --- you can't run a UMD at all today.  If internal builds of the
UMDs run against our internal staging trees are still buggy and won't
work properly with the proper MOCS table settings, then that just makes
it more important to upstream these patches now so that the UMD's don't
come to rely on the incorrect table values.  If we wait until later
after the driver is "complete" and we've already enabled first contact
of the UMDs in general, then we won't be able to update the MOCS
programming to the proper values anymore without breaking ABI.

Unless you're talking about regressing older platforms?  In that case we
should be able to restrict the WB programming to just Xe_HP SDV for now;
ultimately we'll probably need to figure out some kind of opt-in
mechanism for the older platforms, but that can be separate from doing
it unconditionally on Xe_HP SDV and beyond I think.


Matt

> Regards
> -Ayaz
> 
> >
> > Lucas De Marchi
> >
> > >---
> > > drivers/gpu/drm/i915/gt/intel_mocs.c | 33
> > +++++++++++++++++++++++++++-
> > > 1 file changed, 32 insertions(+), 1 deletion(-)
> > >
> > >diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c
> > >b/drivers/gpu/drm/i915/gt/intel_mocs.c
> > >index 17848807f111..0c9d0b936c20 100644
> > >--- a/drivers/gpu/drm/i915/gt/intel_mocs.c
> > >+++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
> > >@@ -40,6 +40,8 @@ struct drm_i915_mocs_table {
> > > #define L3_ESC(value)               ((value) << 0)
> > > #define L3_SCC(value)               ((value) << 1)
> > > #define _L3_CACHEABILITY(value)     ((value) << 4)
> > >+#define L3_GLBGO(value)             ((value) << 6)
> > >+#define L3_LKUP(value)              ((value) << 7)
> > >
> > > /* Helper defines */
> > > #define GEN9_NUM_MOCS_ENTRIES       64  /* 63-64 are reserved, but
> > configured. */
> > >@@ -314,6 +316,31 @@ static const struct drm_i915_mocs_entry
> > dg1_mocs_table[] = {
> > >     MOCS_ENTRY(63, 0, L3_1_UC),
> > > };
> > >
> > >+static const struct drm_i915_mocs_entry xehpsdv_mocs_table[] = {
> > >+    /* wa_1608975824 */
> > >+    MOCS_ENTRY(0, 0, L3_3_WB | L3_LKUP(1)),
> > >+
> > >+    /* UC - Coherent; GO:L3 */
> > >+    MOCS_ENTRY(1, 0, L3_1_UC | L3_LKUP(1)),
> > >+    /* UC - Coherent; GO:Memory */
> > >+    MOCS_ENTRY(2, 0, L3_1_UC | L3_GLBGO(1) | L3_LKUP(1)),
> > >+    /* UC - Non-Coherent; GO:Memory */
> > >+    MOCS_ENTRY(3, 0, L3_1_UC | L3_GLBGO(1)),
> > >+    /* UC - Non-Coherent; GO:L3 */
> > >+    MOCS_ENTRY(4, 0, L3_1_UC),
> > >+
> > >+    /* WB */
> > >+    MOCS_ENTRY(5, 0, L3_3_WB | L3_LKUP(1)),
> > >+
> > >+    /* HW Reserved - SW program but never use. */
> > >+    MOCS_ENTRY(48, 0, L3_3_WB | L3_LKUP(1)),
> > >+    MOCS_ENTRY(49, 0, L3_1_UC | L3_LKUP(1)),
> > >+    MOCS_ENTRY(60, 0, L3_1_UC),
> > >+    MOCS_ENTRY(61, 0, L3_1_UC),
> > >+    MOCS_ENTRY(62, 0, L3_1_UC),
> > >+    MOCS_ENTRY(63, 0, L3_1_UC),
> > >+};
> > >+
> > > enum {
> > >     HAS_GLOBAL_MOCS = BIT(0),
> > >     HAS_ENGINE_MOCS = BIT(1),
> > >@@ -340,7 +367,11 @@ static unsigned int get_mocs_settings(const struct
> > >drm_i915_private *i915,  {
> > >     unsigned int flags;
> > >
> > >-    if (IS_DG1(i915)) {
> > >+    if (IS_XEHPSDV(i915)) {
> > >+            table->size = ARRAY_SIZE(xehpsdv_mocs_table);
> > >+            table->table = xehpsdv_mocs_table;
> > >+            table->n_entries = GEN9_NUM_MOCS_ENTRIES;
> > >+    } else if (IS_DG1(i915)) {
> > >             table->size = ARRAY_SIZE(dg1_mocs_table);
> > >             table->table = dg1_mocs_table;
> > >             table->n_entries = GEN9_NUM_MOCS_ENTRIES;
> > >--
> > >2.25.4
> > >
> > >_______________________________________________
> > >Intel-gfx mailing list
> > >Intel-gfx@lists.freedesktop.org
> > >https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v4 03/18] drm/i915/dg2: Add forcewake table
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 03/18] drm/i915/dg2: Add forcewake table Matt Roper
@ 2021-08-04  0:15   ` Souza, Jose
  0 siblings, 0 replies; 33+ messages in thread
From: Souza, Jose @ 2021-08-04  0:15 UTC (permalink / raw)
  To: Roper, Matthew D, intel-gfx

On Thu, 2021-07-29 at 09:59 -0700, Matt Roper wrote:
> The DG2 forcewake table is very similar to the one used by XeHP SDV (and
> both platforms are even presented as a single table in the bspec).  For
> the most part DG2 starts using a few additional ranges that were
> 'reserved' on XeHP SDV and stops using some others.  However there is a
> single range (0xd800-0xd87f) that needs to be handled differently
> between the two platforms (it needs GT wake on XeHP SDV, but render wake
> on DG2) so unless we want to wake both domains (which could waste power)
> or define new types of forcewake domains for this special case we need
> to have separate tables for the two platforms.  Let's define the ranges
> for both platforms with a parameterized macro so that we don't actually
> need to duplicate everything in the code.
> 
> It should be fine for DG2 to re-use the Xe_HP shadow register list so we
> can continue to use the 'xehpsdv' MMIO write functions and don't need to
> spin up a separate DG2 instance.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>

> 
> Bspec: 66534
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_uncore.c | 305 +++++++++++++++-------------
>  1 file changed, 168 insertions(+), 137 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> index 8cf53f54559d..6b38bc2811c1 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -1317,143 +1317,170 @@ static const struct intel_forcewake_range __gen12_fw_ranges[] = {
>  		0x1d3f00 - 0x1d3fff: VD2 */
>  };
>  
> -/* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
> -static const struct intel_forcewake_range __xehp_fw_ranges[] = {
> -	GEN_FW_RANGE(0x0, 0x1fff, 0), /*
> -		  0x0 -  0xaff: reserved
> -		0xb00 - 0x1fff: always on */
> -	GEN_FW_RANGE(0x2000, 0x26ff, FORCEWAKE_RENDER),
> -	GEN_FW_RANGE(0x2700, 0x4aff, FORCEWAKE_GT),
> -	GEN_FW_RANGE(0x4b00, 0x51ff, 0), /*
> -		0x4b00 - 0x4fff: reserved
> -		0x5000 - 0x51ff: always on */
> -	GEN_FW_RANGE(0x5200, 0x7fff, FORCEWAKE_RENDER),
> -	GEN_FW_RANGE(0x8000, 0x813f, FORCEWAKE_GT),
> -	GEN_FW_RANGE(0x8140, 0x815f, FORCEWAKE_RENDER),
> -	GEN_FW_RANGE(0x8160, 0x81ff, 0), /*
> -		0x8160 - 0x817f: reserved
> -		0x8180 - 0x81ff: always on */
> -	GEN_FW_RANGE(0x8200, 0x82ff, FORCEWAKE_GT),
> -	GEN_FW_RANGE(0x8300, 0x84ff, FORCEWAKE_RENDER),
> -	GEN_FW_RANGE(0x8500, 0x94cf, FORCEWAKE_GT), /*
> -		0x8500 - 0x87ff: gt
> -		0x8800 - 0x8fff: reserved
> -		0x9000 - 0x947f: gt
> -		0x9480 - 0x94cf: reserved */
> -	GEN_FW_RANGE(0x94d0, 0x955f, FORCEWAKE_RENDER),
> -	GEN_FW_RANGE(0x9560, 0x97ff, 0), /*
> -		0x9560 - 0x95ff: always on
> -		0x9600 - 0x97ff: reserved */
> -	GEN_FW_RANGE(0x9800, 0xcfff, FORCEWAKE_GT), /*
> -		0x9800 - 0xb4ff: gt
> -		0xb500 - 0xbfff: reserved
> -		0xc000 - 0xcfff: gt */
> -	GEN_FW_RANGE(0xd000, 0xd7ff, 0),
> -	GEN_FW_RANGE(0xd800, 0xdbff, FORCEWAKE_GT),
> -	GEN_FW_RANGE(0xdc00, 0xdcff, FORCEWAKE_RENDER),
> -	GEN_FW_RANGE(0xdd00, 0xde7f, FORCEWAKE_GT), /*
> -		0xdd00 - 0xddff: gt
> -		0xde00 - 0xde7f: reserved */
> -	GEN_FW_RANGE(0xde80, 0xe8ff, FORCEWAKE_RENDER), /*
> -		0xde80 - 0xdfff: render
> -		0xe000 - 0xe0ff: reserved
> -		0xe100 - 0xe8ff: render */
> -	GEN_FW_RANGE(0xe900, 0xffff, FORCEWAKE_GT), /*
> -		0xe900 - 0xe9ff: gt
> -		0xea00 - 0xefff: reserved
> -		0xf000 - 0xffff: gt */
> -	GEN_FW_RANGE(0x10000, 0x13fff, 0), /*
> -		0x10000 - 0x11fff: reserved
> -		0x12000 - 0x127ff: always on
> -		0x12800 - 0x13fff: reserved */
> -	GEN_FW_RANGE(0x14000, 0x141ff, FORCEWAKE_MEDIA_VDBOX0),
> -	GEN_FW_RANGE(0x14200, 0x143ff, FORCEWAKE_MEDIA_VDBOX2),
> -	GEN_FW_RANGE(0x14400, 0x145ff, FORCEWAKE_MEDIA_VDBOX4),
> -	GEN_FW_RANGE(0x14600, 0x147ff, FORCEWAKE_MEDIA_VDBOX6),
> -	GEN_FW_RANGE(0x14800, 0x1ffff, FORCEWAKE_RENDER), /*
> -		0x14800 - 0x14fff: render
> -		0x15000 - 0x16dff: reserved
> -		0x16e00 - 0x1ffff: render */
> -	GEN_FW_RANGE(0x20000, 0x21fff, FORCEWAKE_MEDIA_VDBOX0), /*
> -		0x20000 - 0x20fff: VD0
> -		0x21000 - 0x21fff: reserved */
> -	GEN_FW_RANGE(0x22000, 0x23fff, FORCEWAKE_GT),
> -	GEN_FW_RANGE(0x24000, 0x2417f, 0), /*
> -		0x24000 - 0x2407f: always on
> -		0x24080 - 0x2417f: reserved */
> -	GEN_FW_RANGE(0x24180, 0x249ff, FORCEWAKE_GT), /*
> -		0x24180 - 0x241ff: gt
> -		0x24200 - 0x249ff: reserved */
> -	GEN_FW_RANGE(0x24a00, 0x251ff, FORCEWAKE_RENDER), /*
> -		0x24a00 - 0x24a7f: render
> -		0x24a80 - 0x251ff: reserved */
> -	GEN_FW_RANGE(0x25200, 0x25fff, FORCEWAKE_GT), /*
> -		0x25200 - 0x252ff: gt
> -		0x25300 - 0x25fff: reserved */
> -	GEN_FW_RANGE(0x26000, 0x2ffff, FORCEWAKE_RENDER), /*
> -		0x26000 - 0x27fff: render
> -		0x28000 - 0x29fff: reserved
> -		0x2a000 - 0x2ffff: undocumented */
> -	GEN_FW_RANGE(0x30000, 0x3ffff, FORCEWAKE_GT),
> -	GEN_FW_RANGE(0x40000, 0x1bffff, 0),
> -	GEN_FW_RANGE(0x1c0000, 0x1c3fff, FORCEWAKE_MEDIA_VDBOX0), /*
> -		0x1c0000 - 0x1c2bff: VD0
> -		0x1c2c00 - 0x1c2cff: reserved
> -		0x1c2d00 - 0x1c2dff: VD0
> -		0x1c2e00 - 0x1c3eff: reserved
> -		0x1c3f00 - 0x1c3fff: VD0 */
> -	GEN_FW_RANGE(0x1c4000, 0x1c7fff, FORCEWAKE_MEDIA_VDBOX1), /*
> -		0x1c4000 - 0x1c6bff: VD1
> -		0x1c6c00 - 0x1c6cff: reserved
> -		0x1c6d00 - 0x1c6dff: VD1
> -		0x1c6e00 - 0x1c7fff: reserved */
> -	GEN_FW_RANGE(0x1c8000, 0x1cbfff, FORCEWAKE_MEDIA_VEBOX0), /*
> -		0x1c8000 - 0x1ca0ff: VE0
> -		0x1ca100 - 0x1cbfff: reserved */
> -	GEN_FW_RANGE(0x1cc000, 0x1ccfff, FORCEWAKE_MEDIA_VDBOX0),
> -	GEN_FW_RANGE(0x1cd000, 0x1cdfff, FORCEWAKE_MEDIA_VDBOX2),
> -	GEN_FW_RANGE(0x1ce000, 0x1cefff, FORCEWAKE_MEDIA_VDBOX4),
> -	GEN_FW_RANGE(0x1cf000, 0x1cffff, FORCEWAKE_MEDIA_VDBOX6),
> -	GEN_FW_RANGE(0x1d0000, 0x1d3fff, FORCEWAKE_MEDIA_VDBOX2), /*
> -		0x1d0000 - 0x1d2bff: VD2
> -		0x1d2c00 - 0x1d2cff: reserved
> -		0x1d2d00 - 0x1d2dff: VD2
> -		0x1d2e00 - 0x1d3eff: reserved
> -		0x1d3f00 - 0x1d3fff: VD2 */
> -	GEN_FW_RANGE(0x1d4000, 0x1d7fff, FORCEWAKE_MEDIA_VDBOX3), /*
> -		0x1d4000 - 0x1d6bff: VD3
> -		0x1d6c00 - 0x1d6cff: reserved
> -		0x1d6d00 - 0x1d6dff: VD3
> -		0x1d6e00 - 0x1d7fff: reserved */
> -	GEN_FW_RANGE(0x1d8000, 0x1dffff, FORCEWAKE_MEDIA_VEBOX1), /*
> -		0x1d8000 - 0x1da0ff: VE1
> -		0x1da100 - 0x1dffff: reserved */
> -	GEN_FW_RANGE(0x1e0000, 0x1e3fff, FORCEWAKE_MEDIA_VDBOX4), /*
> -		0x1e0000 - 0x1e2bff: VD4
> -		0x1e2c00 - 0x1e2cff: reserved
> -		0x1e2d00 - 0x1e2dff: VD4
> -		0x1e2e00 - 0x1e3eff: reserved
> -		0x1e3f00 - 0x1e3fff: VD4 */
> -	GEN_FW_RANGE(0x1e4000, 0x1e7fff, FORCEWAKE_MEDIA_VDBOX5), /*
> -		0x1e4000 - 0x1e6bff: VD5
> -		0x1e6c00 - 0x1e6cff: reserved
> -		0x1e6d00 - 0x1e6dff: VD5
> -		0x1e6e00 - 0x1e7fff: reserved */
> -	GEN_FW_RANGE(0x1e8000, 0x1effff, FORCEWAKE_MEDIA_VEBOX2), /*
> -		0x1e8000 - 0x1ea0ff: VE2
> -		0x1ea100 - 0x1effff: reserved */
> -	GEN_FW_RANGE(0x1f0000, 0x1f3fff, FORCEWAKE_MEDIA_VDBOX6), /*
> -		0x1f0000 - 0x1f2bff: VD6
> -		0x1f2c00 - 0x1f2cff: reserved
> -		0x1f2d00 - 0x1f2dff: VD6
> -		0x1f2e00 - 0x1f3eff: reserved
> -		0x1f3f00 - 0x1f3fff: VD6 */
> -	GEN_FW_RANGE(0x1f4000, 0x1f7fff, FORCEWAKE_MEDIA_VDBOX7), /*
> -		0x1f4000 - 0x1f6bff: VD7
> -		0x1f6c00 - 0x1f6cff: reserved
> -		0x1f6d00 - 0x1f6dff: VD7
> -		0x1f6e00 - 0x1f7fff: reserved */
> +/*
> + * Graphics IP version 12.55 brings a slight change to the 0xd800 range,
> + * switching it from the GT domain to the render domain.
> + *
> + * *Must* be sorted by offset ranges! See intel_fw_table_check().
> + */
> +#define XEHP_FWRANGES(FW_RANGE_D800)					\
> +	GEN_FW_RANGE(0x0, 0x1fff, 0), /*					\
> +		  0x0 -  0xaff: reserved					\
> +		0xb00 - 0x1fff: always on */					\
> +	GEN_FW_RANGE(0x2000, 0x26ff, FORCEWAKE_RENDER),				\
> +	GEN_FW_RANGE(0x2700, 0x4aff, FORCEWAKE_GT),				\
> +	GEN_FW_RANGE(0x4b00, 0x51ff, 0), /*					\
> +		0x4b00 - 0x4fff: reserved					\
> +		0x5000 - 0x51ff: always on */					\
> +	GEN_FW_RANGE(0x5200, 0x7fff, FORCEWAKE_RENDER),				\
> +	GEN_FW_RANGE(0x8000, 0x813f, FORCEWAKE_GT),				\
> +	GEN_FW_RANGE(0x8140, 0x815f, FORCEWAKE_RENDER),				\
> +	GEN_FW_RANGE(0x8160, 0x81ff, 0), /*					\
> +		0x8160 - 0x817f: reserved					\
> +		0x8180 - 0x81ff: always on */					\
> +	GEN_FW_RANGE(0x8200, 0x82ff, FORCEWAKE_GT),				\
> +	GEN_FW_RANGE(0x8300, 0x84ff, FORCEWAKE_RENDER),				\
> +	GEN_FW_RANGE(0x8500, 0x8cff, FORCEWAKE_GT), /*				\
> +		0x8500 - 0x87ff: gt						\
> +		0x8800 - 0x8c7f: reserved					\
> +		0x8c80 - 0x8cff: gt (DG2 only) */				\
> +	GEN_FW_RANGE(0x8d00, 0x8fff, FORCEWAKE_RENDER), /*			\
> +		0x8d00 - 0x8dff: render (DG2 only)				\
> +		0x8e00 - 0x8fff: reserved */					\
> +	GEN_FW_RANGE(0x9000, 0x94cf, FORCEWAKE_GT), /*				\
> +		0x9000 - 0x947f: gt						\
> +		0x9480 - 0x94cf: reserved */					\
> +	GEN_FW_RANGE(0x94d0, 0x955f, FORCEWAKE_RENDER),				\
> +	GEN_FW_RANGE(0x9560, 0x967f, 0), /*					\
> +		0x9560 - 0x95ff: always on					\
> +		0x9600 - 0x967f: reserved */					\
> +	GEN_FW_RANGE(0x9680, 0x97ff, FORCEWAKE_RENDER), /*			\
> +		0x9680 - 0x96ff: render (DG2 only)				\
> +		0x9700 - 0x97ff: reserved */					\
> +	GEN_FW_RANGE(0x9800, 0xcfff, FORCEWAKE_GT), /*				\
> +		0x9800 - 0xb4ff: gt						\
> +		0xb500 - 0xbfff: reserved					\
> +		0xc000 - 0xcfff: gt */						\
> +	GEN_FW_RANGE(0xd000, 0xd7ff, 0),					\
> +	GEN_FW_RANGE(0xd800, 0xd87f, FW_RANGE_D800),			\
> +	GEN_FW_RANGE(0xd880, 0xdbff, FORCEWAKE_GT),				\
> +	GEN_FW_RANGE(0xdc00, 0xdcff, FORCEWAKE_RENDER),				\
> +	GEN_FW_RANGE(0xdd00, 0xde7f, FORCEWAKE_GT), /*				\
> +		0xdd00 - 0xddff: gt						\
> +		0xde00 - 0xde7f: reserved */					\
> +	GEN_FW_RANGE(0xde80, 0xe8ff, FORCEWAKE_RENDER), /*			\
> +		0xde80 - 0xdfff: render						\
> +		0xe000 - 0xe0ff: reserved					\
> +		0xe100 - 0xe8ff: render */					\
> +	GEN_FW_RANGE(0xe900, 0xffff, FORCEWAKE_GT), /*				\
> +		0xe900 - 0xe9ff: gt						\
> +		0xea00 - 0xefff: reserved					\
> +		0xf000 - 0xffff: gt */						\
> +	GEN_FW_RANGE(0x10000, 0x12fff, 0), /*					\
> +		0x10000 - 0x11fff: reserved					\
> +		0x12000 - 0x127ff: always on					\
> +		0x12800 - 0x12fff: reserved */					\
> +	GEN_FW_RANGE(0x13000, 0x131ff, FORCEWAKE_MEDIA_VDBOX0), /* DG2 only */	\
> +	GEN_FW_RANGE(0x13200, 0x13fff, FORCEWAKE_MEDIA_VDBOX2), /*		\
> +		0x13200 - 0x133ff: VD2 (DG2 only)				\
> +		0x13400 - 0x13fff: reserved */					\
> +	GEN_FW_RANGE(0x14000, 0x141ff, FORCEWAKE_MEDIA_VDBOX0), /* XEHPSDV only */	\
> +	GEN_FW_RANGE(0x14200, 0x143ff, FORCEWAKE_MEDIA_VDBOX2), /* XEHPSDV only */	\
> +	GEN_FW_RANGE(0x14400, 0x145ff, FORCEWAKE_MEDIA_VDBOX4), /* XEHPSDV only */	\
> +	GEN_FW_RANGE(0x14600, 0x147ff, FORCEWAKE_MEDIA_VDBOX6), /* XEHPSDV only */	\
> +	GEN_FW_RANGE(0x14800, 0x14fff, FORCEWAKE_RENDER),			\
> +	GEN_FW_RANGE(0x15000, 0x16dff, FORCEWAKE_GT), /*			\
> +		0x15000 - 0x15fff: gt (DG2 only)				\
> +		0x16000 - 0x16dff: reserved */					\
> +	GEN_FW_RANGE(0x16e00, 0x1ffff, FORCEWAKE_RENDER),			\
> +	GEN_FW_RANGE(0x20000, 0x21fff, FORCEWAKE_MEDIA_VDBOX0), /*		\
> +		0x20000 - 0x20fff: VD0 (XEHPSDV only)				\
> +		0x21000 - 0x21fff: reserved */					\
> +	GEN_FW_RANGE(0x22000, 0x23fff, FORCEWAKE_GT),				\
> +	GEN_FW_RANGE(0x24000, 0x2417f, 0), /*					\
> +		0x24000 - 0x2407f: always on					\
> +		0x24080 - 0x2417f: reserved */					\
> +	GEN_FW_RANGE(0x24180, 0x249ff, FORCEWAKE_GT), /*			\
> +		0x24180 - 0x241ff: gt						\
> +		0x24200 - 0x249ff: reserved */					\
> +	GEN_FW_RANGE(0x24a00, 0x251ff, FORCEWAKE_RENDER), /*			\
> +		0x24a00 - 0x24a7f: render					\
> +		0x24a80 - 0x251ff: reserved */					\
> +	GEN_FW_RANGE(0x25200, 0x25fff, FORCEWAKE_GT), /*			\
> +		0x25200 - 0x252ff: gt						\
> +		0x25300 - 0x25fff: reserved */					\
> +	GEN_FW_RANGE(0x26000, 0x2ffff, FORCEWAKE_RENDER), /*			\
> +		0x26000 - 0x27fff: render					\
> +		0x28000 - 0x29fff: reserved					\
> +		0x2a000 - 0x2ffff: undocumented */				\
> +	GEN_FW_RANGE(0x30000, 0x3ffff, FORCEWAKE_GT),				\
> +	GEN_FW_RANGE(0x40000, 0x1bffff, 0),					\
> +	GEN_FW_RANGE(0x1c0000, 0x1c3fff, FORCEWAKE_MEDIA_VDBOX0), /*		\
> +		0x1c0000 - 0x1c2bff: VD0					\
> +		0x1c2c00 - 0x1c2cff: reserved					\
> +		0x1c2d00 - 0x1c2dff: VD0					\
> +		0x1c2e00 - 0x1c3eff: VD0 (DG2 only)				\
> +		0x1c3f00 - 0x1c3fff: VD0 */					\
> +	GEN_FW_RANGE(0x1c4000, 0x1c7fff, FORCEWAKE_MEDIA_VDBOX1), /*		\
> +		0x1c4000 - 0x1c6bff: VD1					\
> +		0x1c6c00 - 0x1c6cff: reserved					\
> +		0x1c6d00 - 0x1c6dff: VD1					\
> +		0x1c6e00 - 0x1c7fff: reserved */				\
> +	GEN_FW_RANGE(0x1c8000, 0x1cbfff, FORCEWAKE_MEDIA_VEBOX0), /*		\
> +		0x1c8000 - 0x1ca0ff: VE0					\
> +		0x1ca100 - 0x1cbfff: reserved */				\
> +	GEN_FW_RANGE(0x1cc000, 0x1ccfff, FORCEWAKE_MEDIA_VDBOX0),		\
> +	GEN_FW_RANGE(0x1cd000, 0x1cdfff, FORCEWAKE_MEDIA_VDBOX2),		\
> +	GEN_FW_RANGE(0x1ce000, 0x1cefff, FORCEWAKE_MEDIA_VDBOX4),		\
> +	GEN_FW_RANGE(0x1cf000, 0x1cffff, FORCEWAKE_MEDIA_VDBOX6),		\
> +	GEN_FW_RANGE(0x1d0000, 0x1d3fff, FORCEWAKE_MEDIA_VDBOX2), /*		\
> +		0x1d0000 - 0x1d2bff: VD2					\
> +		0x1d2c00 - 0x1d2cff: reserved					\
> +		0x1d2d00 - 0x1d2dff: VD2					\
> +		0x1d2e00 - 0x1d3dff: VD2 (DG2 only)				\
> +		0x1d3e00 - 0x1d3eff: reserved					\
> +		0x1d3f00 - 0x1d3fff: VD2 */					\
> +	GEN_FW_RANGE(0x1d4000, 0x1d7fff, FORCEWAKE_MEDIA_VDBOX3), /*		\
> +		0x1d4000 - 0x1d6bff: VD3					\
> +		0x1d6c00 - 0x1d6cff: reserved					\
> +		0x1d6d00 - 0x1d6dff: VD3					\
> +		0x1d6e00 - 0x1d7fff: reserved */				\
> +	GEN_FW_RANGE(0x1d8000, 0x1dffff, FORCEWAKE_MEDIA_VEBOX1), /*		\
> +		0x1d8000 - 0x1da0ff: VE1					\
> +		0x1da100 - 0x1dffff: reserved */				\
> +	GEN_FW_RANGE(0x1e0000, 0x1e3fff, FORCEWAKE_MEDIA_VDBOX4), /*		\
> +		0x1e0000 - 0x1e2bff: VD4					\
> +		0x1e2c00 - 0x1e2cff: reserved					\
> +		0x1e2d00 - 0x1e2dff: VD4					\
> +		0x1e2e00 - 0x1e3eff: reserved					\
> +		0x1e3f00 - 0x1e3fff: VD4 */					\
> +	GEN_FW_RANGE(0x1e4000, 0x1e7fff, FORCEWAKE_MEDIA_VDBOX5), /*		\
> +		0x1e4000 - 0x1e6bff: VD5					\
> +		0x1e6c00 - 0x1e6cff: reserved					\
> +		0x1e6d00 - 0x1e6dff: VD5					\
> +		0x1e6e00 - 0x1e7fff: reserved */				\
> +	GEN_FW_RANGE(0x1e8000, 0x1effff, FORCEWAKE_MEDIA_VEBOX2), /*		\
> +		0x1e8000 - 0x1ea0ff: VE2					\
> +		0x1ea100 - 0x1effff: reserved */				\
> +	GEN_FW_RANGE(0x1f0000, 0x1f3fff, FORCEWAKE_MEDIA_VDBOX6), /*		\
> +		0x1f0000 - 0x1f2bff: VD6					\
> +		0x1f2c00 - 0x1f2cff: reserved					\
> +		0x1f2d00 - 0x1f2dff: VD6					\
> +		0x1f2e00 - 0x1f3eff: reserved					\
> +		0x1f3f00 - 0x1f3fff: VD6 */					\
> +	GEN_FW_RANGE(0x1f4000, 0x1f7fff, FORCEWAKE_MEDIA_VDBOX7), /*		\
> +		0x1f4000 - 0x1f6bff: VD7					\
> +		0x1f6c00 - 0x1f6cff: reserved					\
> +		0x1f6d00 - 0x1f6dff: VD7					\
> +		0x1f6e00 - 0x1f7fff: reserved */				\
>  	GEN_FW_RANGE(0x1f8000, 0x1fa0ff, FORCEWAKE_MEDIA_VEBOX3),
> +
> +static const struct intel_forcewake_range __xehp_fw_ranges[] = {
> +	XEHP_FWRANGES(FORCEWAKE_GT)
> +};
> +
> +static const struct intel_forcewake_range __dg2_fw_ranges[] = {
> +	XEHP_FWRANGES(FORCEWAKE_RENDER)
>  };
>  
>  static void
> @@ -2084,7 +2111,11 @@ static int uncore_forcewake_init(struct intel_uncore *uncore)
>  		return ret;
>  	forcewake_early_sanitize(uncore, 0);
>  
> -	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
> +	if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) {
> +		ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges);
> +		ASSIGN_WRITE_MMIO_VFUNCS(uncore, xehp_fwtable);
> +		ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
> +	} else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
>  		ASSIGN_FW_DOMAINS_TABLE(uncore, __xehp_fw_ranges);
>  		ASSIGN_WRITE_MMIO_VFUNCS(uncore, xehp_fwtable);
>  		ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v4 08/18] drm/i915/xehp: Changes to ss/eu definitions
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 08/18] drm/i915/xehp: Changes to ss/eu definitions Matt Roper
@ 2021-08-04  0:17   ` Souza, Jose
  0 siblings, 0 replies; 33+ messages in thread
From: Souza, Jose @ 2021-08-04  0:17 UTC (permalink / raw)
  To: Roper, Matthew D, intel-gfx; +Cc: De Marchi, Lucas, Auld, Matthew

On Thu, 2021-07-29 at 09:59 -0700, Matt Roper wrote:
> From: Matthew Auld <matthew.auld@intel.com>
> 
> Xe_HP no longer has "slices" in the same way that old platforms did.
> There are new concepts (gslices, cslices, mslices) that apply in various
> contexts, but for the purposes of fusing slices no longer exist and we
> just have one large pool of dual-subslices (DSS) to work with.
> Furthermore, the meaning of the DSS fuse is inverted compared to past
> platforms --- it now specifies which DSS are enabled rather than which
> ones are disabled.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>

> 
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
> Signed-off-by: Stuart Summers <stuart.summers@intel.com>
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> Signed-off-by: Prasad Nallani <prasad.nallani@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_sseu.c | 24 ++++++++++++++++++++----
>  drivers/gpu/drm/i915/i915_getparam.c |  6 ++++--
>  drivers/gpu/drm/i915/i915_reg.h      |  3 +++
>  3 files changed, 27 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c
> index bbed8e8625e1..5d1b7d06c96b 100644
> --- a/drivers/gpu/drm/i915/gt/intel_sseu.c
> +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
> @@ -139,17 +139,33 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
>  	 * Gen12 has Dual-Subslices, which behave similarly to 2 gen11 SS.
>  	 * Instead of splitting these, provide userspace with an array
>  	 * of DSS to more closely represent the hardware resource.
> +	 *
> +	 * In addition, the concept of slice has been removed in Xe_HP.
> +	 * To be compatible with prior generations, assume a single slice
> +	 * across the entire device. Then calculate out the DSS for each
> +	 * workload type within that software slice.
>  	 */
>  	intel_sseu_set_info(sseu, 1, 6, 16);
>  
> -	s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) &
> -		GEN11_GT_S_ENA_MASK;
> +	/*
> +	 * As mentioned above, Xe_HP does not have the concept of a slice.
> +	 * Enable one for software backwards compatibility.
> +	 */
> +	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
> +		s_en = 0x1;
> +	else
> +		s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) &
> +		       GEN11_GT_S_ENA_MASK;
>  
>  	dss_en = intel_uncore_read(uncore, GEN12_GT_DSS_ENABLE);
>  
>  	/* one bit per pair of EUs */
> -	eu_en_fuse = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) &
> -		       GEN11_EU_DIS_MASK);
> +	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
> +		eu_en_fuse = intel_uncore_read(uncore, XEHP_EU_ENABLE) & XEHP_EU_ENA_MASK;
> +	else
> +		eu_en_fuse = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) &
> +			       GEN11_EU_DIS_MASK);
> +
>  	for (eu = 0; eu < sseu->max_eus_per_subslice / 2; eu++)
>  		if (eu_en_fuse & BIT(eu))
>  			eu_en |= BIT(eu * 2) | BIT(eu * 2 + 1);
> diff --git a/drivers/gpu/drm/i915/i915_getparam.c b/drivers/gpu/drm/i915/i915_getparam.c
> index 24e18219eb50..e289397d9178 100644
> --- a/drivers/gpu/drm/i915/i915_getparam.c
> +++ b/drivers/gpu/drm/i915/i915_getparam.c
> @@ -15,7 +15,7 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data,
>  	struct pci_dev *pdev = to_pci_dev(dev->dev);
>  	const struct sseu_dev_info *sseu = &i915->gt.info.sseu;
>  	drm_i915_getparam_t *param = data;
> -	int value;
> +	int value = 0;
>  
>  	switch (param->param) {
>  	case I915_PARAM_IRQ_ACTIVE:
> @@ -150,7 +150,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data,
>  			return -ENODEV;
>  		break;
>  	case I915_PARAM_SUBSLICE_MASK:
> -		value = sseu->subslice_mask[0];
> +		/* Only copy bits from the first slice */
> +		memcpy(&value, sseu->subslice_mask,
> +		       min(sseu->ss_stride, (u8)sizeof(value)));
>  		if (!value)
>  			return -ENODEV;
>  		break;
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 62af453c8c54..99858bc593f0 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -3225,6 +3225,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
>  
>  #define GEN12_GT_DSS_ENABLE _MMIO(0x913C)
>  
> +#define XEHP_EU_ENABLE			_MMIO(0x9134)
> +#define XEHP_EU_ENA_MASK		0xFF
> +
>  #define GEN6_BSD_SLEEP_PSMI_CONTROL	_MMIO(0x12050)
>  #define   GEN6_BSD_SLEEP_MSG_DISABLE	(1 << 0)
>  #define   GEN6_BSD_SLEEP_FLUSH_DISABLE	(1 << 2)


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v4 01/18] drm/i915/xehp: handle new steering options
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 01/18] drm/i915/xehp: handle new steering options Matt Roper
@ 2021-08-04 19:53   ` Lucas De Marchi
  0 siblings, 0 replies; 33+ messages in thread
From: Lucas De Marchi @ 2021-08-04 19:53 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx

On Thu, Jul 29, 2021 at 09:59:51AM -0700, Matt Roper wrote:
>From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>
>Xe_HP is more modular than its predecessors and as a consequence it has
>more types of replicated registers.  As with l3bank regions on previous
>platforms, we may need to explicitly re-steer accesses to these new
>types of ranges at runtime if we can't find a single default steering
>value that satisfies the fusing of all types.
>
>v2:
> - Add a local 'i915' variable to reduce gt->i915 usage.  (Caz)
> - Drop unused 'intel_gt_read_register' prototype.  (Caz)
>
>Bspec: 66534
>Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>Cc: Caz Yokoyama <caz.yokoyama@intel.com>
>Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
>---
> drivers/gpu/drm/i915/gt/intel_gt.c          | 42 +++++++++-
> drivers/gpu/drm/i915/gt/intel_gt_types.h    |  7 ++
> drivers/gpu/drm/i915/gt/intel_region_lmem.c |  1 +
> drivers/gpu/drm/i915/gt/intel_sseu.c        | 18 +++++
> drivers/gpu/drm/i915/gt/intel_sseu.h        |  6 ++
> drivers/gpu/drm/i915/gt/intel_workarounds.c | 89 +++++++++++++++++++--
> drivers/gpu/drm/i915/i915_drv.h             |  3 +
> drivers/gpu/drm/i915/i915_pci.c             |  1 +
> drivers/gpu/drm/i915/i915_reg.h             |  4 +
> drivers/gpu/drm/i915/intel_device_info.h    |  1 +
> 10 files changed, 166 insertions(+), 6 deletions(-)
>
>diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
>index a64aa43f7cd9..39b224600f38 100644
>--- a/drivers/gpu/drm/i915/gt/intel_gt.c
>+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
>@@ -89,18 +89,40 @@ static const struct intel_mmio_range icl_l3bank_steering_table[] = {
> 	{},
> };
>
>+static u16 slicemask(struct intel_gt *gt, int count)
>+{
>+	u64 dss_mask = intel_sseu_get_subslices(&gt->info.sseu, 0);
>+
>+	return intel_slicemask_from_dssmask(dss_mask, count);
>+}
>+
> int intel_gt_init_mmio(struct intel_gt *gt)
> {
>+	struct drm_i915_private *i915 = gt->i915;
>+
> 	intel_gt_init_clock_frequency(gt);
>
> 	intel_uc_init_mmio(&gt->uc);
> 	intel_sseu_info_init(gt);
>
>-	if (GRAPHICS_VER(gt->i915) >= 11) {
>+	/*
>+	 * An mslice is unavailable only if both the meml3 for the slice is
>+	 * disabled *and* all of the DSS in the slice (quadrant) are disabled.
>+	 */
>+	if (HAS_MSLICES(i915))
>+		gt->info.mslice_mask =
>+			slicemask(gt, GEN_DSS_PER_MSLICE) |
>+			(intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
>+			 GEN12_MEML3_EN_MASK);
>+
>+	if (GRAPHICS_VER(i915) >= 11 &&
>+		   GRAPHICS_VER_FULL(i915) < IP_VER(12, 50)) {
> 		gt->steering_table[L3BANK] = icl_l3bank_steering_table;
> 		gt->info.l3bank_mask =
> 			~intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
> 			GEN10_L3BANK_MASK;
>+	} else if (HAS_MSLICES(i915)) {
>+		MISSING_CASE(INTEL_INFO(i915)->platform);
> 	}
>
> 	return intel_engines_init_mmio(gt);
>@@ -787,6 +809,24 @@ static void intel_gt_get_valid_steering(struct intel_gt *gt,
> 		*sliceid = 0;		/* unused */
> 		*subsliceid = __ffs(gt->info.l3bank_mask);
> 		break;
>+	case MSLICE:
>+		GEM_DEBUG_WARN_ON(!gt->info.mslice_mask); /* should be impossible! */
>+
>+		*sliceid = __ffs(gt->info.mslice_mask);
>+		*subsliceid = 0;	/* unused */
>+		break;
>+	case LNCF:
>+		GEM_DEBUG_WARN_ON(!gt->info.mslice_mask); /* should be impossible! */
>+
>+		/*
>+		 * 0xFDC[29:28] selects the mslice to steer to and 0xFDC[27]


not sure if this clarifies, since this wording does not come from
bspec... bspec rather says (0, 1) -> LNCFS in MSLICE0, (2, 3) -> LNCFS
in MSLICE1, etc. Yes, they are equivalent, but I'd be ok with just the
second part (below) of this comment.... Unless we have this wording
somewhere in bspec I'm not seeing.


>+		 * selects which LNCF within the mslice to steer to.  An LNCF
>+		 * is always present if its mslice is present, so we can safely
>+		 * just steer to LNCF 0 in all cases.
>+		 */
>+		*sliceid = __ffs(gt->info.mslice_mask) << 1;
>+		*subsliceid = 0;	/* unused */
>+		break;
> 	default:
> 		MISSING_CASE(type);
> 		*sliceid = 0;
>diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h
>index 97a5075288d2..a81e21bf1bd1 100644
>--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
>+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
>@@ -47,9 +47,14 @@ struct intel_mmio_range {
>  * of multicast registers.  If another type of steering does not have any
>  * overlap in valid steering targets with 'subslice' style registers, we will
>  * need to explicitly re-steer reads of registers of the other type.
>+ *
>+ * Only the replication types that may need additional non-default steering
>+ * are listed here.
>  */
> enum intel_steering_type {
> 	L3BANK,
>+	MSLICE,
>+	LNCF,
>
> 	NUM_STEERING_TYPES
> };
>@@ -184,6 +189,8 @@ struct intel_gt {
>
> 		/* Slice/subslice/EU info */
> 		struct sseu_dev_info sseu;
>+
>+		unsigned long mslice_mask;
> 	} info;
> };
>
>diff --git a/drivers/gpu/drm/i915/gt/intel_region_lmem.c b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
>index e3a2a2fa5f94..a74b72f50cc9 100644
>--- a/drivers/gpu/drm/i915/gt/intel_region_lmem.c
>+++ b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
>@@ -10,6 +10,7 @@
> #include "gem/i915_gem_lmem.h"
> #include "gem/i915_gem_region.h"
> #include "gem/i915_gem_ttm.h"
>+#include "gt/intel_gt.h"
>
> static int init_fake_lmem_bar(struct intel_memory_region *mem)
> {
>diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c
>index 367fd44b81c8..bbed8e8625e1 100644
>--- a/drivers/gpu/drm/i915/gt/intel_sseu.c
>+++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
>@@ -759,3 +759,21 @@ void intel_sseu_print_topology(const struct sseu_dev_info *sseu,
> 		}
> 	}
> }
>+
>+u16 intel_slicemask_from_dssmask(u64 dss_mask, int dss_per_slice)
>+{
>+	u16 slice_mask = 0;
>+	int i;
>+
>+	WARN_ON(sizeof(dss_mask) * 8 / dss_per_slice > 8 * sizeof(slice_mask));
>+
>+	for (i = 0; dss_mask; i++) {
>+		if (dss_mask & GENMASK(dss_per_slice - 1, 0))
>+			slice_mask |= BIT(i);
>+
>+		dss_mask >>= dss_per_slice;
>+	}
>+
>+	return slice_mask;
>+}
>+
>diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h
>index 4cd1a8a7298a..1073471d1980 100644
>--- a/drivers/gpu/drm/i915/gt/intel_sseu.h
>+++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
>@@ -22,6 +22,10 @@ struct drm_printer;
> #define GEN_MAX_EUS		(16) /* TGL upper bound */
> #define GEN_MAX_EU_STRIDE GEN_SSEU_STRIDE(GEN_MAX_EUS)
>
>+#define GEN_DSS_PER_GSLICE	4
>+#define GEN_DSS_PER_CSLICE	8
>+#define GEN_DSS_PER_MSLICE	8
>+
> struct sseu_dev_info {
> 	u8 slice_mask;
> 	u8 subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
>@@ -104,4 +108,6 @@ void intel_sseu_dump(const struct sseu_dev_info *sseu, struct drm_printer *p);
> void intel_sseu_print_topology(const struct sseu_dev_info *sseu,
> 			       struct drm_printer *p);
>
>+u16 intel_slicemask_from_dssmask(u64 dss_mask, int dss_per_slice);
>+
> #endif /* __INTEL_SSEU_H__ */
>diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
>index 9173df59821a..f2ceabb0e2ea 100644
>--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
>+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
>@@ -889,12 +889,24 @@ cfl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
> 		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
> }
>
>+static void __add_mcr_wa(struct drm_i915_private *i915, struct i915_wa_list *wal,
>+			 unsigned slice, unsigned subslice)
>+{
>+	u32 mcr, mcr_mask;
>+
>+	mcr = GEN11_MCR_SLICE(slice) | GEN11_MCR_SUBSLICE(subslice);
>+	mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
>+
>+	drm_dbg(&i915->drm, "MCR slice/subslice = %x\n", mcr);
>+
>+	wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr);
>+}
>+
> static void
> icl_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
> {
> 	const struct sseu_dev_info *sseu = &i915->gt.info.sseu;
> 	unsigned int slice, subslice;
>-	u32 mcr, mcr_mask;
>
> 	GEM_BUG_ON(GRAPHICS_VER(i915) < 11);
> 	GEM_BUG_ON(hweight8(sseu->slice_mask) > 1);
>@@ -919,12 +931,79 @@ icl_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
> 	if (i915->gt.info.l3bank_mask & BIT(subslice))
> 		i915->gt.steering_table[L3BANK] = NULL;
>
>-	mcr = GEN11_MCR_SLICE(slice) | GEN11_MCR_SUBSLICE(subslice);
>-	mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
>+	__add_mcr_wa(i915, wal, slice, subslice);
>+}
>
>-	drm_dbg(&i915->drm, "MCR slice/subslice = %x\n", mcr);
>+__maybe_unused
>+static void
>+xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
>+{
>+	struct drm_i915_private *i915 = gt->i915;
>+	const struct sseu_dev_info *sseu = &gt->info.sseu;
>+	unsigned long slice, subslice = 0, slice_mask = 0;
>+	u64 dss_mask = 0;
>+	u32 lncf_mask = 0;
>+	int i;
>
>-	wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr);
>+	/*
>+	 * On Xe_HP the steering increases in complexity. There are now several
>+	 * more units that require steering and we're not guaranteed to be able
>+	 * to find a common setting for all of them. These are:
>+	 * - GSLICE (fusable)
>+	 * - DSS (sub-unit within gslice; fusable)
>+	 * - L3 Bank (fusable)
>+	 * - MSLICE (fusable)
>+	 * - LNCF (sub-unit within mslice; always present if mslice is present)
>+	 * - SQIDI (always on)
>+	 *
>+	 * We'll do our default/implicit steering based on GSLICE (in the
>+	 * sliceid field) and DSS (in the subsliceid field).  If we can
>+	 * find overlap between the valid MSLICE and/or LNCF values with
>+	 * a suitable GSLICE, then we can just re-use the default value and
>+	 * skip and explicit steering at runtime.
>+	 *
>+	 * We only need to look for overlap between GSLICE/MSLICE/LNCF to find
>+	 * a valid sliceid value.  DSS steering is the only type of steering
>+	 * that utilizes the 'subsliceid' bits.
>+	 *
>+	 * Also note that, even though the steering domain is called "GSlice"
>+	 * and it is encoded in the register using the gslice format, the spec
>+	 * says that the combined (geometry | compute) fuse should be used to
>+	 * select the steering.
>+	 */
>+
>+	/* Find the potential gslice candidates */
>+	dss_mask = intel_sseu_get_subslices(sseu, 0);
>+	slice_mask = intel_slicemask_from_dssmask(dss_mask, GEN_DSS_PER_GSLICE);
>+
>+	/*
>+	 * Find the potential LNCF candidates.  Either LNCF within a valid
>+	 * mslice is fine.
>+	 */
>+	for_each_set_bit(i, &gt->info.mslice_mask, GEN12_MAX_MSLICES)
>+		lncf_mask |= (0x3 << (i * 2));
>+
>+	/*
>+	 * Are there any sliceid values that work for both GSLICE and LNCF
>+	 * steering?
>+	 */
>+	if (slice_mask & lncf_mask) {
>+		slice_mask &= lncf_mask;
>+		gt->steering_table[LNCF] = NULL;
>+	}
>+
>+	/* How about sliceid values that also work for MSLICE steering? */
>+	if (slice_mask & gt->info.mslice_mask) {
>+		slice_mask &= gt->info.mslice_mask;
>+		gt->steering_table[MSLICE] = NULL;
>+	}
>+
>+	slice = __ffs(slice_mask);
>+	subslice = __ffs(dss_mask >> (slice * GEN_DSS_PER_GSLICE));
>+	WARN_ON(subslice > GEN_DSS_PER_GSLICE);
>+	WARN_ON(dss_mask >> (slice * GEN_DSS_PER_GSLICE) == 0);
>+
>+	__add_mcr_wa(i915, wal, slice, subslice);
> }
>
> static void
>diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>index 70bef688c6da..31b8bb43c581 100644
>--- a/drivers/gpu/drm/i915/i915_drv.h
>+++ b/drivers/gpu/drm/i915/i915_drv.h
>@@ -1694,6 +1694,9 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
> #define HAS_RUNTIME_PM(dev_priv) (INTEL_INFO(dev_priv)->has_runtime_pm)
> #define HAS_64BIT_RELOC(dev_priv) (INTEL_INFO(dev_priv)->has_64bit_reloc)
>
>+#define HAS_MSLICES(dev_priv) \
>+	(INTEL_INFO(dev_priv)->has_mslices)
>+
> #define HAS_IPC(dev_priv)		 (INTEL_INFO(dev_priv)->display.has_ipc)
>
> #define HAS_REGION(i915, i) (INTEL_INFO(i915)->memory_regions & (i))
>diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
>index 08651ca03478..ed9edda727d2 100644
>--- a/drivers/gpu/drm/i915/i915_pci.c
>+++ b/drivers/gpu/drm/i915/i915_pci.c
>@@ -1007,6 +1007,7 @@ static const struct intel_device_info adl_p_info = {
> 	.has_llc = 1, \
> 	.has_logical_ring_contexts = 1, \
> 	.has_logical_ring_elsq = 1, \
>+	.has_mslices = 1, \
> 	.has_rc6 = 1, \
> 	.has_reset_engine = 1, \
> 	.has_rps = 1, \
>diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>index 1a164b90e9fc..f4113e7e8271 100644
>--- a/drivers/gpu/drm/i915/i915_reg.h
>+++ b/drivers/gpu/drm/i915/i915_reg.h
>@@ -2766,6 +2766,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
> #define   GEN11_MCR_SLICE_MASK		GEN11_MCR_SLICE(0xf)
> #define   GEN11_MCR_SUBSLICE(subslice)	(((subslice) & 0x7) << 24)
> #define   GEN11_MCR_SUBSLICE_MASK	GEN11_MCR_SUBSLICE(0x7)
>+#define   GEN11_MCR_MULTICAST		REG_BIT(31)


this is not used anywhere so we could just omit it. If we are adding to
have the complete definition for the register, this needs to be before
GEN11_MCR_SLICE() to adhere to the rules of definiting the most
significant bit first.

other than that, patch looks fine to me. It took me some time to review
and understand the table, the steering rules, etc. But AFAICS this seems
to follow them


Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>


Lucas De Marchi


> #define RING_IPEIR(base)	_MMIO((base) + 0x64)
> #define RING_IPEHR(base)	_MMIO((base) + 0x68)
> #define RING_EIR(base)		_MMIO((base) + 0xb0)
>@@ -3184,6 +3185,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
> #define	GEN10_MIRROR_FUSE3		_MMIO(0x9118)
> #define GEN10_L3BANK_PAIR_COUNT     4
> #define GEN10_L3BANK_MASK   0x0F
>+/* on Xe_HP the same fuses indicates mslices instead of L3 banks */
>+#define GEN12_MAX_MSLICES 4
>+#define GEN12_MEML3_EN_MASK 0x0F
>
> #define GEN8_EU_DISABLE0		_MMIO(0x9134)
> #define   GEN8_EU_DIS0_S0_MASK		0xffffff
>diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
>index 121d6d9afd3a..2177372f9ac3 100644
>--- a/drivers/gpu/drm/i915/intel_device_info.h
>+++ b/drivers/gpu/drm/i915/intel_device_info.h
>@@ -133,6 +133,7 @@ enum intel_ppgtt_type {
> 	func(has_llc); \
> 	func(has_logical_ring_contexts); \
> 	func(has_logical_ring_elsq); \
>+	func(has_mslices); \
> 	func(has_pooled_eu); \
> 	func(has_rc6); \
> 	func(has_rc6p); \
>-- 
>2.25.4
>
>_______________________________________________
>Intel-gfx mailing list
>Intel-gfx@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v4 02/18] drm/i915/xehpsdv: Define steering tables
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 02/18] drm/i915/xehpsdv: Define steering tables Matt Roper
@ 2021-08-04 20:13   ` Lucas De Marchi
  0 siblings, 0 replies; 33+ messages in thread
From: Lucas De Marchi @ 2021-08-04 20:13 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx

On Thu, Jul 29, 2021 at 09:59:52AM -0700, Matt Roper wrote:
>Define and initialize the MMIO ranges for which XeHP SDV requires MSLICE
>and LNCF steering.
>
>Bspec: 66534
>Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>


Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>

Lucas De Marchi

>---
> drivers/gpu/drm/i915/gt/intel_gt.c          | 19 ++++++++++++++++++-
> drivers/gpu/drm/i915/gt/intel_workarounds.c | 11 +++++++++--
> 2 files changed, 27 insertions(+), 3 deletions(-)
>
>diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
>index 39b224600f38..8e13bfc81634 100644
>--- a/drivers/gpu/drm/i915/gt/intel_gt.c
>+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
>@@ -89,6 +89,20 @@ static const struct intel_mmio_range icl_l3bank_steering_table[] = {
> 	{},
> };
>
>+static const struct intel_mmio_range xehpsdv_mslice_steering_table[] = {
>+	{ 0x004000, 0x004AFF },
>+	{ 0x00C800, 0x00CFFF },
>+	{ 0x00DD00, 0x00DDFF },
>+	{ 0x00E900, 0x00FFFF }, /* 0xEA00 - OxEFFF is unused */
>+	{},
>+};
>+
>+static const struct intel_mmio_range xehpsdv_lncf_steering_table[] = {
>+	{ 0x00B000, 0x00B0FF },
>+	{ 0x00D800, 0x00D8FF },
>+	{},
>+};
>+
> static u16 slicemask(struct intel_gt *gt, int count)
> {
> 	u64 dss_mask = intel_sseu_get_subslices(&gt->info.sseu, 0);
>@@ -115,7 +129,10 @@ int intel_gt_init_mmio(struct intel_gt *gt)
> 			(intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
> 			 GEN12_MEML3_EN_MASK);
>
>-	if (GRAPHICS_VER(i915) >= 11 &&
>+	if (IS_XEHPSDV(i915)) {
>+		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
>+		gt->steering_table[LNCF] = xehpsdv_lncf_steering_table;
>+	} else if (GRAPHICS_VER(i915) >= 11 &&
> 		   GRAPHICS_VER_FULL(i915) < IP_VER(12, 50)) {
> 		gt->steering_table[L3BANK] = icl_l3bank_steering_table;
> 		gt->info.l3bank_mask =
>diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
>index f2ceabb0e2ea..8717337a6c81 100644
>--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
>+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
>@@ -934,7 +934,6 @@ icl_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
> 	__add_mcr_wa(i915, wal, slice, subslice);
> }
>
>-__maybe_unused
> static void
> xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
> {
>@@ -1136,10 +1135,18 @@ dg1_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
> 			    VSUNIT_CLKGATE_DIS_TGL);
> }
>
>+static void
>+xehpsdv_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
>+{
>+	xehp_init_mcr(&i915->gt, wal);
>+}
>+
> static void
> gt_init_workarounds(struct drm_i915_private *i915, struct i915_wa_list *wal)
> {
>-	if (IS_DG1(i915))
>+	if (IS_XEHPSDV(i915))
>+		xehpsdv_gt_workarounds_init(i915, wal);
>+	else if (IS_DG1(i915))
> 		dg1_gt_workarounds_init(i915, wal);
> 	else if (IS_TIGERLAKE(i915))
> 		tgl_gt_workarounds_init(i915, wal);
>-- 
>2.25.4
>
>_______________________________________________
>Intel-gfx mailing list
>Intel-gfx@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v4 04/18] drm/i915/dg2: Update LNCF steering ranges
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 04/18] drm/i915/dg2: Update LNCF steering ranges Matt Roper
@ 2021-08-04 20:16   ` Lucas De Marchi
  0 siblings, 0 replies; 33+ messages in thread
From: Lucas De Marchi @ 2021-08-04 20:16 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx

On Thu, Jul 29, 2021 at 09:59:54AM -0700, Matt Roper wrote:
>DG2's replicated register ranges are almost the same at XeHP SDV with
>the exception of one LNCF sub-range that switches to gslice steering.
>We can re-use the XeHP SDV mslice steering table and just provide a
>DG2-specific LNCF steering table.
>
>Bspec: 66534
>Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
>---
> drivers/gpu/drm/i915/gt/intel_gt.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
>index 8e13bfc81634..1971e34da254 100644
>--- a/drivers/gpu/drm/i915/gt/intel_gt.c
>+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
>@@ -103,6 +103,12 @@ static const struct intel_mmio_range xehpsdv_lncf_steering_table[] = {
> 	{},
> };
>
>+static const struct intel_mmio_range dg2_lncf_steering_table[] = {
>+	{ 0x00B000, 0x00B0FF },
>+	{ 0x00D880, 0x00D8FF },
>+	{},
>+};
>+
> static u16 slicemask(struct intel_gt *gt, int count)
> {
> 	u64 dss_mask = intel_sseu_get_subslices(&gt->info.sseu, 0);
>@@ -129,7 +135,10 @@ int intel_gt_init_mmio(struct intel_gt *gt)
> 			(intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
> 			 GEN12_MEML3_EN_MASK);
>
>-	if (IS_XEHPSDV(i915)) {
>+	if (IS_DG2(i915)) {
>+		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;


a title "Update steering tables" would be more appropriate as this is
also setting the mslice table, even if just re-using from xehpsdv.


Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>


Lucas De Marchi

>+		gt->steering_table[LNCF] = dg2_lncf_steering_table;
>+	} else if (IS_XEHPSDV(i915)) {
> 		gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
> 		gt->steering_table[LNCF] = xehpsdv_lncf_steering_table;
> 	} else if (GRAPHICS_VER(i915) >= 11 &&
>-- 
>2.25.4
>
>_______________________________________________
>Intel-gfx mailing list
>Intel-gfx@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v4 05/18] drm/i915/dg2: Add SQIDI steering
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 05/18] drm/i915/dg2: Add SQIDI steering Matt Roper
@ 2021-08-04 20:22   ` Lucas De Marchi
  2021-08-05 15:11     ` Matt Roper
  0 siblings, 1 reply; 33+ messages in thread
From: Lucas De Marchi @ 2021-08-04 20:22 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx

On Thu, Jul 29, 2021 at 09:59:55AM -0700, Matt Roper wrote:
>Although DG2_G10 platforms will always have all SQIDI's present and
>don't need steering for registers in a SQIDI MMIO range, this isn't true
>for DG2_G11 platforms; only SQIDI's 2 and 3 can be used on those.
>
>We handle SQIDI ranges a bit differently from other types of explicit
>steering.  The SQIDI ranges belong to either the MCFG unit or the SF
>unit, both of which have their own dedicated steering registers and do
>not use the typical 0xFDC steering control that all other types of
>ranges use.  Thus we only need to worry about picking a valid initial
>value for the MCFG and SF steering registers (0xFD0 and 0xFD8
>resepectively) at driver init; they won't change after we set them up so

respectively

>we don't need to worry about re-steering them explicitly at runtime.
>
>Given that any SQIDI value should work fine for DG2-G10 and XeHP SDV,
>while only values of 2 and 3 are valid for DG2-G11, we'll just
>initialize the MCFG and SF steering registers to a constant value of "2"
>for all XeHP-based platforms for simplicity --- that will work in all
>cases.
>
>Bspec: 66534
>Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
>---
> drivers/gpu/drm/i915/gt/intel_workarounds.c | 28 +++++++++++++++++----
> drivers/gpu/drm/i915/i915_reg.h             |  2 ++
> 2 files changed, 25 insertions(+), 5 deletions(-)
>
>diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
>index 8717337a6c81..6895b083523d 100644
>--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
>+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
>@@ -889,17 +889,24 @@ cfl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
> 		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
> }
>
>-static void __add_mcr_wa(struct drm_i915_private *i915, struct i915_wa_list *wal,
>-			 unsigned slice, unsigned subslice)
>+static void __set_mcr_steering(struct i915_wa_list *wal,
>+			       i915_reg_t steering_reg,
>+			       unsigned int slice, unsigned int subslice)
> {
> 	u32 mcr, mcr_mask;
>
> 	mcr = GEN11_MCR_SLICE(slice) | GEN11_MCR_SUBSLICE(subslice);
> 	mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
>
>-	drm_dbg(&i915->drm, "MCR slice/subslice = %x\n", mcr);
>+	wa_write_clr_set(wal, steering_reg, mcr_mask, mcr);
>+}
>+
>+static void __add_mcr_wa(struct drm_i915_private *i915, struct i915_wa_list *wal,
>+			 unsigned int slice, unsigned int subslice)
>+{
>+	drm_dbg(&i915->drm, "MCR slice=0x%x, subslice=0x%x\n", slice, subslice);

maybe we could leave the debug message in __set_mcr_steering() and add
what steering register we are setting? Up to you.


Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>


>
>-	wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr);
>+	__set_mcr_steering(wal, GEN8_MCR_SELECTOR, slice, subslice);
> }
>
> static void
>@@ -953,7 +960,6 @@ xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
> 	 * - L3 Bank (fusable)
> 	 * - MSLICE (fusable)
> 	 * - LNCF (sub-unit within mslice; always present if mslice is present)
>-	 * - SQIDI (always on)
> 	 *
> 	 * We'll do our default/implicit steering based on GSLICE (in the
> 	 * sliceid field) and DSS (in the subsliceid field).  If we can
>@@ -1003,6 +1009,18 @@ xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
> 	WARN_ON(dss_mask >> (slice * GEN_DSS_PER_GSLICE) == 0);
>
> 	__add_mcr_wa(i915, wal, slice, subslice);
>+
>+	/*
>+	 * SQIDI ranges are special because they use different steering
>+	 * registers than everything else we work with.  On XeHP SDV and
>+	 * DG2-G10, any value in the steering registers will work fine since
>+	 * all instances are present, but DG2-G11 only has SQIDI instances at
>+	 * ID's 2 and 3, so we need to steer to one of those.  For simplicity
>+	 * we'll just steer to a hardcoded "2" since that value will work
>+	 * everywhere.
>+	 */
>+	__set_mcr_steering(wal, MCFG_MCR_SELECTOR, 0, 2);
>+	__set_mcr_steering(wal, SF_MCR_SELECTOR, 0, 2);
> }
>
> static void
>diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>index f4113e7e8271..39ce6befff52 100644
>--- a/drivers/gpu/drm/i915/i915_reg.h
>+++ b/drivers/gpu/drm/i915/i915_reg.h
>@@ -2757,6 +2757,8 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
> #define GEN12_SC_INSTDONE_EXTRA2	_MMIO(0x7108)
> #define GEN7_SAMPLER_INSTDONE	_MMIO(0xe160)
> #define GEN7_ROW_INSTDONE	_MMIO(0xe164)
>+#define MCFG_MCR_SELECTOR		_MMIO(0xfd0)
>+#define SF_MCR_SELECTOR			_MMIO(0xfd8)
> #define GEN8_MCR_SELECTOR		_MMIO(0xfdc)
> #define   GEN8_MCR_SLICE(slice)		(((slice) & 3) << 26)
> #define   GEN8_MCR_SLICE_MASK		GEN8_MCR_SLICE(3)
>-- 
>2.25.4
>
>_______________________________________________
>Intel-gfx mailing list
>Intel-gfx@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v4 09/18] drm/i915/xehpsdv: Add maximum sseu limits
  2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 09/18] drm/i915/xehpsdv: Add maximum sseu limits Matt Roper
@ 2021-08-04 20:26   ` Lucas De Marchi
  0 siblings, 0 replies; 33+ messages in thread
From: Lucas De Marchi @ 2021-08-04 20:26 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx

On Thu, Jul 29, 2021 at 09:59:59AM -0700, Matt Roper wrote:
>Due to the removal of legacy slices and the transition to a
>gslice/cslice/mslice/etc. design, we'll internally store all DSS under
>"slice0."
>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
>Reviewed-by: Caz Yokoyama <caz.yokoyama@intel.com>
>---
> drivers/gpu/drm/i915/gt/intel_sseu.c         | 5 ++++-
> drivers/gpu/drm/i915/gt/intel_sseu.h         | 2 +-
> drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c | 2 +-
> 3 files changed, 6 insertions(+), 3 deletions(-)
>
>diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c
>index 5d1b7d06c96b..16c0552fcd1d 100644
>--- a/drivers/gpu/drm/i915/gt/intel_sseu.c
>+++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
>@@ -145,7 +145,10 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
> 	 * across the entire device. Then calculate out the DSS for each
> 	 * workload type within that software slice.
> 	 */
>-	intel_sseu_set_info(sseu, 1, 6, 16);
>+	if (IS_XEHPSDV(gt->i915))
>+		intel_sseu_set_info(sseu, 1, 32, 16);
>+	else
>+		intel_sseu_set_info(sseu, 1, 6, 16);
>
> 	/*
> 	 * As mentioned above, Xe_HP does not have the concept of a slice.
>diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h
>index 74487650b08f..204ea6709460 100644
>--- a/drivers/gpu/drm/i915/gt/intel_sseu.h
>+++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
>@@ -16,7 +16,7 @@ struct intel_gt;
> struct drm_printer;
>
> #define GEN_MAX_SLICES		(6) /* CNL upper bound */

needs a rebase :-/


Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>

Lucas De Marchi

>-#define GEN_MAX_SUBSLICES	(8) /* ICL upper bound */
>+#define GEN_MAX_SUBSLICES	(32) /* XEHPSDV upper bound */
> #define GEN_SSEU_STRIDE(max_entries) DIV_ROUND_UP(max_entries, BITS_PER_BYTE)
> #define GEN_MAX_SUBSLICE_STRIDE GEN_SSEU_STRIDE(GEN_MAX_SUBSLICES)
> #define GEN_MAX_EUS		(16) /* TGL upper bound */
>diff --git a/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c b/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c
>index 714fe8495775..a424150b052e 100644
>--- a/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c
>+++ b/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c
>@@ -53,7 +53,7 @@ static void cherryview_sseu_device_status(struct intel_gt *gt,
> static void gen10_sseu_device_status(struct intel_gt *gt,
> 				     struct sseu_dev_info *sseu)
> {
>-#define SS_MAX 6
>+#define SS_MAX 8
> 	struct intel_uncore *uncore = gt->uncore;
> 	const struct intel_gt_info *info = &gt->info;
> 	u32 s_reg[SS_MAX], eu_reg[2 * SS_MAX], eu_mask[2];
>-- 
>2.25.4
>
>_______________________________________________
>Intel-gfx mailing list
>Intel-gfx@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v4 10/18] drm/i915/xehpsdv: Add compute DSS type
  2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 10/18] drm/i915/xehpsdv: Add compute DSS type Matt Roper
@ 2021-08-04 20:36   ` Lucas De Marchi
  2021-08-04 21:00     ` Matt Roper
  0 siblings, 1 reply; 33+ messages in thread
From: Lucas De Marchi @ 2021-08-04 20:36 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx, Stuart Summers, matthew.s.atwood

On Thu, Jul 29, 2021 at 10:00:00AM -0700, Matt Roper wrote:
>From: Stuart Summers <stuart.summers@intel.com>
>
>Starting in XeHP, the concept of slice has been removed in favor of
>DSS (Dual-Subslice) masks for various workload types. These workloads have
>been divided into those enabled for geometry and those enabled for compute.
>
>i915 currently maintains a single set of S/SS/EU masks for the device.
>The goal of this patch set is to minimize the amount of impact to prior
>generations while still giving the user maximum flexibility.
>
>Bspec: 33117, 33118, 20376
>Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>Cc: Matt Roper <matthew.d.roper@intel.com>
>Signed-off-by: Stuart Summers <stuart.summers@intel.com>
>Signed-off-by: Steve Hampson <steven.t.hampson@intel.com>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
>---
> drivers/gpu/drm/i915/gt/intel_sseu.c | 73 ++++++++++++++++++++--------
> drivers/gpu/drm/i915/gt/intel_sseu.h |  5 +-
> drivers/gpu/drm/i915/i915_reg.h      |  3 +-
> include/uapi/drm/i915_drm.h          |  3 --
> 4 files changed, 59 insertions(+), 25 deletions(-)
>
>diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c
>index 16c0552fcd1d..5d3b8dff464c 100644
>--- a/drivers/gpu/drm/i915/gt/intel_sseu.c
>+++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
>@@ -46,11 +46,11 @@ u32 intel_sseu_get_subslices(const struct sseu_dev_info *sseu, u8 slice)
> }
>
> void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
>-			      u32 ss_mask)
>+			      u8 *subslice_mask, u32 ss_mask)
> {
> 	int offset = slice * sseu->ss_stride;
>
>-	memcpy(&sseu->subslice_mask[offset], &ss_mask, sseu->ss_stride);
>+	memcpy(&subslice_mask[offset], &ss_mask, sseu->ss_stride);
> }
>
> unsigned int
>@@ -100,14 +100,24 @@ static u16 compute_eu_total(const struct sseu_dev_info *sseu)
> 	return total;
> }
>
>-static void gen11_compute_sseu_info(struct sseu_dev_info *sseu,
>-				    u8 s_en, u32 ss_en, u16 eu_en)
>+static u32 get_ss_stride_mask(struct sseu_dev_info *sseu, u8 s, u32 ss_en)
>+{
>+	u32 ss_mask;
>+
>+	ss_mask = ss_en >> (s * sseu->max_subslices);
>+	ss_mask &= GENMASK(sseu->max_subslices - 1, 0);
>+
>+	return ss_mask;
>+}
>+
>+static void gen11_compute_sseu_info(struct sseu_dev_info *sseu, u8 s_en,
>+				    u32 g_ss_en, u32 c_ss_en, u16 eu_en)
> {
> 	int s, ss;
>
>-	/* ss_en represents entire subslice mask across all slices */
>+	/* g_ss_en/c_ss_en represent entire subslice mask across all slices */
> 	GEM_BUG_ON(sseu->max_slices * sseu->max_subslices >
>-		   sizeof(ss_en) * BITS_PER_BYTE);
>+		   sizeof(g_ss_en) * BITS_PER_BYTE);
>
> 	for (s = 0; s < sseu->max_slices; s++) {
> 		if ((s_en & BIT(s)) == 0)
>@@ -115,7 +125,23 @@ static void gen11_compute_sseu_info(struct sseu_dev_info *sseu,
>
> 		sseu->slice_mask |= BIT(s);
>
>-		intel_sseu_set_subslices(sseu, s, ss_en);
>+		/*
>+		 * XeHP introduces the concept of compute vs
>+		 * geometry DSS. To reduce variation between GENs
>+		 * around subslice usage, store a mask for both the
>+		 * geometry and compute enabled masks, to provide
>+		 * to user space later in QUERY_TOPOLOGY_INFO, and


this is not true anymore... I think when this was written the idea was
to use QUERY_TOPOLOGY_INFO's  flags field to differentiate between
compute/geometry. However looking at next patches it seems this is not
there anymore: we still check for

         if (query_item->flags != 0)
                 return -EINVAL;

Cc'ing Stuart and Matt Atwood


Also, it would be good to have the patches adding the query as the very
next patch.

Lucas De Marchi

>+		 * compute a total enabled subslice count for the
>+		 * purposes of selecting subslices to use in a
>+		 * particular GEM context.
>+		 */
>+		intel_sseu_set_subslices(sseu, s, sseu->compute_subslice_mask,
>+					 get_ss_stride_mask(sseu, s, c_ss_en));
>+		intel_sseu_set_subslices(sseu, s, sseu->geometry_subslice_mask,
>+					 get_ss_stride_mask(sseu, s, g_ss_en));
>+		intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
>+					 get_ss_stride_mask(sseu, s,
>+							    g_ss_en | c_ss_en));
>
> 		for (ss = 0; ss < sseu->max_subslices; ss++)
> 			if (intel_sseu_has_subslice(sseu, s, ss))
>@@ -129,7 +155,7 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
> {
> 	struct sseu_dev_info *sseu = &gt->info.sseu;
> 	struct intel_uncore *uncore = gt->uncore;
>-	u32 dss_en;
>+	u32 g_dss_en, c_dss_en = 0;
> 	u16 eu_en = 0;
> 	u8 eu_en_fuse;
> 	u8 s_en;
>@@ -145,10 +171,12 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
> 	 * across the entire device. Then calculate out the DSS for each
> 	 * workload type within that software slice.
> 	 */
>-	if (IS_XEHPSDV(gt->i915))
>+	if (IS_XEHPSDV(gt->i915)) {
> 		intel_sseu_set_info(sseu, 1, 32, 16);
>-	else
>+		sseu->has_compute_dss = 1;
>+	} else {
> 		intel_sseu_set_info(sseu, 1, 6, 16);
>+	}
>
> 	/*
> 	 * As mentioned above, Xe_HP does not have the concept of a slice.
>@@ -160,7 +188,9 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
> 		s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) &
> 		       GEN11_GT_S_ENA_MASK;
>
>-	dss_en = intel_uncore_read(uncore, GEN12_GT_DSS_ENABLE);
>+	g_dss_en = intel_uncore_read(uncore, GEN12_GT_GEOMETRY_DSS_ENABLE);
>+	if (sseu->has_compute_dss)
>+		c_dss_en = intel_uncore_read(uncore, GEN12_GT_COMPUTE_DSS_ENABLE);
>
> 	/* one bit per pair of EUs */
> 	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
>@@ -173,7 +203,7 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
> 		if (eu_en_fuse & BIT(eu))
> 			eu_en |= BIT(eu * 2) | BIT(eu * 2 + 1);
>
>-	gen11_compute_sseu_info(sseu, s_en, dss_en, eu_en);
>+	gen11_compute_sseu_info(sseu, s_en, g_dss_en, c_dss_en, eu_en);
>
> 	/* TGL only supports slice-level power gating */
> 	sseu->has_slice_pg = 1;
>@@ -199,7 +229,7 @@ static void gen11_sseu_info_init(struct intel_gt *gt)
> 	eu_en = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) &
> 		  GEN11_EU_DIS_MASK);
>
>-	gen11_compute_sseu_info(sseu, s_en, ss_en, eu_en);
>+	gen11_compute_sseu_info(sseu, s_en, ss_en, 0, eu_en);
>
> 	/* ICL has no power gating restrictions. */
> 	sseu->has_slice_pg = 1;
>@@ -260,9 +290,9 @@ static void gen10_sseu_info_init(struct intel_gt *gt)
> 		 * Slice0 can have up to 3 subslices, but there are only 2 in
> 		 * slice1/2.
> 		 */
>-		intel_sseu_set_subslices(sseu, s, s == 0 ?
>-					 subslice_mask_with_eus :
>-					 subslice_mask_with_eus & 0x3);
>+		intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
>+					 s == 0 ? subslice_mask_with_eus :
>+						  subslice_mask_with_eus & 0x3);
> 	}
>
> 	sseu->eu_total = compute_eu_total(sseu);
>@@ -317,7 +347,7 @@ static void cherryview_sseu_info_init(struct intel_gt *gt)
> 		sseu_set_eus(sseu, 0, 1, ~disabled_mask);
> 	}
>
>-	intel_sseu_set_subslices(sseu, 0, subslice_mask);
>+	intel_sseu_set_subslices(sseu, 0, sseu->subslice_mask, subslice_mask);
>
> 	sseu->eu_total = compute_eu_total(sseu);
>
>@@ -373,7 +403,8 @@ static void gen9_sseu_info_init(struct intel_gt *gt)
> 			/* skip disabled slice */
> 			continue;
>
>-		intel_sseu_set_subslices(sseu, s, subslice_mask);
>+		intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
>+					 subslice_mask);
>
> 		eu_disable = intel_uncore_read(uncore, GEN9_EU_DISABLE(s));
> 		for (ss = 0; ss < sseu->max_subslices; ss++) {
>@@ -485,7 +516,8 @@ static void bdw_sseu_info_init(struct intel_gt *gt)
> 			/* skip disabled slice */
> 			continue;
>
>-		intel_sseu_set_subslices(sseu, s, subslice_mask);
>+		intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
>+					 subslice_mask);
>
> 		for (ss = 0; ss < sseu->max_subslices; ss++) {
> 			u8 eu_disabled_mask;
>@@ -583,7 +615,8 @@ static void hsw_sseu_info_init(struct intel_gt *gt)
> 			    sseu->eu_per_subslice);
>
> 	for (s = 0; s < sseu->max_slices; s++) {
>-		intel_sseu_set_subslices(sseu, s, subslice_mask);
>+		intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
>+					 subslice_mask);
>
> 		for (ss = 0; ss < sseu->max_subslices; ss++) {
> 			sseu_set_eus(sseu, s, ss,
>diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h
>index 204ea6709460..b383e7d97554 100644
>--- a/drivers/gpu/drm/i915/gt/intel_sseu.h
>+++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
>@@ -32,6 +32,8 @@ struct drm_printer;
> struct sseu_dev_info {
> 	u8 slice_mask;
> 	u8 subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
>+	u8 geometry_subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
>+	u8 compute_subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
> 	u8 eu_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICES * GEN_MAX_EU_STRIDE];
> 	u16 eu_total;
> 	u8 eu_per_subslice;
>@@ -41,6 +43,7 @@ struct sseu_dev_info {
> 	u8 has_slice_pg:1;
> 	u8 has_subslice_pg:1;
> 	u8 has_eu_pg:1;
>+	u8 has_compute_dss:1;
>
> 	/* Topology fields */
> 	u8 max_slices;
>@@ -104,7 +107,7 @@ intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice);
> u32  intel_sseu_get_subslices(const struct sseu_dev_info *sseu, u8 slice);
>
> void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
>-			      u32 ss_mask);
>+			      u8 *subslice_mask, u32 ss_mask);
>
> void intel_sseu_info_init(struct intel_gt *gt);
>
>diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>index 99858bc593f0..d7e4418955f7 100644
>--- a/drivers/gpu/drm/i915/i915_reg.h
>+++ b/drivers/gpu/drm/i915/i915_reg.h
>@@ -3223,7 +3223,8 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
>
> #define GEN11_GT_SUBSLICE_DISABLE _MMIO(0x913C)
>
>-#define GEN12_GT_DSS_ENABLE _MMIO(0x913C)
>+#define GEN12_GT_GEOMETRY_DSS_ENABLE _MMIO(0x913C)
>+#define GEN12_GT_COMPUTE_DSS_ENABLE _MMIO(0x9144)
>
> #define XEHP_EU_ENABLE			_MMIO(0x9134)
> #define XEHP_EU_ENA_MASK		0xFF
>diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
>index 7f13d241417f..aef15542a95b 100644
>--- a/include/uapi/drm/i915_drm.h
>+++ b/include/uapi/drm/i915_drm.h
>@@ -2589,9 +2589,6 @@ struct drm_i915_query {
>  *                 Z / 8] >> (Z % 8)) & 1
>  */
> struct drm_i915_query_topology_info {
>-	/*
>-	 * Unused for now. Must be cleared to zero.
>-	 */
> 	__u16 flags;
>
> 	__u16 max_slices;
>-- 
>2.25.4
>
>_______________________________________________
>Intel-gfx mailing list
>Intel-gfx@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v4 10/18] drm/i915/xehpsdv: Add compute DSS type
  2021-08-04 20:36   ` Lucas De Marchi
@ 2021-08-04 21:00     ` Matt Roper
  0 siblings, 0 replies; 33+ messages in thread
From: Matt Roper @ 2021-08-04 21:00 UTC (permalink / raw)
  To: Lucas De Marchi; +Cc: intel-gfx, Stuart Summers, matthew.s.atwood

On Wed, Aug 04, 2021 at 01:36:37PM -0700, Lucas De Marchi wrote:
> On Thu, Jul 29, 2021 at 10:00:00AM -0700, Matt Roper wrote:
> > From: Stuart Summers <stuart.summers@intel.com>
> > 
> > Starting in XeHP, the concept of slice has been removed in favor of
> > DSS (Dual-Subslice) masks for various workload types. These workloads have
> > been divided into those enabled for geometry and those enabled for compute.
> > 
> > i915 currently maintains a single set of S/SS/EU masks for the device.
> > The goal of this patch set is to minimize the amount of impact to prior
> > generations while still giving the user maximum flexibility.
> > 
> > Bspec: 33117, 33118, 20376
> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> > Cc: Matt Roper <matthew.d.roper@intel.com>
> > Signed-off-by: Stuart Summers <stuart.summers@intel.com>
> > Signed-off-by: Steve Hampson <steven.t.hampson@intel.com>
> > Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> > ---
> > drivers/gpu/drm/i915/gt/intel_sseu.c | 73 ++++++++++++++++++++--------
> > drivers/gpu/drm/i915/gt/intel_sseu.h |  5 +-
> > drivers/gpu/drm/i915/i915_reg.h      |  3 +-
> > include/uapi/drm/i915_drm.h          |  3 --
> > 4 files changed, 59 insertions(+), 25 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c
> > index 16c0552fcd1d..5d3b8dff464c 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_sseu.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
> > @@ -46,11 +46,11 @@ u32 intel_sseu_get_subslices(const struct sseu_dev_info *sseu, u8 slice)
> > }
> > 
> > void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
> > -			      u32 ss_mask)
> > +			      u8 *subslice_mask, u32 ss_mask)
> > {
> > 	int offset = slice * sseu->ss_stride;
> > 
> > -	memcpy(&sseu->subslice_mask[offset], &ss_mask, sseu->ss_stride);
> > +	memcpy(&subslice_mask[offset], &ss_mask, sseu->ss_stride);
> > }
> > 
> > unsigned int
> > @@ -100,14 +100,24 @@ static u16 compute_eu_total(const struct sseu_dev_info *sseu)
> > 	return total;
> > }
> > 
> > -static void gen11_compute_sseu_info(struct sseu_dev_info *sseu,
> > -				    u8 s_en, u32 ss_en, u16 eu_en)
> > +static u32 get_ss_stride_mask(struct sseu_dev_info *sseu, u8 s, u32 ss_en)
> > +{
> > +	u32 ss_mask;
> > +
> > +	ss_mask = ss_en >> (s * sseu->max_subslices);
> > +	ss_mask &= GENMASK(sseu->max_subslices - 1, 0);
> > +
> > +	return ss_mask;
> > +}
> > +
> > +static void gen11_compute_sseu_info(struct sseu_dev_info *sseu, u8 s_en,
> > +				    u32 g_ss_en, u32 c_ss_en, u16 eu_en)
> > {
> > 	int s, ss;
> > 
> > -	/* ss_en represents entire subslice mask across all slices */
> > +	/* g_ss_en/c_ss_en represent entire subslice mask across all slices */
> > 	GEM_BUG_ON(sseu->max_slices * sseu->max_subslices >
> > -		   sizeof(ss_en) * BITS_PER_BYTE);
> > +		   sizeof(g_ss_en) * BITS_PER_BYTE);
> > 
> > 	for (s = 0; s < sseu->max_slices; s++) {
> > 		if ((s_en & BIT(s)) == 0)
> > @@ -115,7 +125,23 @@ static void gen11_compute_sseu_info(struct sseu_dev_info *sseu,
> > 
> > 		sseu->slice_mask |= BIT(s);
> > 
> > -		intel_sseu_set_subslices(sseu, s, ss_en);
> > +		/*
> > +		 * XeHP introduces the concept of compute vs
> > +		 * geometry DSS. To reduce variation between GENs
> > +		 * around subslice usage, store a mask for both the
> > +		 * geometry and compute enabled masks, to provide
> > +		 * to user space later in QUERY_TOPOLOGY_INFO, and
> 
> 
> this is not true anymore... I think when this was written the idea was
> to use QUERY_TOPOLOGY_INFO's  flags field to differentiate between
> compute/geometry. However looking at next patches it seems this is not
> there anymore: we still check for
> 
>         if (query_item->flags != 0)
>                 return -EINVAL;
> 
> Cc'ing Stuart and Matt Atwood
> 
> 
> Also, it would be good to have the patches adding the query as the very
> next patch.

I held off on including any of the UAPI changes in this series since
those will require UMD availability too.  I know the OCL driver has
started landing some support for XeHP SDV, but from a quick skim I
didn't see the uapi-related code in what they've made available so far.


Matt

> 
> Lucas De Marchi
> 
> > +		 * compute a total enabled subslice count for the
> > +		 * purposes of selecting subslices to use in a
> > +		 * particular GEM context.
> > +		 */
> > +		intel_sseu_set_subslices(sseu, s, sseu->compute_subslice_mask,
> > +					 get_ss_stride_mask(sseu, s, c_ss_en));
> > +		intel_sseu_set_subslices(sseu, s, sseu->geometry_subslice_mask,
> > +					 get_ss_stride_mask(sseu, s, g_ss_en));
> > +		intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
> > +					 get_ss_stride_mask(sseu, s,
> > +							    g_ss_en | c_ss_en));
> > 
> > 		for (ss = 0; ss < sseu->max_subslices; ss++)
> > 			if (intel_sseu_has_subslice(sseu, s, ss))
> > @@ -129,7 +155,7 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
> > {
> > 	struct sseu_dev_info *sseu = &gt->info.sseu;
> > 	struct intel_uncore *uncore = gt->uncore;
> > -	u32 dss_en;
> > +	u32 g_dss_en, c_dss_en = 0;
> > 	u16 eu_en = 0;
> > 	u8 eu_en_fuse;
> > 	u8 s_en;
> > @@ -145,10 +171,12 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
> > 	 * across the entire device. Then calculate out the DSS for each
> > 	 * workload type within that software slice.
> > 	 */
> > -	if (IS_XEHPSDV(gt->i915))
> > +	if (IS_XEHPSDV(gt->i915)) {
> > 		intel_sseu_set_info(sseu, 1, 32, 16);
> > -	else
> > +		sseu->has_compute_dss = 1;
> > +	} else {
> > 		intel_sseu_set_info(sseu, 1, 6, 16);
> > +	}
> > 
> > 	/*
> > 	 * As mentioned above, Xe_HP does not have the concept of a slice.
> > @@ -160,7 +188,9 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
> > 		s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) &
> > 		       GEN11_GT_S_ENA_MASK;
> > 
> > -	dss_en = intel_uncore_read(uncore, GEN12_GT_DSS_ENABLE);
> > +	g_dss_en = intel_uncore_read(uncore, GEN12_GT_GEOMETRY_DSS_ENABLE);
> > +	if (sseu->has_compute_dss)
> > +		c_dss_en = intel_uncore_read(uncore, GEN12_GT_COMPUTE_DSS_ENABLE);
> > 
> > 	/* one bit per pair of EUs */
> > 	if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
> > @@ -173,7 +203,7 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
> > 		if (eu_en_fuse & BIT(eu))
> > 			eu_en |= BIT(eu * 2) | BIT(eu * 2 + 1);
> > 
> > -	gen11_compute_sseu_info(sseu, s_en, dss_en, eu_en);
> > +	gen11_compute_sseu_info(sseu, s_en, g_dss_en, c_dss_en, eu_en);
> > 
> > 	/* TGL only supports slice-level power gating */
> > 	sseu->has_slice_pg = 1;
> > @@ -199,7 +229,7 @@ static void gen11_sseu_info_init(struct intel_gt *gt)
> > 	eu_en = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) &
> > 		  GEN11_EU_DIS_MASK);
> > 
> > -	gen11_compute_sseu_info(sseu, s_en, ss_en, eu_en);
> > +	gen11_compute_sseu_info(sseu, s_en, ss_en, 0, eu_en);
> > 
> > 	/* ICL has no power gating restrictions. */
> > 	sseu->has_slice_pg = 1;
> > @@ -260,9 +290,9 @@ static void gen10_sseu_info_init(struct intel_gt *gt)
> > 		 * Slice0 can have up to 3 subslices, but there are only 2 in
> > 		 * slice1/2.
> > 		 */
> > -		intel_sseu_set_subslices(sseu, s, s == 0 ?
> > -					 subslice_mask_with_eus :
> > -					 subslice_mask_with_eus & 0x3);
> > +		intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
> > +					 s == 0 ? subslice_mask_with_eus :
> > +						  subslice_mask_with_eus & 0x3);
> > 	}
> > 
> > 	sseu->eu_total = compute_eu_total(sseu);
> > @@ -317,7 +347,7 @@ static void cherryview_sseu_info_init(struct intel_gt *gt)
> > 		sseu_set_eus(sseu, 0, 1, ~disabled_mask);
> > 	}
> > 
> > -	intel_sseu_set_subslices(sseu, 0, subslice_mask);
> > +	intel_sseu_set_subslices(sseu, 0, sseu->subslice_mask, subslice_mask);
> > 
> > 	sseu->eu_total = compute_eu_total(sseu);
> > 
> > @@ -373,7 +403,8 @@ static void gen9_sseu_info_init(struct intel_gt *gt)
> > 			/* skip disabled slice */
> > 			continue;
> > 
> > -		intel_sseu_set_subslices(sseu, s, subslice_mask);
> > +		intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
> > +					 subslice_mask);
> > 
> > 		eu_disable = intel_uncore_read(uncore, GEN9_EU_DISABLE(s));
> > 		for (ss = 0; ss < sseu->max_subslices; ss++) {
> > @@ -485,7 +516,8 @@ static void bdw_sseu_info_init(struct intel_gt *gt)
> > 			/* skip disabled slice */
> > 			continue;
> > 
> > -		intel_sseu_set_subslices(sseu, s, subslice_mask);
> > +		intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
> > +					 subslice_mask);
> > 
> > 		for (ss = 0; ss < sseu->max_subslices; ss++) {
> > 			u8 eu_disabled_mask;
> > @@ -583,7 +615,8 @@ static void hsw_sseu_info_init(struct intel_gt *gt)
> > 			    sseu->eu_per_subslice);
> > 
> > 	for (s = 0; s < sseu->max_slices; s++) {
> > -		intel_sseu_set_subslices(sseu, s, subslice_mask);
> > +		intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
> > +					 subslice_mask);
> > 
> > 		for (ss = 0; ss < sseu->max_subslices; ss++) {
> > 			sseu_set_eus(sseu, s, ss,
> > diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h
> > index 204ea6709460..b383e7d97554 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_sseu.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
> > @@ -32,6 +32,8 @@ struct drm_printer;
> > struct sseu_dev_info {
> > 	u8 slice_mask;
> > 	u8 subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
> > +	u8 geometry_subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
> > +	u8 compute_subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
> > 	u8 eu_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICES * GEN_MAX_EU_STRIDE];
> > 	u16 eu_total;
> > 	u8 eu_per_subslice;
> > @@ -41,6 +43,7 @@ struct sseu_dev_info {
> > 	u8 has_slice_pg:1;
> > 	u8 has_subslice_pg:1;
> > 	u8 has_eu_pg:1;
> > +	u8 has_compute_dss:1;
> > 
> > 	/* Topology fields */
> > 	u8 max_slices;
> > @@ -104,7 +107,7 @@ intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice);
> > u32  intel_sseu_get_subslices(const struct sseu_dev_info *sseu, u8 slice);
> > 
> > void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
> > -			      u32 ss_mask);
> > +			      u8 *subslice_mask, u32 ss_mask);
> > 
> > void intel_sseu_info_init(struct intel_gt *gt);
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> > index 99858bc593f0..d7e4418955f7 100644
> > --- a/drivers/gpu/drm/i915/i915_reg.h
> > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > @@ -3223,7 +3223,8 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
> > 
> > #define GEN11_GT_SUBSLICE_DISABLE _MMIO(0x913C)
> > 
> > -#define GEN12_GT_DSS_ENABLE _MMIO(0x913C)
> > +#define GEN12_GT_GEOMETRY_DSS_ENABLE _MMIO(0x913C)
> > +#define GEN12_GT_COMPUTE_DSS_ENABLE _MMIO(0x9144)
> > 
> > #define XEHP_EU_ENABLE			_MMIO(0x9134)
> > #define XEHP_EU_ENA_MASK		0xFF
> > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> > index 7f13d241417f..aef15542a95b 100644
> > --- a/include/uapi/drm/i915_drm.h
> > +++ b/include/uapi/drm/i915_drm.h
> > @@ -2589,9 +2589,6 @@ struct drm_i915_query {
> >  *                 Z / 8] >> (Z % 8)) & 1
> >  */
> > struct drm_i915_query_topology_info {
> > -	/*
> > -	 * Unused for now. Must be cleared to zero.
> > -	 */
> > 	__u16 flags;
> > 
> > 	__u16 max_slices;
> > -- 
> > 2.25.4
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-gfx] [PATCH v4 05/18] drm/i915/dg2: Add SQIDI steering
  2021-08-04 20:22   ` Lucas De Marchi
@ 2021-08-05 15:11     ` Matt Roper
  0 siblings, 0 replies; 33+ messages in thread
From: Matt Roper @ 2021-08-05 15:11 UTC (permalink / raw)
  To: Lucas De Marchi; +Cc: intel-gfx

On Wed, Aug 04, 2021 at 01:22:17PM -0700, Lucas De Marchi wrote:
> On Thu, Jul 29, 2021 at 09:59:55AM -0700, Matt Roper wrote:
> > Although DG2_G10 platforms will always have all SQIDI's present and
> > don't need steering for registers in a SQIDI MMIO range, this isn't true
> > for DG2_G11 platforms; only SQIDI's 2 and 3 can be used on those.
> > 
> > We handle SQIDI ranges a bit differently from other types of explicit
> > steering.  The SQIDI ranges belong to either the MCFG unit or the SF
> > unit, both of which have their own dedicated steering registers and do
> > not use the typical 0xFDC steering control that all other types of
> > ranges use.  Thus we only need to worry about picking a valid initial
> > value for the MCFG and SF steering registers (0xFD0 and 0xFD8
> > resepectively) at driver init; they won't change after we set them up so
> 
> respectively
> 
> > we don't need to worry about re-steering them explicitly at runtime.
> > 
> > Given that any SQIDI value should work fine for DG2-G10 and XeHP SDV,
> > while only values of 2 and 3 are valid for DG2-G11, we'll just
> > initialize the MCFG and SF steering registers to a constant value of "2"
> > for all XeHP-based platforms for simplicity --- that will work in all
> > cases.
> > 
> > Bspec: 66534
> > Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
> > Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> > ---
> > drivers/gpu/drm/i915/gt/intel_workarounds.c | 28 +++++++++++++++++----
> > drivers/gpu/drm/i915/i915_reg.h             |  2 ++
> > 2 files changed, 25 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > index 8717337a6c81..6895b083523d 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > @@ -889,17 +889,24 @@ cfl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
> > 		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
> > }
> > 
> > -static void __add_mcr_wa(struct drm_i915_private *i915, struct i915_wa_list *wal,
> > -			 unsigned slice, unsigned subslice)
> > +static void __set_mcr_steering(struct i915_wa_list *wal,
> > +			       i915_reg_t steering_reg,
> > +			       unsigned int slice, unsigned int subslice)
> > {
> > 	u32 mcr, mcr_mask;
> > 
> > 	mcr = GEN11_MCR_SLICE(slice) | GEN11_MCR_SUBSLICE(subslice);
> > 	mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
> > 
> > -	drm_dbg(&i915->drm, "MCR slice/subslice = %x\n", mcr);
> > +	wa_write_clr_set(wal, steering_reg, mcr_mask, mcr);
> > +}
> > +
> > +static void __add_mcr_wa(struct drm_i915_private *i915, struct i915_wa_list *wal,
> > +			 unsigned int slice, unsigned int subslice)
> > +{
> > +	drm_dbg(&i915->drm, "MCR slice=0x%x, subslice=0x%x\n", slice, subslice);
> 
> maybe we could leave the debug message in __set_mcr_steering() and add
> what steering register we are setting? Up to you.
> 

I've got a separate patch that adds more clear steering debug
information via a drm_printer and then prints it both in the dmesg log
and in a new debugfs node.  The patch depends on some debugfs changes
that haven't shown up yet so I didn't include it here, but I'll rebase
and send it soon if the debugfs changes don't happen first.


Matt

> 
> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
> 
> 
> > 
> > -	wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr);
> > +	__set_mcr_steering(wal, GEN8_MCR_SELECTOR, slice, subslice);
> > }
> > 
> > static void
> > @@ -953,7 +960,6 @@ xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
> > 	 * - L3 Bank (fusable)
> > 	 * - MSLICE (fusable)
> > 	 * - LNCF (sub-unit within mslice; always present if mslice is present)
> > -	 * - SQIDI (always on)
> > 	 *
> > 	 * We'll do our default/implicit steering based on GSLICE (in the
> > 	 * sliceid field) and DSS (in the subsliceid field).  If we can
> > @@ -1003,6 +1009,18 @@ xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
> > 	WARN_ON(dss_mask >> (slice * GEN_DSS_PER_GSLICE) == 0);
> > 
> > 	__add_mcr_wa(i915, wal, slice, subslice);
> > +
> > +	/*
> > +	 * SQIDI ranges are special because they use different steering
> > +	 * registers than everything else we work with.  On XeHP SDV and
> > +	 * DG2-G10, any value in the steering registers will work fine since
> > +	 * all instances are present, but DG2-G11 only has SQIDI instances at
> > +	 * ID's 2 and 3, so we need to steer to one of those.  For simplicity
> > +	 * we'll just steer to a hardcoded "2" since that value will work
> > +	 * everywhere.
> > +	 */
> > +	__set_mcr_steering(wal, MCFG_MCR_SELECTOR, 0, 2);
> > +	__set_mcr_steering(wal, SF_MCR_SELECTOR, 0, 2);
> > }
> > 
> > static void
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> > index f4113e7e8271..39ce6befff52 100644
> > --- a/drivers/gpu/drm/i915/i915_reg.h
> > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > @@ -2757,6 +2757,8 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
> > #define GEN12_SC_INSTDONE_EXTRA2	_MMIO(0x7108)
> > #define GEN7_SAMPLER_INSTDONE	_MMIO(0xe160)
> > #define GEN7_ROW_INSTDONE	_MMIO(0xe164)
> > +#define MCFG_MCR_SELECTOR		_MMIO(0xfd0)
> > +#define SF_MCR_SELECTOR			_MMIO(0xfd8)
> > #define GEN8_MCR_SELECTOR		_MMIO(0xfdc)
> > #define   GEN8_MCR_SLICE(slice)		(((slice) & 3) << 26)
> > #define   GEN8_MCR_SLICE_MASK		GEN8_MCR_SLICE(3)
> > -- 
> > 2.25.4
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2021-08-05 15:11 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-29 16:59 [Intel-gfx] [PATCH v4 00/18] Begin enabling Xe_HP SDV and DG2 platforms Matt Roper
2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 01/18] drm/i915/xehp: handle new steering options Matt Roper
2021-08-04 19:53   ` Lucas De Marchi
2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 02/18] drm/i915/xehpsdv: Define steering tables Matt Roper
2021-08-04 20:13   ` Lucas De Marchi
2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 03/18] drm/i915/dg2: Add forcewake table Matt Roper
2021-08-04  0:15   ` Souza, Jose
2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 04/18] drm/i915/dg2: Update LNCF steering ranges Matt Roper
2021-08-04 20:16   ` Lucas De Marchi
2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 05/18] drm/i915/dg2: Add SQIDI steering Matt Roper
2021-08-04 20:22   ` Lucas De Marchi
2021-08-05 15:11     ` Matt Roper
2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 06/18] drm/i915/xehp: Loop over all gslices for INSTDONE processing Matt Roper
2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 07/18] drm/i915/dg2: Report INSTDONE_GEOM values in error state Matt Roper
2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 08/18] drm/i915/xehp: Changes to ss/eu definitions Matt Roper
2021-08-04  0:17   ` Souza, Jose
2021-07-29 16:59 ` [Intel-gfx] [PATCH v4 09/18] drm/i915/xehpsdv: Add maximum sseu limits Matt Roper
2021-08-04 20:26   ` Lucas De Marchi
2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 10/18] drm/i915/xehpsdv: Add compute DSS type Matt Roper
2021-08-04 20:36   ` Lucas De Marchi
2021-08-04 21:00     ` Matt Roper
2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 11/18] drm/i915/dg2: DG2 uses the same sseu limits as XeHP SDV Matt Roper
2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 12/18] drm/i915/xehpsdv: Define MOCS table for " Matt Roper
2021-07-29 17:31   ` Lucas De Marchi
2021-07-30  7:16     ` Siddiqui, Ayaz A
2021-07-30 17:01       ` Matt Roper
2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 13/18] drm/i915/dg2: Define MOCS table for DG2 Matt Roper
2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 14/18] drm/i915/xehpsdv: factor out function to read RP_STATE_CAP Matt Roper
2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 15/18] drm/i915/xehpsdv: Read correct RP_STATE_CAP register Matt Roper
2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 16/18] drm/i915/dg2: Add new LRI reg offsets Matt Roper
2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 17/18] drm/i915/dg2: Maintain backward-compatible nested batch behavior Matt Roper
2021-07-29 17:00 ` [Intel-gfx] [PATCH v4 18/18] drm/i915/dg2: Configure PCON in DP pre-enable path Matt Roper
2021-07-29 20:59 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for Begin enabling Xe_HP SDV and DG2 platforms (rev8) Patchwork

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).