dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/6] Update AUX invalidation sequence
@ 2023-07-17 17:30 Andi Shyti
  2023-07-17 17:30 ` [PATCH v4 1/6] drm/i915/gt: Cleanup aux invalidation registers Andi Shyti
                   ` (5 more replies)
  0 siblings, 6 replies; 26+ messages in thread
From: Andi Shyti @ 2023-07-17 17:30 UTC (permalink / raw)
  To: Jonathan Cavitt, Matt Roper, Chris Wilson, Mika Kuoppala, Nirmoy Das
  Cc: Intel GFX, Andi Shyti, DRI Devel

Hi,

as there are new hardware directives, we need a little adaptation
for the AUX invalidation sequence.

In this version we support all the engines affected by this
change.

The stable backport has some challenges because the original
patch that this series fixes has had more changes in between.

Thanks a lot Nirmoy for your review and for the fruitful discussions!

Thanks,
Andi

Changelog:
=========
v3 -> v4
 - A trivial patch 3 is added to rename the flags with
   bit_group_{0,1} to align with the datasheet naming.
 - Patch 4 fixes a confusion I made where the CCS flag was
   applied to the wrong bit group.

v2 -> v3
 - added r-b from Nirmoy in patch 1 and 4.
 - added patch 3 which enables the ccs_flush in the control pipe
   for mtl+ compute and render engines.
 - added redundant checks in patch 2 for enabling the EMIT_FLUSH
   flag.

v1 -> v2
 - add a clean up preliminary patch for the existing registers
 - add support for more engines
 - add the Fixes tag

Andi Shyti (4):
  drm/i915/gt: Cleanup aux invalidation registers
  drm/i915/gt: Rename flags with bit_group_X according to the datasheet
  drm/i915/gt: Enable the CCS_FLUSH bit in the pipe control
  drm/i915/gt: Support aux invalidation on all engines

Jonathan Cavitt (2):
  drm/i915/gt: Ensure memory quiesced before invalidation
  drm/i915/gt: Poll aux invalidation register bit on invalidation

 drivers/gpu/drm/i915/gt/gen8_engine_cs.c     | 127 +++++++++++++------
 drivers/gpu/drm/i915/gt/gen8_engine_cs.h     |   3 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h |   1 +
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h |   2 +
 drivers/gpu/drm/i915/gt/intel_gt_regs.h      |  14 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c          |  17 +--
 6 files changed, 98 insertions(+), 66 deletions(-)

-- 
2.40.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v4 1/6] drm/i915/gt: Cleanup aux invalidation registers
  2023-07-17 17:30 [PATCH v4 0/6] Update AUX invalidation sequence Andi Shyti
@ 2023-07-17 17:30 ` Andi Shyti
  2023-07-17 17:30 ` [PATCH v4 2/6] drm/i915/gt: Ensure memory quiesced before invalidation Andi Shyti
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 26+ messages in thread
From: Andi Shyti @ 2023-07-17 17:30 UTC (permalink / raw)
  To: Jonathan Cavitt, Matt Roper, Chris Wilson, Mika Kuoppala, Nirmoy Das
  Cc: Intel GFX, Andi Shyti, DRI Devel

Fix the 'NV' definition postfix that is supposed to be INV.

Take the chance to also order properly the registers based on
their address and call the GEN12_GFX_CCS_AUX_INV address as
GEN12_CCS_AUX_INV like all the other similar registers.

Remove also VD1, VD3 and VE1 registers that don't exist.

Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.8+
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c |  8 ++++----
 drivers/gpu/drm/i915/gt/intel_gt_regs.h  | 14 ++++++--------
 drivers/gpu/drm/i915/gt/intel_lrc.c      |  6 +++---
 3 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 23857cc08eca1..563efee055602 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -287,8 +287,8 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
 
 		if (!HAS_FLAT_CCS(rq->engine->i915)) {
 			/* hsdes: 1809175790 */
-			cs = gen12_emit_aux_table_inv(rq->engine->gt,
-						      cs, GEN12_GFX_CCS_AUX_NV);
+			cs = gen12_emit_aux_table_inv(rq->engine->gt, cs,
+						      GEN12_CCS_AUX_INV);
 		}
 
 		*cs++ = preparser_disable(false);
@@ -348,10 +348,10 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
 	if (aux_inv) { /* hsdes: 1809175790 */
 		if (rq->engine->class == VIDEO_DECODE_CLASS)
 			cs = gen12_emit_aux_table_inv(rq->engine->gt,
-						      cs, GEN12_VD0_AUX_NV);
+						      cs, GEN12_VD0_AUX_INV);
 		else
 			cs = gen12_emit_aux_table_inv(rq->engine->gt,
-						      cs, GEN12_VE0_AUX_NV);
+						      cs, GEN12_VE0_AUX_INV);
 	}
 
 	if (mode & EMIT_INVALIDATE)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 718cb2c80f79e..78b67a5336afc 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -332,9 +332,10 @@
 #define GEN8_PRIVATE_PAT_HI			_MMIO(0x40e0 + 4)
 #define GEN10_PAT_INDEX(index)			_MMIO(0x40e0 + (index) * 4)
 #define BSD_HWS_PGA_GEN7			_MMIO(0x4180)
-#define GEN12_GFX_CCS_AUX_NV			_MMIO(0x4208)
-#define GEN12_VD0_AUX_NV			_MMIO(0x4218)
-#define GEN12_VD1_AUX_NV			_MMIO(0x4228)
+
+#define GEN12_CCS_AUX_INV			_MMIO(0x4208)
+#define GEN12_VD0_AUX_INV			_MMIO(0x4218)
+#define GEN12_VE0_AUX_INV			_MMIO(0x4238)
 
 #define GEN8_RTCR				_MMIO(0x4260)
 #define GEN8_M1TCR				_MMIO(0x4264)
@@ -342,14 +343,11 @@
 #define GEN8_BTCR				_MMIO(0x426c)
 #define GEN8_VTCR				_MMIO(0x4270)
 
-#define GEN12_VD2_AUX_NV			_MMIO(0x4298)
-#define GEN12_VD3_AUX_NV			_MMIO(0x42a8)
-#define GEN12_VE0_AUX_NV			_MMIO(0x4238)
-
 #define BLT_HWS_PGA_GEN7			_MMIO(0x4280)
 
-#define GEN12_VE1_AUX_NV			_MMIO(0x42b8)
+#define GEN12_VD2_AUX_INV			_MMIO(0x4298)
 #define   AUX_INV				REG_BIT(0)
+
 #define VEBOX_HWS_PGA_GEN7			_MMIO(0x4380)
 
 #define GEN12_AUX_ERR_DBG			_MMIO(0x43f4)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 1b710102390bf..235f3fab60a98 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1374,7 +1374,7 @@ gen12_emit_indirect_ctx_rcs(const struct intel_context *ce, u32 *cs)
 	/* hsdes: 1809175790 */
 	if (!HAS_FLAT_CCS(ce->engine->i915))
 		cs = gen12_emit_aux_table_inv(ce->engine->gt,
-					      cs, GEN12_GFX_CCS_AUX_NV);
+					      cs, GEN12_CCS_AUX_INV);
 
 	/* Wa_16014892111 */
 	if (IS_MTL_GRAPHICS_STEP(ce->engine->i915, M, STEP_A0, STEP_B0) ||
@@ -1403,10 +1403,10 @@ gen12_emit_indirect_ctx_xcs(const struct intel_context *ce, u32 *cs)
 	if (!HAS_FLAT_CCS(ce->engine->i915)) {
 		if (ce->engine->class == VIDEO_DECODE_CLASS)
 			cs = gen12_emit_aux_table_inv(ce->engine->gt,
-						      cs, GEN12_VD0_AUX_NV);
+						      cs, GEN12_VD0_AUX_INV);
 		else if (ce->engine->class == VIDEO_ENHANCEMENT_CLASS)
 			cs = gen12_emit_aux_table_inv(ce->engine->gt,
-						      cs, GEN12_VE0_AUX_NV);
+						      cs, GEN12_VE0_AUX_INV);
 	}
 
 	return cs;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 2/6] drm/i915/gt: Ensure memory quiesced before invalidation
  2023-07-17 17:30 [PATCH v4 0/6] Update AUX invalidation sequence Andi Shyti
  2023-07-17 17:30 ` [PATCH v4 1/6] drm/i915/gt: Cleanup aux invalidation registers Andi Shyti
@ 2023-07-17 17:30 ` Andi Shyti
  2023-07-17 17:54   ` Matt Roper
  2023-07-17 17:30 ` [PATCH v4 3/6] drm/i915/gt: Rename flags with bit_group_X according to the datasheet Andi Shyti
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 26+ messages in thread
From: Andi Shyti @ 2023-07-17 17:30 UTC (permalink / raw)
  To: Jonathan Cavitt, Matt Roper, Chris Wilson, Mika Kuoppala, Nirmoy Das
  Cc: Intel GFX, Andi Shyti, DRI Devel

From: Jonathan Cavitt <jonathan.cavitt@intel.com>

All memory traffic must be quiesced before requesting
an aux invalidation on platforms that use Aux CCS.

Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.8+
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 563efee055602..bee3b7dc595cf 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -202,6 +202,13 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
 {
 	struct intel_engine_cs *engine = rq->engine;
 
+	/*
+	 * Aux invalidations on Aux CCS platforms require
+	 * memory traffic is quiesced prior.
+	 */
+	if ((mode & EMIT_INVALIDATE) && !HAS_FLAT_CCS(engine->i915))
+		mode |= EMIT_FLUSH;
+
 	if (mode & EMIT_FLUSH) {
 		u32 flags = 0;
 		int err;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 3/6] drm/i915/gt: Rename flags with bit_group_X according to the datasheet
  2023-07-17 17:30 [PATCH v4 0/6] Update AUX invalidation sequence Andi Shyti
  2023-07-17 17:30 ` [PATCH v4 1/6] drm/i915/gt: Cleanup aux invalidation registers Andi Shyti
  2023-07-17 17:30 ` [PATCH v4 2/6] drm/i915/gt: Ensure memory quiesced before invalidation Andi Shyti
@ 2023-07-17 17:30 ` Andi Shyti
  2023-07-17 17:32   ` Andi Shyti
                     ` (3 more replies)
  2023-07-17 17:30 ` [PATCH v4 4/6] drm/i915/gt: Enable the CCS_FLUSH bit in the pipe control Andi Shyti
                   ` (2 subsequent siblings)
  5 siblings, 4 replies; 26+ messages in thread
From: Andi Shyti @ 2023-07-17 17:30 UTC (permalink / raw)
  To: Jonathan Cavitt, Matt Roper, Chris Wilson, Mika Kuoppala, Nirmoy Das
  Cc: Intel GFX, Andi Shyti, DRI Devel

In preparation of the next patch allign with the datasheet (BSPEC
47112) with the naming of the pipe control set of flag values.
The variable "flags" in gen12_emit_flush_rcs() is applied as a
set of flags called Bit Group 1.

Define also the Bit Group 0 as bit_group_0 where currently only
PIPE_CONTROL0_HDC_PIPELINE_FLUSH bit is set.

Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.8+
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 34 +++++++++++++-----------
 1 file changed, 18 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index bee3b7dc595cf..3c935d6b68bf0 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -210,7 +210,8 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
 		mode |= EMIT_FLUSH;
 
 	if (mode & EMIT_FLUSH) {
-		u32 flags = 0;
+		u32 bit_group_0 = 0;
+		u32 bit_group_1 = 0;
 		int err;
 		u32 *cs;
 
@@ -218,32 +219,33 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
 		if (err)
 			return err;
 
-		flags |= PIPE_CONTROL_TILE_CACHE_FLUSH;
-		flags |= PIPE_CONTROL_FLUSH_L3;
-		flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
-		flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
+		bit_group_0 |= PIPE_CONTROL0_HDC_PIPELINE_FLUSH;
+
+		bit_group_1 |= PIPE_CONTROL_TILE_CACHE_FLUSH;
+		bit_group_1 |= PIPE_CONTROL_FLUSH_L3;
+		bit_group_1 |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
+		bit_group_1 |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
 		/* Wa_1409600907:tgl,adl-p */
-		flags |= PIPE_CONTROL_DEPTH_STALL;
-		flags |= PIPE_CONTROL_DC_FLUSH_ENABLE;
-		flags |= PIPE_CONTROL_FLUSH_ENABLE;
+		bit_group_1 |= PIPE_CONTROL_DEPTH_STALL;
+		bit_group_1 |= PIPE_CONTROL_DC_FLUSH_ENABLE;
+		bit_group_1 |= PIPE_CONTROL_FLUSH_ENABLE;
 
-		flags |= PIPE_CONTROL_STORE_DATA_INDEX;
-		flags |= PIPE_CONTROL_QW_WRITE;
+		bit_group_1 |= PIPE_CONTROL_STORE_DATA_INDEX;
+		bit_group_1 |= PIPE_CONTROL_QW_WRITE;
 
-		flags |= PIPE_CONTROL_CS_STALL;
+		bit_group_1 |= PIPE_CONTROL_CS_STALL;
 
 		if (!HAS_3D_PIPELINE(engine->i915))
-			flags &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
+			bit_group_1 &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
 		else if (engine->class == COMPUTE_CLASS)
-			flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
+			bit_group_1 &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
 
 		cs = intel_ring_begin(rq, 6);
 		if (IS_ERR(cs))
 			return PTR_ERR(cs);
 
-		cs = gen12_emit_pipe_control(cs,
-					     PIPE_CONTROL0_HDC_PIPELINE_FLUSH,
-					     flags, LRC_PPHWSP_SCRATCH_ADDR);
+		cs = gen12_emit_pipe_control(cs, bit_group_0, bit_group_1,
+					     LRC_PPHWSP_SCRATCH_ADDR);
 		intel_ring_advance(rq, cs);
 	}
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 4/6] drm/i915/gt: Enable the CCS_FLUSH bit in the pipe control
  2023-07-17 17:30 [PATCH v4 0/6] Update AUX invalidation sequence Andi Shyti
                   ` (2 preceding siblings ...)
  2023-07-17 17:30 ` [PATCH v4 3/6] drm/i915/gt: Rename flags with bit_group_X according to the datasheet Andi Shyti
@ 2023-07-17 17:30 ` Andi Shyti
  2023-07-17 18:23   ` [Intel-gfx] " Andrzej Hajda
  2023-07-17 19:40   ` Matt Roper
  2023-07-17 17:30 ` [PATCH v4 5/6] drm/i915/gt: Poll aux invalidation register bit on invalidation Andi Shyti
  2023-07-17 17:30 ` [PATCH v4 6/6] drm/i915/gt: Support aux invalidation on all engines Andi Shyti
  5 siblings, 2 replies; 26+ messages in thread
From: Andi Shyti @ 2023-07-17 17:30 UTC (permalink / raw)
  To: Jonathan Cavitt, Matt Roper, Chris Wilson, Mika Kuoppala, Nirmoy Das
  Cc: Intel GFX, Andi Shyti, DRI Devel

Enable the CCS_FLUSH bit 13 in the control pipe for render and
compute engines in platforms starting from Meteor Lake (BSPEC
43904 and 47112).

Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: <stable@vger.kernel.org> # v5.8+
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c     | 10 +++++++++-
 drivers/gpu/drm/i915/gt/intel_engine_types.h |  1 +
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  1 +
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 3c935d6b68bf0..aa2fb9d72745a 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -207,7 +207,7 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
 	 * memory traffic is quiesced prior.
 	 */
 	if ((mode & EMIT_INVALIDATE) && !HAS_FLAT_CCS(engine->i915))
-		mode |= EMIT_FLUSH;
+		mode |= EMIT_FLUSH | EMIT_CCS_FLUSH;
 
 	if (mode & EMIT_FLUSH) {
 		u32 bit_group_0 = 0;
@@ -221,6 +221,14 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
 
 		bit_group_0 |= PIPE_CONTROL0_HDC_PIPELINE_FLUSH;
 
+		/*
+		 * When required, in MTL+ platforms we need to
+		 * set the CCS_FLUSH bit in the pipe control
+		 */
+		if (mode & EMIT_CCS_FLUSH &&
+		    GRAPHICS_VER_FULL(rq->i915) >= IP_VER(12, 70))
+			bit_group_0 |= PIPE_CONTROL_CCS_FLUSH;
+
 		bit_group_1 |= PIPE_CONTROL_TILE_CACHE_FLUSH;
 		bit_group_1 |= PIPE_CONTROL_FLUSH_L3;
 		bit_group_1 |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index e99a6fa03d453..e2cae9d02bd62 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -514,6 +514,7 @@ struct intel_engine_cs {
 	int		(*emit_flush)(struct i915_request *request, u32 mode);
 #define EMIT_INVALIDATE	BIT(0)
 #define EMIT_FLUSH	BIT(1)
+#define EMIT_CCS_FLUSH	BIT(2) /* MTL+ */
 #define EMIT_BARRIER	(EMIT_INVALIDATE | EMIT_FLUSH)
 	int		(*emit_bb_start)(struct i915_request *rq,
 					 u64 offset, u32 length,
diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index 5d143e2a8db03..5df7cce23197c 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -299,6 +299,7 @@
 #define   PIPE_CONTROL_QW_WRITE				(1<<14)
 #define   PIPE_CONTROL_POST_SYNC_OP_MASK                (3<<14)
 #define   PIPE_CONTROL_DEPTH_STALL			(1<<13)
+#define   PIPE_CONTROL_CCS_FLUSH			(1<<13) /* MTL+ */
 #define   PIPE_CONTROL_WRITE_FLUSH			(1<<12)
 #define   PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH	(1<<12) /* gen6+ */
 #define   PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE	(1<<11) /* MBZ on ILK */
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 5/6] drm/i915/gt: Poll aux invalidation register bit on invalidation
  2023-07-17 17:30 [PATCH v4 0/6] Update AUX invalidation sequence Andi Shyti
                   ` (3 preceding siblings ...)
  2023-07-17 17:30 ` [PATCH v4 4/6] drm/i915/gt: Enable the CCS_FLUSH bit in the pipe control Andi Shyti
@ 2023-07-17 17:30 ` Andi Shyti
  2023-07-17 18:51   ` [Intel-gfx] " Andrzej Hajda
  2023-07-17 20:05   ` Matt Roper
  2023-07-17 17:30 ` [PATCH v4 6/6] drm/i915/gt: Support aux invalidation on all engines Andi Shyti
  5 siblings, 2 replies; 26+ messages in thread
From: Andi Shyti @ 2023-07-17 17:30 UTC (permalink / raw)
  To: Jonathan Cavitt, Matt Roper, Chris Wilson, Mika Kuoppala, Nirmoy Das
  Cc: Intel GFX, Andi Shyti, DRI Devel

From: Jonathan Cavitt <jonathan.cavitt@intel.com>

For platforms that use Aux CCS, wait for aux invalidation to
complete by checking the aux invalidation register bit is
cleared.

Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.8+
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c     | 17 +++++++++++++----
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  1 +
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index aa2fb9d72745a..fbc70f3b7f2fd 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -174,6 +174,16 @@ u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t inv
 	*cs++ = AUX_INV;
 	*cs++ = MI_NOOP;
 
+	*cs++ = MI_SEMAPHORE_WAIT_TOKEN |
+		MI_SEMAPHORE_REGISTER_POLL |
+		MI_SEMAPHORE_POLL |
+		MI_SEMAPHORE_SAD_EQ_SDD;
+	*cs++ = 0;
+	*cs++ = i915_mmio_reg_offset(inv_reg) + gsi_offset;
+	*cs++ = 0;
+	*cs++ = 0;
+	*cs++ = MI_NOOP;
+
 	return cs;
 }
 
@@ -284,10 +294,9 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
 		else if (engine->class == COMPUTE_CLASS)
 			flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
 
+		count = 8;
 		if (!HAS_FLAT_CCS(rq->engine->i915))
-			count = 8 + 4;
-		else
-			count = 8;
+			count += 10;
 
 		cs = intel_ring_begin(rq, count);
 		if (IS_ERR(cs))
@@ -330,7 +339,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
 			aux_inv = rq->engine->mask &
 				~GENMASK(_BCS(I915_MAX_BCS - 1), BCS0);
 			if (aux_inv)
-				cmd += 4;
+				cmd += 10;
 		}
 	}
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index 5df7cce23197c..2bd8d98d21102 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -121,6 +121,7 @@
 #define   MI_SEMAPHORE_TARGET(engine)	((engine)<<15)
 #define MI_SEMAPHORE_WAIT	MI_INSTR(0x1c, 2) /* GEN8+ */
 #define MI_SEMAPHORE_WAIT_TOKEN	MI_INSTR(0x1c, 3) /* GEN12+ */
+#define   MI_SEMAPHORE_REGISTER_POLL	(1 << 16)
 #define   MI_SEMAPHORE_POLL		(1 << 15)
 #define   MI_SEMAPHORE_SAD_GT_SDD	(0 << 12)
 #define   MI_SEMAPHORE_SAD_GTE_SDD	(1 << 12)
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v4 6/6] drm/i915/gt: Support aux invalidation on all engines
  2023-07-17 17:30 [PATCH v4 0/6] Update AUX invalidation sequence Andi Shyti
                   ` (4 preceding siblings ...)
  2023-07-17 17:30 ` [PATCH v4 5/6] drm/i915/gt: Poll aux invalidation register bit on invalidation Andi Shyti
@ 2023-07-17 17:30 ` Andi Shyti
  2023-07-17 19:11   ` [Intel-gfx] " Andrzej Hajda
  2023-07-17 20:27   ` Matt Roper
  5 siblings, 2 replies; 26+ messages in thread
From: Andi Shyti @ 2023-07-17 17:30 UTC (permalink / raw)
  To: Jonathan Cavitt, Matt Roper, Chris Wilson, Mika Kuoppala, Nirmoy Das
  Cc: Intel GFX, Andi Shyti, DRI Devel

Perform some refactoring with the purpose of keeping in one
single place all the operations around the aux table
invalidation.

With this refactoring add more engines where the invalidation
should be performed.

Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.8+
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 63 +++++++++++++++---------
 drivers/gpu/drm/i915/gt/gen8_engine_cs.h |  3 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c      | 17 +------
 3 files changed, 44 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index fbc70f3b7f2fd..6d21a1ac06e73 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -165,7 +165,8 @@ static u32 preparser_disable(bool state)
 	return MI_ARB_CHECK | 1 << 8 | state;
 }
 
-u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t inv_reg)
+static u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs,
+				     const i915_reg_t inv_reg)
 {
 	u32 gsi_offset = gt->uncore->gsi_offset;
 
@@ -187,6 +188,40 @@ u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t inv
 	return cs;
 }
 
+static i915_reg_t intel_get_aux_inv_reg(struct intel_engine_cs *engine)
+{
+	if (HAS_FLAT_CCS(engine->i915))
+		return _MMIO(0);
+
+	switch (engine->id) {
+	case RCS0:
+		return GEN12_CCS_AUX_INV;
+	case VCS0:
+		return GEN12_VD0_AUX_INV;
+	case VCS2:
+		return GEN12_VD2_AUX_INV;
+	case VECS0:
+		return GEN12_VE0_AUX_INV;
+	default:
+		return _MMIO(0);
+	}
+}
+
+static bool intel_engine_has_aux_inv(struct intel_engine_cs *engine)
+{
+	i915_reg_t reg = intel_get_aux_inv_reg(engine);
+
+	return !!reg.reg;
+}
+
+u32 *intel_emit_aux_table_inv(struct intel_engine_cs *engine, u32 *cs)
+{
+	i915_reg_t reg = intel_get_aux_inv_reg(engine);
+	struct intel_gt *gt = engine->gt;
+
+	return reg.reg ? gen12_emit_aux_table_inv(gt, cs, reg) : cs;
+}
+
 static int mtl_dummy_pipe_control(struct i915_request *rq)
 {
 	/* Wa_14016712196 */
@@ -311,11 +346,7 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
 
 		cs = gen8_emit_pipe_control(cs, flags, LRC_PPHWSP_SCRATCH_ADDR);
 
-		if (!HAS_FLAT_CCS(rq->engine->i915)) {
-			/* hsdes: 1809175790 */
-			cs = gen12_emit_aux_table_inv(rq->engine->gt, cs,
-						      GEN12_CCS_AUX_INV);
-		}
+		cs = intel_emit_aux_table_inv(engine, cs);
 
 		*cs++ = preparser_disable(false);
 		intel_ring_advance(rq, cs);
@@ -326,21 +357,14 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
 
 int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
 {
-	intel_engine_mask_t aux_inv = 0;
 	u32 cmd, *cs;
 
 	cmd = 4;
 	if (mode & EMIT_INVALIDATE) {
 		cmd += 2;
 
-		if (!HAS_FLAT_CCS(rq->engine->i915) &&
-		    (rq->engine->class == VIDEO_DECODE_CLASS ||
-		     rq->engine->class == VIDEO_ENHANCEMENT_CLASS)) {
-			aux_inv = rq->engine->mask &
-				~GENMASK(_BCS(I915_MAX_BCS - 1), BCS0);
-			if (aux_inv)
-				cmd += 10;
-		}
+		if (intel_engine_has_aux_inv(rq->engine))
+			cmd += 10;
 	}
 
 	cs = intel_ring_begin(rq, cmd);
@@ -371,14 +395,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
 	*cs++ = 0; /* upper addr */
 	*cs++ = 0; /* value */
 
-	if (aux_inv) { /* hsdes: 1809175790 */
-		if (rq->engine->class == VIDEO_DECODE_CLASS)
-			cs = gen12_emit_aux_table_inv(rq->engine->gt,
-						      cs, GEN12_VD0_AUX_INV);
-		else
-			cs = gen12_emit_aux_table_inv(rq->engine->gt,
-						      cs, GEN12_VE0_AUX_INV);
-	}
+	cs = intel_emit_aux_table_inv(rq->engine, cs);
 
 	if (mode & EMIT_INVALIDATE)
 		*cs++ = preparser_disable(false);
diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.h b/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
index 655e5c00ddc27..d938c94524510 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
@@ -13,6 +13,7 @@
 #include "intel_gt_regs.h"
 #include "intel_gpu_commands.h"
 
+struct intel_engine_cs;
 struct intel_gt;
 struct i915_request;
 
@@ -46,7 +47,7 @@ u32 *gen8_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs);
 u32 *gen11_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs);
 u32 *gen12_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs);
 
-u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t inv_reg);
+u32 *intel_emit_aux_table_inv(struct intel_engine_cs *engine, u32 *cs);
 
 static inline u32 *
 __gen8_emit_pipe_control(u32 *batch, u32 flags0, u32 flags1, u32 offset)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 235f3fab60a98..70054767c88c3 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1371,10 +1371,7 @@ gen12_emit_indirect_ctx_rcs(const struct intel_context *ce, u32 *cs)
 	    IS_DG2_G11(ce->engine->i915))
 		cs = gen8_emit_pipe_control(cs, PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE, 0);
 
-	/* hsdes: 1809175790 */
-	if (!HAS_FLAT_CCS(ce->engine->i915))
-		cs = gen12_emit_aux_table_inv(ce->engine->gt,
-					      cs, GEN12_CCS_AUX_INV);
+	cs = intel_emit_aux_table_inv(ce->engine, cs);
 
 	/* Wa_16014892111 */
 	if (IS_MTL_GRAPHICS_STEP(ce->engine->i915, M, STEP_A0, STEP_B0) ||
@@ -1399,17 +1396,7 @@ gen12_emit_indirect_ctx_xcs(const struct intel_context *ce, u32 *cs)
 						    PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE,
 						    0);
 
-	/* hsdes: 1809175790 */
-	if (!HAS_FLAT_CCS(ce->engine->i915)) {
-		if (ce->engine->class == VIDEO_DECODE_CLASS)
-			cs = gen12_emit_aux_table_inv(ce->engine->gt,
-						      cs, GEN12_VD0_AUX_INV);
-		else if (ce->engine->class == VIDEO_ENHANCEMENT_CLASS)
-			cs = gen12_emit_aux_table_inv(ce->engine->gt,
-						      cs, GEN12_VE0_AUX_INV);
-	}
-
-	return cs;
+	return intel_emit_aux_table_inv(ce->engine, cs);
 }
 
 static void
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 3/6] drm/i915/gt: Rename flags with bit_group_X according to the datasheet
  2023-07-17 17:30 ` [PATCH v4 3/6] drm/i915/gt: Rename flags with bit_group_X according to the datasheet Andi Shyti
@ 2023-07-17 17:32   ` Andi Shyti
  2023-07-17 17:59   ` Matt Roper
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 26+ messages in thread
From: Andi Shyti @ 2023-07-17 17:32 UTC (permalink / raw)
  To: Andi Shyti
  Cc: Mika Kuoppala, Intel GFX, Jonathan Cavitt, DRI Devel,
	Chris Wilson, Matt Roper, Nirmoy Das

Hi,

On Mon, Jul 17, 2023 at 07:30:56PM +0200, Andi Shyti wrote:
> In preparation of the next patch allign with the datasheet (BSPEC
> 47112) with the naming of the pipe control set of flag values.
> The variable "flags" in gen12_emit_flush_rcs() is applied as a
> set of flags called Bit Group 1.
> 
> Define also the Bit Group 0 as bit_group_0 where currently only
> PIPE_CONTROL0_HDC_PIPELINE_FLUSH bit is set.
> 
> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: <stable@vger.kernel.org> # v5.8+
> ---
>  drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 34 +++++++++++++-----------
>  1 file changed, 18 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> index bee3b7dc595cf..3c935d6b68bf0 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> @@ -210,7 +210,8 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
>  		mode |= EMIT_FLUSH;
>  
>  	if (mode & EMIT_FLUSH) {
> -		u32 flags = 0;
> +		u32 bit_group_0 = 0;
> +		u32 bit_group_1 = 0;

I could eventually add here a comment to explain better the
meaning of these two bit_group_{0,1}.

Andi

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 2/6] drm/i915/gt: Ensure memory quiesced before invalidation
  2023-07-17 17:30 ` [PATCH v4 2/6] drm/i915/gt: Ensure memory quiesced before invalidation Andi Shyti
@ 2023-07-17 17:54   ` Matt Roper
  2023-07-17 20:31     ` Matt Roper
  0 siblings, 1 reply; 26+ messages in thread
From: Matt Roper @ 2023-07-17 17:54 UTC (permalink / raw)
  To: Andi Shyti
  Cc: Mika Kuoppala, Intel GFX, Jonathan Cavitt, DRI Devel,
	Chris Wilson, Nirmoy Das

On Mon, Jul 17, 2023 at 07:30:55PM +0200, Andi Shyti wrote:
> From: Jonathan Cavitt <jonathan.cavitt@intel.com>
> 
> All memory traffic must be quiesced before requesting
> an aux invalidation on platforms that use Aux CCS.
> 
> Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: <stable@vger.kernel.org> # v5.8+
> ---
>  drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> index 563efee055602..bee3b7dc595cf 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> @@ -202,6 +202,13 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
>  {
>  	struct intel_engine_cs *engine = rq->engine;
>  
> +	/*
> +	 * Aux invalidations on Aux CCS platforms require
> +	 * memory traffic is quiesced prior.
> +	 */
> +	if ((mode & EMIT_INVALIDATE) && !HAS_FLAT_CCS(engine->i915))

It's a pre-existing mistake in drm-tip at the moment, but we shouldn't
assume !flatccs always implies auxccs.  PVC has neither, and there may
be other similar platforms in the future.  We should probably add a
helper function for AuxCCS, similar to what we added to the Xe driver
recently:

https://patchwork.freedesktop.org/patch/539304/?series=118334&rev=1


Matt


> +		mode |= EMIT_FLUSH;
> +
>  	if (mode & EMIT_FLUSH) {
>  		u32 flags = 0;
>  		int err;
> -- 
> 2.40.1
> 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 3/6] drm/i915/gt: Rename flags with bit_group_X according to the datasheet
  2023-07-17 17:30 ` [PATCH v4 3/6] drm/i915/gt: Rename flags with bit_group_X according to the datasheet Andi Shyti
  2023-07-17 17:32   ` Andi Shyti
@ 2023-07-17 17:59   ` Matt Roper
  2023-07-17 18:21   ` [Intel-gfx] " Andrzej Hajda
  2023-07-17 18:45   ` Nirmoy Das
  3 siblings, 0 replies; 26+ messages in thread
From: Matt Roper @ 2023-07-17 17:59 UTC (permalink / raw)
  To: Andi Shyti
  Cc: Mika Kuoppala, Intel GFX, Jonathan Cavitt, DRI Devel,
	Chris Wilson, Nirmoy Das

On Mon, Jul 17, 2023 at 07:30:56PM +0200, Andi Shyti wrote:
> In preparation of the next patch allign with the datasheet (BSPEC

s/allign/align/

Otherwise,

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

> 47112) with the naming of the pipe control set of flag values.
> The variable "flags" in gen12_emit_flush_rcs() is applied as a
> set of flags called Bit Group 1.
> 
> Define also the Bit Group 0 as bit_group_0 where currently only
> PIPE_CONTROL0_HDC_PIPELINE_FLUSH bit is set.
> 
> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: <stable@vger.kernel.org> # v5.8+
> ---
>  drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 34 +++++++++++++-----------
>  1 file changed, 18 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> index bee3b7dc595cf..3c935d6b68bf0 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> @@ -210,7 +210,8 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
>  		mode |= EMIT_FLUSH;
>  
>  	if (mode & EMIT_FLUSH) {
> -		u32 flags = 0;
> +		u32 bit_group_0 = 0;
> +		u32 bit_group_1 = 0;
>  		int err;
>  		u32 *cs;
>  
> @@ -218,32 +219,33 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
>  		if (err)
>  			return err;
>  
> -		flags |= PIPE_CONTROL_TILE_CACHE_FLUSH;
> -		flags |= PIPE_CONTROL_FLUSH_L3;
> -		flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
> -		flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
> +		bit_group_0 |= PIPE_CONTROL0_HDC_PIPELINE_FLUSH;
> +
> +		bit_group_1 |= PIPE_CONTROL_TILE_CACHE_FLUSH;
> +		bit_group_1 |= PIPE_CONTROL_FLUSH_L3;
> +		bit_group_1 |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
> +		bit_group_1 |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
>  		/* Wa_1409600907:tgl,adl-p */
> -		flags |= PIPE_CONTROL_DEPTH_STALL;
> -		flags |= PIPE_CONTROL_DC_FLUSH_ENABLE;
> -		flags |= PIPE_CONTROL_FLUSH_ENABLE;
> +		bit_group_1 |= PIPE_CONTROL_DEPTH_STALL;
> +		bit_group_1 |= PIPE_CONTROL_DC_FLUSH_ENABLE;
> +		bit_group_1 |= PIPE_CONTROL_FLUSH_ENABLE;
>  
> -		flags |= PIPE_CONTROL_STORE_DATA_INDEX;
> -		flags |= PIPE_CONTROL_QW_WRITE;
> +		bit_group_1 |= PIPE_CONTROL_STORE_DATA_INDEX;
> +		bit_group_1 |= PIPE_CONTROL_QW_WRITE;
>  
> -		flags |= PIPE_CONTROL_CS_STALL;
> +		bit_group_1 |= PIPE_CONTROL_CS_STALL;
>  
>  		if (!HAS_3D_PIPELINE(engine->i915))
> -			flags &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
> +			bit_group_1 &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
>  		else if (engine->class == COMPUTE_CLASS)
> -			flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
> +			bit_group_1 &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
>  
>  		cs = intel_ring_begin(rq, 6);
>  		if (IS_ERR(cs))
>  			return PTR_ERR(cs);
>  
> -		cs = gen12_emit_pipe_control(cs,
> -					     PIPE_CONTROL0_HDC_PIPELINE_FLUSH,
> -					     flags, LRC_PPHWSP_SCRATCH_ADDR);
> +		cs = gen12_emit_pipe_control(cs, bit_group_0, bit_group_1,
> +					     LRC_PPHWSP_SCRATCH_ADDR);
>  		intel_ring_advance(rq, cs);
>  	}
>  
> -- 
> 2.40.1
> 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Intel-gfx] [PATCH v4 3/6] drm/i915/gt: Rename flags with bit_group_X according to the datasheet
  2023-07-17 17:30 ` [PATCH v4 3/6] drm/i915/gt: Rename flags with bit_group_X according to the datasheet Andi Shyti
  2023-07-17 17:32   ` Andi Shyti
  2023-07-17 17:59   ` Matt Roper
@ 2023-07-17 18:21   ` Andrzej Hajda
  2023-07-17 21:54     ` Andi Shyti
  2023-07-17 18:45   ` Nirmoy Das
  3 siblings, 1 reply; 26+ messages in thread
From: Andrzej Hajda @ 2023-07-17 18:21 UTC (permalink / raw)
  To: Andi Shyti, Jonathan Cavitt, Matt Roper, Chris Wilson,
	Mika Kuoppala, Nirmoy Das
  Cc: Intel GFX, DRI Devel

On 17.07.2023 19:30, Andi Shyti wrote:
> In preparation of the next patch allign with the datasheet (BSPEC
> 47112) with the naming of the pipe control set of flag values.
> The variable "flags" in gen12_emit_flush_rcs() is applied as a
> set of flags called Bit Group 1.
> 
> Define also the Bit Group 0 as bit_group_0 where currently only
> PIPE_CONTROL0_HDC_PIPELINE_FLUSH bit is set.
> 
> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: <stable@vger.kernel.org> # v5.8+
> ---
>   drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 34 +++++++++++++-----------
>   1 file changed, 18 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> index bee3b7dc595cf..3c935d6b68bf0 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> @@ -210,7 +210,8 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
>   		mode |= EMIT_FLUSH;
>   
>   	if (mode & EMIT_FLUSH) {
> -		u32 flags = 0;
> +		u32 bit_group_0 = 0;
> +		u32 bit_group_1 = 0;

For me flags0 and flags1 seems more readable as in 
*gen8_emit_pipe_control(batch, flags0, flags1, offset), but no strong 
feelings, but if bit_group is chosen arguments of 
*gen8_emit_pipe_control could be changed as well.

Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>

Regards
Andrzej

>   		int err;
>   		u32 *cs;
>   
> @@ -218,32 +219,33 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
>   		if (err)
>   			return err;
>   
> -		flags |= PIPE_CONTROL_TILE_CACHE_FLUSH;
> -		flags |= PIPE_CONTROL_FLUSH_L3;
> -		flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
> -		flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
> +		bit_group_0 |= PIPE_CONTROL0_HDC_PIPELINE_FLUSH;
> +
> +		bit_group_1 |= PIPE_CONTROL_TILE_CACHE_FLUSH;
> +		bit_group_1 |= PIPE_CONTROL_FLUSH_L3;
> +		bit_group_1 |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
> +		bit_group_1 |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
>   		/* Wa_1409600907:tgl,adl-p */
> -		flags |= PIPE_CONTROL_DEPTH_STALL;
> -		flags |= PIPE_CONTROL_DC_FLUSH_ENABLE;
> -		flags |= PIPE_CONTROL_FLUSH_ENABLE;
> +		bit_group_1 |= PIPE_CONTROL_DEPTH_STALL;
> +		bit_group_1 |= PIPE_CONTROL_DC_FLUSH_ENABLE;
> +		bit_group_1 |= PIPE_CONTROL_FLUSH_ENABLE;
>   
> -		flags |= PIPE_CONTROL_STORE_DATA_INDEX;
> -		flags |= PIPE_CONTROL_QW_WRITE;
> +		bit_group_1 |= PIPE_CONTROL_STORE_DATA_INDEX;
> +		bit_group_1 |= PIPE_CONTROL_QW_WRITE;
>   
> -		flags |= PIPE_CONTROL_CS_STALL;
> +		bit_group_1 |= PIPE_CONTROL_CS_STALL;
>   
>   		if (!HAS_3D_PIPELINE(engine->i915))
> -			flags &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
> +			bit_group_1 &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
>   		else if (engine->class == COMPUTE_CLASS)
> -			flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
> +			bit_group_1 &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
>   
>   		cs = intel_ring_begin(rq, 6);
>   		if (IS_ERR(cs))
>   			return PTR_ERR(cs);
>   
> -		cs = gen12_emit_pipe_control(cs,
> -					     PIPE_CONTROL0_HDC_PIPELINE_FLUSH,
> -					     flags, LRC_PPHWSP_SCRATCH_ADDR);
> +		cs = gen12_emit_pipe_control(cs, bit_group_0, bit_group_1,
> +					     LRC_PPHWSP_SCRATCH_ADDR);
>   		intel_ring_advance(rq, cs);
>   	}
>   


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Intel-gfx] [PATCH v4 4/6] drm/i915/gt: Enable the CCS_FLUSH bit in the pipe control
  2023-07-17 17:30 ` [PATCH v4 4/6] drm/i915/gt: Enable the CCS_FLUSH bit in the pipe control Andi Shyti
@ 2023-07-17 18:23   ` Andrzej Hajda
  2023-07-17 19:40   ` Matt Roper
  1 sibling, 0 replies; 26+ messages in thread
From: Andrzej Hajda @ 2023-07-17 18:23 UTC (permalink / raw)
  To: Andi Shyti, Jonathan Cavitt, Matt Roper, Chris Wilson,
	Mika Kuoppala, Nirmoy Das
  Cc: Intel GFX, DRI Devel

On 17.07.2023 19:30, Andi Shyti wrote:
> Enable the CCS_FLUSH bit 13 in the control pipe for render and
> compute engines in platforms starting from Meteor Lake (BSPEC
> 43904 and 47112).
> 
> Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
> Cc: Nirmoy Das <nirmoy.das@intel.com>
> Cc: <stable@vger.kernel.org> # v5.8+


Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>

Regards
Andrzej
> ---
>   drivers/gpu/drm/i915/gt/gen8_engine_cs.c     | 10 +++++++++-
>   drivers/gpu/drm/i915/gt/intel_engine_types.h |  1 +
>   drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  1 +
>   3 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> index 3c935d6b68bf0..aa2fb9d72745a 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> @@ -207,7 +207,7 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
>   	 * memory traffic is quiesced prior.
>   	 */
>   	if ((mode & EMIT_INVALIDATE) && !HAS_FLAT_CCS(engine->i915))
> -		mode |= EMIT_FLUSH;
> +		mode |= EMIT_FLUSH | EMIT_CCS_FLUSH;
>   
>   	if (mode & EMIT_FLUSH) {
>   		u32 bit_group_0 = 0;
> @@ -221,6 +221,14 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
>   
>   		bit_group_0 |= PIPE_CONTROL0_HDC_PIPELINE_FLUSH;
>   
> +		/*
> +		 * When required, in MTL+ platforms we need to
> +		 * set the CCS_FLUSH bit in the pipe control
> +		 */
> +		if (mode & EMIT_CCS_FLUSH &&
> +		    GRAPHICS_VER_FULL(rq->i915) >= IP_VER(12, 70))
> +			bit_group_0 |= PIPE_CONTROL_CCS_FLUSH;
> +
>   		bit_group_1 |= PIPE_CONTROL_TILE_CACHE_FLUSH;
>   		bit_group_1 |= PIPE_CONTROL_FLUSH_L3;
>   		bit_group_1 |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> index e99a6fa03d453..e2cae9d02bd62 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> @@ -514,6 +514,7 @@ struct intel_engine_cs {
>   	int		(*emit_flush)(struct i915_request *request, u32 mode);
>   #define EMIT_INVALIDATE	BIT(0)
>   #define EMIT_FLUSH	BIT(1)
> +#define EMIT_CCS_FLUSH	BIT(2) /* MTL+ */
>   #define EMIT_BARRIER	(EMIT_INVALIDATE | EMIT_FLUSH)
>   	int		(*emit_bb_start)(struct i915_request *rq,
>   					 u64 offset, u32 length,
> diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> index 5d143e2a8db03..5df7cce23197c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> @@ -299,6 +299,7 @@
>   #define   PIPE_CONTROL_QW_WRITE				(1<<14)
>   #define   PIPE_CONTROL_POST_SYNC_OP_MASK                (3<<14)
>   #define   PIPE_CONTROL_DEPTH_STALL			(1<<13)
> +#define   PIPE_CONTROL_CCS_FLUSH			(1<<13) /* MTL+ */
>   #define   PIPE_CONTROL_WRITE_FLUSH			(1<<12)
>   #define   PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH	(1<<12) /* gen6+ */
>   #define   PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE	(1<<11) /* MBZ on ILK */


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 3/6] drm/i915/gt: Rename flags with bit_group_X according to the datasheet
  2023-07-17 17:30 ` [PATCH v4 3/6] drm/i915/gt: Rename flags with bit_group_X according to the datasheet Andi Shyti
                     ` (2 preceding siblings ...)
  2023-07-17 18:21   ` [Intel-gfx] " Andrzej Hajda
@ 2023-07-17 18:45   ` Nirmoy Das
  3 siblings, 0 replies; 26+ messages in thread
From: Nirmoy Das @ 2023-07-17 18:45 UTC (permalink / raw)
  To: Andi Shyti, Jonathan Cavitt, Matt Roper, Chris Wilson,
	Mika Kuoppala, Nirmoy Das
  Cc: Intel GFX, DRI Devel

Thanks for cleaning this.

With Matt's suggestion, this is

Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>

On 7/17/2023 7:30 PM, Andi Shyti wrote:
> In preparation of the next patch allign with the datasheet (BSPEC
> 47112) with the naming of the pipe control set of flag values.
> The variable "flags" in gen12_emit_flush_rcs() is applied as a
> set of flags called Bit Group 1.
>
> Define also the Bit Group 0 as bit_group_0 where currently only
> PIPE_CONTROL0_HDC_PIPELINE_FLUSH bit is set.
>
> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: <stable@vger.kernel.org> # v5.8+
> ---
>   drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 34 +++++++++++++-----------
>   1 file changed, 18 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> index bee3b7dc595cf..3c935d6b68bf0 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> @@ -210,7 +210,8 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
>   		mode |= EMIT_FLUSH;
>   
>   	if (mode & EMIT_FLUSH) {
> -		u32 flags = 0;
> +		u32 bit_group_0 = 0;
> +		u32 bit_group_1 = 0;
>   		int err;
>   		u32 *cs;
>   
> @@ -218,32 +219,33 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
>   		if (err)
>   			return err;
>   
> -		flags |= PIPE_CONTROL_TILE_CACHE_FLUSH;
> -		flags |= PIPE_CONTROL_FLUSH_L3;
> -		flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
> -		flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
> +		bit_group_0 |= PIPE_CONTROL0_HDC_PIPELINE_FLUSH;
> +
> +		bit_group_1 |= PIPE_CONTROL_TILE_CACHE_FLUSH;
> +		bit_group_1 |= PIPE_CONTROL_FLUSH_L3;
> +		bit_group_1 |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
> +		bit_group_1 |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
>   		/* Wa_1409600907:tgl,adl-p */
> -		flags |= PIPE_CONTROL_DEPTH_STALL;
> -		flags |= PIPE_CONTROL_DC_FLUSH_ENABLE;
> -		flags |= PIPE_CONTROL_FLUSH_ENABLE;
> +		bit_group_1 |= PIPE_CONTROL_DEPTH_STALL;
> +		bit_group_1 |= PIPE_CONTROL_DC_FLUSH_ENABLE;
> +		bit_group_1 |= PIPE_CONTROL_FLUSH_ENABLE;
>   
> -		flags |= PIPE_CONTROL_STORE_DATA_INDEX;
> -		flags |= PIPE_CONTROL_QW_WRITE;
> +		bit_group_1 |= PIPE_CONTROL_STORE_DATA_INDEX;
> +		bit_group_1 |= PIPE_CONTROL_QW_WRITE;
>   
> -		flags |= PIPE_CONTROL_CS_STALL;
> +		bit_group_1 |= PIPE_CONTROL_CS_STALL;
>   
>   		if (!HAS_3D_PIPELINE(engine->i915))
> -			flags &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
> +			bit_group_1 &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
>   		else if (engine->class == COMPUTE_CLASS)
> -			flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
> +			bit_group_1 &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
>   
>   		cs = intel_ring_begin(rq, 6);
>   		if (IS_ERR(cs))
>   			return PTR_ERR(cs);
>   
> -		cs = gen12_emit_pipe_control(cs,
> -					     PIPE_CONTROL0_HDC_PIPELINE_FLUSH,
> -					     flags, LRC_PPHWSP_SCRATCH_ADDR);
> +		cs = gen12_emit_pipe_control(cs, bit_group_0, bit_group_1,
> +					     LRC_PPHWSP_SCRATCH_ADDR);
>   		intel_ring_advance(rq, cs);
>   	}
>   

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Intel-gfx] [PATCH v4 5/6] drm/i915/gt: Poll aux invalidation register bit on invalidation
  2023-07-17 17:30 ` [PATCH v4 5/6] drm/i915/gt: Poll aux invalidation register bit on invalidation Andi Shyti
@ 2023-07-17 18:51   ` Andrzej Hajda
  2023-07-17 20:05   ` Matt Roper
  1 sibling, 0 replies; 26+ messages in thread
From: Andrzej Hajda @ 2023-07-17 18:51 UTC (permalink / raw)
  To: Andi Shyti, Jonathan Cavitt, Matt Roper, Chris Wilson,
	Mika Kuoppala, Nirmoy Das
  Cc: Intel GFX, DRI Devel

On 17.07.2023 19:30, Andi Shyti wrote:
> From: Jonathan Cavitt <jonathan.cavitt@intel.com>
> 
> For platforms that use Aux CCS, wait for aux invalidation to
> complete by checking the aux invalidation register bit is
> cleared.
> 
> Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: <stable@vger.kernel.org> # v5.8+
> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>

Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>

Regards
Andrzej
> ---
>   drivers/gpu/drm/i915/gt/gen8_engine_cs.c     | 17 +++++++++++++----
>   drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  1 +
>   2 files changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> index aa2fb9d72745a..fbc70f3b7f2fd 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> @@ -174,6 +174,16 @@ u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t inv
>   	*cs++ = AUX_INV;
>   	*cs++ = MI_NOOP;
>   
> +	*cs++ = MI_SEMAPHORE_WAIT_TOKEN |
> +		MI_SEMAPHORE_REGISTER_POLL |
> +		MI_SEMAPHORE_POLL |
> +		MI_SEMAPHORE_SAD_EQ_SDD;
> +	*cs++ = 0;
> +	*cs++ = i915_mmio_reg_offset(inv_reg) + gsi_offset;
> +	*cs++ = 0;
> +	*cs++ = 0;
> +	*cs++ = MI_NOOP;
> +
>   	return cs;
>   }
>   
> @@ -284,10 +294,9 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
>   		else if (engine->class == COMPUTE_CLASS)
>   			flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
>   
> +		count = 8;
>   		if (!HAS_FLAT_CCS(rq->engine->i915))
> -			count = 8 + 4;
> -		else
> -			count = 8;
> +			count += 10;
>   
>   		cs = intel_ring_begin(rq, count);
>   		if (IS_ERR(cs))
> @@ -330,7 +339,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
>   			aux_inv = rq->engine->mask &
>   				~GENMASK(_BCS(I915_MAX_BCS - 1), BCS0);
>   			if (aux_inv)
> -				cmd += 4;
> +				cmd += 10;
>   		}
>   	}
>   
> diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> index 5df7cce23197c..2bd8d98d21102 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> @@ -121,6 +121,7 @@
>   #define   MI_SEMAPHORE_TARGET(engine)	((engine)<<15)
>   #define MI_SEMAPHORE_WAIT	MI_INSTR(0x1c, 2) /* GEN8+ */
>   #define MI_SEMAPHORE_WAIT_TOKEN	MI_INSTR(0x1c, 3) /* GEN12+ */
> +#define   MI_SEMAPHORE_REGISTER_POLL	(1 << 16)
>   #define   MI_SEMAPHORE_POLL		(1 << 15)
>   #define   MI_SEMAPHORE_SAD_GT_SDD	(0 << 12)
>   #define   MI_SEMAPHORE_SAD_GTE_SDD	(1 << 12)


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Intel-gfx] [PATCH v4 6/6] drm/i915/gt: Support aux invalidation on all engines
  2023-07-17 17:30 ` [PATCH v4 6/6] drm/i915/gt: Support aux invalidation on all engines Andi Shyti
@ 2023-07-17 19:11   ` Andrzej Hajda
  2023-07-17 22:00     ` Andi Shyti
  2023-07-17 20:27   ` Matt Roper
  1 sibling, 1 reply; 26+ messages in thread
From: Andrzej Hajda @ 2023-07-17 19:11 UTC (permalink / raw)
  To: Andi Shyti, Jonathan Cavitt, Matt Roper, Chris Wilson,
	Mika Kuoppala, Nirmoy Das
  Cc: Intel GFX, DRI Devel

On 17.07.2023 19:30, Andi Shyti wrote:
> Perform some refactoring with the purpose of keeping in one
> single place all the operations around the aux table
> invalidation.
> 
> With this refactoring add more engines where the invalidation
> should be performed.
> 
> Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: <stable@vger.kernel.org> # v5.8+
> ---
>   drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 63 +++++++++++++++---------
>   drivers/gpu/drm/i915/gt/gen8_engine_cs.h |  3 +-
>   drivers/gpu/drm/i915/gt/intel_lrc.c      | 17 +------
>   3 files changed, 44 insertions(+), 39 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> index fbc70f3b7f2fd..6d21a1ac06e73 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> @@ -165,7 +165,8 @@ static u32 preparser_disable(bool state)
>   	return MI_ARB_CHECK | 1 << 8 | state;
>   }
>   
> -u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t inv_reg)
> +static u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs,
> +				     const i915_reg_t inv_reg)
>   {
>   	u32 gsi_offset = gt->uncore->gsi_offset;
>   
> @@ -187,6 +188,40 @@ u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t inv
>   	return cs;
>   }
>   
> +static i915_reg_t intel_get_aux_inv_reg(struct intel_engine_cs *engine)
> +{
> +	if (HAS_FLAT_CCS(engine->i915))
> +		return _MMIO(0);

Why not INVALID_MMIO_REG ? Here and below.

> +
> +	switch (engine->id) {
> +	case RCS0:
> +		return GEN12_CCS_AUX_INV;
> +	case VCS0:
> +		return GEN12_VD0_AUX_INV;
> +	case VCS2:
> +		return GEN12_VD2_AUX_INV;
> +	case VECS0:
> +		return GEN12_VE0_AUX_INV;
> +	default:
> +		return _MMIO(0);
> +	}
> +}
> +
> +static bool intel_engine_has_aux_inv(struct intel_engine_cs *engine)
> +{
> +	i915_reg_t reg = intel_get_aux_inv_reg(engine);
> +
> +	return !!reg.reg;

  return i915_mmio_reg_valid(intel_get_aux_inv_reg(engine));

> +}
> +
> +u32 *intel_emit_aux_table_inv(struct intel_engine_cs *engine, u32 *cs)
> +{
> +	i915_reg_t reg = intel_get_aux_inv_reg(engine);
> +	struct intel_gt *gt = engine->gt;
> +
> +	return reg.reg ? gen12_emit_aux_table_inv(gt, cs, reg) : cs;
> +}
> +

I am not sure about prefixes, IMHO gen12_ instead of intel_ is more 
adequate as this is only gen12 feature, works only on gen12, and is 
called from gen12 context, up to you. In any case we can squash 
intel_emit_aux_table_inv and gen12_emit_aux_table_inv into one function, 
am I right?

Regards
Andrzej

>   static int mtl_dummy_pipe_control(struct i915_request *rq)
>   {
>   	/* Wa_14016712196 */
> @@ -311,11 +346,7 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
>   
>   		cs = gen8_emit_pipe_control(cs, flags, LRC_PPHWSP_SCRATCH_ADDR);
>   
> -		if (!HAS_FLAT_CCS(rq->engine->i915)) {
> -			/* hsdes: 1809175790 */
> -			cs = gen12_emit_aux_table_inv(rq->engine->gt, cs,
> -						      GEN12_CCS_AUX_INV);
> -		}
> +		cs = intel_emit_aux_table_inv(engine, cs);
>   
>   		*cs++ = preparser_disable(false);
>   		intel_ring_advance(rq, cs);
> @@ -326,21 +357,14 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
>   
>   int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
>   {
> -	intel_engine_mask_t aux_inv = 0;
>   	u32 cmd, *cs;
>   
>   	cmd = 4;
>   	if (mode & EMIT_INVALIDATE) {
>   		cmd += 2;
>   
> -		if (!HAS_FLAT_CCS(rq->engine->i915) &&
> -		    (rq->engine->class == VIDEO_DECODE_CLASS ||
> -		     rq->engine->class == VIDEO_ENHANCEMENT_CLASS)) {
> -			aux_inv = rq->engine->mask &
> -				~GENMASK(_BCS(I915_MAX_BCS - 1), BCS0);
> -			if (aux_inv)
> -				cmd += 10;
> -		}
> +		if (intel_engine_has_aux_inv(rq->engine))
> +			cmd += 10;
>   	}
>   
>   	cs = intel_ring_begin(rq, cmd);
> @@ -371,14 +395,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
>   	*cs++ = 0; /* upper addr */
>   	*cs++ = 0; /* value */
>   
> -	if (aux_inv) { /* hsdes: 1809175790 */
> -		if (rq->engine->class == VIDEO_DECODE_CLASS)
> -			cs = gen12_emit_aux_table_inv(rq->engine->gt,
> -						      cs, GEN12_VD0_AUX_INV);
> -		else
> -			cs = gen12_emit_aux_table_inv(rq->engine->gt,
> -						      cs, GEN12_VE0_AUX_INV);
> -	}
> +	cs = intel_emit_aux_table_inv(rq->engine, cs);
>   
>   	if (mode & EMIT_INVALIDATE)
>   		*cs++ = preparser_disable(false);
> diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.h b/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
> index 655e5c00ddc27..d938c94524510 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
> +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
> @@ -13,6 +13,7 @@
>   #include "intel_gt_regs.h"
>   #include "intel_gpu_commands.h"
>   
> +struct intel_engine_cs;
>   struct intel_gt;
>   struct i915_request;
>   
> @@ -46,7 +47,7 @@ u32 *gen8_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs);
>   u32 *gen11_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs);
>   u32 *gen12_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs);
>   
> -u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t inv_reg);
> +u32 *intel_emit_aux_table_inv(struct intel_engine_cs *engine, u32 *cs);
>   
>   static inline u32 *
>   __gen8_emit_pipe_control(u32 *batch, u32 flags0, u32 flags1, u32 offset)
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 235f3fab60a98..70054767c88c3 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -1371,10 +1371,7 @@ gen12_emit_indirect_ctx_rcs(const struct intel_context *ce, u32 *cs)
>   	    IS_DG2_G11(ce->engine->i915))
>   		cs = gen8_emit_pipe_control(cs, PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE, 0);
>   
> -	/* hsdes: 1809175790 */
> -	if (!HAS_FLAT_CCS(ce->engine->i915))
> -		cs = gen12_emit_aux_table_inv(ce->engine->gt,
> -					      cs, GEN12_CCS_AUX_INV);
> +	cs = intel_emit_aux_table_inv(ce->engine, cs);
>   
>   	/* Wa_16014892111 */
>   	if (IS_MTL_GRAPHICS_STEP(ce->engine->i915, M, STEP_A0, STEP_B0) ||
> @@ -1399,17 +1396,7 @@ gen12_emit_indirect_ctx_xcs(const struct intel_context *ce, u32 *cs)
>   						    PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE,
>   						    0);
>   
> -	/* hsdes: 1809175790 */
> -	if (!HAS_FLAT_CCS(ce->engine->i915)) {
> -		if (ce->engine->class == VIDEO_DECODE_CLASS)
> -			cs = gen12_emit_aux_table_inv(ce->engine->gt,
> -						      cs, GEN12_VD0_AUX_INV);
> -		else if (ce->engine->class == VIDEO_ENHANCEMENT_CLASS)
> -			cs = gen12_emit_aux_table_inv(ce->engine->gt,
> -						      cs, GEN12_VE0_AUX_INV);
> -	}
> -
> -	return cs;
> +	return intel_emit_aux_table_inv(ce->engine, cs);
>   }
>   
>   static void


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 4/6] drm/i915/gt: Enable the CCS_FLUSH bit in the pipe control
  2023-07-17 17:30 ` [PATCH v4 4/6] drm/i915/gt: Enable the CCS_FLUSH bit in the pipe control Andi Shyti
  2023-07-17 18:23   ` [Intel-gfx] " Andrzej Hajda
@ 2023-07-17 19:40   ` Matt Roper
  1 sibling, 0 replies; 26+ messages in thread
From: Matt Roper @ 2023-07-17 19:40 UTC (permalink / raw)
  To: Andi Shyti
  Cc: Mika Kuoppala, Intel GFX, Jonathan Cavitt, DRI Devel,
	Chris Wilson, Nirmoy Das

On Mon, Jul 17, 2023 at 07:30:57PM +0200, Andi Shyti wrote:
> Enable the CCS_FLUSH bit 13 in the control pipe for render and
> compute engines in platforms starting from Meteor Lake (BSPEC
> 43904 and 47112).
> 
> Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
> Cc: Nirmoy Das <nirmoy.das@intel.com>
> Cc: <stable@vger.kernel.org> # v5.8+
> ---
>  drivers/gpu/drm/i915/gt/gen8_engine_cs.c     | 10 +++++++++-
>  drivers/gpu/drm/i915/gt/intel_engine_types.h |  1 +
>  drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  1 +
>  3 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> index 3c935d6b68bf0..aa2fb9d72745a 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> @@ -207,7 +207,7 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
>  	 * memory traffic is quiesced prior.
>  	 */
>  	if ((mode & EMIT_INVALIDATE) && !HAS_FLAT_CCS(engine->i915))
> -		mode |= EMIT_FLUSH;
> +		mode |= EMIT_FLUSH | EMIT_CCS_FLUSH;

Do we even really need the extra EMIT_* flag?  It seems like just doing
the CCS flush on graphics 12.70 and beyond would probably be fine since
EMIT_FLUSH is only used in two places on those platforms:  an
EMIT_BARRIER in intel_engine_emit_ctx_wa (which happens during device
init, before we've had an opportunity to use CCS for anything) and the
new flush we're adding here in aux invalidation.  All other uses of
EMIT_FLUSH in the driver are specific to non-GuC execlist submission or
to the old ringbuffer-based submission on pre-gen8 platforms.

Anyway, adding the extra condition shouldn't really hurt anything
either, so up to you whether you want to drop it or not.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

>  
>  	if (mode & EMIT_FLUSH) {
>  		u32 bit_group_0 = 0;
> @@ -221,6 +221,14 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
>  
>  		bit_group_0 |= PIPE_CONTROL0_HDC_PIPELINE_FLUSH;
>  
> +		/*
> +		 * When required, in MTL+ platforms we need to
> +		 * set the CCS_FLUSH bit in the pipe control
> +		 */
> +		if (mode & EMIT_CCS_FLUSH &&
> +		    GRAPHICS_VER_FULL(rq->i915) >= IP_VER(12, 70))
> +			bit_group_0 |= PIPE_CONTROL_CCS_FLUSH;
> +
>  		bit_group_1 |= PIPE_CONTROL_TILE_CACHE_FLUSH;
>  		bit_group_1 |= PIPE_CONTROL_FLUSH_L3;
>  		bit_group_1 |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> index e99a6fa03d453..e2cae9d02bd62 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> @@ -514,6 +514,7 @@ struct intel_engine_cs {
>  	int		(*emit_flush)(struct i915_request *request, u32 mode);
>  #define EMIT_INVALIDATE	BIT(0)
>  #define EMIT_FLUSH	BIT(1)
> +#define EMIT_CCS_FLUSH	BIT(2) /* MTL+ */
>  #define EMIT_BARRIER	(EMIT_INVALIDATE | EMIT_FLUSH)
>  	int		(*emit_bb_start)(struct i915_request *rq,
>  					 u64 offset, u32 length,
> diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> index 5d143e2a8db03..5df7cce23197c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> @@ -299,6 +299,7 @@
>  #define   PIPE_CONTROL_QW_WRITE				(1<<14)
>  #define   PIPE_CONTROL_POST_SYNC_OP_MASK                (3<<14)
>  #define   PIPE_CONTROL_DEPTH_STALL			(1<<13)
> +#define   PIPE_CONTROL_CCS_FLUSH			(1<<13) /* MTL+ */
>  #define   PIPE_CONTROL_WRITE_FLUSH			(1<<12)
>  #define   PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH	(1<<12) /* gen6+ */
>  #define   PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE	(1<<11) /* MBZ on ILK */
> -- 
> 2.40.1
> 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 5/6] drm/i915/gt: Poll aux invalidation register bit on invalidation
  2023-07-17 17:30 ` [PATCH v4 5/6] drm/i915/gt: Poll aux invalidation register bit on invalidation Andi Shyti
  2023-07-17 18:51   ` [Intel-gfx] " Andrzej Hajda
@ 2023-07-17 20:05   ` Matt Roper
  1 sibling, 0 replies; 26+ messages in thread
From: Matt Roper @ 2023-07-17 20:05 UTC (permalink / raw)
  To: Andi Shyti
  Cc: Mika Kuoppala, Intel GFX, Jonathan Cavitt, DRI Devel,
	Chris Wilson, Nirmoy Das

On Mon, Jul 17, 2023 at 07:30:58PM +0200, Andi Shyti wrote:
> From: Jonathan Cavitt <jonathan.cavitt@intel.com>
> 
> For platforms that use Aux CCS, wait for aux invalidation to
> complete by checking the aux invalidation register bit is
> cleared.
> 
> Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: <stable@vger.kernel.org> # v5.8+
> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/gen8_engine_cs.c     | 17 +++++++++++++----
>  drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  1 +
>  2 files changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> index aa2fb9d72745a..fbc70f3b7f2fd 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> @@ -174,6 +174,16 @@ u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t inv
>  	*cs++ = AUX_INV;
>  	*cs++ = MI_NOOP;

We only need qword alignment for sequences of commands, not each
individual command, right?  So technically we could drop this noop...

>  
> +	*cs++ = MI_SEMAPHORE_WAIT_TOKEN |
> +		MI_SEMAPHORE_REGISTER_POLL |
> +		MI_SEMAPHORE_POLL |
> +		MI_SEMAPHORE_SAD_EQ_SDD;
> +	*cs++ = 0;
> +	*cs++ = i915_mmio_reg_offset(inv_reg) + gsi_offset;
> +	*cs++ = 0;
> +	*cs++ = 0;
> +	*cs++ = MI_NOOP;

...and then we wouldn't need an extra one here.

If we drop the pair of noops, that would also change the # of dwords
farther down too.

> +
>  	return cs;
>  }
>  
> @@ -284,10 +294,9 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
>  		else if (engine->class == COMPUTE_CLASS)
>  			flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
>  
> +		count = 8;
>  		if (!HAS_FLAT_CCS(rq->engine->i915))

As noted on the earlier patch, we should probably make this check that
the platform actually has AuxCCS.  

Anyway, up to you whether you want to make that change or not.  The
extra noops don't actually hurt anything.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

> -			count = 8 + 4;
> -		else
> -			count = 8;
> +			count += 10;
>  
>  		cs = intel_ring_begin(rq, count);
>  		if (IS_ERR(cs))
> @@ -330,7 +339,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
>  			aux_inv = rq->engine->mask &
>  				~GENMASK(_BCS(I915_MAX_BCS - 1), BCS0);
>  			if (aux_inv)
> -				cmd += 4;
> +				cmd += 10;
>  		}
>  	}
>  
> diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> index 5df7cce23197c..2bd8d98d21102 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> @@ -121,6 +121,7 @@
>  #define   MI_SEMAPHORE_TARGET(engine)	((engine)<<15)
>  #define MI_SEMAPHORE_WAIT	MI_INSTR(0x1c, 2) /* GEN8+ */
>  #define MI_SEMAPHORE_WAIT_TOKEN	MI_INSTR(0x1c, 3) /* GEN12+ */
> +#define   MI_SEMAPHORE_REGISTER_POLL	(1 << 16)
>  #define   MI_SEMAPHORE_POLL		(1 << 15)
>  #define   MI_SEMAPHORE_SAD_GT_SDD	(0 << 12)
>  #define   MI_SEMAPHORE_SAD_GTE_SDD	(1 << 12)
> -- 
> 2.40.1
> 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 6/6] drm/i915/gt: Support aux invalidation on all engines
  2023-07-17 17:30 ` [PATCH v4 6/6] drm/i915/gt: Support aux invalidation on all engines Andi Shyti
  2023-07-17 19:11   ` [Intel-gfx] " Andrzej Hajda
@ 2023-07-17 20:27   ` Matt Roper
  2023-07-17 22:02     ` Andi Shyti
  1 sibling, 1 reply; 26+ messages in thread
From: Matt Roper @ 2023-07-17 20:27 UTC (permalink / raw)
  To: Andi Shyti
  Cc: Mika Kuoppala, Intel GFX, Jonathan Cavitt, DRI Devel,
	Chris Wilson, Nirmoy Das

On Mon, Jul 17, 2023 at 07:30:59PM +0200, Andi Shyti wrote:
> Perform some refactoring with the purpose of keeping in one
> single place all the operations around the aux table
> invalidation.
> 
> With this refactoring add more engines where the invalidation
> should be performed.
> 
> Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: <stable@vger.kernel.org> # v5.8+
> ---
>  drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 63 +++++++++++++++---------
>  drivers/gpu/drm/i915/gt/gen8_engine_cs.h |  3 +-
>  drivers/gpu/drm/i915/gt/intel_lrc.c      | 17 +------
>  3 files changed, 44 insertions(+), 39 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> index fbc70f3b7f2fd..6d21a1ac06e73 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> @@ -165,7 +165,8 @@ static u32 preparser_disable(bool state)
>  	return MI_ARB_CHECK | 1 << 8 | state;
>  }
>  
> -u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t inv_reg)
> +static u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs,
> +				     const i915_reg_t inv_reg)
>  {
>  	u32 gsi_offset = gt->uncore->gsi_offset;
>  
> @@ -187,6 +188,40 @@ u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t inv
>  	return cs;
>  }
>  
> +static i915_reg_t intel_get_aux_inv_reg(struct intel_engine_cs *engine)

Generally we try to avoid putting "intel_" and "i915_" prefixes on
static functions.

> +{
> +	if (HAS_FLAT_CCS(engine->i915))
> +		return _MMIO(0);
> +
> +	switch (engine->id) {
> +	case RCS0:
> +		return GEN12_CCS_AUX_INV;
> +	case VCS0:
> +		return GEN12_VD0_AUX_INV;
> +	case VCS2:
> +		return GEN12_VD2_AUX_INV;
> +	case VECS0:
> +		return GEN12_VE0_AUX_INV;

We need CCS0 here (0x42c0).  And on graphics versions 12.70 and beyond
we also need BCS0 too (0x4248) since the blitter gained the ability to
interpret CCS compression.

> +	default:
> +		return _MMIO(0);

It might be cleaner to use INVALID_MMIO_REG here (and then check for
i915_mmio_reg_valid() below).

> +	}
> +}
> +
> +static bool intel_engine_has_aux_inv(struct intel_engine_cs *engine)
> +{
> +	i915_reg_t reg = intel_get_aux_inv_reg(engine);
> +
> +	return !!reg.reg;
> +}
> +
> +u32 *intel_emit_aux_table_inv(struct intel_engine_cs *engine, u32 *cs)
> +{
> +	i915_reg_t reg = intel_get_aux_inv_reg(engine);
> +	struct intel_gt *gt = engine->gt;
> +
> +	return reg.reg ? gen12_emit_aux_table_inv(gt, cs, reg) : cs;
> +}

Rather than adding this new wrapper function, can we just do the
register lookup at the top of gen12_emit_aux_table_inv() (and bail out
of that function early if there isn't a valid register)?

Keeping the non-static function as the one with "gen12" in the name also
helps reduce confusion about whether this is something that older
platforms should have been calling as well.


Matt

> +
>  static int mtl_dummy_pipe_control(struct i915_request *rq)
>  {
>  	/* Wa_14016712196 */
> @@ -311,11 +346,7 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
>  
>  		cs = gen8_emit_pipe_control(cs, flags, LRC_PPHWSP_SCRATCH_ADDR);
>  
> -		if (!HAS_FLAT_CCS(rq->engine->i915)) {
> -			/* hsdes: 1809175790 */
> -			cs = gen12_emit_aux_table_inv(rq->engine->gt, cs,
> -						      GEN12_CCS_AUX_INV);
> -		}
> +		cs = intel_emit_aux_table_inv(engine, cs);
>  
>  		*cs++ = preparser_disable(false);
>  		intel_ring_advance(rq, cs);
> @@ -326,21 +357,14 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
>  
>  int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
>  {
> -	intel_engine_mask_t aux_inv = 0;
>  	u32 cmd, *cs;
>  
>  	cmd = 4;
>  	if (mode & EMIT_INVALIDATE) {
>  		cmd += 2;
>  
> -		if (!HAS_FLAT_CCS(rq->engine->i915) &&
> -		    (rq->engine->class == VIDEO_DECODE_CLASS ||
> -		     rq->engine->class == VIDEO_ENHANCEMENT_CLASS)) {
> -			aux_inv = rq->engine->mask &
> -				~GENMASK(_BCS(I915_MAX_BCS - 1), BCS0);
> -			if (aux_inv)
> -				cmd += 10;
> -		}
> +		if (intel_engine_has_aux_inv(rq->engine))
> +			cmd += 10;
>  	}
>  
>  	cs = intel_ring_begin(rq, cmd);
> @@ -371,14 +395,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
>  	*cs++ = 0; /* upper addr */
>  	*cs++ = 0; /* value */
>  
> -	if (aux_inv) { /* hsdes: 1809175790 */
> -		if (rq->engine->class == VIDEO_DECODE_CLASS)
> -			cs = gen12_emit_aux_table_inv(rq->engine->gt,
> -						      cs, GEN12_VD0_AUX_INV);
> -		else
> -			cs = gen12_emit_aux_table_inv(rq->engine->gt,
> -						      cs, GEN12_VE0_AUX_INV);
> -	}
> +	cs = intel_emit_aux_table_inv(rq->engine, cs);
>  
>  	if (mode & EMIT_INVALIDATE)
>  		*cs++ = preparser_disable(false);
> diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.h b/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
> index 655e5c00ddc27..d938c94524510 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
> +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
> @@ -13,6 +13,7 @@
>  #include "intel_gt_regs.h"
>  #include "intel_gpu_commands.h"
>  
> +struct intel_engine_cs;
>  struct intel_gt;
>  struct i915_request;
>  
> @@ -46,7 +47,7 @@ u32 *gen8_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs);
>  u32 *gen11_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs);
>  u32 *gen12_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs);
>  
> -u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t inv_reg);
> +u32 *intel_emit_aux_table_inv(struct intel_engine_cs *engine, u32 *cs);
>  
>  static inline u32 *
>  __gen8_emit_pipe_control(u32 *batch, u32 flags0, u32 flags1, u32 offset)
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 235f3fab60a98..70054767c88c3 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -1371,10 +1371,7 @@ gen12_emit_indirect_ctx_rcs(const struct intel_context *ce, u32 *cs)
>  	    IS_DG2_G11(ce->engine->i915))
>  		cs = gen8_emit_pipe_control(cs, PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE, 0);
>  
> -	/* hsdes: 1809175790 */
> -	if (!HAS_FLAT_CCS(ce->engine->i915))
> -		cs = gen12_emit_aux_table_inv(ce->engine->gt,
> -					      cs, GEN12_CCS_AUX_INV);
> +	cs = intel_emit_aux_table_inv(ce->engine, cs);
>  
>  	/* Wa_16014892111 */
>  	if (IS_MTL_GRAPHICS_STEP(ce->engine->i915, M, STEP_A0, STEP_B0) ||
> @@ -1399,17 +1396,7 @@ gen12_emit_indirect_ctx_xcs(const struct intel_context *ce, u32 *cs)
>  						    PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE,
>  						    0);
>  
> -	/* hsdes: 1809175790 */
> -	if (!HAS_FLAT_CCS(ce->engine->i915)) {
> -		if (ce->engine->class == VIDEO_DECODE_CLASS)
> -			cs = gen12_emit_aux_table_inv(ce->engine->gt,
> -						      cs, GEN12_VD0_AUX_INV);
> -		else if (ce->engine->class == VIDEO_ENHANCEMENT_CLASS)
> -			cs = gen12_emit_aux_table_inv(ce->engine->gt,
> -						      cs, GEN12_VE0_AUX_INV);
> -	}
> -
> -	return cs;
> +	return intel_emit_aux_table_inv(ce->engine, cs);
>  }
>  
>  static void
> -- 
> 2.40.1
> 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 2/6] drm/i915/gt: Ensure memory quiesced before invalidation
  2023-07-17 17:54   ` Matt Roper
@ 2023-07-17 20:31     ` Matt Roper
  2023-07-17 21:52       ` Andi Shyti
  0 siblings, 1 reply; 26+ messages in thread
From: Matt Roper @ 2023-07-17 20:31 UTC (permalink / raw)
  To: Andi Shyti
  Cc: Mika Kuoppala, Intel GFX, Jonathan Cavitt, DRI Devel,
	Chris Wilson, Nirmoy Das

On Mon, Jul 17, 2023 at 10:54:37AM -0700, Matt Roper wrote:
> On Mon, Jul 17, 2023 at 07:30:55PM +0200, Andi Shyti wrote:
> > From: Jonathan Cavitt <jonathan.cavitt@intel.com>
> > 
> > All memory traffic must be quiesced before requesting
> > an aux invalidation on platforms that use Aux CCS.
> > 
> > Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
> > Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> > Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> > Cc: <stable@vger.kernel.org> # v5.8+
> > ---
> >  drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> > index 563efee055602..bee3b7dc595cf 100644
> > --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> > @@ -202,6 +202,13 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
> >  {
> >  	struct intel_engine_cs *engine = rq->engine;
> >  
> > +	/*
> > +	 * Aux invalidations on Aux CCS platforms require
> > +	 * memory traffic is quiesced prior.
> > +	 */
> > +	if ((mode & EMIT_INVALIDATE) && !HAS_FLAT_CCS(engine->i915))
> 
> It's a pre-existing mistake in drm-tip at the moment, but we shouldn't
> assume !flatccs always implies auxccs.  PVC has neither, and there may
> be other similar platforms in the future.  We should probably add a
> helper function for AuxCCS, similar to what we added to the Xe driver
> recently:
> 
> https://patchwork.freedesktop.org/patch/539304/?series=118334&rev=1
> 

BTW, since this patch didn't handle it I was expecting to see another
patch in the series that quiesces memory for the non-RCS/CCS engines,
but it looks like there isn't one yet.  So we should probably add the
necessary MI_FLUSH_DW logic for the other engines to this patch as well.


Matt

> 
> Matt
> 
> 
> > +		mode |= EMIT_FLUSH;
> > +
> >  	if (mode & EMIT_FLUSH) {
> >  		u32 flags = 0;
> >  		int err;
> > -- 
> > 2.40.1
> > 
> 
> -- 
> Matt Roper
> Graphics Software Engineer
> Linux GPU Platform Enablement
> Intel Corporation

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 2/6] drm/i915/gt: Ensure memory quiesced before invalidation
  2023-07-17 20:31     ` Matt Roper
@ 2023-07-17 21:52       ` Andi Shyti
  2023-07-17 22:00         ` Matt Roper
  0 siblings, 1 reply; 26+ messages in thread
From: Andi Shyti @ 2023-07-17 21:52 UTC (permalink / raw)
  To: Matt Roper
  Cc: Mika Kuoppala, Intel GFX, Jonathan Cavitt, DRI Devel,
	Chris Wilson, Andi Shyti, Nirmoy Das

Hi Matt,

On Mon, Jul 17, 2023 at 01:31:03PM -0700, Matt Roper wrote:
> On Mon, Jul 17, 2023 at 10:54:37AM -0700, Matt Roper wrote:
> > On Mon, Jul 17, 2023 at 07:30:55PM +0200, Andi Shyti wrote:
> > > From: Jonathan Cavitt <jonathan.cavitt@intel.com>
> > > 
> > > All memory traffic must be quiesced before requesting
> > > an aux invalidation on platforms that use Aux CCS.
> > > 
> > > Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
> > > Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> > > Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> > > Cc: <stable@vger.kernel.org> # v5.8+
> > > ---
> > >  drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 7 +++++++
> > >  1 file changed, 7 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> > > index 563efee055602..bee3b7dc595cf 100644
> > > --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> > > +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> > > @@ -202,6 +202,13 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
> > >  {
> > >  	struct intel_engine_cs *engine = rq->engine;
> > >  
> > > +	/*
> > > +	 * Aux invalidations on Aux CCS platforms require
> > > +	 * memory traffic is quiesced prior.
> > > +	 */
> > > +	if ((mode & EMIT_INVALIDATE) && !HAS_FLAT_CCS(engine->i915))
> > 
> > It's a pre-existing mistake in drm-tip at the moment, but we shouldn't
> > assume !flatccs always implies auxccs.  PVC has neither, and there may
> > be other similar platforms in the future.  We should probably add a
> > helper function for AuxCCS, similar to what we added to the Xe driver
> > recently:
> > 
> > https://patchwork.freedesktop.org/patch/539304/?series=118334&rev=1

Currently that is done in patch 6...

> BTW, since this patch didn't handle it I was expecting to see another
> patch in the series that quiesces memory for the non-RCS/CCS engines,
> but it looks like there isn't one yet.  So we should probably add the
> necessary MI_FLUSH_DW logic for the other engines to this patch as well.

... where also other engines are handles as well. I left this
patch as it is in order to preserve the authorship and it's
original form.

Maybe in patch 6 I can add the extra check for PVC as you did for
XE.

Thanks,
Andi

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Intel-gfx] [PATCH v4 3/6] drm/i915/gt: Rename flags with bit_group_X according to the datasheet
  2023-07-17 18:21   ` [Intel-gfx] " Andrzej Hajda
@ 2023-07-17 21:54     ` Andi Shyti
  0 siblings, 0 replies; 26+ messages in thread
From: Andi Shyti @ 2023-07-17 21:54 UTC (permalink / raw)
  To: Andrzej Hajda
  Cc: Mika Kuoppala, Intel GFX, Jonathan Cavitt, DRI Devel,
	Chris Wilson, Andi Shyti, Matt Roper, Nirmoy Das

Hi Andrzej,

On Mon, Jul 17, 2023 at 08:21:40PM +0200, Andrzej Hajda wrote:
> On 17.07.2023 19:30, Andi Shyti wrote:
> > In preparation of the next patch allign with the datasheet (BSPEC
> > 47112) with the naming of the pipe control set of flag values.
> > The variable "flags" in gen12_emit_flush_rcs() is applied as a
> > set of flags called Bit Group 1.
> > 
> > Define also the Bit Group 0 as bit_group_0 where currently only
> > PIPE_CONTROL0_HDC_PIPELINE_FLUSH bit is set.
> > 
> > Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> > Cc: <stable@vger.kernel.org> # v5.8+
> > ---
> >   drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 34 +++++++++++++-----------
> >   1 file changed, 18 insertions(+), 16 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> > index bee3b7dc595cf..3c935d6b68bf0 100644
> > --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> > @@ -210,7 +210,8 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
> >   		mode |= EMIT_FLUSH;
> >   	if (mode & EMIT_FLUSH) {
> > -		u32 flags = 0;
> > +		u32 bit_group_0 = 0;
> > +		u32 bit_group_1 = 0;
> 
> For me flags0 and flags1 seems more readable as in
> *gen8_emit_pipe_control(batch, flags0, flags1, offset), but no strong
> feelings, but if bit_group is chosen arguments of *gen8_emit_pipe_control
> could be changed as well.

yes, I thought to update all of them, as well for consistency. I
like the name bit_group_0/1 because it's similar to the
datasheet.

I had some confusion about it earlier and I think this can help
to improve readability.

Will update all the other functions, as well.

> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>

Thanks!

Andi

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Intel-gfx] [PATCH v4 6/6] drm/i915/gt: Support aux invalidation on all engines
  2023-07-17 19:11   ` [Intel-gfx] " Andrzej Hajda
@ 2023-07-17 22:00     ` Andi Shyti
  0 siblings, 0 replies; 26+ messages in thread
From: Andi Shyti @ 2023-07-17 22:00 UTC (permalink / raw)
  To: Andrzej Hajda
  Cc: Mika Kuoppala, Intel GFX, Jonathan Cavitt, DRI Devel,
	Chris Wilson, Andi Shyti, Matt Roper, Nirmoy Das

Hi Andrzej,

On Mon, Jul 17, 2023 at 09:11:26PM +0200, Andrzej Hajda wrote:
> On 17.07.2023 19:30, Andi Shyti wrote:
> > Perform some refactoring with the purpose of keeping in one
> > single place all the operations around the aux table
> > invalidation.
> > 
> > With this refactoring add more engines where the invalidation
> > should be performed.
> > 
> > Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
> > Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> > Cc: <stable@vger.kernel.org> # v5.8+
> > ---
> >   drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 63 +++++++++++++++---------
> >   drivers/gpu/drm/i915/gt/gen8_engine_cs.h |  3 +-
> >   drivers/gpu/drm/i915/gt/intel_lrc.c      | 17 +------
> >   3 files changed, 44 insertions(+), 39 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> > index fbc70f3b7f2fd..6d21a1ac06e73 100644
> > --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> > @@ -165,7 +165,8 @@ static u32 preparser_disable(bool state)
> >   	return MI_ARB_CHECK | 1 << 8 | state;
> >   }
> > -u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t inv_reg)
> > +static u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs,
> > +				     const i915_reg_t inv_reg)
> >   {
> >   	u32 gsi_offset = gt->uncore->gsi_offset;
> > @@ -187,6 +188,40 @@ u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t inv
> >   	return cs;
> >   }
> > +static i915_reg_t intel_get_aux_inv_reg(struct intel_engine_cs *engine)
> > +{
> > +	if (HAS_FLAT_CCS(engine->i915))
> > +		return _MMIO(0);
> 
> Why not INVALID_MMIO_REG ? Here and below.
> 
> > +
> > +	switch (engine->id) {
> > +	case RCS0:
> > +		return GEN12_CCS_AUX_INV;
> > +	case VCS0:
> > +		return GEN12_VD0_AUX_INV;
> > +	case VCS2:
> > +		return GEN12_VD2_AUX_INV;
> > +	case VECS0:
> > +		return GEN12_VE0_AUX_INV;
> > +	default:
> > +		return _MMIO(0);
> > +	}
> > +}
> > +
> > +static bool intel_engine_has_aux_inv(struct intel_engine_cs *engine)
> > +{
> > +	i915_reg_t reg = intel_get_aux_inv_reg(engine);
> > +
> > +	return !!reg.reg;
> 
>  return i915_mmio_reg_valid(intel_get_aux_inv_reg(engine));
> 
> > +}
> > +
> > +u32 *intel_emit_aux_table_inv(struct intel_engine_cs *engine, u32 *cs)
> > +{
> > +	i915_reg_t reg = intel_get_aux_inv_reg(engine);
> > +	struct intel_gt *gt = engine->gt;
> > +
> > +	return reg.reg ? gen12_emit_aux_table_inv(gt, cs, reg) : cs;
> > +}
> > +
> 
> I am not sure about prefixes, IMHO gen12_ instead of intel_ is more adequate
> as this is only gen12 feature, works only on gen12, and is called from gen12
> context, up to you. In any case we can squash intel_emit_aux_table_inv and
> gen12_emit_aux_table_inv into one function, am I right?

you and Matt have made exactly the same comments... will fix
them, Thank you!

Andi

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 2/6] drm/i915/gt: Ensure memory quiesced before invalidation
  2023-07-17 21:52       ` Andi Shyti
@ 2023-07-17 22:00         ` Matt Roper
  2023-07-18  0:28           ` Andi Shyti
  0 siblings, 1 reply; 26+ messages in thread
From: Matt Roper @ 2023-07-17 22:00 UTC (permalink / raw)
  To: Andi Shyti
  Cc: Mika Kuoppala, Intel GFX, Jonathan Cavitt, DRI Devel,
	Chris Wilson, Nirmoy Das

On Mon, Jul 17, 2023 at 11:52:25PM +0200, Andi Shyti wrote:
> Hi Matt,
> 
> On Mon, Jul 17, 2023 at 01:31:03PM -0700, Matt Roper wrote:
> > On Mon, Jul 17, 2023 at 10:54:37AM -0700, Matt Roper wrote:
> > > On Mon, Jul 17, 2023 at 07:30:55PM +0200, Andi Shyti wrote:
> > > > From: Jonathan Cavitt <jonathan.cavitt@intel.com>
> > > > 
> > > > All memory traffic must be quiesced before requesting
> > > > an aux invalidation on platforms that use Aux CCS.
> > > > 
> > > > Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
> > > > Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> > > > Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> > > > Cc: <stable@vger.kernel.org> # v5.8+
> > > > ---
> > > >  drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 7 +++++++
> > > >  1 file changed, 7 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> > > > index 563efee055602..bee3b7dc595cf 100644
> > > > --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> > > > +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> > > > @@ -202,6 +202,13 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
> > > >  {
> > > >  	struct intel_engine_cs *engine = rq->engine;
> > > >  
> > > > +	/*
> > > > +	 * Aux invalidations on Aux CCS platforms require
> > > > +	 * memory traffic is quiesced prior.
> > > > +	 */
> > > > +	if ((mode & EMIT_INVALIDATE) && !HAS_FLAT_CCS(engine->i915))
> > > 
> > > It's a pre-existing mistake in drm-tip at the moment, but we shouldn't
> > > assume !flatccs always implies auxccs.  PVC has neither, and there may
> > > be other similar platforms in the future.  We should probably add a
> > > helper function for AuxCCS, similar to what we added to the Xe driver
> > > recently:
> > > 
> > > https://patchwork.freedesktop.org/patch/539304/?series=118334&rev=1
> 
> Currently that is done in patch 6...

Are you sure?  Patch #6 consolidates things a bit, but is still incorrectly
assuming flatccs = !auxccs:

       if (HAS_FLAT_CCS(engine->i915))                                                                                                            
               return _MMIO(0);                                                                                                                   

> 
> > BTW, since this patch didn't handle it I was expecting to see another
> > patch in the series that quiesces memory for the non-RCS/CCS engines,
> > but it looks like there isn't one yet.  So we should probably add the
> > necessary MI_FLUSH_DW logic for the other engines to this patch as well.
> 
> ... where also other engines are handles as well. I left this
> patch as it is in order to preserve the authorship and it's
> original form.

I don't see it being handled in patch #6.  That performs invalidation on
more engines than we were before, but it doesn't add the missing quiesce
logic as far as I can see.


Matt

> 
> Maybe in patch 6 I can add the extra check for PVC as you did for
> XE.
> 
> Thanks,
> Andi

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 6/6] drm/i915/gt: Support aux invalidation on all engines
  2023-07-17 20:27   ` Matt Roper
@ 2023-07-17 22:02     ` Andi Shyti
  0 siblings, 0 replies; 26+ messages in thread
From: Andi Shyti @ 2023-07-17 22:02 UTC (permalink / raw)
  To: Matt Roper
  Cc: Mika Kuoppala, Intel GFX, Jonathan Cavitt, DRI Devel,
	Chris Wilson, Andi Shyti, Nirmoy Das

Hi Matt,

On Mon, Jul 17, 2023 at 01:27:09PM -0700, Matt Roper wrote:
> On Mon, Jul 17, 2023 at 07:30:59PM +0200, Andi Shyti wrote:
> > Perform some refactoring with the purpose of keeping in one
> > single place all the operations around the aux table
> > invalidation.
> > 
> > With this refactoring add more engines where the invalidation
> > should be performed.
> > 
> > Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
> > Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> > Cc: <stable@vger.kernel.org> # v5.8+
> > ---
> >  drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 63 +++++++++++++++---------
> >  drivers/gpu/drm/i915/gt/gen8_engine_cs.h |  3 +-
> >  drivers/gpu/drm/i915/gt/intel_lrc.c      | 17 +------
> >  3 files changed, 44 insertions(+), 39 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> > index fbc70f3b7f2fd..6d21a1ac06e73 100644
> > --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> > @@ -165,7 +165,8 @@ static u32 preparser_disable(bool state)
> >  	return MI_ARB_CHECK | 1 << 8 | state;
> >  }
> >  
> > -u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t inv_reg)
> > +static u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs,
> > +				     const i915_reg_t inv_reg)
> >  {
> >  	u32 gsi_offset = gt->uncore->gsi_offset;
> >  
> > @@ -187,6 +188,40 @@ u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t inv
> >  	return cs;
> >  }
> >  
> > +static i915_reg_t intel_get_aux_inv_reg(struct intel_engine_cs *engine)
> 
> Generally we try to avoid putting "intel_" and "i915_" prefixes on
> static functions.
> 
> > +{
> > +	if (HAS_FLAT_CCS(engine->i915))
> > +		return _MMIO(0);
> > +
> > +	switch (engine->id) {
> > +	case RCS0:
> > +		return GEN12_CCS_AUX_INV;
> > +	case VCS0:
> > +		return GEN12_VD0_AUX_INV;
> > +	case VCS2:
> > +		return GEN12_VD2_AUX_INV;
> > +	case VECS0:
> > +		return GEN12_VE0_AUX_INV;
> 
> We need CCS0 here (0x42c0).  And on graphics versions 12.70 and beyond
> we also need BCS0 too (0x4248) since the blitter gained the ability to
> interpret CCS compression.

Thanks, will add.

> > +	default:
> > +		return _MMIO(0);
> 
> It might be cleaner to use INVALID_MMIO_REG here (and then check for
> i915_mmio_reg_valid() below).
> 
> > +	}
> > +}
> > +
> > +static bool intel_engine_has_aux_inv(struct intel_engine_cs *engine)
> > +{
> > +	i915_reg_t reg = intel_get_aux_inv_reg(engine);
> > +
> > +	return !!reg.reg;
> > +}
> > +
> > +u32 *intel_emit_aux_table_inv(struct intel_engine_cs *engine, u32 *cs)
> > +{
> > +	i915_reg_t reg = intel_get_aux_inv_reg(engine);
> > +	struct intel_gt *gt = engine->gt;
> > +
> > +	return reg.reg ? gen12_emit_aux_table_inv(gt, cs, reg) : cs;
> > +}
> 
> Rather than adding this new wrapper function, can we just do the
> register lookup at the top of gen12_emit_aux_table_inv() (and bail out
> of that function early if there isn't a valid register)?
> 
> Keeping the non-static function as the one with "gen12" in the name also
> helps reduce confusion about whether this is something that older
> platforms should have been calling as well.

You and Andrzej have made the same comments, will fix them.

BTW, this set of functions are doing more or less a similar thing
to what you have done here[*]. I will add the PVC flag.

Thank you!
Andi

[*] https://patchwork.freedesktop.org/patch/539304/?series=118334&rev=1

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 2/6] drm/i915/gt: Ensure memory quiesced before invalidation
  2023-07-17 22:00         ` Matt Roper
@ 2023-07-18  0:28           ` Andi Shyti
  2023-07-18 15:53             ` Matt Roper
  0 siblings, 1 reply; 26+ messages in thread
From: Andi Shyti @ 2023-07-18  0:28 UTC (permalink / raw)
  To: Matt Roper
  Cc: Mika Kuoppala, Intel GFX, Jonathan Cavitt, DRI Devel,
	Chris Wilson, Andi Shyti, Nirmoy Das

Hi Matt,

> > > > > +	/*
> > > > > +	 * Aux invalidations on Aux CCS platforms require
> > > > > +	 * memory traffic is quiesced prior.
> > > > > +	 */
> > > > > +	if ((mode & EMIT_INVALIDATE) && !HAS_FLAT_CCS(engine->i915))
> > > > 
> > > > It's a pre-existing mistake in drm-tip at the moment, but we shouldn't
> > > > assume !flatccs always implies auxccs.  PVC has neither, and there may
> > > > be other similar platforms in the future.  We should probably add a
> > > > helper function for AuxCCS, similar to what we added to the Xe driver
> > > > recently:
> > > > 
> > > > https://patchwork.freedesktop.org/patch/539304/?series=118334&rev=1
> > 
> > Currently that is done in patch 6...
> 
> Are you sure?  Patch #6 consolidates things a bit, but is still incorrectly
> assuming flatccs = !auxccs:
> 
>        if (HAS_FLAT_CCS(engine->i915))                                                                                                            
>                return _MMIO(0);                                                                                                                   

But isn't it the same the patch you linked is doing?

	return !xe->info.has_flat_ccs;

And

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v4 2/6] drm/i915/gt: Ensure memory quiesced before invalidation
  2023-07-18  0:28           ` Andi Shyti
@ 2023-07-18 15:53             ` Matt Roper
  0 siblings, 0 replies; 26+ messages in thread
From: Matt Roper @ 2023-07-18 15:53 UTC (permalink / raw)
  To: Andi Shyti
  Cc: Mika Kuoppala, Intel GFX, Jonathan Cavitt, DRI Devel,
	Chris Wilson, Nirmoy Das

On Tue, Jul 18, 2023 at 02:28:26AM +0200, Andi Shyti wrote:
> Hi Matt,
> 
> > > > > > +	/*
> > > > > > +	 * Aux invalidations on Aux CCS platforms require
> > > > > > +	 * memory traffic is quiesced prior.
> > > > > > +	 */
> > > > > > +	if ((mode & EMIT_INVALIDATE) && !HAS_FLAT_CCS(engine->i915))
> > > > > 
> > > > > It's a pre-existing mistake in drm-tip at the moment, but we shouldn't
> > > > > assume !flatccs always implies auxccs.  PVC has neither, and there may
> > > > > be other similar platforms in the future.  We should probably add a
> > > > > helper function for AuxCCS, similar to what we added to the Xe driver
> > > > > recently:
> > > > > 
> > > > > https://patchwork.freedesktop.org/patch/539304/?series=118334&rev=1
> > > 
> > > Currently that is done in patch 6...
> > 
> > Are you sure?  Patch #6 consolidates things a bit, but is still incorrectly
> > assuming flatccs = !auxccs:
> > 
> >        if (HAS_FLAT_CCS(engine->i915))                                                                                                            
> >                return _MMIO(0);                                                                                                                   
> 
> But isn't it the same the patch you linked is doing?
> 
> 	return !xe->info.has_flat_ccs;

No, that's just the end of the function.  The important
platform-specific checks come before that point (at the moment we only
have PVC, but we expect more platforms to be added there very soon too).


Matt

> 
> And

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2023-07-18 15:53 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-17 17:30 [PATCH v4 0/6] Update AUX invalidation sequence Andi Shyti
2023-07-17 17:30 ` [PATCH v4 1/6] drm/i915/gt: Cleanup aux invalidation registers Andi Shyti
2023-07-17 17:30 ` [PATCH v4 2/6] drm/i915/gt: Ensure memory quiesced before invalidation Andi Shyti
2023-07-17 17:54   ` Matt Roper
2023-07-17 20:31     ` Matt Roper
2023-07-17 21:52       ` Andi Shyti
2023-07-17 22:00         ` Matt Roper
2023-07-18  0:28           ` Andi Shyti
2023-07-18 15:53             ` Matt Roper
2023-07-17 17:30 ` [PATCH v4 3/6] drm/i915/gt: Rename flags with bit_group_X according to the datasheet Andi Shyti
2023-07-17 17:32   ` Andi Shyti
2023-07-17 17:59   ` Matt Roper
2023-07-17 18:21   ` [Intel-gfx] " Andrzej Hajda
2023-07-17 21:54     ` Andi Shyti
2023-07-17 18:45   ` Nirmoy Das
2023-07-17 17:30 ` [PATCH v4 4/6] drm/i915/gt: Enable the CCS_FLUSH bit in the pipe control Andi Shyti
2023-07-17 18:23   ` [Intel-gfx] " Andrzej Hajda
2023-07-17 19:40   ` Matt Roper
2023-07-17 17:30 ` [PATCH v4 5/6] drm/i915/gt: Poll aux invalidation register bit on invalidation Andi Shyti
2023-07-17 18:51   ` [Intel-gfx] " Andrzej Hajda
2023-07-17 20:05   ` Matt Roper
2023-07-17 17:30 ` [PATCH v4 6/6] drm/i915/gt: Support aux invalidation on all engines Andi Shyti
2023-07-17 19:11   ` [Intel-gfx] " Andrzej Hajda
2023-07-17 22:00     ` Andi Shyti
2023-07-17 20:27   ` Matt Roper
2023-07-17 22:02     ` Andi Shyti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).