All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/12] Random assortment of (mostly) GuC related patches
@ 2022-07-12 23:31 ` John.C.Harrison
  0 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: John Harrison, DRI-Devel

From: John Harrison <John.C.Harrison@Intel.com>

Pushing a bunch of patches which had gotten forgotten about.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>


Alan Previn (1):
  drm/i915/guc: Add a helper for log buffer size

Chris Wilson (1):
  drm/i915/guc: Use streaming loads to speed up dumping the guc log

John Harrison (4):
  drm/i915/guc: Add GuC <-> kernel time stamp translation information
  drm/i915/guc: Record CTB info in error logs
  drm/i915/selftest: Cope with not having an RCS engine
  drm/i915/guc: Don't abort on CTB_UNUSED status

Matthew Brost (4):
  drm/i915: Remove bogus GEM_BUG_ON in unpark
  drm/i915/guc: Don't call ring_is_idle in GuC submission
  drm/i915/guc: Fix issues with live_preempt_cancel
  drm/i915/guc: Support larger contexts on newer hardware

Michał Winiarski (1):
  drm/i915/guc: Route semaphores to GuC for Gen12+

Rahul Kumar Singh (1):
  drm/i915/guc: Add selftest for a hung GuC

 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  13 ++
 drivers/gpu/drm/i915/gt/intel_engine_pm.c     |   2 -
 drivers/gpu/drm/i915/gt/intel_gt_regs.h       |   2 +
 drivers/gpu/drm/i915/gt/selftest_execlists.c  |  16 +-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |  12 +-
 .../gt/uc/abi/guc_communication_ctb_abi.h     |   8 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc.c        |  19 +++
 drivers/gpu/drm/i915/gt/uc/intel_guc.h        |   2 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c    |  10 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c     |  18 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_log.c    |  75 ++++++---
 drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h    |   4 +
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  15 ++
 .../drm/i915/gt/uc/selftest_guc_hangcheck.c   | 159 ++++++++++++++++++
 drivers/gpu/drm/i915/i915_debugfs.c           |   6 +-
 drivers/gpu/drm/i915/i915_drv.h               |   2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c         |  67 +++++++-
 drivers/gpu/drm/i915/i915_gpu_error.h         |  21 ++-
 .../drm/i915/selftests/i915_live_selftests.h  |   1 +
 19 files changed, 393 insertions(+), 59 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gt/uc/selftest_guc_hangcheck.c

-- 
2.36.0


^ permalink raw reply	[flat|nested] 56+ messages in thread

* [Intel-gfx] [PATCH 00/12] Random assortment of (mostly) GuC related patches
@ 2022-07-12 23:31 ` John.C.Harrison
  0 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: DRI-Devel

From: John Harrison <John.C.Harrison@Intel.com>

Pushing a bunch of patches which had gotten forgotten about.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>


Alan Previn (1):
  drm/i915/guc: Add a helper for log buffer size

Chris Wilson (1):
  drm/i915/guc: Use streaming loads to speed up dumping the guc log

John Harrison (4):
  drm/i915/guc: Add GuC <-> kernel time stamp translation information
  drm/i915/guc: Record CTB info in error logs
  drm/i915/selftest: Cope with not having an RCS engine
  drm/i915/guc: Don't abort on CTB_UNUSED status

Matthew Brost (4):
  drm/i915: Remove bogus GEM_BUG_ON in unpark
  drm/i915/guc: Don't call ring_is_idle in GuC submission
  drm/i915/guc: Fix issues with live_preempt_cancel
  drm/i915/guc: Support larger contexts on newer hardware

Michał Winiarski (1):
  drm/i915/guc: Route semaphores to GuC for Gen12+

Rahul Kumar Singh (1):
  drm/i915/guc: Add selftest for a hung GuC

 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  13 ++
 drivers/gpu/drm/i915/gt/intel_engine_pm.c     |   2 -
 drivers/gpu/drm/i915/gt/intel_gt_regs.h       |   2 +
 drivers/gpu/drm/i915/gt/selftest_execlists.c  |  16 +-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |  12 +-
 .../gt/uc/abi/guc_communication_ctb_abi.h     |   8 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc.c        |  19 +++
 drivers/gpu/drm/i915/gt/uc/intel_guc.h        |   2 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c    |  10 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c     |  18 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_log.c    |  75 ++++++---
 drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h    |   4 +
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  15 ++
 .../drm/i915/gt/uc/selftest_guc_hangcheck.c   | 159 ++++++++++++++++++
 drivers/gpu/drm/i915/i915_debugfs.c           |   6 +-
 drivers/gpu/drm/i915/i915_drv.h               |   2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c         |  67 +++++++-
 drivers/gpu/drm/i915/i915_gpu_error.h         |  21 ++-
 .../drm/i915/selftests/i915_live_selftests.h  |   1 +
 19 files changed, 393 insertions(+), 59 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gt/uc/selftest_guc_hangcheck.c

-- 
2.36.0


^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH 01/12] drm/i915: Remove bogus GEM_BUG_ON in unpark
  2022-07-12 23:31 ` [Intel-gfx] " John.C.Harrison
@ 2022-07-12 23:31   ` John.C.Harrison
  -1 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: Matthew Brost, DRI-Devel

From: Matthew Brost <matthew.brost@intel.com>

Remove bogus GEM_BUG_ON which compared kernel context timeline seqno to
seqno in memory on engine PM unpark. If a GT reset occurred these values
might not match as a kernel context could be skipped. This bug was
hidden by always switching to a kernel context on park (execlists
requirement).

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_pm.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index b0a4a2dbe3ee9..fb3e1599d04ec 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -68,8 +68,6 @@ static int __engine_unpark(struct intel_wakeref *wf)
 			 ce->timeline->seqno,
 			 READ_ONCE(*ce->timeline->hwsp_seqno),
 			 ce->ring->emit);
-		GEM_BUG_ON(ce->timeline->seqno !=
-			   READ_ONCE(*ce->timeline->hwsp_seqno));
 	}
 
 	if (engine->unpark)
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Intel-gfx] [PATCH 01/12] drm/i915: Remove bogus GEM_BUG_ON in unpark
@ 2022-07-12 23:31   ` John.C.Harrison
  0 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: DRI-Devel

From: Matthew Brost <matthew.brost@intel.com>

Remove bogus GEM_BUG_ON which compared kernel context timeline seqno to
seqno in memory on engine PM unpark. If a GT reset occurred these values
might not match as a kernel context could be skipped. This bug was
hidden by always switching to a kernel context on park (execlists
requirement).

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_pm.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index b0a4a2dbe3ee9..fb3e1599d04ec 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -68,8 +68,6 @@ static int __engine_unpark(struct intel_wakeref *wf)
 			 ce->timeline->seqno,
 			 READ_ONCE(*ce->timeline->hwsp_seqno),
 			 ce->ring->emit);
-		GEM_BUG_ON(ce->timeline->seqno !=
-			   READ_ONCE(*ce->timeline->hwsp_seqno));
 	}
 
 	if (engine->unpark)
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 02/12] drm/i915/guc: Don't call ring_is_idle in GuC submission
  2022-07-12 23:31 ` [Intel-gfx] " John.C.Harrison
@ 2022-07-12 23:31   ` John.C.Harrison
  -1 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: Matthew Brost, DRI-Devel

From: Matthew Brost <matthew.brost@intel.com>

The engine registers really shouldn't be touched during GuC submission
as the GuC owns the registers. Don't call ring_is_idle and tie
intel_engine_is_idle strictly to the engine pm.

Because intel_engine_is_idle tied to the engine pm, retire requests
before checking intel_engines_are_idle in gt_drop_caches, and lastly
increase the timeout in gt_drop_caches for the intel_engines_are_idle
check.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 13 +++++++++++++
 drivers/gpu/drm/i915/i915_debugfs.c       |  6 +++---
 drivers/gpu/drm/i915/i915_drv.h           |  2 +-
 3 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 283870c659911..959a7c92e8f4d 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1602,6 +1602,9 @@ static bool ring_is_idle(struct intel_engine_cs *engine)
 {
 	bool idle = true;
 
+	/* GuC submission shouldn't access HEAD & TAIL via MMIO */
+	GEM_BUG_ON(intel_engine_uses_guc(engine));
+
 	if (I915_SELFTEST_ONLY(!engine->mmio_base))
 		return true;
 
@@ -1668,6 +1671,16 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
 	if (!i915_sched_engine_is_empty(engine->sched_engine))
 		return false;
 
+	/*
+	 * We shouldn't touch engine registers with GuC submission as the GuC
+	 * owns the registers. Let's tie the idle to engine pm, at worst this
+	 * function sometimes will falsely report non-idle when idle during the
+	 * delay to retire requests or with virtual engines and a request
+	 * running on another instance within the same class / submit mask.
+	 */
+	if (intel_engine_uses_guc(engine))
+		return false;
+
 	/* Ring stopped? */
 	return ring_is_idle(engine);
 }
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 94e5c29d2ee3a..ee5334840e9cb 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -654,13 +654,13 @@ gt_drop_caches(struct intel_gt *gt, u64 val)
 {
 	int ret;
 
+	if (val & DROP_RETIRE || val & DROP_RESET_ACTIVE)
+		intel_gt_retire_requests(gt);
+
 	if (val & DROP_RESET_ACTIVE &&
 	    wait_for(intel_engines_are_idle(gt), I915_IDLE_ENGINES_TIMEOUT))
 		intel_gt_set_wedged(gt);
 
-	if (val & DROP_RETIRE)
-		intel_gt_retire_requests(gt);
-
 	if (val & (DROP_IDLE | DROP_ACTIVE)) {
 		ret = intel_gt_wait_for_idle(gt, MAX_SCHEDULE_TIMEOUT);
 		if (ret)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c22f29c3faa0e..53c7474dde495 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -278,7 +278,7 @@ struct i915_gem_mm {
 	u32 shrink_count;
 };
 
-#define I915_IDLE_ENGINES_TIMEOUT (200) /* in ms */
+#define I915_IDLE_ENGINES_TIMEOUT (500) /* in ms */
 
 unsigned long i915_fence_context_timeout(const struct drm_i915_private *i915,
 					 u64 context);
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Intel-gfx] [PATCH 02/12] drm/i915/guc: Don't call ring_is_idle in GuC submission
@ 2022-07-12 23:31   ` John.C.Harrison
  0 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: DRI-Devel

From: Matthew Brost <matthew.brost@intel.com>

The engine registers really shouldn't be touched during GuC submission
as the GuC owns the registers. Don't call ring_is_idle and tie
intel_engine_is_idle strictly to the engine pm.

Because intel_engine_is_idle tied to the engine pm, retire requests
before checking intel_engines_are_idle in gt_drop_caches, and lastly
increase the timeout in gt_drop_caches for the intel_engines_are_idle
check.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 13 +++++++++++++
 drivers/gpu/drm/i915/i915_debugfs.c       |  6 +++---
 drivers/gpu/drm/i915/i915_drv.h           |  2 +-
 3 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 283870c659911..959a7c92e8f4d 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1602,6 +1602,9 @@ static bool ring_is_idle(struct intel_engine_cs *engine)
 {
 	bool idle = true;
 
+	/* GuC submission shouldn't access HEAD & TAIL via MMIO */
+	GEM_BUG_ON(intel_engine_uses_guc(engine));
+
 	if (I915_SELFTEST_ONLY(!engine->mmio_base))
 		return true;
 
@@ -1668,6 +1671,16 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
 	if (!i915_sched_engine_is_empty(engine->sched_engine))
 		return false;
 
+	/*
+	 * We shouldn't touch engine registers with GuC submission as the GuC
+	 * owns the registers. Let's tie the idle to engine pm, at worst this
+	 * function sometimes will falsely report non-idle when idle during the
+	 * delay to retire requests or with virtual engines and a request
+	 * running on another instance within the same class / submit mask.
+	 */
+	if (intel_engine_uses_guc(engine))
+		return false;
+
 	/* Ring stopped? */
 	return ring_is_idle(engine);
 }
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 94e5c29d2ee3a..ee5334840e9cb 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -654,13 +654,13 @@ gt_drop_caches(struct intel_gt *gt, u64 val)
 {
 	int ret;
 
+	if (val & DROP_RETIRE || val & DROP_RESET_ACTIVE)
+		intel_gt_retire_requests(gt);
+
 	if (val & DROP_RESET_ACTIVE &&
 	    wait_for(intel_engines_are_idle(gt), I915_IDLE_ENGINES_TIMEOUT))
 		intel_gt_set_wedged(gt);
 
-	if (val & DROP_RETIRE)
-		intel_gt_retire_requests(gt);
-
 	if (val & (DROP_IDLE | DROP_ACTIVE)) {
 		ret = intel_gt_wait_for_idle(gt, MAX_SCHEDULE_TIMEOUT);
 		if (ret)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c22f29c3faa0e..53c7474dde495 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -278,7 +278,7 @@ struct i915_gem_mm {
 	u32 shrink_count;
 };
 
-#define I915_IDLE_ENGINES_TIMEOUT (200) /* in ms */
+#define I915_IDLE_ENGINES_TIMEOUT (500) /* in ms */
 
 unsigned long i915_fence_context_timeout(const struct drm_i915_private *i915,
 					 u64 context);
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 03/12] drm/i915/guc: Fix issues with live_preempt_cancel
  2022-07-12 23:31 ` [Intel-gfx] " John.C.Harrison
@ 2022-07-12 23:31   ` John.C.Harrison
  -1 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: Matthew Brost, DRI-Devel

From: Matthew Brost <matthew.brost@intel.com>

Having semaphores results in different behavior when a dependent request
is cancelled. In the case of semaphores the request could be on the HW
and complete successfully while without the request is held in the
driver and the error from the dependent request is propagated. Fix
live_preempt_cancel to take this behavior into account.

Also update live_preempt_cancel to use new function intel_context_ban
rather than intel_context_set_banned.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/gt/selftest_execlists.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
index 02fc97a0ab502..015f8cd3463e2 100644
--- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
+++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
@@ -2087,7 +2087,7 @@ static int __cancel_active0(struct live_preempt_cancel *arg)
 		goto out;
 	}
 
-	intel_context_set_banned(rq->context);
+	intel_context_ban(rq->context, rq);
 	err = intel_engine_pulse(arg->engine);
 	if (err)
 		goto out;
@@ -2146,7 +2146,7 @@ static int __cancel_active1(struct live_preempt_cancel *arg)
 	if (err)
 		goto out;
 
-	intel_context_set_banned(rq[1]->context);
+	intel_context_ban(rq[1]->context, rq[1]);
 	err = intel_engine_pulse(arg->engine);
 	if (err)
 		goto out;
@@ -2229,7 +2229,7 @@ static int __cancel_queued(struct live_preempt_cancel *arg)
 	if (err)
 		goto out;
 
-	intel_context_set_banned(rq[2]->context);
+	intel_context_ban(rq[2]->context, rq[2]);
 	err = intel_engine_pulse(arg->engine);
 	if (err)
 		goto out;
@@ -2244,7 +2244,13 @@ static int __cancel_queued(struct live_preempt_cancel *arg)
 		goto out;
 	}
 
-	if (rq[1]->fence.error != 0) {
+	/*
+	 * The behavior between having semaphores and not is different. With
+	 * semaphores the subsequent request is on the hardware and not cancelled
+	 * while without the request is held in the driver and cancelled.
+	 */
+	if (intel_engine_has_semaphores(rq[1]->engine) &&
+	    rq[1]->fence.error != 0) {
 		pr_err("Normal inflight1 request did not complete\n");
 		err = -EINVAL;
 		goto out;
@@ -2292,7 +2298,7 @@ static int __cancel_hostile(struct live_preempt_cancel *arg)
 		goto out;
 	}
 
-	intel_context_set_banned(rq->context);
+	intel_context_ban(rq->context, rq);
 	err = intel_engine_pulse(arg->engine); /* force reset */
 	if (err)
 		goto out;
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Intel-gfx] [PATCH 03/12] drm/i915/guc: Fix issues with live_preempt_cancel
@ 2022-07-12 23:31   ` John.C.Harrison
  0 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: DRI-Devel

From: Matthew Brost <matthew.brost@intel.com>

Having semaphores results in different behavior when a dependent request
is cancelled. In the case of semaphores the request could be on the HW
and complete successfully while without the request is held in the
driver and the error from the dependent request is propagated. Fix
live_preempt_cancel to take this behavior into account.

Also update live_preempt_cancel to use new function intel_context_ban
rather than intel_context_set_banned.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/gt/selftest_execlists.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
index 02fc97a0ab502..015f8cd3463e2 100644
--- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
+++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
@@ -2087,7 +2087,7 @@ static int __cancel_active0(struct live_preempt_cancel *arg)
 		goto out;
 	}
 
-	intel_context_set_banned(rq->context);
+	intel_context_ban(rq->context, rq);
 	err = intel_engine_pulse(arg->engine);
 	if (err)
 		goto out;
@@ -2146,7 +2146,7 @@ static int __cancel_active1(struct live_preempt_cancel *arg)
 	if (err)
 		goto out;
 
-	intel_context_set_banned(rq[1]->context);
+	intel_context_ban(rq[1]->context, rq[1]);
 	err = intel_engine_pulse(arg->engine);
 	if (err)
 		goto out;
@@ -2229,7 +2229,7 @@ static int __cancel_queued(struct live_preempt_cancel *arg)
 	if (err)
 		goto out;
 
-	intel_context_set_banned(rq[2]->context);
+	intel_context_ban(rq[2]->context, rq[2]);
 	err = intel_engine_pulse(arg->engine);
 	if (err)
 		goto out;
@@ -2244,7 +2244,13 @@ static int __cancel_queued(struct live_preempt_cancel *arg)
 		goto out;
 	}
 
-	if (rq[1]->fence.error != 0) {
+	/*
+	 * The behavior between having semaphores and not is different. With
+	 * semaphores the subsequent request is on the hardware and not cancelled
+	 * while without the request is held in the driver and cancelled.
+	 */
+	if (intel_engine_has_semaphores(rq[1]->engine) &&
+	    rq[1]->fence.error != 0) {
 		pr_err("Normal inflight1 request did not complete\n");
 		err = -EINVAL;
 		goto out;
@@ -2292,7 +2298,7 @@ static int __cancel_hostile(struct live_preempt_cancel *arg)
 		goto out;
 	}
 
-	intel_context_set_banned(rq->context);
+	intel_context_ban(rq->context, rq);
 	err = intel_engine_pulse(arg->engine); /* force reset */
 	if (err)
 		goto out;
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 04/12] drm/i915/guc: Add GuC <-> kernel time stamp translation information
  2022-07-12 23:31 ` [Intel-gfx] " John.C.Harrison
@ 2022-07-12 23:31   ` John.C.Harrison
  -1 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: John Harrison, DRI-Devel

From: John Harrison <John.C.Harrison@Intel.com>

It is useful to be able to match GuC events to kernel events when
looking at the GuC log. That requires being able to convert GuC
timestamps to kernel time. So, when dumping error captures and/or GuC
logs, include a stamp in both time zones plus the clock frequency.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_regs.h    |  2 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc.c     | 19 +++++++++++++++++++
 drivers/gpu/drm/i915/gt/uc/intel_guc.h     |  2 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc_log.c |  2 ++
 drivers/gpu/drm/i915/i915_gpu_error.c      | 12 ++++++++++++
 drivers/gpu/drm/i915/i915_gpu_error.h      |  3 +++
 6 files changed, 40 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index e6bb24dc7b998..b1258f7239b06 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -1004,6 +1004,8 @@
 #define   GEN11_LSN_UNSLCVC_GAFS_HALF_CL2_MAXALLOC	(1 << 9)
 #define   GEN11_LSN_UNSLCVC_GAFS_HALF_SF_MAXALLOC	(1 << 7)
 
+#define GUCPMTIMESTAMP				_MMIO(0xc3e8)
+
 #define __GEN9_RCS0_MOCS0			0xc800
 #define GEN9_GFX_MOCS(i)			_MMIO(__GEN9_RCS0_MOCS0 + (i) * 4)
 #define __GEN9_VCS0_MOCS0			0xc900
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index 2706a8c650900..ab4aacc516aa4 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -389,6 +389,25 @@ void intel_guc_write_params(struct intel_guc *guc)
 	intel_uncore_forcewake_put(uncore, FORCEWAKE_GT);
 }
 
+void intel_guc_dump_time_info(struct intel_guc *guc, struct drm_printer *p)
+{
+	struct intel_gt *gt = guc_to_gt(guc);
+	intel_wakeref_t wakeref;
+	u32 stamp = 0;
+	u64 ktime;
+
+	intel_device_info_print_runtime(RUNTIME_INFO(gt->i915), p);
+
+	with_intel_runtime_pm(&gt->i915->runtime_pm, wakeref)
+		stamp = intel_uncore_read(gt->uncore, GUCPMTIMESTAMP);
+	ktime = ktime_get_boottime_ns();
+
+	drm_printf(p, "Kernel timestamp: 0x%08llX [%llu]\n", ktime, ktime);
+	drm_printf(p, "GuC timestamp: 0x%08X [%u]\n", stamp, stamp);
+	drm_printf(p, "CS timestamp frequency: %u Hz, %u ns\n",
+		   gt->clock_frequency, gt->clock_period_ns);
+}
+
 int intel_guc_init(struct intel_guc *guc)
 {
 	struct intel_gt *gt = guc_to_gt(guc);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index d0d99f178f2d4..111484372e6f8 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -459,4 +459,6 @@ void intel_guc_load_status(struct intel_guc *guc, struct drm_printer *p);
 
 void intel_guc_write_barrier(struct intel_guc *guc);
 
+void intel_guc_dump_time_info(struct intel_guc *guc, struct drm_printer *p);
+
 #endif
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
index 02311ad902641..45f62cdabe356 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
@@ -763,6 +763,8 @@ int intel_guc_log_dump(struct intel_guc_log *log, struct drm_printer *p,
 	if (!obj)
 		return 0;
 
+	intel_guc_dump_time_info(guc, p);
+
 	map = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC);
 	if (IS_ERR(map)) {
 		DRM_DEBUG("Failed to pin object\n");
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 52ea13fee015e..5bcf36c292ebd 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -690,6 +690,7 @@ static void err_print_uc(struct drm_i915_error_state_buf *m,
 
 	intel_uc_fw_dump(&error_uc->guc_fw, &p);
 	intel_uc_fw_dump(&error_uc->huc_fw, &p);
+	err_printf(m, "GuC timestamp: 0x%08x\n", error_uc->timestamp);
 	intel_gpu_error_print_vma(m, NULL, error_uc->guc_log);
 }
 
@@ -732,6 +733,8 @@ static void err_print_gt_global_nonguc(struct drm_i915_error_state_buf *m,
 	int i;
 
 	err_printf(m, "GT awake: %s\n", str_yes_no(gt->awake));
+	err_printf(m, "CS timestamp frequency: %u Hz, %d ns\n",
+		   gt->clock_frequency, gt->clock_period_ns);
 	err_printf(m, "EIR: 0x%08x\n", gt->eir);
 	err_printf(m, "PGTBL_ER: 0x%08x\n", gt->pgtbl_er);
 
@@ -1687,6 +1690,13 @@ gt_record_uc(struct intel_gt_coredump *gt,
 	 */
 	error_uc->guc_fw.path = kstrdup(uc->guc.fw.path, ALLOW_FAIL);
 	error_uc->huc_fw.path = kstrdup(uc->huc.fw.path, ALLOW_FAIL);
+
+	/*
+	 * Save the GuC log and include a timestamp reference for converting the
+	 * log times to system times (in conjunction with the error->boottime and
+	 * gt->clock_frequency fields saved elsewhere).
+	 */
+	error_uc->timestamp = intel_uncore_read(gt->_gt->uncore, GUCPMTIMESTAMP);
 	error_uc->guc_log = create_vma_coredump(gt->_gt, uc->guc.log.vma,
 						"GuC log buffer", compress);
 
@@ -1845,6 +1855,8 @@ static void gt_record_global_regs(struct intel_gt_coredump *gt)
 static void gt_record_info(struct intel_gt_coredump *gt)
 {
 	memcpy(&gt->info, &gt->_gt->info, sizeof(struct intel_gt_info));
+	gt->clock_frequency = gt->_gt->clock_frequency;
+	gt->clock_period_ns = gt->_gt->clock_period_ns;
 }
 
 /*
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h
index 55a143b92d10e..d8a8b3d529e09 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.h
+++ b/drivers/gpu/drm/i915/i915_gpu_error.h
@@ -150,6 +150,8 @@ struct intel_gt_coredump {
 	u32 gtt_cache;
 	u32 aux_err; /* gen12 */
 	u32 gam_done; /* gen12 */
+	u32 clock_frequency;
+	u32 clock_period_ns;
 
 	/* Display related */
 	u32 derrmr;
@@ -164,6 +166,7 @@ struct intel_gt_coredump {
 		struct intel_uc_fw guc_fw;
 		struct intel_uc_fw huc_fw;
 		struct i915_vma_coredump *guc_log;
+		u32 timestamp;
 		bool is_guc_capture;
 	} *uc;
 
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Intel-gfx] [PATCH 04/12] drm/i915/guc: Add GuC <-> kernel time stamp translation information
@ 2022-07-12 23:31   ` John.C.Harrison
  0 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: DRI-Devel

From: John Harrison <John.C.Harrison@Intel.com>

It is useful to be able to match GuC events to kernel events when
looking at the GuC log. That requires being able to convert GuC
timestamps to kernel time. So, when dumping error captures and/or GuC
logs, include a stamp in both time zones plus the clock frequency.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_regs.h    |  2 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc.c     | 19 +++++++++++++++++++
 drivers/gpu/drm/i915/gt/uc/intel_guc.h     |  2 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc_log.c |  2 ++
 drivers/gpu/drm/i915/i915_gpu_error.c      | 12 ++++++++++++
 drivers/gpu/drm/i915/i915_gpu_error.h      |  3 +++
 6 files changed, 40 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index e6bb24dc7b998..b1258f7239b06 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -1004,6 +1004,8 @@
 #define   GEN11_LSN_UNSLCVC_GAFS_HALF_CL2_MAXALLOC	(1 << 9)
 #define   GEN11_LSN_UNSLCVC_GAFS_HALF_SF_MAXALLOC	(1 << 7)
 
+#define GUCPMTIMESTAMP				_MMIO(0xc3e8)
+
 #define __GEN9_RCS0_MOCS0			0xc800
 #define GEN9_GFX_MOCS(i)			_MMIO(__GEN9_RCS0_MOCS0 + (i) * 4)
 #define __GEN9_VCS0_MOCS0			0xc900
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index 2706a8c650900..ab4aacc516aa4 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -389,6 +389,25 @@ void intel_guc_write_params(struct intel_guc *guc)
 	intel_uncore_forcewake_put(uncore, FORCEWAKE_GT);
 }
 
+void intel_guc_dump_time_info(struct intel_guc *guc, struct drm_printer *p)
+{
+	struct intel_gt *gt = guc_to_gt(guc);
+	intel_wakeref_t wakeref;
+	u32 stamp = 0;
+	u64 ktime;
+
+	intel_device_info_print_runtime(RUNTIME_INFO(gt->i915), p);
+
+	with_intel_runtime_pm(&gt->i915->runtime_pm, wakeref)
+		stamp = intel_uncore_read(gt->uncore, GUCPMTIMESTAMP);
+	ktime = ktime_get_boottime_ns();
+
+	drm_printf(p, "Kernel timestamp: 0x%08llX [%llu]\n", ktime, ktime);
+	drm_printf(p, "GuC timestamp: 0x%08X [%u]\n", stamp, stamp);
+	drm_printf(p, "CS timestamp frequency: %u Hz, %u ns\n",
+		   gt->clock_frequency, gt->clock_period_ns);
+}
+
 int intel_guc_init(struct intel_guc *guc)
 {
 	struct intel_gt *gt = guc_to_gt(guc);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index d0d99f178f2d4..111484372e6f8 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -459,4 +459,6 @@ void intel_guc_load_status(struct intel_guc *guc, struct drm_printer *p);
 
 void intel_guc_write_barrier(struct intel_guc *guc);
 
+void intel_guc_dump_time_info(struct intel_guc *guc, struct drm_printer *p);
+
 #endif
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
index 02311ad902641..45f62cdabe356 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
@@ -763,6 +763,8 @@ int intel_guc_log_dump(struct intel_guc_log *log, struct drm_printer *p,
 	if (!obj)
 		return 0;
 
+	intel_guc_dump_time_info(guc, p);
+
 	map = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC);
 	if (IS_ERR(map)) {
 		DRM_DEBUG("Failed to pin object\n");
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 52ea13fee015e..5bcf36c292ebd 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -690,6 +690,7 @@ static void err_print_uc(struct drm_i915_error_state_buf *m,
 
 	intel_uc_fw_dump(&error_uc->guc_fw, &p);
 	intel_uc_fw_dump(&error_uc->huc_fw, &p);
+	err_printf(m, "GuC timestamp: 0x%08x\n", error_uc->timestamp);
 	intel_gpu_error_print_vma(m, NULL, error_uc->guc_log);
 }
 
@@ -732,6 +733,8 @@ static void err_print_gt_global_nonguc(struct drm_i915_error_state_buf *m,
 	int i;
 
 	err_printf(m, "GT awake: %s\n", str_yes_no(gt->awake));
+	err_printf(m, "CS timestamp frequency: %u Hz, %d ns\n",
+		   gt->clock_frequency, gt->clock_period_ns);
 	err_printf(m, "EIR: 0x%08x\n", gt->eir);
 	err_printf(m, "PGTBL_ER: 0x%08x\n", gt->pgtbl_er);
 
@@ -1687,6 +1690,13 @@ gt_record_uc(struct intel_gt_coredump *gt,
 	 */
 	error_uc->guc_fw.path = kstrdup(uc->guc.fw.path, ALLOW_FAIL);
 	error_uc->huc_fw.path = kstrdup(uc->huc.fw.path, ALLOW_FAIL);
+
+	/*
+	 * Save the GuC log and include a timestamp reference for converting the
+	 * log times to system times (in conjunction with the error->boottime and
+	 * gt->clock_frequency fields saved elsewhere).
+	 */
+	error_uc->timestamp = intel_uncore_read(gt->_gt->uncore, GUCPMTIMESTAMP);
 	error_uc->guc_log = create_vma_coredump(gt->_gt, uc->guc.log.vma,
 						"GuC log buffer", compress);
 
@@ -1845,6 +1855,8 @@ static void gt_record_global_regs(struct intel_gt_coredump *gt)
 static void gt_record_info(struct intel_gt_coredump *gt)
 {
 	memcpy(&gt->info, &gt->_gt->info, sizeof(struct intel_gt_info));
+	gt->clock_frequency = gt->_gt->clock_frequency;
+	gt->clock_period_ns = gt->_gt->clock_period_ns;
 }
 
 /*
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h
index 55a143b92d10e..d8a8b3d529e09 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.h
+++ b/drivers/gpu/drm/i915/i915_gpu_error.h
@@ -150,6 +150,8 @@ struct intel_gt_coredump {
 	u32 gtt_cache;
 	u32 aux_err; /* gen12 */
 	u32 gam_done; /* gen12 */
+	u32 clock_frequency;
+	u32 clock_period_ns;
 
 	/* Display related */
 	u32 derrmr;
@@ -164,6 +166,7 @@ struct intel_gt_coredump {
 		struct intel_uc_fw guc_fw;
 		struct intel_uc_fw huc_fw;
 		struct i915_vma_coredump *guc_log;
+		u32 timestamp;
 		bool is_guc_capture;
 	} *uc;
 
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 05/12] drm/i915/guc: Record CTB info in error logs
  2022-07-12 23:31 ` [Intel-gfx] " John.C.Harrison
@ 2022-07-12 23:31   ` John.C.Harrison
  -1 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: John Harrison, DRI-Devel

From: John Harrison <John.C.Harrison@Intel.com>

When debugging GuC communication issues, it is useful to have the CTB
info available. So add the state and buffer contents to the error
capture log.

Also, add a sub-structure for the GuC specific error capture info as
it is now becoming numerous.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/i915_gpu_error.c | 59 +++++++++++++++++++++++----
 drivers/gpu/drm/i915/i915_gpu_error.h | 20 +++++++--
 2 files changed, 67 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 5bcf36c292ebd..0e43b8dd22cf7 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -683,6 +683,18 @@ static void err_print_pciid(struct drm_i915_error_state_buf *m,
 		   pdev->subsystem_device);
 }
 
+static void err_print_guc_ctb(struct drm_i915_error_state_buf *m,
+			      const char *name,
+			      const struct intel_ctb_coredump *ctb)
+{
+	if (!ctb->size)
+		return;
+
+	err_printf(m, "GuC %s CTB: raw: 0x%08X, 0x%08X/%08X, cached: 0x%08X/%08X, desc = 0x%08X, buf = 0x%08X x 0x%08X\n",
+		   name, ctb->raw_status, ctb->raw_head, ctb->raw_tail,
+		   ctb->head, ctb->tail, ctb->desc_offset, ctb->cmds_offset, ctb->size);
+}
+
 static void err_print_uc(struct drm_i915_error_state_buf *m,
 			 const struct intel_uc_coredump *error_uc)
 {
@@ -690,8 +702,12 @@ static void err_print_uc(struct drm_i915_error_state_buf *m,
 
 	intel_uc_fw_dump(&error_uc->guc_fw, &p);
 	intel_uc_fw_dump(&error_uc->huc_fw, &p);
-	err_printf(m, "GuC timestamp: 0x%08x\n", error_uc->timestamp);
-	intel_gpu_error_print_vma(m, NULL, error_uc->guc_log);
+	err_printf(m, "GuC timestamp: 0x%08x\n", error_uc->guc.timestamp);
+	intel_gpu_error_print_vma(m, NULL, error_uc->guc.vma_log);
+	err_printf(m, "GuC CTB fence: %d\n", error_uc->guc.last_fence);
+	err_print_guc_ctb(m, "Send", error_uc->guc.ctb + 0);
+	err_print_guc_ctb(m, "Recv", error_uc->guc.ctb + 1);
+	intel_gpu_error_print_vma(m, NULL, error_uc->guc.vma_ctb);
 }
 
 static void err_free_sgl(struct scatterlist *sgl)
@@ -866,7 +882,7 @@ static void __err_print_to_sgl(struct drm_i915_error_state_buf *m,
 	if (error->gt) {
 		bool print_guc_capture = false;
 
-		if (error->gt->uc && error->gt->uc->is_guc_capture)
+		if (error->gt->uc && error->gt->uc->guc.is_guc_capture)
 			print_guc_capture = true;
 
 		err_print_gt_display(m, error->gt);
@@ -1021,7 +1037,8 @@ static void cleanup_uc(struct intel_uc_coredump *uc)
 {
 	kfree(uc->guc_fw.path);
 	kfree(uc->huc_fw.path);
-	i915_vma_coredump_free(uc->guc_log);
+	i915_vma_coredump_free(uc->guc.vma_log);
+	i915_vma_coredump_free(uc->guc.vma_ctb);
 
 	kfree(uc);
 }
@@ -1670,6 +1687,23 @@ gt_record_engines(struct intel_gt_coredump *gt,
 	}
 }
 
+static void gt_record_guc_ctb(struct intel_ctb_coredump *saved,
+			      const struct intel_guc_ct_buffer *ctb,
+			      const void *blob_ptr, struct intel_guc *guc)
+{
+	if (!ctb || !ctb->desc)
+		return;
+
+	saved->raw_status = ctb->desc->status;
+	saved->raw_head = ctb->desc->head;
+	saved->raw_tail = ctb->desc->tail;
+	saved->head = ctb->head;
+	saved->tail = ctb->tail;
+	saved->size = ctb->size;
+	saved->desc_offset = ((void *)ctb->desc) - blob_ptr;
+	saved->cmds_offset = ((void *)ctb->cmds) - blob_ptr;
+}
+
 static struct intel_uc_coredump *
 gt_record_uc(struct intel_gt_coredump *gt,
 	     struct i915_vma_compress *compress)
@@ -1696,9 +1730,16 @@ gt_record_uc(struct intel_gt_coredump *gt,
 	 * log times to system times (in conjunction with the error->boottime and
 	 * gt->clock_frequency fields saved elsewhere).
 	 */
-	error_uc->timestamp = intel_uncore_read(gt->_gt->uncore, GUCPMTIMESTAMP);
-	error_uc->guc_log = create_vma_coredump(gt->_gt, uc->guc.log.vma,
-						"GuC log buffer", compress);
+	error_uc->guc.timestamp = intel_uncore_read(gt->_gt->uncore, GUCPMTIMESTAMP);
+	error_uc->guc.vma_log = create_vma_coredump(gt->_gt, uc->guc.log.vma,
+						    "GuC log buffer", compress);
+	error_uc->guc.vma_ctb = create_vma_coredump(gt->_gt, uc->guc.ct.vma,
+						    "GuC CT buffer", compress);
+	error_uc->guc.last_fence = uc->guc.ct.requests.last_fence;
+	gt_record_guc_ctb(error_uc->guc.ctb + 0, &uc->guc.ct.ctbs.send,
+			  uc->guc.ct.ctbs.send.desc, (struct intel_guc *)&uc->guc);
+	gt_record_guc_ctb(error_uc->guc.ctb + 1, &uc->guc.ct.ctbs.recv,
+			  uc->guc.ct.ctbs.send.desc, (struct intel_guc *)&uc->guc);
 
 	return error_uc;
 }
@@ -2051,9 +2092,9 @@ __i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask, u32 du
 			error->gt->uc = gt_record_uc(error->gt, compress);
 			if (error->gt->uc) {
 				if (dump_flags & CORE_DUMP_FLAG_IS_GUC_CAPTURE)
-					error->gt->uc->is_guc_capture = true;
+					error->gt->uc->guc.is_guc_capture = true;
 				else
-					GEM_BUG_ON(error->gt->uc->is_guc_capture);
+					GEM_BUG_ON(error->gt->uc->guc.is_guc_capture);
 			}
 		}
 
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h
index d8a8b3d529e09..efc75cc2ffdb9 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.h
+++ b/drivers/gpu/drm/i915/i915_gpu_error.h
@@ -125,6 +125,15 @@ struct intel_engine_coredump {
 	struct intel_engine_coredump *next;
 };
 
+struct intel_ctb_coredump {
+	u32 raw_head, head;
+	u32 raw_tail, tail;
+	u32 raw_status;
+	u32 desc_offset;
+	u32 cmds_offset;
+	u32 size;
+};
+
 struct intel_gt_coredump {
 	const struct intel_gt *_gt;
 	bool awake;
@@ -165,9 +174,14 @@ struct intel_gt_coredump {
 	struct intel_uc_coredump {
 		struct intel_uc_fw guc_fw;
 		struct intel_uc_fw huc_fw;
-		struct i915_vma_coredump *guc_log;
-		u32 timestamp;
-		bool is_guc_capture;
+		struct guc_info {
+			struct intel_ctb_coredump ctb[2];
+			struct i915_vma_coredump *vma_ctb;
+			struct i915_vma_coredump *vma_log;
+			u32 timestamp;
+			u16 last_fence;
+			bool is_guc_capture;
+		} guc;
 	} *uc;
 
 	struct intel_gt_coredump *next;
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Intel-gfx] [PATCH 05/12] drm/i915/guc: Record CTB info in error logs
@ 2022-07-12 23:31   ` John.C.Harrison
  0 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: DRI-Devel

From: John Harrison <John.C.Harrison@Intel.com>

When debugging GuC communication issues, it is useful to have the CTB
info available. So add the state and buffer contents to the error
capture log.

Also, add a sub-structure for the GuC specific error capture info as
it is now becoming numerous.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/i915_gpu_error.c | 59 +++++++++++++++++++++++----
 drivers/gpu/drm/i915/i915_gpu_error.h | 20 +++++++--
 2 files changed, 67 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 5bcf36c292ebd..0e43b8dd22cf7 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -683,6 +683,18 @@ static void err_print_pciid(struct drm_i915_error_state_buf *m,
 		   pdev->subsystem_device);
 }
 
+static void err_print_guc_ctb(struct drm_i915_error_state_buf *m,
+			      const char *name,
+			      const struct intel_ctb_coredump *ctb)
+{
+	if (!ctb->size)
+		return;
+
+	err_printf(m, "GuC %s CTB: raw: 0x%08X, 0x%08X/%08X, cached: 0x%08X/%08X, desc = 0x%08X, buf = 0x%08X x 0x%08X\n",
+		   name, ctb->raw_status, ctb->raw_head, ctb->raw_tail,
+		   ctb->head, ctb->tail, ctb->desc_offset, ctb->cmds_offset, ctb->size);
+}
+
 static void err_print_uc(struct drm_i915_error_state_buf *m,
 			 const struct intel_uc_coredump *error_uc)
 {
@@ -690,8 +702,12 @@ static void err_print_uc(struct drm_i915_error_state_buf *m,
 
 	intel_uc_fw_dump(&error_uc->guc_fw, &p);
 	intel_uc_fw_dump(&error_uc->huc_fw, &p);
-	err_printf(m, "GuC timestamp: 0x%08x\n", error_uc->timestamp);
-	intel_gpu_error_print_vma(m, NULL, error_uc->guc_log);
+	err_printf(m, "GuC timestamp: 0x%08x\n", error_uc->guc.timestamp);
+	intel_gpu_error_print_vma(m, NULL, error_uc->guc.vma_log);
+	err_printf(m, "GuC CTB fence: %d\n", error_uc->guc.last_fence);
+	err_print_guc_ctb(m, "Send", error_uc->guc.ctb + 0);
+	err_print_guc_ctb(m, "Recv", error_uc->guc.ctb + 1);
+	intel_gpu_error_print_vma(m, NULL, error_uc->guc.vma_ctb);
 }
 
 static void err_free_sgl(struct scatterlist *sgl)
@@ -866,7 +882,7 @@ static void __err_print_to_sgl(struct drm_i915_error_state_buf *m,
 	if (error->gt) {
 		bool print_guc_capture = false;
 
-		if (error->gt->uc && error->gt->uc->is_guc_capture)
+		if (error->gt->uc && error->gt->uc->guc.is_guc_capture)
 			print_guc_capture = true;
 
 		err_print_gt_display(m, error->gt);
@@ -1021,7 +1037,8 @@ static void cleanup_uc(struct intel_uc_coredump *uc)
 {
 	kfree(uc->guc_fw.path);
 	kfree(uc->huc_fw.path);
-	i915_vma_coredump_free(uc->guc_log);
+	i915_vma_coredump_free(uc->guc.vma_log);
+	i915_vma_coredump_free(uc->guc.vma_ctb);
 
 	kfree(uc);
 }
@@ -1670,6 +1687,23 @@ gt_record_engines(struct intel_gt_coredump *gt,
 	}
 }
 
+static void gt_record_guc_ctb(struct intel_ctb_coredump *saved,
+			      const struct intel_guc_ct_buffer *ctb,
+			      const void *blob_ptr, struct intel_guc *guc)
+{
+	if (!ctb || !ctb->desc)
+		return;
+
+	saved->raw_status = ctb->desc->status;
+	saved->raw_head = ctb->desc->head;
+	saved->raw_tail = ctb->desc->tail;
+	saved->head = ctb->head;
+	saved->tail = ctb->tail;
+	saved->size = ctb->size;
+	saved->desc_offset = ((void *)ctb->desc) - blob_ptr;
+	saved->cmds_offset = ((void *)ctb->cmds) - blob_ptr;
+}
+
 static struct intel_uc_coredump *
 gt_record_uc(struct intel_gt_coredump *gt,
 	     struct i915_vma_compress *compress)
@@ -1696,9 +1730,16 @@ gt_record_uc(struct intel_gt_coredump *gt,
 	 * log times to system times (in conjunction with the error->boottime and
 	 * gt->clock_frequency fields saved elsewhere).
 	 */
-	error_uc->timestamp = intel_uncore_read(gt->_gt->uncore, GUCPMTIMESTAMP);
-	error_uc->guc_log = create_vma_coredump(gt->_gt, uc->guc.log.vma,
-						"GuC log buffer", compress);
+	error_uc->guc.timestamp = intel_uncore_read(gt->_gt->uncore, GUCPMTIMESTAMP);
+	error_uc->guc.vma_log = create_vma_coredump(gt->_gt, uc->guc.log.vma,
+						    "GuC log buffer", compress);
+	error_uc->guc.vma_ctb = create_vma_coredump(gt->_gt, uc->guc.ct.vma,
+						    "GuC CT buffer", compress);
+	error_uc->guc.last_fence = uc->guc.ct.requests.last_fence;
+	gt_record_guc_ctb(error_uc->guc.ctb + 0, &uc->guc.ct.ctbs.send,
+			  uc->guc.ct.ctbs.send.desc, (struct intel_guc *)&uc->guc);
+	gt_record_guc_ctb(error_uc->guc.ctb + 1, &uc->guc.ct.ctbs.recv,
+			  uc->guc.ct.ctbs.send.desc, (struct intel_guc *)&uc->guc);
 
 	return error_uc;
 }
@@ -2051,9 +2092,9 @@ __i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask, u32 du
 			error->gt->uc = gt_record_uc(error->gt, compress);
 			if (error->gt->uc) {
 				if (dump_flags & CORE_DUMP_FLAG_IS_GUC_CAPTURE)
-					error->gt->uc->is_guc_capture = true;
+					error->gt->uc->guc.is_guc_capture = true;
 				else
-					GEM_BUG_ON(error->gt->uc->is_guc_capture);
+					GEM_BUG_ON(error->gt->uc->guc.is_guc_capture);
 			}
 		}
 
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h
index d8a8b3d529e09..efc75cc2ffdb9 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.h
+++ b/drivers/gpu/drm/i915/i915_gpu_error.h
@@ -125,6 +125,15 @@ struct intel_engine_coredump {
 	struct intel_engine_coredump *next;
 };
 
+struct intel_ctb_coredump {
+	u32 raw_head, head;
+	u32 raw_tail, tail;
+	u32 raw_status;
+	u32 desc_offset;
+	u32 cmds_offset;
+	u32 size;
+};
+
 struct intel_gt_coredump {
 	const struct intel_gt *_gt;
 	bool awake;
@@ -165,9 +174,14 @@ struct intel_gt_coredump {
 	struct intel_uc_coredump {
 		struct intel_uc_fw guc_fw;
 		struct intel_uc_fw huc_fw;
-		struct i915_vma_coredump *guc_log;
-		u32 timestamp;
-		bool is_guc_capture;
+		struct guc_info {
+			struct intel_ctb_coredump ctb[2];
+			struct i915_vma_coredump *vma_ctb;
+			struct i915_vma_coredump *vma_log;
+			u32 timestamp;
+			u16 last_fence;
+			bool is_guc_capture;
+		} guc;
 	} *uc;
 
 	struct intel_gt_coredump *next;
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 06/12] drm/i915/guc: Use streaming loads to speed up dumping the guc log
  2022-07-12 23:31 ` [Intel-gfx] " John.C.Harrison
@ 2022-07-12 23:31   ` John.C.Harrison
  -1 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: Chris Wilson, DRI-Devel

From: Chris Wilson <chris.p.wilson@intel.com>

Use a temporary page and mempy_from_wc to reduce the time it takes to
dump the guc log to debugfs.

Signed-off-by: Chris Wilson <chris.p.wilson@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_log.c | 24 ++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
index 45f62cdabe356..ff091adb56096 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
@@ -749,8 +749,9 @@ int intel_guc_log_dump(struct intel_guc_log *log, struct drm_printer *p,
 	struct intel_guc *guc = log_to_guc(log);
 	struct intel_uc *uc = container_of(guc, struct intel_uc, guc);
 	struct drm_i915_gem_object *obj = NULL;
-	u32 *map;
-	int i = 0;
+	void *map;
+	u32 *page;
+	int i, j;
 
 	if (!intel_guc_is_supported(guc))
 		return -ENODEV;
@@ -763,23 +764,34 @@ int intel_guc_log_dump(struct intel_guc_log *log, struct drm_printer *p,
 	if (!obj)
 		return 0;
 
+	page = (u32 *)__get_free_page(GFP_KERNEL);
+	if (!page)
+		return -ENOMEM;
+
 	intel_guc_dump_time_info(guc, p);
 
 	map = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC);
 	if (IS_ERR(map)) {
 		DRM_DEBUG("Failed to pin object\n");
 		drm_puts(p, "(log data unaccessible)\n");
+		free_page((unsigned long)page);
 		return PTR_ERR(map);
 	}
 
-	for (i = 0; i < obj->base.size / sizeof(u32); i += 4)
-		drm_printf(p, "0x%08x 0x%08x 0x%08x 0x%08x\n",
-			   *(map + i), *(map + i + 1),
-			   *(map + i + 2), *(map + i + 3));
+	for (i = 0; i < obj->base.size; i += PAGE_SIZE) {
+		if (!i915_memcpy_from_wc(page, map + i, PAGE_SIZE))
+			memcpy(page, map + i, PAGE_SIZE);
+
+		for (j = 0; j < PAGE_SIZE / sizeof(u32); j += 4)
+			drm_printf(p, "0x%08x 0x%08x 0x%08x 0x%08x\n",
+				   *(page + j + 0), *(page + j + 1),
+				   *(page + j + 2), *(page + j + 3));
+	}
 
 	drm_puts(p, "\n");
 
 	i915_gem_object_unpin_map(obj);
+	free_page((unsigned long)page);
 
 	return 0;
 }
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Intel-gfx] [PATCH 06/12] drm/i915/guc: Use streaming loads to speed up dumping the guc log
@ 2022-07-12 23:31   ` John.C.Harrison
  0 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: Chris Wilson, DRI-Devel

From: Chris Wilson <chris.p.wilson@intel.com>

Use a temporary page and mempy_from_wc to reduce the time it takes to
dump the guc log to debugfs.

Signed-off-by: Chris Wilson <chris.p.wilson@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_log.c | 24 ++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
index 45f62cdabe356..ff091adb56096 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
@@ -749,8 +749,9 @@ int intel_guc_log_dump(struct intel_guc_log *log, struct drm_printer *p,
 	struct intel_guc *guc = log_to_guc(log);
 	struct intel_uc *uc = container_of(guc, struct intel_uc, guc);
 	struct drm_i915_gem_object *obj = NULL;
-	u32 *map;
-	int i = 0;
+	void *map;
+	u32 *page;
+	int i, j;
 
 	if (!intel_guc_is_supported(guc))
 		return -ENODEV;
@@ -763,23 +764,34 @@ int intel_guc_log_dump(struct intel_guc_log *log, struct drm_printer *p,
 	if (!obj)
 		return 0;
 
+	page = (u32 *)__get_free_page(GFP_KERNEL);
+	if (!page)
+		return -ENOMEM;
+
 	intel_guc_dump_time_info(guc, p);
 
 	map = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC);
 	if (IS_ERR(map)) {
 		DRM_DEBUG("Failed to pin object\n");
 		drm_puts(p, "(log data unaccessible)\n");
+		free_page((unsigned long)page);
 		return PTR_ERR(map);
 	}
 
-	for (i = 0; i < obj->base.size / sizeof(u32); i += 4)
-		drm_printf(p, "0x%08x 0x%08x 0x%08x 0x%08x\n",
-			   *(map + i), *(map + i + 1),
-			   *(map + i + 2), *(map + i + 3));
+	for (i = 0; i < obj->base.size; i += PAGE_SIZE) {
+		if (!i915_memcpy_from_wc(page, map + i, PAGE_SIZE))
+			memcpy(page, map + i, PAGE_SIZE);
+
+		for (j = 0; j < PAGE_SIZE / sizeof(u32); j += 4)
+			drm_printf(p, "0x%08x 0x%08x 0x%08x 0x%08x\n",
+				   *(page + j + 0), *(page + j + 1),
+				   *(page + j + 2), *(page + j + 3));
+	}
 
 	drm_puts(p, "\n");
 
 	i915_gem_object_unpin_map(obj);
+	free_page((unsigned long)page);
 
 	return 0;
 }
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 07/12] drm/i915/guc: Route semaphores to GuC for Gen12+
  2022-07-12 23:31 ` [Intel-gfx] " John.C.Harrison
@ 2022-07-12 23:31   ` John.C.Harrison
  -1 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: Michał Winiarski, DRI-Devel

From: Michał Winiarski <michal.winiarski@intel.com>

Since we're going to use semaphores in selftests (and eventually in
regular GuC submission), let's route semaphores to GuC.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h        |  4 ++++
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 14 ++++++++++++++
 2 files changed, 18 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
index 8dc063f087eb1..a7092f711e9cd 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
@@ -102,6 +102,10 @@
 #define   GUC_SEND_TRIGGER		  (1<<0)
 #define GEN11_GUC_HOST_INTERRUPT	_MMIO(0x1901f0)
 
+#define GEN12_GUC_SEM_INTR_ENABLES	_MMIO(0xc71c)
+#define   GUC_SEM_INTR_ROUTE_TO_GUC	BIT(31)
+#define   GUC_SEM_INTR_ENABLE_ALL	(0xff)
+
 #define GUC_NUM_DOORBELLS		256
 
 /* format of the HW-monitored doorbell cacheline */
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 40f726c61e951..7537459080278 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -3953,13 +3953,27 @@ int intel_guc_submission_setup(struct intel_engine_cs *engine)
 
 void intel_guc_submission_enable(struct intel_guc *guc)
 {
+	struct intel_gt *gt = guc_to_gt(guc);
+
+	/* Enable and route to GuC */
+	if (GRAPHICS_VER(gt->i915) >= 12)
+		intel_uncore_write(gt->uncore, GEN12_GUC_SEM_INTR_ENABLES,
+				   GUC_SEM_INTR_ROUTE_TO_GUC |
+				   GUC_SEM_INTR_ENABLE_ALL);
+
 	guc_init_lrc_mapping(guc);
 	guc_init_engine_stats(guc);
 }
 
 void intel_guc_submission_disable(struct intel_guc *guc)
 {
+	struct intel_gt *gt = guc_to_gt(guc);
+
 	/* Note: By the time we're here, GuC may have already been reset */
+
+	/* Disable and route to host */
+	if (GRAPHICS_VER(gt->i915) >= 12)
+		intel_uncore_write(gt->uncore, GEN12_GUC_SEM_INTR_ENABLES, 0x0);
 }
 
 static bool __guc_submission_supported(struct intel_guc *guc)
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Intel-gfx] [PATCH 07/12] drm/i915/guc: Route semaphores to GuC for Gen12+
@ 2022-07-12 23:31   ` John.C.Harrison
  0 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: Michał Winiarski, DRI-Devel

From: Michał Winiarski <michal.winiarski@intel.com>

Since we're going to use semaphores in selftests (and eventually in
regular GuC submission), let's route semaphores to GuC.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h        |  4 ++++
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 14 ++++++++++++++
 2 files changed, 18 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
index 8dc063f087eb1..a7092f711e9cd 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
@@ -102,6 +102,10 @@
 #define   GUC_SEND_TRIGGER		  (1<<0)
 #define GEN11_GUC_HOST_INTERRUPT	_MMIO(0x1901f0)
 
+#define GEN12_GUC_SEM_INTR_ENABLES	_MMIO(0xc71c)
+#define   GUC_SEM_INTR_ROUTE_TO_GUC	BIT(31)
+#define   GUC_SEM_INTR_ENABLE_ALL	(0xff)
+
 #define GUC_NUM_DOORBELLS		256
 
 /* format of the HW-monitored doorbell cacheline */
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 40f726c61e951..7537459080278 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -3953,13 +3953,27 @@ int intel_guc_submission_setup(struct intel_engine_cs *engine)
 
 void intel_guc_submission_enable(struct intel_guc *guc)
 {
+	struct intel_gt *gt = guc_to_gt(guc);
+
+	/* Enable and route to GuC */
+	if (GRAPHICS_VER(gt->i915) >= 12)
+		intel_uncore_write(gt->uncore, GEN12_GUC_SEM_INTR_ENABLES,
+				   GUC_SEM_INTR_ROUTE_TO_GUC |
+				   GUC_SEM_INTR_ENABLE_ALL);
+
 	guc_init_lrc_mapping(guc);
 	guc_init_engine_stats(guc);
 }
 
 void intel_guc_submission_disable(struct intel_guc *guc)
 {
+	struct intel_gt *gt = guc_to_gt(guc);
+
 	/* Note: By the time we're here, GuC may have already been reset */
+
+	/* Disable and route to host */
+	if (GRAPHICS_VER(gt->i915) >= 12)
+		intel_uncore_write(gt->uncore, GEN12_GUC_SEM_INTR_ENABLES, 0x0);
 }
 
 static bool __guc_submission_supported(struct intel_guc *guc)
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 08/12] drm/i915/guc: Add selftest for a hung GuC
  2022-07-12 23:31 ` [Intel-gfx] " John.C.Harrison
@ 2022-07-12 23:31   ` John.C.Harrison
  -1 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: DRI-Devel, Rahul Kumar Singh

From: Rahul Kumar Singh <rahul.kumar.singh@intel.com>

Add a test to check that the hangcheck will recover from a submission
hang in the GuC.

Signed-off-by: Rahul Kumar Singh <rahul.kumar.singh@intel.com>
---
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |   1 +
 .../drm/i915/gt/uc/selftest_guc_hangcheck.c   | 159 ++++++++++++++++++
 .../drm/i915/selftests/i915_live_selftests.h  |   1 +
 3 files changed, 161 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gt/uc/selftest_guc_hangcheck.c

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 7537459080278..72832a4f4bac7 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -4937,4 +4937,5 @@ bool intel_guc_virtual_engine_has_heartbeat(const struct intel_engine_cs *ve)
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftest_guc.c"
 #include "selftest_guc_multi_lrc.c"
+#include "selftest_guc_hangcheck.c"
 #endif
diff --git a/drivers/gpu/drm/i915/gt/uc/selftest_guc_hangcheck.c b/drivers/gpu/drm/i915/gt/uc/selftest_guc_hangcheck.c
new file mode 100644
index 0000000000000..af913c4b09d37
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/uc/selftest_guc_hangcheck.c
@@ -0,0 +1,159 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright �� 2019 Intel Corporation
+ */
+
+#include "selftests/igt_spinner.h"
+#include "selftests/igt_reset.h"
+#include "selftests/intel_scheduler_helpers.h"
+#include "gt/intel_engine_heartbeat.h"
+#include "gem/selftests/mock_context.h"
+
+#define BEAT_INTERVAL	100
+
+static struct i915_request *nop_request(struct intel_engine_cs *engine)
+{
+	struct i915_request *rq;
+
+	rq = intel_engine_create_kernel_request(engine);
+	if (IS_ERR(rq))
+		return rq;
+
+	i915_request_get(rq);
+	i915_request_add(rq);
+
+	return rq;
+}
+
+static int intel_hang_guc(void *arg)
+{
+	struct intel_gt *gt = arg;
+	int ret = 0;
+	struct i915_gem_context *ctx;
+	struct intel_context *ce;
+	struct igt_spinner spin;
+	struct i915_request *rq;
+	intel_wakeref_t wakeref;
+	struct i915_gpu_error *global = &gt->i915->gpu_error;
+	struct intel_engine_cs *engine;
+	unsigned int reset_count;
+	u32 guc_status;
+	u32 old_beat;
+
+	ctx = kernel_context(gt->i915, NULL);
+	if (IS_ERR(ctx)) {
+		pr_err("Failed get kernel context: %ld\n", PTR_ERR(ctx));
+		return PTR_ERR(ctx);
+	}
+
+	wakeref = intel_runtime_pm_get(gt->uncore->rpm);
+
+	ce = intel_context_create(gt->engine[BCS0]);
+	if (IS_ERR(ce)) {
+		ret = PTR_ERR(ce);
+		pr_err("Failed to create spinner request: %d\n", ret);
+		goto err;
+	}
+
+	engine = ce->engine;
+	reset_count = i915_reset_count(global);
+
+	old_beat = engine->props.heartbeat_interval_ms;
+	ret = intel_engine_set_heartbeat(engine, BEAT_INTERVAL);
+	if (ret) {
+		pr_err("Failed to boost heatbeat interval: %d\n", ret);
+		goto err;
+	}
+
+	ret = igt_spinner_init(&spin, engine->gt);
+	if (ret) {
+		pr_err("Failed to create spinner: %d\n", ret);
+		goto err;
+	}
+
+	rq = igt_spinner_create_request(&spin, ce, MI_NOOP);
+	intel_context_put(ce);
+	if (IS_ERR(rq)) {
+		ret = PTR_ERR(rq);
+		pr_err("Failed to create spinner request: %d\n", ret);
+		goto err_spin;
+	}
+
+	ret = request_add_spin(rq, &spin);
+	if (ret) {
+		i915_request_put(rq);
+		pr_err("Failed to add Spinner request: %d\n", ret);
+		goto err_spin;
+	}
+
+	ret = intel_reset_guc(gt);
+	if (ret) {
+		i915_request_put(rq);
+		pr_err("Failed to reset GuC, ret = %d\n", ret);
+		goto err_spin;
+	}
+
+	guc_status = intel_uncore_read(gt->uncore, GUC_STATUS);
+	if (!(guc_status & GS_MIA_IN_RESET)) {
+		i915_request_put(rq);
+		pr_err("GuC failed to reset: status = 0x%08X\n", guc_status);
+		ret = -EIO;
+		goto err_spin;
+	}
+
+	/* Wait for the heartbeat to cause a reset */
+	ret = intel_selftest_wait_for_rq(rq);
+	i915_request_put(rq);
+	if (ret) {
+		pr_err("Request failed to complete: %d\n", ret);
+		goto err_spin;
+	}
+
+	if (i915_reset_count(global) == reset_count) {
+		pr_err("Failed to record a GPU reset\n");
+		ret = -EINVAL;
+		goto err_spin;
+	}
+
+err_spin:
+	igt_spinner_end(&spin);
+	igt_spinner_fini(&spin);
+	intel_engine_set_heartbeat(engine, old_beat);
+
+	if (ret == 0) {
+		rq = nop_request(engine);
+		if (IS_ERR(rq)) {
+			ret = PTR_ERR(rq);
+			goto err;
+		}
+
+		ret = intel_selftest_wait_for_rq(rq);
+		i915_request_put(rq);
+		if (ret) {
+			pr_err("No-op failed to complete: %d\n", ret);
+			goto err;
+		}
+	}
+
+err:
+	intel_runtime_pm_put(gt->uncore->rpm, wakeref);
+	kernel_context_close(ctx);
+
+	return ret;
+}
+
+int intel_guc_hang_check(struct drm_i915_private *i915)
+{
+	static const struct i915_subtest tests[] = {
+		SUBTEST(intel_hang_guc),
+	};
+	struct intel_gt *gt = to_gt(i915);
+
+	if (intel_gt_is_wedged(gt))
+		return 0;
+
+	if (!intel_uc_uses_guc_submission(&gt->uc))
+		return 0;
+
+	return intel_gt_live_subtests(tests, gt);
+}
diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
index bdd290f2bf3cd..aaf8a380e5c78 100644
--- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
@@ -49,5 +49,6 @@ selftest(perf, i915_perf_live_selftests)
 selftest(slpc, intel_slpc_live_selftests)
 selftest(guc, intel_guc_live_selftests)
 selftest(guc_multi_lrc, intel_guc_multi_lrc_live_selftests)
+selftest(guc_hang, intel_guc_hang_check)
 /* Here be dragons: keep last to run last! */
 selftest(late_gt_pm, intel_gt_pm_late_selftests)
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Intel-gfx] [PATCH 08/12] drm/i915/guc: Add selftest for a hung GuC
@ 2022-07-12 23:31   ` John.C.Harrison
  0 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: DRI-Devel, Rahul Kumar Singh

From: Rahul Kumar Singh <rahul.kumar.singh@intel.com>

Add a test to check that the hangcheck will recover from a submission
hang in the GuC.

Signed-off-by: Rahul Kumar Singh <rahul.kumar.singh@intel.com>
---
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |   1 +
 .../drm/i915/gt/uc/selftest_guc_hangcheck.c   | 159 ++++++++++++++++++
 .../drm/i915/selftests/i915_live_selftests.h  |   1 +
 3 files changed, 161 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gt/uc/selftest_guc_hangcheck.c

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 7537459080278..72832a4f4bac7 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -4937,4 +4937,5 @@ bool intel_guc_virtual_engine_has_heartbeat(const struct intel_engine_cs *ve)
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftest_guc.c"
 #include "selftest_guc_multi_lrc.c"
+#include "selftest_guc_hangcheck.c"
 #endif
diff --git a/drivers/gpu/drm/i915/gt/uc/selftest_guc_hangcheck.c b/drivers/gpu/drm/i915/gt/uc/selftest_guc_hangcheck.c
new file mode 100644
index 0000000000000..af913c4b09d37
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/uc/selftest_guc_hangcheck.c
@@ -0,0 +1,159 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright �� 2019 Intel Corporation
+ */
+
+#include "selftests/igt_spinner.h"
+#include "selftests/igt_reset.h"
+#include "selftests/intel_scheduler_helpers.h"
+#include "gt/intel_engine_heartbeat.h"
+#include "gem/selftests/mock_context.h"
+
+#define BEAT_INTERVAL	100
+
+static struct i915_request *nop_request(struct intel_engine_cs *engine)
+{
+	struct i915_request *rq;
+
+	rq = intel_engine_create_kernel_request(engine);
+	if (IS_ERR(rq))
+		return rq;
+
+	i915_request_get(rq);
+	i915_request_add(rq);
+
+	return rq;
+}
+
+static int intel_hang_guc(void *arg)
+{
+	struct intel_gt *gt = arg;
+	int ret = 0;
+	struct i915_gem_context *ctx;
+	struct intel_context *ce;
+	struct igt_spinner spin;
+	struct i915_request *rq;
+	intel_wakeref_t wakeref;
+	struct i915_gpu_error *global = &gt->i915->gpu_error;
+	struct intel_engine_cs *engine;
+	unsigned int reset_count;
+	u32 guc_status;
+	u32 old_beat;
+
+	ctx = kernel_context(gt->i915, NULL);
+	if (IS_ERR(ctx)) {
+		pr_err("Failed get kernel context: %ld\n", PTR_ERR(ctx));
+		return PTR_ERR(ctx);
+	}
+
+	wakeref = intel_runtime_pm_get(gt->uncore->rpm);
+
+	ce = intel_context_create(gt->engine[BCS0]);
+	if (IS_ERR(ce)) {
+		ret = PTR_ERR(ce);
+		pr_err("Failed to create spinner request: %d\n", ret);
+		goto err;
+	}
+
+	engine = ce->engine;
+	reset_count = i915_reset_count(global);
+
+	old_beat = engine->props.heartbeat_interval_ms;
+	ret = intel_engine_set_heartbeat(engine, BEAT_INTERVAL);
+	if (ret) {
+		pr_err("Failed to boost heatbeat interval: %d\n", ret);
+		goto err;
+	}
+
+	ret = igt_spinner_init(&spin, engine->gt);
+	if (ret) {
+		pr_err("Failed to create spinner: %d\n", ret);
+		goto err;
+	}
+
+	rq = igt_spinner_create_request(&spin, ce, MI_NOOP);
+	intel_context_put(ce);
+	if (IS_ERR(rq)) {
+		ret = PTR_ERR(rq);
+		pr_err("Failed to create spinner request: %d\n", ret);
+		goto err_spin;
+	}
+
+	ret = request_add_spin(rq, &spin);
+	if (ret) {
+		i915_request_put(rq);
+		pr_err("Failed to add Spinner request: %d\n", ret);
+		goto err_spin;
+	}
+
+	ret = intel_reset_guc(gt);
+	if (ret) {
+		i915_request_put(rq);
+		pr_err("Failed to reset GuC, ret = %d\n", ret);
+		goto err_spin;
+	}
+
+	guc_status = intel_uncore_read(gt->uncore, GUC_STATUS);
+	if (!(guc_status & GS_MIA_IN_RESET)) {
+		i915_request_put(rq);
+		pr_err("GuC failed to reset: status = 0x%08X\n", guc_status);
+		ret = -EIO;
+		goto err_spin;
+	}
+
+	/* Wait for the heartbeat to cause a reset */
+	ret = intel_selftest_wait_for_rq(rq);
+	i915_request_put(rq);
+	if (ret) {
+		pr_err("Request failed to complete: %d\n", ret);
+		goto err_spin;
+	}
+
+	if (i915_reset_count(global) == reset_count) {
+		pr_err("Failed to record a GPU reset\n");
+		ret = -EINVAL;
+		goto err_spin;
+	}
+
+err_spin:
+	igt_spinner_end(&spin);
+	igt_spinner_fini(&spin);
+	intel_engine_set_heartbeat(engine, old_beat);
+
+	if (ret == 0) {
+		rq = nop_request(engine);
+		if (IS_ERR(rq)) {
+			ret = PTR_ERR(rq);
+			goto err;
+		}
+
+		ret = intel_selftest_wait_for_rq(rq);
+		i915_request_put(rq);
+		if (ret) {
+			pr_err("No-op failed to complete: %d\n", ret);
+			goto err;
+		}
+	}
+
+err:
+	intel_runtime_pm_put(gt->uncore->rpm, wakeref);
+	kernel_context_close(ctx);
+
+	return ret;
+}
+
+int intel_guc_hang_check(struct drm_i915_private *i915)
+{
+	static const struct i915_subtest tests[] = {
+		SUBTEST(intel_hang_guc),
+	};
+	struct intel_gt *gt = to_gt(i915);
+
+	if (intel_gt_is_wedged(gt))
+		return 0;
+
+	if (!intel_uc_uses_guc_submission(&gt->uc))
+		return 0;
+
+	return intel_gt_live_subtests(tests, gt);
+}
diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
index bdd290f2bf3cd..aaf8a380e5c78 100644
--- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
@@ -49,5 +49,6 @@ selftest(perf, i915_perf_live_selftests)
 selftest(slpc, intel_slpc_live_selftests)
 selftest(guc, intel_guc_live_selftests)
 selftest(guc_multi_lrc, intel_guc_multi_lrc_live_selftests)
+selftest(guc_hang, intel_guc_hang_check)
 /* Here be dragons: keep last to run last! */
 selftest(late_gt_pm, intel_gt_pm_late_selftests)
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 09/12] drm/i915/selftest: Cope with not having an RCS engine
  2022-07-12 23:31 ` [Intel-gfx] " John.C.Harrison
@ 2022-07-12 23:31   ` John.C.Harrison
  -1 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: John Harrison, DRI-Devel

From: John Harrison <John.C.Harrison@Intel.com>

It is no longer guaranteed that there will always be an RCS engine.
So, use the helper function for finding the first available engine that
can be used for general purpose selftets.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index 6493265d5f642..7f3bb1d34dfbf 100644
--- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -1302,13 +1302,15 @@ static int igt_reset_wait(void *arg)
 {
 	struct intel_gt *gt = arg;
 	struct i915_gpu_error *global = &gt->i915->gpu_error;
-	struct intel_engine_cs *engine = gt->engine[RCS0];
+	struct intel_engine_cs *engine;
 	struct i915_request *rq;
 	unsigned int reset_count;
 	struct hang h;
 	long timeout;
 	int err;
 
+	engine = intel_selftest_find_any_engine(gt);
+
 	if (!engine || !intel_engine_can_store_dword(engine))
 		return 0;
 
@@ -1432,7 +1434,7 @@ static int __igt_reset_evict_vma(struct intel_gt *gt,
 				 int (*fn)(void *),
 				 unsigned int flags)
 {
-	struct intel_engine_cs *engine = gt->engine[RCS0];
+	struct intel_engine_cs *engine;
 	struct drm_i915_gem_object *obj;
 	struct task_struct *tsk = NULL;
 	struct i915_request *rq;
@@ -1444,6 +1446,8 @@ static int __igt_reset_evict_vma(struct intel_gt *gt,
 	if (!gt->ggtt->num_fences && flags & EXEC_OBJECT_NEEDS_FENCE)
 		return 0;
 
+	engine = intel_selftest_find_any_engine(gt);
+
 	if (!engine || !intel_engine_can_store_dword(engine))
 		return 0;
 
@@ -1819,12 +1823,14 @@ static int igt_handle_error(void *arg)
 {
 	struct intel_gt *gt = arg;
 	struct i915_gpu_error *global = &gt->i915->gpu_error;
-	struct intel_engine_cs *engine = gt->engine[RCS0];
+	struct intel_engine_cs *engine;
 	struct hang h;
 	struct i915_request *rq;
 	struct i915_gpu_coredump *error;
 	int err;
 
+	engine = intel_selftest_find_any_engine(gt);
+
 	/* Check that we can issue a global GPU and engine reset */
 
 	if (!intel_has_reset_engine(gt))
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Intel-gfx] [PATCH 09/12] drm/i915/selftest: Cope with not having an RCS engine
@ 2022-07-12 23:31   ` John.C.Harrison
  0 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: DRI-Devel

From: John Harrison <John.C.Harrison@Intel.com>

It is no longer guaranteed that there will always be an RCS engine.
So, use the helper function for finding the first available engine that
can be used for general purpose selftets.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index 6493265d5f642..7f3bb1d34dfbf 100644
--- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -1302,13 +1302,15 @@ static int igt_reset_wait(void *arg)
 {
 	struct intel_gt *gt = arg;
 	struct i915_gpu_error *global = &gt->i915->gpu_error;
-	struct intel_engine_cs *engine = gt->engine[RCS0];
+	struct intel_engine_cs *engine;
 	struct i915_request *rq;
 	unsigned int reset_count;
 	struct hang h;
 	long timeout;
 	int err;
 
+	engine = intel_selftest_find_any_engine(gt);
+
 	if (!engine || !intel_engine_can_store_dword(engine))
 		return 0;
 
@@ -1432,7 +1434,7 @@ static int __igt_reset_evict_vma(struct intel_gt *gt,
 				 int (*fn)(void *),
 				 unsigned int flags)
 {
-	struct intel_engine_cs *engine = gt->engine[RCS0];
+	struct intel_engine_cs *engine;
 	struct drm_i915_gem_object *obj;
 	struct task_struct *tsk = NULL;
 	struct i915_request *rq;
@@ -1444,6 +1446,8 @@ static int __igt_reset_evict_vma(struct intel_gt *gt,
 	if (!gt->ggtt->num_fences && flags & EXEC_OBJECT_NEEDS_FENCE)
 		return 0;
 
+	engine = intel_selftest_find_any_engine(gt);
+
 	if (!engine || !intel_engine_can_store_dword(engine))
 		return 0;
 
@@ -1819,12 +1823,14 @@ static int igt_handle_error(void *arg)
 {
 	struct intel_gt *gt = arg;
 	struct i915_gpu_error *global = &gt->i915->gpu_error;
-	struct intel_engine_cs *engine = gt->engine[RCS0];
+	struct intel_engine_cs *engine;
 	struct hang h;
 	struct i915_request *rq;
 	struct i915_gpu_coredump *error;
 	int err;
 
+	engine = intel_selftest_find_any_engine(gt);
+
 	/* Check that we can issue a global GPU and engine reset */
 
 	if (!intel_has_reset_engine(gt))
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 10/12] drm/i915/guc: Support larger contexts on newer hardware
  2022-07-12 23:31 ` [Intel-gfx] " John.C.Harrison
@ 2022-07-12 23:31   ` John.C.Harrison
  -1 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: Matthew Brost, DRI-Devel

From: Matthew Brost <matthew.brost@intel.com>

The GuC needs a copy of a golden context for implementing watchdog
resets (aka media resets). This context is larger on newer platforms.
So adjust the size being allocated/copied accordingly.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index ba7541f3ca610..74cbe8eaf5318 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -464,7 +464,11 @@ static void fill_engine_enable_masks(struct intel_gt *gt,
 }
 
 #define LR_HW_CONTEXT_SIZE (80 * sizeof(u32))
-#define LRC_SKIP_SIZE (LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SIZE)
+#define XEHP_LR_HW_CONTEXT_SIZE (96 * sizeof(u32))
+#define LR_HW_CONTEXT_SZ(i915) (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50) ? \
+				    XEHP_LR_HW_CONTEXT_SIZE : \
+				    LR_HW_CONTEXT_SIZE)
+#define LRC_SKIP_SIZE(i915) (LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SZ(i915))
 static int guc_prep_golden_context(struct intel_guc *guc)
 {
 	struct intel_gt *gt = guc_to_gt(guc);
@@ -525,7 +529,7 @@ static int guc_prep_golden_context(struct intel_guc *guc)
 		 * on all engines).
 		 */
 		ads_blob_write(guc, ads.eng_state_size[guc_class],
-			       real_size - LRC_SKIP_SIZE);
+			       real_size - LRC_SKIP_SIZE(gt->i915));
 		ads_blob_write(guc, ads.golden_context_lrca[guc_class],
 			       addr_ggtt);
 
@@ -599,7 +603,7 @@ static void guc_init_golden_context(struct intel_guc *guc)
 		}
 
 		GEM_BUG_ON(ads_blob_read(guc, ads.eng_state_size[guc_class]) !=
-			   real_size - LRC_SKIP_SIZE);
+			   real_size - LRC_SKIP_SIZE(gt->i915));
 		GEM_BUG_ON(ads_blob_read(guc, ads.golden_context_lrca[guc_class]) != addr_ggtt);
 
 		addr_ggtt += alloc_size;
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Intel-gfx] [PATCH 10/12] drm/i915/guc: Support larger contexts on newer hardware
@ 2022-07-12 23:31   ` John.C.Harrison
  0 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: DRI-Devel

From: Matthew Brost <matthew.brost@intel.com>

The GuC needs a copy of a golden context for implementing watchdog
resets (aka media resets). This context is larger on newer platforms.
So adjust the size being allocated/copied accordingly.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index ba7541f3ca610..74cbe8eaf5318 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -464,7 +464,11 @@ static void fill_engine_enable_masks(struct intel_gt *gt,
 }
 
 #define LR_HW_CONTEXT_SIZE (80 * sizeof(u32))
-#define LRC_SKIP_SIZE (LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SIZE)
+#define XEHP_LR_HW_CONTEXT_SIZE (96 * sizeof(u32))
+#define LR_HW_CONTEXT_SZ(i915) (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50) ? \
+				    XEHP_LR_HW_CONTEXT_SIZE : \
+				    LR_HW_CONTEXT_SIZE)
+#define LRC_SKIP_SIZE(i915) (LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SZ(i915))
 static int guc_prep_golden_context(struct intel_guc *guc)
 {
 	struct intel_gt *gt = guc_to_gt(guc);
@@ -525,7 +529,7 @@ static int guc_prep_golden_context(struct intel_guc *guc)
 		 * on all engines).
 		 */
 		ads_blob_write(guc, ads.eng_state_size[guc_class],
-			       real_size - LRC_SKIP_SIZE);
+			       real_size - LRC_SKIP_SIZE(gt->i915));
 		ads_blob_write(guc, ads.golden_context_lrca[guc_class],
 			       addr_ggtt);
 
@@ -599,7 +603,7 @@ static void guc_init_golden_context(struct intel_guc *guc)
 		}
 
 		GEM_BUG_ON(ads_blob_read(guc, ads.eng_state_size[guc_class]) !=
-			   real_size - LRC_SKIP_SIZE);
+			   real_size - LRC_SKIP_SIZE(gt->i915));
 		GEM_BUG_ON(ads_blob_read(guc, ads.golden_context_lrca[guc_class]) != addr_ggtt);
 
 		addr_ggtt += alloc_size;
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 11/12] drm/i915/guc: Don't abort on CTB_UNUSED status
  2022-07-12 23:31 ` [Intel-gfx] " John.C.Harrison
@ 2022-07-12 23:31   ` John.C.Harrison
  -1 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: John Harrison, DRI-Devel

From: John Harrison <John.C.Harrison@Intel.com>

When the KMD sends a CLIENT_RESET request to GuC (as part of the
suspend sequence), GuC will mark the CTB buffer as 'UNUSED'. If the
KMD then checked the CTB queue, it would see a non-zero status value
and report the buffer as corrupted.

Technically, no G2H messages should be received once the CLIENT_RESET
has been sent. However, if a context was outstanding on an engine then
it would get reset and a reset notification would be sent. So, don't
actually treat UNUSED as a catastrophic error. Just flag it up as
unexpected and keep going.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 .../i915/gt/uc/abi/guc_communication_ctb_abi.h |  8 +++++---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c      | 18 ++++++++++++++++--
 2 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
index df83c1cc7c7a6..28b8387f97b77 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
@@ -37,6 +37,7 @@
  *  |   |       |   - _`GUC_CTB_STATUS_OVERFLOW` = 1 (head/tail too large)     |
  *  |   |       |   - _`GUC_CTB_STATUS_UNDERFLOW` = 2 (truncated message)      |
  *  |   |       |   - _`GUC_CTB_STATUS_MISMATCH` = 4 (head/tail modified)      |
+ *  |   |       |   - _`GUC_CTB_STATUS_UNUSED` = 8 (CTB is not in use)         |
  *  +---+-------+--------------------------------------------------------------+
  *  |...|       | RESERVED = MBZ                                               |
  *  +---+-------+--------------------------------------------------------------+
@@ -49,9 +50,10 @@ struct guc_ct_buffer_desc {
 	u32 tail;
 	u32 status;
 #define GUC_CTB_STATUS_NO_ERROR				0
-#define GUC_CTB_STATUS_OVERFLOW				(1 << 0)
-#define GUC_CTB_STATUS_UNDERFLOW			(1 << 1)
-#define GUC_CTB_STATUS_MISMATCH				(1 << 2)
+#define GUC_CTB_STATUS_OVERFLOW				BIT(0)
+#define GUC_CTB_STATUS_UNDERFLOW			BIT(1)
+#define GUC_CTB_STATUS_MISMATCH				BIT(2)
+#define GUC_CTB_STATUS_UNUSED				BIT(3)
 	u32 reserved[13];
 } __packed;
 static_assert(sizeof(struct guc_ct_buffer_desc) == 64);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index f01325cd1b625..11b5d4ddb19ce 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -816,8 +816,22 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
 	if (unlikely(ctb->broken))
 		return -EPIPE;
 
-	if (unlikely(desc->status))
-		goto corrupted;
+	if (unlikely(desc->status)) {
+		u32 status = desc->status;
+
+		if (status & GUC_CTB_STATUS_UNUSED) {
+			/*
+			 * Potentially valid if a CLIENT_RESET request resulted in
+			 * contexts/engines being reset. But should never happen as
+			 * no contexts should be active when CLIENT_RESET is sent.
+			 */
+			CT_ERROR(ct, "Unexpected G2H after GuC has stopped!\n");
+			status &= ~GUC_CTB_STATUS_UNUSED;
+		}
+
+		if (status)
+			goto corrupted;
+	}
 
 	GEM_BUG_ON(head > size);
 
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Intel-gfx] [PATCH 11/12] drm/i915/guc: Don't abort on CTB_UNUSED status
@ 2022-07-12 23:31   ` John.C.Harrison
  0 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: DRI-Devel

From: John Harrison <John.C.Harrison@Intel.com>

When the KMD sends a CLIENT_RESET request to GuC (as part of the
suspend sequence), GuC will mark the CTB buffer as 'UNUSED'. If the
KMD then checked the CTB queue, it would see a non-zero status value
and report the buffer as corrupted.

Technically, no G2H messages should be received once the CLIENT_RESET
has been sent. However, if a context was outstanding on an engine then
it would get reset and a reset notification would be sent. So, don't
actually treat UNUSED as a catastrophic error. Just flag it up as
unexpected and keep going.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 .../i915/gt/uc/abi/guc_communication_ctb_abi.h |  8 +++++---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c      | 18 ++++++++++++++++--
 2 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
index df83c1cc7c7a6..28b8387f97b77 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
@@ -37,6 +37,7 @@
  *  |   |       |   - _`GUC_CTB_STATUS_OVERFLOW` = 1 (head/tail too large)     |
  *  |   |       |   - _`GUC_CTB_STATUS_UNDERFLOW` = 2 (truncated message)      |
  *  |   |       |   - _`GUC_CTB_STATUS_MISMATCH` = 4 (head/tail modified)      |
+ *  |   |       |   - _`GUC_CTB_STATUS_UNUSED` = 8 (CTB is not in use)         |
  *  +---+-------+--------------------------------------------------------------+
  *  |...|       | RESERVED = MBZ                                               |
  *  +---+-------+--------------------------------------------------------------+
@@ -49,9 +50,10 @@ struct guc_ct_buffer_desc {
 	u32 tail;
 	u32 status;
 #define GUC_CTB_STATUS_NO_ERROR				0
-#define GUC_CTB_STATUS_OVERFLOW				(1 << 0)
-#define GUC_CTB_STATUS_UNDERFLOW			(1 << 1)
-#define GUC_CTB_STATUS_MISMATCH				(1 << 2)
+#define GUC_CTB_STATUS_OVERFLOW				BIT(0)
+#define GUC_CTB_STATUS_UNDERFLOW			BIT(1)
+#define GUC_CTB_STATUS_MISMATCH				BIT(2)
+#define GUC_CTB_STATUS_UNUSED				BIT(3)
 	u32 reserved[13];
 } __packed;
 static_assert(sizeof(struct guc_ct_buffer_desc) == 64);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index f01325cd1b625..11b5d4ddb19ce 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -816,8 +816,22 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
 	if (unlikely(ctb->broken))
 		return -EPIPE;
 
-	if (unlikely(desc->status))
-		goto corrupted;
+	if (unlikely(desc->status)) {
+		u32 status = desc->status;
+
+		if (status & GUC_CTB_STATUS_UNUSED) {
+			/*
+			 * Potentially valid if a CLIENT_RESET request resulted in
+			 * contexts/engines being reset. But should never happen as
+			 * no contexts should be active when CLIENT_RESET is sent.
+			 */
+			CT_ERROR(ct, "Unexpected G2H after GuC has stopped!\n");
+			status &= ~GUC_CTB_STATUS_UNUSED;
+		}
+
+		if (status)
+			goto corrupted;
+	}
 
 	GEM_BUG_ON(head > size);
 
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 12/12] drm/i915/guc: Add a helper for log buffer size
  2022-07-12 23:31 ` [Intel-gfx] " John.C.Harrison
@ 2022-07-12 23:31   ` John.C.Harrison
  -1 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: Alan Previn, DRI-Devel

From: Alan Previn <alan.previn.teres.alexis@intel.com>

Add a helper to get GuC log buffer size.

Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_log.c | 49 ++++++++++++----------
 1 file changed, 27 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
index ff091adb56096..4bb81d2cf3828 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
@@ -15,6 +15,32 @@
 
 static void guc_log_copy_debuglogs_for_relay(struct intel_guc_log *log);
 
+static u32 intel_guc_log_size(struct intel_guc_log *log)
+{
+	/*
+	 *  GuC Log buffer Layout:
+	 *
+	 *  NB: Ordering must follow "enum guc_log_buffer_type".
+	 *
+	 *  +===============================+ 00B
+	 *  |      Debug state header       |
+	 *  +-------------------------------+ 32B
+	 *  |    Crash dump state header    |
+	 *  +-------------------------------+ 64B
+	 *  |     Capture state header      |
+	 *  +-------------------------------+ 96B
+	 *  |                               |
+	 *  +===============================+ PAGE_SIZE (4KB)
+	 *  |          Debug logs           |
+	 *  +===============================+ + DEBUG_SIZE
+	 *  |        Crash Dump logs        |
+	 *  +===============================+ + CRASH_SIZE
+	 *  |         Capture logs          |
+	 *  +===============================+ + CAPTURE_SIZE
+	 */
+	return PAGE_SIZE + CRASH_BUFFER_SIZE + DEBUG_BUFFER_SIZE + CAPTURE_BUFFER_SIZE;
+}
+
 /**
  * DOC: GuC firmware log
  *
@@ -461,32 +487,11 @@ int intel_guc_log_create(struct intel_guc_log *log)
 
 	GEM_BUG_ON(log->vma);
 
-	/*
-	 *  GuC Log buffer Layout
-	 * (this ordering must follow "enum guc_log_buffer_type" definition)
-	 *
-	 *  +===============================+ 00B
-	 *  |      Debug state header       |
-	 *  +-------------------------------+ 32B
-	 *  |    Crash dump state header    |
-	 *  +-------------------------------+ 64B
-	 *  |     Capture state header      |
-	 *  +-------------------------------+ 96B
-	 *  |                               |
-	 *  +===============================+ PAGE_SIZE (4KB)
-	 *  |          Debug logs           |
-	 *  +===============================+ + DEBUG_SIZE
-	 *  |        Crash Dump logs        |
-	 *  +===============================+ + CRASH_SIZE
-	 *  |         Capture logs          |
-	 *  +===============================+ + CAPTURE_SIZE
-	 */
 	if (intel_guc_capture_output_min_size_est(guc) > CAPTURE_BUFFER_SIZE)
 		DRM_WARN("GuC log buffer for state_capture maybe too small. %d < %d\n",
 			 CAPTURE_BUFFER_SIZE, intel_guc_capture_output_min_size_est(guc));
 
-	guc_log_size = PAGE_SIZE + CRASH_BUFFER_SIZE + DEBUG_BUFFER_SIZE +
-		       CAPTURE_BUFFER_SIZE;
+	guc_log_size = intel_guc_log_size(log);
 
 	vma = intel_guc_allocate_vma(guc, guc_log_size);
 	if (IS_ERR(vma)) {
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Intel-gfx] [PATCH 12/12] drm/i915/guc: Add a helper for log buffer size
@ 2022-07-12 23:31   ` John.C.Harrison
  0 siblings, 0 replies; 56+ messages in thread
From: John.C.Harrison @ 2022-07-12 23:31 UTC (permalink / raw)
  To: Intel-GFX; +Cc: Alan Previn, DRI-Devel

From: Alan Previn <alan.previn.teres.alexis@intel.com>

Add a helper to get GuC log buffer size.

Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_log.c | 49 ++++++++++++----------
 1 file changed, 27 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
index ff091adb56096..4bb81d2cf3828 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
@@ -15,6 +15,32 @@
 
 static void guc_log_copy_debuglogs_for_relay(struct intel_guc_log *log);
 
+static u32 intel_guc_log_size(struct intel_guc_log *log)
+{
+	/*
+	 *  GuC Log buffer Layout:
+	 *
+	 *  NB: Ordering must follow "enum guc_log_buffer_type".
+	 *
+	 *  +===============================+ 00B
+	 *  |      Debug state header       |
+	 *  +-------------------------------+ 32B
+	 *  |    Crash dump state header    |
+	 *  +-------------------------------+ 64B
+	 *  |     Capture state header      |
+	 *  +-------------------------------+ 96B
+	 *  |                               |
+	 *  +===============================+ PAGE_SIZE (4KB)
+	 *  |          Debug logs           |
+	 *  +===============================+ + DEBUG_SIZE
+	 *  |        Crash Dump logs        |
+	 *  +===============================+ + CRASH_SIZE
+	 *  |         Capture logs          |
+	 *  +===============================+ + CAPTURE_SIZE
+	 */
+	return PAGE_SIZE + CRASH_BUFFER_SIZE + DEBUG_BUFFER_SIZE + CAPTURE_BUFFER_SIZE;
+}
+
 /**
  * DOC: GuC firmware log
  *
@@ -461,32 +487,11 @@ int intel_guc_log_create(struct intel_guc_log *log)
 
 	GEM_BUG_ON(log->vma);
 
-	/*
-	 *  GuC Log buffer Layout
-	 * (this ordering must follow "enum guc_log_buffer_type" definition)
-	 *
-	 *  +===============================+ 00B
-	 *  |      Debug state header       |
-	 *  +-------------------------------+ 32B
-	 *  |    Crash dump state header    |
-	 *  +-------------------------------+ 64B
-	 *  |     Capture state header      |
-	 *  +-------------------------------+ 96B
-	 *  |                               |
-	 *  +===============================+ PAGE_SIZE (4KB)
-	 *  |          Debug logs           |
-	 *  +===============================+ + DEBUG_SIZE
-	 *  |        Crash Dump logs        |
-	 *  +===============================+ + CRASH_SIZE
-	 *  |         Capture logs          |
-	 *  +===============================+ + CAPTURE_SIZE
-	 */
 	if (intel_guc_capture_output_min_size_est(guc) > CAPTURE_BUFFER_SIZE)
 		DRM_WARN("GuC log buffer for state_capture maybe too small. %d < %d\n",
 			 CAPTURE_BUFFER_SIZE, intel_guc_capture_output_min_size_est(guc));
 
-	guc_log_size = PAGE_SIZE + CRASH_BUFFER_SIZE + DEBUG_BUFFER_SIZE +
-		       CAPTURE_BUFFER_SIZE;
+	guc_log_size = intel_guc_log_size(log);
 
 	vma = intel_guc_allocate_vma(guc, guc_log_size);
 	if (IS_ERR(vma)) {
-- 
2.36.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: warning for Random assortment of (mostly) GuC related patches
  2022-07-12 23:31 ` [Intel-gfx] " John.C.Harrison
                   ` (12 preceding siblings ...)
  (?)
@ 2022-07-13  0:31 ` Patchwork
  -1 siblings, 0 replies; 56+ messages in thread
From: Patchwork @ 2022-07-13  0:31 UTC (permalink / raw)
  To: john.c.harrison; +Cc: intel-gfx

== Series Details ==

Series: Random assortment of (mostly) GuC related patches
URL   : https://patchwork.freedesktop.org/series/106272/
State : warning

== Summary ==

Error: patch https://patchwork.freedesktop.org/api/1.0/series/106272/revisions/1/mbox/ not found



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 12/12] drm/i915/guc: Add a helper for log buffer size
  2022-07-12 23:31   ` [Intel-gfx] " John.C.Harrison
@ 2022-07-13  0:46     ` Matthew Brost
  -1 siblings, 0 replies; 56+ messages in thread
From: Matthew Brost @ 2022-07-13  0:46 UTC (permalink / raw)
  To: John.C.Harrison; +Cc: Intel-GFX, DRI-Devel, Alan Previn

On Tue, Jul 12, 2022 at 04:31:36PM -0700, John.C.Harrison@Intel.com wrote:
> From: Alan Previn <alan.previn.teres.alexis@intel.com>
> 
> Add a helper to get GuC log buffer size.
> 
> Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>

John - you need to add a Signed-off-by since you posted.

Patch LGTM:

Reviewed-by: Matthew Brost <matthew.brost@intel.com>

> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_log.c | 49 ++++++++++++----------
>  1 file changed, 27 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
> index ff091adb56096..4bb81d2cf3828 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
> @@ -15,6 +15,32 @@
>  
>  static void guc_log_copy_debuglogs_for_relay(struct intel_guc_log *log);
>  
> +static u32 intel_guc_log_size(struct intel_guc_log *log)
> +{
> +	/*
> +	 *  GuC Log buffer Layout:
> +	 *
> +	 *  NB: Ordering must follow "enum guc_log_buffer_type".
> +	 *
> +	 *  +===============================+ 00B
> +	 *  |      Debug state header       |
> +	 *  +-------------------------------+ 32B
> +	 *  |    Crash dump state header    |
> +	 *  +-------------------------------+ 64B
> +	 *  |     Capture state header      |
> +	 *  +-------------------------------+ 96B
> +	 *  |                               |
> +	 *  +===============================+ PAGE_SIZE (4KB)
> +	 *  |          Debug logs           |
> +	 *  +===============================+ + DEBUG_SIZE
> +	 *  |        Crash Dump logs        |
> +	 *  +===============================+ + CRASH_SIZE
> +	 *  |         Capture logs          |
> +	 *  +===============================+ + CAPTURE_SIZE
> +	 */
> +	return PAGE_SIZE + CRASH_BUFFER_SIZE + DEBUG_BUFFER_SIZE + CAPTURE_BUFFER_SIZE;
> +}
> +
>  /**
>   * DOC: GuC firmware log
>   *
> @@ -461,32 +487,11 @@ int intel_guc_log_create(struct intel_guc_log *log)
>  
>  	GEM_BUG_ON(log->vma);
>  
> -	/*
> -	 *  GuC Log buffer Layout
> -	 * (this ordering must follow "enum guc_log_buffer_type" definition)
> -	 *
> -	 *  +===============================+ 00B
> -	 *  |      Debug state header       |
> -	 *  +-------------------------------+ 32B
> -	 *  |    Crash dump state header    |
> -	 *  +-------------------------------+ 64B
> -	 *  |     Capture state header      |
> -	 *  +-------------------------------+ 96B
> -	 *  |                               |
> -	 *  +===============================+ PAGE_SIZE (4KB)
> -	 *  |          Debug logs           |
> -	 *  +===============================+ + DEBUG_SIZE
> -	 *  |        Crash Dump logs        |
> -	 *  +===============================+ + CRASH_SIZE
> -	 *  |         Capture logs          |
> -	 *  +===============================+ + CAPTURE_SIZE
> -	 */
>  	if (intel_guc_capture_output_min_size_est(guc) > CAPTURE_BUFFER_SIZE)
>  		DRM_WARN("GuC log buffer for state_capture maybe too small. %d < %d\n",
>  			 CAPTURE_BUFFER_SIZE, intel_guc_capture_output_min_size_est(guc));
>  
> -	guc_log_size = PAGE_SIZE + CRASH_BUFFER_SIZE + DEBUG_BUFFER_SIZE +
> -		       CAPTURE_BUFFER_SIZE;
> +	guc_log_size = intel_guc_log_size(log);
>  
>  	vma = intel_guc_allocate_vma(guc, guc_log_size);
>  	if (IS_ERR(vma)) {
> -- 
> 2.36.0
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Intel-gfx] [PATCH 12/12] drm/i915/guc: Add a helper for log buffer size
@ 2022-07-13  0:46     ` Matthew Brost
  0 siblings, 0 replies; 56+ messages in thread
From: Matthew Brost @ 2022-07-13  0:46 UTC (permalink / raw)
  To: John.C.Harrison; +Cc: Intel-GFX, DRI-Devel, Alan Previn

On Tue, Jul 12, 2022 at 04:31:36PM -0700, John.C.Harrison@Intel.com wrote:
> From: Alan Previn <alan.previn.teres.alexis@intel.com>
> 
> Add a helper to get GuC log buffer size.
> 
> Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>

John - you need to add a Signed-off-by since you posted.

Patch LGTM:

Reviewed-by: Matthew Brost <matthew.brost@intel.com>

> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_log.c | 49 ++++++++++++----------
>  1 file changed, 27 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
> index ff091adb56096..4bb81d2cf3828 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
> @@ -15,6 +15,32 @@
>  
>  static void guc_log_copy_debuglogs_for_relay(struct intel_guc_log *log);
>  
> +static u32 intel_guc_log_size(struct intel_guc_log *log)
> +{
> +	/*
> +	 *  GuC Log buffer Layout:
> +	 *
> +	 *  NB: Ordering must follow "enum guc_log_buffer_type".
> +	 *
> +	 *  +===============================+ 00B
> +	 *  |      Debug state header       |
> +	 *  +-------------------------------+ 32B
> +	 *  |    Crash dump state header    |
> +	 *  +-------------------------------+ 64B
> +	 *  |     Capture state header      |
> +	 *  +-------------------------------+ 96B
> +	 *  |                               |
> +	 *  +===============================+ PAGE_SIZE (4KB)
> +	 *  |          Debug logs           |
> +	 *  +===============================+ + DEBUG_SIZE
> +	 *  |        Crash Dump logs        |
> +	 *  +===============================+ + CRASH_SIZE
> +	 *  |         Capture logs          |
> +	 *  +===============================+ + CAPTURE_SIZE
> +	 */
> +	return PAGE_SIZE + CRASH_BUFFER_SIZE + DEBUG_BUFFER_SIZE + CAPTURE_BUFFER_SIZE;
> +}
> +
>  /**
>   * DOC: GuC firmware log
>   *
> @@ -461,32 +487,11 @@ int intel_guc_log_create(struct intel_guc_log *log)
>  
>  	GEM_BUG_ON(log->vma);
>  
> -	/*
> -	 *  GuC Log buffer Layout
> -	 * (this ordering must follow "enum guc_log_buffer_type" definition)
> -	 *
> -	 *  +===============================+ 00B
> -	 *  |      Debug state header       |
> -	 *  +-------------------------------+ 32B
> -	 *  |    Crash dump state header    |
> -	 *  +-------------------------------+ 64B
> -	 *  |     Capture state header      |
> -	 *  +-------------------------------+ 96B
> -	 *  |                               |
> -	 *  +===============================+ PAGE_SIZE (4KB)
> -	 *  |          Debug logs           |
> -	 *  +===============================+ + DEBUG_SIZE
> -	 *  |        Crash Dump logs        |
> -	 *  +===============================+ + CRASH_SIZE
> -	 *  |         Capture logs          |
> -	 *  +===============================+ + CAPTURE_SIZE
> -	 */
>  	if (intel_guc_capture_output_min_size_est(guc) > CAPTURE_BUFFER_SIZE)
>  		DRM_WARN("GuC log buffer for state_capture maybe too small. %d < %d\n",
>  			 CAPTURE_BUFFER_SIZE, intel_guc_capture_output_min_size_est(guc));
>  
> -	guc_log_size = PAGE_SIZE + CRASH_BUFFER_SIZE + DEBUG_BUFFER_SIZE +
> -		       CAPTURE_BUFFER_SIZE;
> +	guc_log_size = intel_guc_log_size(log);
>  
>  	vma = intel_guc_allocate_vma(guc, guc_log_size);
>  	if (IS_ERR(vma)) {
> -- 
> 2.36.0
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 09/12] drm/i915/selftest: Cope with not having an RCS engine
  2022-07-12 23:31   ` [Intel-gfx] " John.C.Harrison
@ 2022-07-13  0:48     ` Matthew Brost
  -1 siblings, 0 replies; 56+ messages in thread
From: Matthew Brost @ 2022-07-13  0:48 UTC (permalink / raw)
  To: John.C.Harrison; +Cc: Intel-GFX, DRI-Devel

On Tue, Jul 12, 2022 at 04:31:33PM -0700, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> It is no longer guaranteed that there will always be an RCS engine.
> So, use the helper function for finding the first available engine that
> can be used for general purpose selftets.
> 
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>

Reviewed-by: Matthew Brost <matthew.brost@intel.com>

> ---
>  drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> index 6493265d5f642..7f3bb1d34dfbf 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> @@ -1302,13 +1302,15 @@ static int igt_reset_wait(void *arg)
>  {
>  	struct intel_gt *gt = arg;
>  	struct i915_gpu_error *global = &gt->i915->gpu_error;
> -	struct intel_engine_cs *engine = gt->engine[RCS0];
> +	struct intel_engine_cs *engine;
>  	struct i915_request *rq;
>  	unsigned int reset_count;
>  	struct hang h;
>  	long timeout;
>  	int err;
>  
> +	engine = intel_selftest_find_any_engine(gt);
> +
>  	if (!engine || !intel_engine_can_store_dword(engine))
>  		return 0;
>  
> @@ -1432,7 +1434,7 @@ static int __igt_reset_evict_vma(struct intel_gt *gt,
>  				 int (*fn)(void *),
>  				 unsigned int flags)
>  {
> -	struct intel_engine_cs *engine = gt->engine[RCS0];
> +	struct intel_engine_cs *engine;
>  	struct drm_i915_gem_object *obj;
>  	struct task_struct *tsk = NULL;
>  	struct i915_request *rq;
> @@ -1444,6 +1446,8 @@ static int __igt_reset_evict_vma(struct intel_gt *gt,
>  	if (!gt->ggtt->num_fences && flags & EXEC_OBJECT_NEEDS_FENCE)
>  		return 0;
>  
> +	engine = intel_selftest_find_any_engine(gt);
> +
>  	if (!engine || !intel_engine_can_store_dword(engine))
>  		return 0;
>  
> @@ -1819,12 +1823,14 @@ static int igt_handle_error(void *arg)
>  {
>  	struct intel_gt *gt = arg;
>  	struct i915_gpu_error *global = &gt->i915->gpu_error;
> -	struct intel_engine_cs *engine = gt->engine[RCS0];
> +	struct intel_engine_cs *engine;
>  	struct hang h;
>  	struct i915_request *rq;
>  	struct i915_gpu_coredump *error;
>  	int err;
>  
> +	engine = intel_selftest_find_any_engine(gt);
> +
>  	/* Check that we can issue a global GPU and engine reset */
>  
>  	if (!intel_has_reset_engine(gt))
> -- 
> 2.36.0
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Intel-gfx] [PATCH 09/12] drm/i915/selftest: Cope with not having an RCS engine
@ 2022-07-13  0:48     ` Matthew Brost
  0 siblings, 0 replies; 56+ messages in thread
From: Matthew Brost @ 2022-07-13  0:48 UTC (permalink / raw)
  To: John.C.Harrison; +Cc: Intel-GFX, DRI-Devel

On Tue, Jul 12, 2022 at 04:31:33PM -0700, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> It is no longer guaranteed that there will always be an RCS engine.
> So, use the helper function for finding the first available engine that
> can be used for general purpose selftets.
> 
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>

Reviewed-by: Matthew Brost <matthew.brost@intel.com>

> ---
>  drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> index 6493265d5f642..7f3bb1d34dfbf 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> @@ -1302,13 +1302,15 @@ static int igt_reset_wait(void *arg)
>  {
>  	struct intel_gt *gt = arg;
>  	struct i915_gpu_error *global = &gt->i915->gpu_error;
> -	struct intel_engine_cs *engine = gt->engine[RCS0];
> +	struct intel_engine_cs *engine;
>  	struct i915_request *rq;
>  	unsigned int reset_count;
>  	struct hang h;
>  	long timeout;
>  	int err;
>  
> +	engine = intel_selftest_find_any_engine(gt);
> +
>  	if (!engine || !intel_engine_can_store_dword(engine))
>  		return 0;
>  
> @@ -1432,7 +1434,7 @@ static int __igt_reset_evict_vma(struct intel_gt *gt,
>  				 int (*fn)(void *),
>  				 unsigned int flags)
>  {
> -	struct intel_engine_cs *engine = gt->engine[RCS0];
> +	struct intel_engine_cs *engine;
>  	struct drm_i915_gem_object *obj;
>  	struct task_struct *tsk = NULL;
>  	struct i915_request *rq;
> @@ -1444,6 +1446,8 @@ static int __igt_reset_evict_vma(struct intel_gt *gt,
>  	if (!gt->ggtt->num_fences && flags & EXEC_OBJECT_NEEDS_FENCE)
>  		return 0;
>  
> +	engine = intel_selftest_find_any_engine(gt);
> +
>  	if (!engine || !intel_engine_can_store_dword(engine))
>  		return 0;
>  
> @@ -1819,12 +1823,14 @@ static int igt_handle_error(void *arg)
>  {
>  	struct intel_gt *gt = arg;
>  	struct i915_gpu_error *global = &gt->i915->gpu_error;
> -	struct intel_engine_cs *engine = gt->engine[RCS0];
> +	struct intel_engine_cs *engine;
>  	struct hang h;
>  	struct i915_request *rq;
>  	struct i915_gpu_coredump *error;
>  	int err;
>  
> +	engine = intel_selftest_find_any_engine(gt);
> +
>  	/* Check that we can issue a global GPU and engine reset */
>  
>  	if (!intel_has_reset_engine(gt))
> -- 
> 2.36.0
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 07/12] drm/i915/guc: Route semaphores to GuC for Gen12+
  2022-07-12 23:31   ` [Intel-gfx] " John.C.Harrison
@ 2022-07-13  0:51     ` Matthew Brost
  -1 siblings, 0 replies; 56+ messages in thread
From: Matthew Brost @ 2022-07-13  0:51 UTC (permalink / raw)
  To: John.C.Harrison; +Cc: Intel-GFX, Michał Winiarski, DRI-Devel

On Tue, Jul 12, 2022 at 04:31:31PM -0700, John.C.Harrison@Intel.com wrote:
> From: Michał Winiarski <michal.winiarski@intel.com>
> 
> Since we're going to use semaphores in selftests (and eventually in
> regular GuC submission), let's route semaphores to GuC.

I don't think this comment isn't correct, we have no plans to use
semaphores in GuC submission. Still if we want semaphores to work with
GuC submission they should be routed to the GuC.

> 
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>

Again John, add a SoB for any patch you post.

With a better commit message and SoB update:
Reviewed-by: Matthew Brost <matthew.brost@intel.com>

> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h        |  4 ++++
>  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 14 ++++++++++++++
>  2 files changed, 18 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
> index 8dc063f087eb1..a7092f711e9cd 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
> @@ -102,6 +102,10 @@
>  #define   GUC_SEND_TRIGGER		  (1<<0)
>  #define GEN11_GUC_HOST_INTERRUPT	_MMIO(0x1901f0)
>  
> +#define GEN12_GUC_SEM_INTR_ENABLES	_MMIO(0xc71c)
> +#define   GUC_SEM_INTR_ROUTE_TO_GUC	BIT(31)
> +#define   GUC_SEM_INTR_ENABLE_ALL	(0xff)
> +
>  #define GUC_NUM_DOORBELLS		256
>  
>  /* format of the HW-monitored doorbell cacheline */
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 40f726c61e951..7537459080278 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -3953,13 +3953,27 @@ int intel_guc_submission_setup(struct intel_engine_cs *engine)
>  
>  void intel_guc_submission_enable(struct intel_guc *guc)
>  {
> +	struct intel_gt *gt = guc_to_gt(guc);
> +
> +	/* Enable and route to GuC */
> +	if (GRAPHICS_VER(gt->i915) >= 12)
> +		intel_uncore_write(gt->uncore, GEN12_GUC_SEM_INTR_ENABLES,
> +				   GUC_SEM_INTR_ROUTE_TO_GUC |
> +				   GUC_SEM_INTR_ENABLE_ALL);
> +
>  	guc_init_lrc_mapping(guc);
>  	guc_init_engine_stats(guc);
>  }
>  
>  void intel_guc_submission_disable(struct intel_guc *guc)
>  {
> +	struct intel_gt *gt = guc_to_gt(guc);
> +
>  	/* Note: By the time we're here, GuC may have already been reset */
> +
> +	/* Disable and route to host */
> +	if (GRAPHICS_VER(gt->i915) >= 12)
> +		intel_uncore_write(gt->uncore, GEN12_GUC_SEM_INTR_ENABLES, 0x0);
>  }
>  
>  static bool __guc_submission_supported(struct intel_guc *guc)
> -- 
> 2.36.0
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Intel-gfx] [PATCH 07/12] drm/i915/guc: Route semaphores to GuC for Gen12+
@ 2022-07-13  0:51     ` Matthew Brost
  0 siblings, 0 replies; 56+ messages in thread
From: Matthew Brost @ 2022-07-13  0:51 UTC (permalink / raw)
  To: John.C.Harrison; +Cc: Intel-GFX, Michał Winiarski, DRI-Devel

On Tue, Jul 12, 2022 at 04:31:31PM -0700, John.C.Harrison@Intel.com wrote:
> From: Michał Winiarski <michal.winiarski@intel.com>
> 
> Since we're going to use semaphores in selftests (and eventually in
> regular GuC submission), let's route semaphores to GuC.

I don't think this comment isn't correct, we have no plans to use
semaphores in GuC submission. Still if we want semaphores to work with
GuC submission they should be routed to the GuC.

> 
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>

Again John, add a SoB for any patch you post.

With a better commit message and SoB update:
Reviewed-by: Matthew Brost <matthew.brost@intel.com>

> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h        |  4 ++++
>  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 14 ++++++++++++++
>  2 files changed, 18 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
> index 8dc063f087eb1..a7092f711e9cd 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
> @@ -102,6 +102,10 @@
>  #define   GUC_SEND_TRIGGER		  (1<<0)
>  #define GEN11_GUC_HOST_INTERRUPT	_MMIO(0x1901f0)
>  
> +#define GEN12_GUC_SEM_INTR_ENABLES	_MMIO(0xc71c)
> +#define   GUC_SEM_INTR_ROUTE_TO_GUC	BIT(31)
> +#define   GUC_SEM_INTR_ENABLE_ALL	(0xff)
> +
>  #define GUC_NUM_DOORBELLS		256
>  
>  /* format of the HW-monitored doorbell cacheline */
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 40f726c61e951..7537459080278 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -3953,13 +3953,27 @@ int intel_guc_submission_setup(struct intel_engine_cs *engine)
>  
>  void intel_guc_submission_enable(struct intel_guc *guc)
>  {
> +	struct intel_gt *gt = guc_to_gt(guc);
> +
> +	/* Enable and route to GuC */
> +	if (GRAPHICS_VER(gt->i915) >= 12)
> +		intel_uncore_write(gt->uncore, GEN12_GUC_SEM_INTR_ENABLES,
> +				   GUC_SEM_INTR_ROUTE_TO_GUC |
> +				   GUC_SEM_INTR_ENABLE_ALL);
> +
>  	guc_init_lrc_mapping(guc);
>  	guc_init_engine_stats(guc);
>  }
>  
>  void intel_guc_submission_disable(struct intel_guc *guc)
>  {
> +	struct intel_gt *gt = guc_to_gt(guc);
> +
>  	/* Note: By the time we're here, GuC may have already been reset */
> +
> +	/* Disable and route to host */
> +	if (GRAPHICS_VER(gt->i915) >= 12)
> +		intel_uncore_write(gt->uncore, GEN12_GUC_SEM_INTR_ENABLES, 0x0);
>  }
>  
>  static bool __guc_submission_supported(struct intel_guc *guc)
> -- 
> 2.36.0
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Random assortment of (mostly) GuC related patches (rev2)
  2022-07-12 23:31 ` [Intel-gfx] " John.C.Harrison
                   ` (13 preceding siblings ...)
  (?)
@ 2022-07-13 20:09 ` Patchwork
  -1 siblings, 0 replies; 56+ messages in thread
From: Patchwork @ 2022-07-13 20:09 UTC (permalink / raw)
  To: john.c.harrison; +Cc: intel-gfx

== Series Details ==

Series: Random assortment of (mostly) GuC related patches (rev2)
URL   : https://patchwork.freedesktop.org/series/106272/
State : warning

== Summary ==

Error: dim checkpatch failed
dea3dbd81ade drm/i915: Remove bogus GEM_BUG_ON in unpark
909eeb72fb3d drm/i915/guc: Don't call ring_is_idle in GuC submission
53760fb38f4a drm/i915/guc: Fix issues with live_preempt_cancel
f56a0551c6ad drm/i915/guc: Add GuC <-> kernel time stamp translation information
9ec48f9590e4 drm/i915/guc: Record CTB info in error logs
9de35e4d864c drm/i915/guc: Use streaming loads to speed up dumping the guc log
263f1e938c10 drm/i915/guc: Route semaphores to GuC for Gen12+
cf9a04633964 drm/i915/guc: Add selftest for a hung GuC
Traceback (most recent call last):
  File "scripts/spdxcheck.py", line 6, in <module>
    from ply import lex, yacc
ModuleNotFoundError: No module named 'ply'
-:22: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#22: 
new file mode 100644

total: 0 errors, 1 warnings, 0 checks, 170 lines checked
e9a1fdbc7625 drm/i915/selftest: Cope with not having an RCS engine
6bac7cf93b56 drm/i915/guc: Support larger contexts on newer hardware
4d462f42d5f1 drm/i915/guc: Don't abort on CTB_UNUSED status
85fefd30d816 drm/i915/guc: Add a helper for log buffer size



^ permalink raw reply	[flat|nested] 56+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for Random assortment of (mostly) GuC related patches (rev2)
  2022-07-12 23:31 ` [Intel-gfx] " John.C.Harrison
                   ` (14 preceding siblings ...)
  (?)
@ 2022-07-13 20:27 ` Patchwork
  -1 siblings, 0 replies; 56+ messages in thread
From: Patchwork @ 2022-07-13 20:27 UTC (permalink / raw)
  To: john.c.harrison; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 10186 bytes --]

== Series Details ==

Series: Random assortment of (mostly) GuC related patches (rev2)
URL   : https://patchwork.freedesktop.org/series/106272/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_11877 -> Patchwork_106272v2
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/index.html

Participating hosts (39 -> 39)
------------------------------

  Additional (5): bat-adlm-1 bat-dg2-9 bat-adlp-4 bat-jsl-3 bat-rpls-1 
  Missing    (5): fi-kbl-soraka fi-bxt-dsi fi-rkl-11600 bat-dg1-5 bat-adln-1 

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_106272v2:

### IGT changes ###

#### Suppressed ####

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@kms_pipe_crc_basic@suspend-read-crc:
    - {bat-adlm-1}:       NOTRUN -> [SKIP][1]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/bat-adlm-1/igt@kms_pipe_crc_basic@suspend-read-crc.html

  
New tests
---------

  New tests have been introduced between CI_DRM_11877 and Patchwork_106272v2:

### New IGT tests (1) ###

  * igt@i915_selftest@live@guc_hang:
    - Statuses : 1 dmesg-warn(s) 31 pass(s)
    - Exec time: [0.42, 4.13] s

  

Known issues
------------

  Here are the changes found in Patchwork_106272v2 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_lmem_swapping@verify-random:
    - bat-adlp-4:         NOTRUN -> [SKIP][2] ([i915#4613]) +3 similar issues
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/bat-adlp-4/igt@gem_lmem_swapping@verify-random.html

  * igt@gem_tiled_pread_basic:
    - bat-adlp-4:         NOTRUN -> [SKIP][3] ([i915#3282])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/bat-adlp-4/igt@gem_tiled_pread_basic.html

  * igt@i915_selftest@live@gem:
    - fi-blb-e6850:       NOTRUN -> [DMESG-FAIL][4] ([i915#4528])
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/fi-blb-e6850/igt@i915_selftest@live@gem.html

  * {igt@i915_selftest@live@guc_hang} (NEW):
    - {bat-dg2-8}:        NOTRUN -> [DMESG-WARN][5] ([i915#5763])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/bat-dg2-8/igt@i915_selftest@live@guc_hang.html

  * igt@i915_selftest@live@hangcheck:
    - fi-hsw-g3258:       [PASS][6] -> [INCOMPLETE][7] ([i915#3303] / [i915#4785])
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/fi-hsw-g3258/igt@i915_selftest@live@hangcheck.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/fi-hsw-g3258/igt@i915_selftest@live@hangcheck.html
    - fi-snb-2600:        [PASS][8] -> [INCOMPLETE][9] ([i915#3921])
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/fi-snb-2600/igt@i915_selftest@live@hangcheck.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/fi-snb-2600/igt@i915_selftest@live@hangcheck.html

  * igt@i915_suspend@basic-s3-without-i915:
    - bat-adlp-4:         NOTRUN -> [SKIP][10] ([i915#5903])
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/bat-adlp-4/igt@i915_suspend@basic-s3-without-i915.html

  * igt@kms_chamelium@dp-crc-fast:
    - bat-adlp-4:         NOTRUN -> [SKIP][11] ([fdo#111827]) +8 similar issues
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/bat-adlp-4/igt@kms_chamelium@dp-crc-fast.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor:
    - bat-adlp-4:         NOTRUN -> [SKIP][12] ([i915#4103])
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/bat-adlp-4/igt@kms_cursor_legacy@basic-busy-flip-before-cursor.html

  * igt@kms_force_connector_basic@prune-stale-modes:
    - bat-adlp-4:         NOTRUN -> [SKIP][13] ([i915#4093]) +3 similar issues
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/bat-adlp-4/igt@kms_force_connector_basic@prune-stale-modes.html

  * igt@kms_setmode@basic-clone-single-crtc:
    - bat-adlp-4:         NOTRUN -> [SKIP][14] ([i915#3555] / [i915#4579])
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/bat-adlp-4/igt@kms_setmode@basic-clone-single-crtc.html

  * igt@prime_vgem@basic-userptr:
    - bat-adlp-4:         NOTRUN -> [SKIP][15] ([fdo#109295] / [i915#3301] / [i915#3708])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/bat-adlp-4/igt@prime_vgem@basic-userptr.html

  * igt@prime_vgem@basic-write:
    - bat-adlp-4:         NOTRUN -> [SKIP][16] ([fdo#109295] / [i915#3291] / [i915#3708]) +2 similar issues
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/bat-adlp-4/igt@prime_vgem@basic-write.html

  * igt@runner@aborted:
    - fi-hsw-g3258:       NOTRUN -> [FAIL][17] ([fdo#109271] / [i915#4312] / [i915#6246])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/fi-hsw-g3258/igt@runner@aborted.html

  
#### Possible fixes ####

  * igt@i915_module_load@load:
    - {fi-tgl-dsi}:       [DMESG-WARN][18] ([i915#1982]) -> [PASS][19]
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/fi-tgl-dsi/igt@i915_module_load@load.html
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/fi-tgl-dsi/igt@i915_module_load@load.html

  * igt@i915_selftest@live@requests:
    - fi-blb-e6850:       [DMESG-FAIL][20] ([i915#4528]) -> [PASS][21]
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/fi-blb-e6850/igt@i915_selftest@live@requests.html
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/fi-blb-e6850/igt@i915_selftest@live@requests.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109285]: https://bugs.freedesktop.org/show_bug.cgi?id=109285
  [fdo#109295]: https://bugs.freedesktop.org/show_bug.cgi?id=109295
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1072]: https://gitlab.freedesktop.org/drm/intel/issues/1072
  [i915#1155]: https://gitlab.freedesktop.org/drm/intel/issues/1155
  [i915#1845]: https://gitlab.freedesktop.org/drm/intel/issues/1845
  [i915#1849]: https://gitlab.freedesktop.org/drm/intel/issues/1849
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#2582]: https://gitlab.freedesktop.org/drm/intel/issues/2582
  [i915#2867]: https://gitlab.freedesktop.org/drm/intel/issues/2867
  [i915#3003]: https://gitlab.freedesktop.org/drm/intel/issues/3003
  [i915#3282]: https://gitlab.freedesktop.org/drm/intel/issues/3282
  [i915#3291]: https://gitlab.freedesktop.org/drm/intel/issues/3291
  [i915#3301]: https://gitlab.freedesktop.org/drm/intel/issues/3301
  [i915#3303]: https://gitlab.freedesktop.org/drm/intel/issues/3303
  [i915#3555]: https://gitlab.freedesktop.org/drm/intel/issues/3555
  [i915#3595]: https://gitlab.freedesktop.org/drm/intel/issues/3595
  [i915#3637]: https://gitlab.freedesktop.org/drm/intel/issues/3637
  [i915#3708]: https://gitlab.freedesktop.org/drm/intel/issues/3708
  [i915#3921]: https://gitlab.freedesktop.org/drm/intel/issues/3921
  [i915#4077]: https://gitlab.freedesktop.org/drm/intel/issues/4077
  [i915#4079]: https://gitlab.freedesktop.org/drm/intel/issues/4079
  [i915#4083]: https://gitlab.freedesktop.org/drm/intel/issues/4083
  [i915#4093]: https://gitlab.freedesktop.org/drm/intel/issues/4093
  [i915#4103]: https://gitlab.freedesktop.org/drm/intel/issues/4103
  [i915#4212]: https://gitlab.freedesktop.org/drm/intel/issues/4212
  [i915#4213]: https://gitlab.freedesktop.org/drm/intel/issues/4213
  [i915#4215]: https://gitlab.freedesktop.org/drm/intel/issues/4215
  [i915#4312]: https://gitlab.freedesktop.org/drm/intel/issues/4312
  [i915#4528]: https://gitlab.freedesktop.org/drm/intel/issues/4528
  [i915#4579]: https://gitlab.freedesktop.org/drm/intel/issues/4579
  [i915#4613]: https://gitlab.freedesktop.org/drm/intel/issues/4613
  [i915#4785]: https://gitlab.freedesktop.org/drm/intel/issues/4785
  [i915#4873]: https://gitlab.freedesktop.org/drm/intel/issues/4873
  [i915#5087]: https://gitlab.freedesktop.org/drm/intel/issues/5087
  [i915#5174]: https://gitlab.freedesktop.org/drm/intel/issues/5174
  [i915#5190]: https://gitlab.freedesktop.org/drm/intel/issues/5190
  [i915#5274]: https://gitlab.freedesktop.org/drm/intel/issues/5274
  [i915#5763]: https://gitlab.freedesktop.org/drm/intel/issues/5763
  [i915#5903]: https://gitlab.freedesktop.org/drm/intel/issues/5903
  [i915#6246]: https://gitlab.freedesktop.org/drm/intel/issues/6246
  [i915#6257]: https://gitlab.freedesktop.org/drm/intel/issues/6257


Build changes
-------------

  * Linux: CI_DRM_11877 -> Patchwork_106272v2

  CI-20190529: 20190529
  CI_DRM_11877: e55cefc370de5a38ee848aa96082d9d9f4cacdb9 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6578: 7d289d89131ec37c1145bcdb86149914587d7406 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_106272v2: e55cefc370de5a38ee848aa96082d9d9f4cacdb9 @ git://anongit.freedesktop.org/gfx-ci/linux


### Linux commits

9c3e17340f17 drm/i915/guc: Add a helper for log buffer size
9bb603f61427 drm/i915/guc: Don't abort on CTB_UNUSED status
e14a4c8dfb25 drm/i915/guc: Support larger contexts on newer hardware
4fb50ad3371f drm/i915/selftest: Cope with not having an RCS engine
9c6033853797 drm/i915/guc: Add selftest for a hung GuC
3e480d3a6e92 drm/i915/guc: Route semaphores to GuC for Gen12+
2e8931889eae drm/i915/guc: Use streaming loads to speed up dumping the guc log
ebe673745559 drm/i915/guc: Record CTB info in error logs
61bb2a4d4080 drm/i915/guc: Add GuC <-> kernel time stamp translation information
bef91fb07ac7 drm/i915/guc: Fix issues with live_preempt_cancel
95d709e73336 drm/i915/guc: Don't call ring_is_idle in GuC submission
73532cc967d9 drm/i915: Remove bogus GEM_BUG_ON in unpark

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/index.html

[-- Attachment #2: Type: text/html, Size: 9949 bytes --]

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [Intel-gfx] ✗ Fi.CI.IGT: failure for Random assortment of (mostly) GuC related patches (rev2)
  2022-07-12 23:31 ` [Intel-gfx] " John.C.Harrison
                   ` (15 preceding siblings ...)
  (?)
@ 2022-07-14  1:41 ` Patchwork
  -1 siblings, 0 replies; 56+ messages in thread
From: Patchwork @ 2022-07-14  1:41 UTC (permalink / raw)
  To: john.c.harrison; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 49791 bytes --]

== Series Details ==

Series: Random assortment of (mostly) GuC related patches (rev2)
URL   : https://patchwork.freedesktop.org/series/106272/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_11877_full -> Patchwork_106272v2_full
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_106272v2_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_106272v2_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Participating hosts (13 -> 12)
------------------------------

  Missing    (1): shard-dg1 

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_106272v2_full:

### IGT changes ###

#### Possible regressions ####

  * igt@gem_exec_fair@basic-deadline:
    - shard-kbl:          [PASS][1] -> [FAIL][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-kbl7/igt@gem_exec_fair@basic-deadline.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-kbl6/igt@gem_exec_fair@basic-deadline.html

  * igt@i915_selftest@live@objects:
    - shard-skl:          [PASS][3] -> [INCOMPLETE][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-skl6/igt@i915_selftest@live@objects.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl2/igt@i915_selftest@live@objects.html

  
#### Suppressed ####

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@gem_exec_whisper@basic-fds-priority-all:
    - {shard-rkl}:        [PASS][5] -> [DMESG-WARN][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-rkl-2/igt@gem_exec_whisper@basic-fds-priority-all.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-rkl-5/igt@gem_exec_whisper@basic-fds-priority-all.html

  
New tests
---------

  New tests have been introduced between CI_DRM_11877_full and Patchwork_106272v2_full:

### New IGT tests (1) ###

  * igt@i915_selftest@live@guc_hang:
    - Statuses : 8 pass(s)
    - Exec time: [0.48, 1.81] s

  

Known issues
------------

  Here are the changes found in Patchwork_106272v2_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_ctx_isolation@preservation-s3@vcs0:
    - shard-kbl:          [PASS][7] -> [DMESG-WARN][8] ([i915#180]) +7 similar issues
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-kbl1/igt@gem_ctx_isolation@preservation-s3@vcs0.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-kbl4/igt@gem_ctx_isolation@preservation-s3@vcs0.html

  * igt@gem_eio@kms:
    - shard-tglb:         [PASS][9] -> [FAIL][10] ([i915#5784])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-tglb7/igt@gem_eio@kms.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb5/igt@gem_eio@kms.html

  * igt@gem_exec_fair@basic-none@vcs1:
    - shard-iclb:         NOTRUN -> [FAIL][11] ([i915#2842])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-iclb2/igt@gem_exec_fair@basic-none@vcs1.html

  * igt@gem_exec_fair@basic-pace-share@rcs0:
    - shard-glk:          [PASS][12] -> [FAIL][13] ([i915#2842])
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-glk7/igt@gem_exec_fair@basic-pace-share@rcs0.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-glk7/igt@gem_exec_fair@basic-pace-share@rcs0.html
    - shard-tglb:         [PASS][14] -> [FAIL][15] ([i915#2842])
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-tglb1/igt@gem_exec_fair@basic-pace-share@rcs0.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb2/igt@gem_exec_fair@basic-pace-share@rcs0.html

  * igt@gem_exec_fair@basic-pace@vecs0:
    - shard-iclb:         [PASS][16] -> [FAIL][17] ([i915#2842])
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-iclb1/igt@gem_exec_fair@basic-pace@vecs0.html
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-iclb4/igt@gem_exec_fair@basic-pace@vecs0.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
    - shard-iclb:         [PASS][18] -> [FAIL][19] ([i915#2849])
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-iclb6/igt@gem_exec_fair@basic-throttle@rcs0.html
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-iclb5/igt@gem_exec_fair@basic-throttle@rcs0.html

  * igt@gem_exec_whisper@basic-queues-forked-all:
    - shard-glk:          [PASS][20] -> [DMESG-WARN][21] ([i915#118])
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-glk9/igt@gem_exec_whisper@basic-queues-forked-all.html
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-glk8/igt@gem_exec_whisper@basic-queues-forked-all.html

  * igt@gem_huc_copy@huc-copy:
    - shard-skl:          NOTRUN -> [SKIP][22] ([fdo#109271] / [i915#2190])
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl2/igt@gem_huc_copy@huc-copy.html

  * igt@gem_lmem_swapping@heavy-verify-multi:
    - shard-skl:          NOTRUN -> [SKIP][23] ([fdo#109271] / [i915#4613]) +3 similar issues
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl5/igt@gem_lmem_swapping@heavy-verify-multi.html

  * igt@gem_lmem_swapping@heavy-verify-random:
    - shard-tglb:         NOTRUN -> [SKIP][24] ([i915#4613])
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb7/igt@gem_lmem_swapping@heavy-verify-random.html

  * igt@gem_lmem_swapping@parallel-random:
    - shard-kbl:          NOTRUN -> [SKIP][25] ([fdo#109271] / [i915#4613]) +1 similar issue
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-kbl7/igt@gem_lmem_swapping@parallel-random.html

  * igt@gem_pwrite@basic-exhaustion:
    - shard-kbl:          NOTRUN -> [WARN][26] ([i915#2658])
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-kbl6/igt@gem_pwrite@basic-exhaustion.html

  * igt@gem_pxp@create-regular-buffer:
    - shard-tglb:         NOTRUN -> [SKIP][27] ([i915#4270])
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb7/igt@gem_pxp@create-regular-buffer.html

  * igt@gem_userptr_blits@dmabuf-sync:
    - shard-skl:          NOTRUN -> [SKIP][28] ([fdo#109271] / [i915#3323])
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl6/igt@gem_userptr_blits@dmabuf-sync.html

  * igt@gem_userptr_blits@vma-merge:
    - shard-skl:          NOTRUN -> [FAIL][29] ([i915#3318])
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl2/igt@gem_userptr_blits@vma-merge.html

  * igt@gen7_exec_parse@bitmasks:
    - shard-tglb:         NOTRUN -> [SKIP][30] ([fdo#109289]) +1 similar issue
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb7/igt@gen7_exec_parse@bitmasks.html

  * igt@i915_module_load@load:
    - shard-skl:          NOTRUN -> [SKIP][31] ([fdo#109271] / [i915#6227])
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl6/igt@i915_module_load@load.html

  * igt@i915_pm_dc@dc6-dpms:
    - shard-iclb:         [PASS][32] -> [FAIL][33] ([i915#454]) +1 similar issue
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-iclb8/igt@i915_pm_dc@dc6-dpms.html
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-iclb3/igt@i915_pm_dc@dc6-dpms.html
    - shard-skl:          NOTRUN -> [FAIL][34] ([i915#454])
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl6/igt@i915_pm_dc@dc6-dpms.html

  * igt@i915_pm_dc@dc9-dpms:
    - shard-iclb:         [PASS][35] -> [SKIP][36] ([i915#4281])
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-iclb5/igt@i915_pm_dc@dc9-dpms.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-iclb3/igt@i915_pm_dc@dc9-dpms.html

  * igt@i915_pm_rpm@modeset-non-lpsp-stress:
    - shard-tglb:         NOTRUN -> [SKIP][37] ([fdo#111644] / [i915#1397] / [i915#2411])
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb7/igt@i915_pm_rpm@modeset-non-lpsp-stress.html

  * igt@kms_big_fb@4-tiled-16bpp-rotate-270:
    - shard-tglb:         NOTRUN -> [SKIP][38] ([i915#5286]) +1 similar issue
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb7/igt@kms_big_fb@4-tiled-16bpp-rotate-270.html

  * igt@kms_big_fb@x-tiled-64bpp-rotate-90:
    - shard-tglb:         NOTRUN -> [SKIP][39] ([fdo#111614]) +1 similar issue
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb7/igt@kms_big_fb@x-tiled-64bpp-rotate-90.html

  * igt@kms_big_fb@y-tiled-8bpp-rotate-180:
    - shard-glk:          [PASS][40] -> [FAIL][41] ([i915#1888])
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-glk7/igt@kms_big_fb@y-tiled-8bpp-rotate-180.html
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-glk7/igt@kms_big_fb@y-tiled-8bpp-rotate-180.html

  * igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-180-async-flip:
    - shard-skl:          NOTRUN -> [FAIL][42] ([i915#3743]) +2 similar issues
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl7/igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-180-async-flip.html

  * igt@kms_big_fb@yf-tiled-8bpp-rotate-270:
    - shard-tglb:         NOTRUN -> [SKIP][43] ([fdo#111615]) +1 similar issue
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb7/igt@kms_big_fb@yf-tiled-8bpp-rotate-270.html

  * igt@kms_ccs@pipe-a-ccs-on-another-bo-y_tiled_gen12_mc_ccs:
    - shard-kbl:          NOTRUN -> [SKIP][44] ([fdo#109271] / [i915#3886]) +4 similar issues
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-kbl6/igt@kms_ccs@pipe-a-ccs-on-another-bo-y_tiled_gen12_mc_ccs.html

  * igt@kms_ccs@pipe-a-crc-sprite-planes-basic-4_tiled_dg2_rc_ccs:
    - shard-tglb:         NOTRUN -> [SKIP][45] ([i915#3689] / [i915#6095])
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb7/igt@kms_ccs@pipe-a-crc-sprite-planes-basic-4_tiled_dg2_rc_ccs.html

  * igt@kms_ccs@pipe-a-missing-ccs-buffer-y_tiled_gen12_mc_ccs:
    - shard-skl:          NOTRUN -> [SKIP][46] ([fdo#109271] / [i915#3886]) +8 similar issues
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl6/igt@kms_ccs@pipe-a-missing-ccs-buffer-y_tiled_gen12_mc_ccs.html

  * igt@kms_ccs@pipe-b-crc-primary-basic-4_tiled_dg2_mc_ccs:
    - shard-tglb:         NOTRUN -> [SKIP][47] ([i915#6095]) +1 similar issue
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb7/igt@kms_ccs@pipe-b-crc-primary-basic-4_tiled_dg2_mc_ccs.html

  * igt@kms_ccs@pipe-c-crc-sprite-planes-basic-y_tiled_ccs:
    - shard-tglb:         NOTRUN -> [SKIP][48] ([i915#3689]) +2 similar issues
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb7/igt@kms_ccs@pipe-c-crc-sprite-planes-basic-y_tiled_ccs.html

  * igt@kms_color_chamelium@pipe-c-ctm-0-75:
    - shard-kbl:          NOTRUN -> [SKIP][49] ([fdo#109271] / [fdo#111827]) +2 similar issues
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-kbl7/igt@kms_color_chamelium@pipe-c-ctm-0-75.html

  * igt@kms_color_chamelium@pipe-d-ctm-blue-to-red:
    - shard-tglb:         NOTRUN -> [SKIP][50] ([fdo#109284] / [fdo#111827]) +1 similar issue
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb7/igt@kms_color_chamelium@pipe-d-ctm-blue-to-red.html

  * igt@kms_color_chamelium@pipe-d-ctm-green-to-red:
    - shard-skl:          NOTRUN -> [SKIP][51] ([fdo#109271] / [fdo#111827]) +19 similar issues
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl2/igt@kms_color_chamelium@pipe-d-ctm-green-to-red.html

  * igt@kms_color_chamelium@pipe-d-ctm-red-to-blue:
    - shard-apl:          NOTRUN -> [SKIP][52] ([fdo#109271] / [fdo#111827]) +1 similar issue
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-apl4/igt@kms_color_chamelium@pipe-d-ctm-red-to-blue.html

  * igt@kms_content_protection@dp-mst-type-1:
    - shard-tglb:         NOTRUN -> [SKIP][53] ([i915#3116] / [i915#3299])
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb7/igt@kms_content_protection@dp-mst-type-1.html

  * igt@kms_content_protection@legacy:
    - shard-apl:          NOTRUN -> [TIMEOUT][54] ([i915#1319])
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-apl3/igt@kms_content_protection@legacy.html

  * igt@kms_cursor_crc@cursor-suspend@pipe-a-dp-1:
    - shard-apl:          [PASS][55] -> [DMESG-WARN][56] ([i915#180])
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-apl2/igt@kms_cursor_crc@cursor-suspend@pipe-a-dp-1.html
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-apl4/igt@kms_cursor_crc@cursor-suspend@pipe-a-dp-1.html

  * igt@kms_flip@2x-flip-vs-panning-interruptible:
    - shard-tglb:         NOTRUN -> [SKIP][57] ([fdo#109274] / [fdo#111825] / [i915#3637]) +1 similar issue
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb7/igt@kms_flip@2x-flip-vs-panning-interruptible.html

  * igt@kms_flip@flip-vs-expired-vblank@a-edp1:
    - shard-skl:          NOTRUN -> [FAIL][58] ([i915#79]) +1 similar issue
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl5/igt@kms_flip@flip-vs-expired-vblank@a-edp1.html

  * igt@kms_flip@flip-vs-suspend-interruptible@c-edp1:
    - shard-skl:          NOTRUN -> [INCOMPLETE][59] ([i915#4939])
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl4/igt@kms_flip@flip-vs-suspend-interruptible@c-edp1.html

  * igt@kms_flip_scaled_crc@flip-32bpp-4tile-to-64bpp-4tile-downscaling@pipe-a-valid-mode:
    - shard-tglb:         NOTRUN -> [SKIP][60] ([i915#2672])
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb7/igt@kms_flip_scaled_crc@flip-32bpp-4tile-to-64bpp-4tile-downscaling@pipe-a-valid-mode.html

  * igt@kms_flip_scaled_crc@flip-32bpp-xtile-to-64bpp-xtile-downscaling@pipe-a-default-mode:
    - shard-iclb:         NOTRUN -> [SKIP][61] ([i915#3555])
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-iclb2/igt@kms_flip_scaled_crc@flip-32bpp-xtile-to-64bpp-xtile-downscaling@pipe-a-default-mode.html

  * igt@kms_flip_scaled_crc@flip-32bpp-yftile-to-64bpp-yftile-upscaling@pipe-a-default-mode:
    - shard-iclb:         NOTRUN -> [SKIP][62] ([i915#2672]) +7 similar issues
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-iclb3/igt@kms_flip_scaled_crc@flip-32bpp-yftile-to-64bpp-yftile-upscaling@pipe-a-default-mode.html

  * igt@kms_frontbuffer_tracking@psr-2p-primscrn-spr-indfb-fullscreen:
    - shard-apl:          NOTRUN -> [SKIP][63] ([fdo#109271]) +24 similar issues
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-apl3/igt@kms_frontbuffer_tracking@psr-2p-primscrn-spr-indfb-fullscreen.html

  * igt@kms_frontbuffer_tracking@psr-2p-primscrn-spr-indfb-move:
    - shard-tglb:         NOTRUN -> [SKIP][64] ([fdo#109280] / [fdo#111825]) +9 similar issues
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb7/igt@kms_frontbuffer_tracking@psr-2p-primscrn-spr-indfb-move.html

  * igt@kms_hdr@bpc-switch-dpms@pipe-a-dp-1:
    - shard-kbl:          NOTRUN -> [FAIL][65] ([i915#1188])
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-kbl7/igt@kms_hdr@bpc-switch-dpms@pipe-a-dp-1.html

  * igt@kms_plane_alpha_blend@pipe-c-alpha-transparent-fb:
    - shard-kbl:          NOTRUN -> [FAIL][66] ([i915#265])
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-kbl6/igt@kms_plane_alpha_blend@pipe-c-alpha-transparent-fb.html

  * igt@kms_plane_alpha_blend@pipe-c-constant-alpha-max:
    - shard-skl:          NOTRUN -> [FAIL][67] ([fdo#108145] / [i915#265]) +1 similar issue
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl6/igt@kms_plane_alpha_blend@pipe-c-constant-alpha-max.html

  * igt@kms_plane_scaling@planes-downscale-factor-0-5@pipe-b-edp-1:
    - shard-skl:          NOTRUN -> [SKIP][68] ([fdo#109271]) +240 similar issues
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl6/igt@kms_plane_scaling@planes-downscale-factor-0-5@pipe-b-edp-1.html

  * igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-0-25@pipe-c-edp-1:
    - shard-tglb:         NOTRUN -> [SKIP][69] ([i915#5235]) +3 similar issues
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb7/igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-0-25@pipe-c-edp-1.html

  * igt@kms_psr2_sf@cursor-plane-move-continuous-exceed-sf:
    - shard-skl:          NOTRUN -> [SKIP][70] ([fdo#109271] / [i915#658]) +4 similar issues
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl7/igt@kms_psr2_sf@cursor-plane-move-continuous-exceed-sf.html

  * igt@kms_psr2_sf@cursor-plane-move-continuous-sf:
    - shard-kbl:          NOTRUN -> [SKIP][71] ([fdo#109271] / [i915#658]) +2 similar issues
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-kbl6/igt@kms_psr2_sf@cursor-plane-move-continuous-sf.html

  * igt@kms_psr@psr2_primary_mmap_cpu:
    - shard-tglb:         NOTRUN -> [FAIL][72] ([i915#132] / [i915#3467])
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb7/igt@kms_psr@psr2_primary_mmap_cpu.html

  * igt@kms_psr@psr2_sprite_blt:
    - shard-iclb:         [PASS][73] -> [SKIP][74] ([fdo#109441]) +2 similar issues
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-iclb2/igt@kms_psr@psr2_sprite_blt.html
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-iclb1/igt@kms_psr@psr2_sprite_blt.html

  * igt@kms_setmode@clone-exclusive-crtc:
    - shard-tglb:         NOTRUN -> [SKIP][75] ([i915#3555]) +1 similar issue
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb7/igt@kms_setmode@clone-exclusive-crtc.html

  * igt@kms_vblank@pipe-d-wait-idle:
    - shard-skl:          NOTRUN -> [SKIP][76] ([fdo#109271] / [i915#533])
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl6/igt@kms_vblank@pipe-d-wait-idle.html

  * igt@kms_writeback@writeback-check-output:
    - shard-apl:          NOTRUN -> [SKIP][77] ([fdo#109271] / [i915#2437])
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-apl4/igt@kms_writeback@writeback-check-output.html

  * igt@kms_writeback@writeback-pixel-formats:
    - shard-skl:          NOTRUN -> [SKIP][78] ([fdo#109271] / [i915#2437])
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl2/igt@kms_writeback@writeback-pixel-formats.html

  * igt@nouveau_crc@pipe-c-source-outp-inactive:
    - shard-tglb:         NOTRUN -> [SKIP][79] ([i915#2530])
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb7/igt@nouveau_crc@pipe-c-source-outp-inactive.html

  * igt@prime_nv_api@i915_nv_double_export:
    - shard-tglb:         NOTRUN -> [SKIP][80] ([fdo#109291]) +1 similar issue
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb7/igt@prime_nv_api@i915_nv_double_export.html

  * igt@prime_nv_api@nv_i915_import_twice_check_flink_name:
    - shard-kbl:          NOTRUN -> [SKIP][81] ([fdo#109271]) +72 similar issues
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-kbl7/igt@prime_nv_api@nv_i915_import_twice_check_flink_name.html

  * igt@sysfs_clients@create:
    - shard-skl:          NOTRUN -> [SKIP][82] ([fdo#109271] / [i915#2994]) +1 similar issue
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl2/igt@sysfs_clients@create.html

  * igt@sysfs_clients@recycle:
    - shard-tglb:         NOTRUN -> [SKIP][83] ([i915#2994])
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb7/igt@sysfs_clients@recycle.html

  * igt@sysfs_clients@sema-25:
    - shard-kbl:          NOTRUN -> [SKIP][84] ([fdo#109271] / [i915#2994])
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-kbl7/igt@sysfs_clients@sema-25.html

  
#### Possible fixes ####

  * igt@feature_discovery@psr1:
    - {shard-rkl}:        [SKIP][85] ([i915#658]) -> [PASS][86] +1 similar issue
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-rkl-1/igt@feature_discovery@psr1.html
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-rkl-6/igt@feature_discovery@psr1.html

  * igt@gem_ctx_persistence@legacy-engines-hang@blt:
    - {shard-rkl}:        [SKIP][87] ([i915#6252]) -> [PASS][88]
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-rkl-5/igt@gem_ctx_persistence@legacy-engines-hang@blt.html
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-rkl-1/igt@gem_ctx_persistence@legacy-engines-hang@blt.html

  * igt@gem_ctx_persistence@many-contexts:
    - {shard-rkl}:        [FAIL][89] ([i915#2410]) -> [PASS][90]
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-rkl-1/igt@gem_ctx_persistence@many-contexts.html
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-rkl-5/igt@gem_ctx_persistence@many-contexts.html

  * igt@gem_exec_balancer@parallel-keep-in-fence:
    - shard-iclb:         [SKIP][91] ([i915#4525]) -> [PASS][92] +1 similar issue
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-iclb3/igt@gem_exec_balancer@parallel-keep-in-fence.html
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-iclb2/igt@gem_exec_balancer@parallel-keep-in-fence.html

  * igt@gem_exec_fair@basic-flow@rcs0:
    - shard-tglb:         [FAIL][93] ([i915#2842]) -> [PASS][94]
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-tglb3/igt@gem_exec_fair@basic-flow@rcs0.html
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb8/igt@gem_exec_fair@basic-flow@rcs0.html

  * igt@gem_exec_fair@basic-none@vcs1:
    - shard-kbl:          [FAIL][95] ([i915#2842]) -> [PASS][96] +2 similar issues
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-kbl7/igt@gem_exec_fair@basic-none@vcs1.html
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-kbl1/igt@gem_exec_fair@basic-none@vcs1.html

  * igt@gem_exec_fair@basic-none@vecs0:
    - shard-apl:          [FAIL][97] ([i915#2842]) -> [PASS][98]
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-apl6/igt@gem_exec_fair@basic-none@vecs0.html
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-apl6/igt@gem_exec_fair@basic-none@vecs0.html

  * igt@gem_exec_fair@basic-pace@vcs1:
    - shard-iclb:         [FAIL][99] ([i915#2842]) -> [PASS][100] +1 similar issue
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-iclb1/igt@gem_exec_fair@basic-pace@vcs1.html
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-iclb4/igt@gem_exec_fair@basic-pace@vcs1.html

  * igt@gem_exec_reloc@basic-gtt-wc-noreloc:
    - {shard-rkl}:        [SKIP][101] ([i915#3281]) -> [PASS][102] +4 similar issues
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-rkl-2/igt@gem_exec_reloc@basic-gtt-wc-noreloc.html
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-rkl-5/igt@gem_exec_reloc@basic-gtt-wc-noreloc.html

  * igt@gem_exec_schedule@semaphore-power:
    - {shard-rkl}:        [SKIP][103] ([fdo#110254]) -> [PASS][104]
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-rkl-2/igt@gem_exec_schedule@semaphore-power.html
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-rkl-5/igt@gem_exec_schedule@semaphore-power.html

  * igt@gem_huc_copy@huc-copy:
    - shard-tglb:         [SKIP][105] ([i915#2190]) -> [PASS][106]
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-tglb7/igt@gem_huc_copy@huc-copy.html
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb5/igt@gem_huc_copy@huc-copy.html

  * igt@gem_pread@snoop:
    - {shard-rkl}:        [SKIP][107] ([i915#3282]) -> [PASS][108] +2 similar issues
   [107]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-rkl-2/igt@gem_pread@snoop.html
   [108]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-rkl-5/igt@gem_pread@snoop.html

  * igt@gem_spin_batch@user-each:
    - shard-skl:          [FAIL][109] ([i915#2898]) -> [PASS][110]
   [109]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-skl5/igt@gem_spin_batch@user-each.html
   [110]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl7/igt@gem_spin_batch@user-each.html

  * igt@gen9_exec_parse@batch-without-end:
    - {shard-rkl}:        [SKIP][111] ([i915#2527]) -> [PASS][112] +2 similar issues
   [111]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-rkl-2/igt@gen9_exec_parse@batch-without-end.html
   [112]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-rkl-5/igt@gen9_exec_parse@batch-without-end.html

  * igt@i915_pm_backlight@fade:
    - {shard-rkl}:        [SKIP][113] ([i915#3012]) -> [PASS][114]
   [113]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-rkl-5/igt@i915_pm_backlight@fade.html
   [114]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-rkl-6/igt@i915_pm_backlight@fade.html

  * igt@i915_pm_rpm@system-suspend-modeset:
    - {shard-rkl}:        [SKIP][115] ([fdo#109308]) -> [PASS][116]
   [115]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-rkl-5/igt@i915_pm_rpm@system-suspend-modeset.html
   [116]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-rkl-6/igt@i915_pm_rpm@system-suspend-modeset.html

  * igt@i915_selftest@live@gt_pm:
    - {shard-tglu}:       [DMESG-FAIL][117] ([i915#3987]) -> [PASS][118]
   [117]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-tglu-4/igt@i915_selftest@live@gt_pm.html
   [118]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglu-5/igt@i915_selftest@live@gt_pm.html

  * igt@i915_suspend@fence-restore-untiled:
    - shard-kbl:          [DMESG-WARN][119] ([i915#180]) -> [PASS][120] +2 similar issues
   [119]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-kbl6/igt@i915_suspend@fence-restore-untiled.html
   [120]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-kbl7/igt@i915_suspend@fence-restore-untiled.html

  * igt@kms_big_fb@x-tiled-max-hw-stride-32bpp-rotate-180:
    - shard-skl:          [DMESG-WARN][121] ([i915#1982]) -> [PASS][122]
   [121]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-skl3/igt@kms_big_fb@x-tiled-max-hw-stride-32bpp-rotate-180.html
   [122]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl10/igt@kms_big_fb@x-tiled-max-hw-stride-32bpp-rotate-180.html

  * igt@kms_big_fb@x-tiled-max-hw-stride-64bpp-rotate-180-async-flip:
    - {shard-rkl}:        [SKIP][123] ([i915#1845] / [i915#4098]) -> [PASS][124] +24 similar issues
   [123]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-rkl-5/igt@kms_big_fb@x-tiled-max-hw-stride-64bpp-rotate-180-async-flip.html
   [124]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-rkl-6/igt@kms_big_fb@x-tiled-max-hw-stride-64bpp-rotate-180-async-flip.html

  * igt@kms_cursor_crc@cursor-suspend@pipe-b-edp-1:
    - shard-skl:          [INCOMPLETE][125] ([i915#4939]) -> [PASS][126] +2 similar issues
   [125]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-skl5/igt@kms_cursor_crc@cursor-suspend@pipe-b-edp-1.html
   [126]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl1/igt@kms_cursor_crc@cursor-suspend@pipe-b-edp-1.html

  * igt@kms_cursor_legacy@flip-vs-cursor@atomic-transitions:
    - shard-glk:          [FAIL][127] ([i915#2346]) -> [PASS][128]
   [127]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-glk7/igt@kms_cursor_legacy@flip-vs-cursor@atomic-transitions.html
   [128]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-glk7/igt@kms_cursor_legacy@flip-vs-cursor@atomic-transitions.html

  * igt@kms_draw_crc@draw-method-rgb565-pwrite-ytiled:
    - {shard-rkl}:        [SKIP][129] ([fdo#111314] / [i915#4098] / [i915#4369]) -> [PASS][130] +5 similar issues
   [129]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-rkl-2/igt@kms_draw_crc@draw-method-rgb565-pwrite-ytiled.html
   [130]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-rkl-6/igt@kms_draw_crc@draw-method-rgb565-pwrite-ytiled.html

  * igt@kms_flip@flip-vs-expired-vblank-interruptible@c-edp1:
    - shard-skl:          [FAIL][131] ([i915#2122]) -> [PASS][132]
   [131]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-skl5/igt@kms_flip@flip-vs-expired-vblank-interruptible@c-edp1.html
   [132]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl7/igt@kms_flip@flip-vs-expired-vblank-interruptible@c-edp1.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-mmap-cpu:
    - shard-iclb:         [FAIL][133] ([i915#1888] / [i915#2546]) -> [PASS][134]
   [133]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-iclb8/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-mmap-cpu.html
   [134]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-iclb3/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-mmap-cpu.html

  * igt@kms_frontbuffer_tracking@fbcpsr-rgb101010-draw-mmap-wc:
    - {shard-rkl}:        [SKIP][135] ([i915#1849] / [i915#4098]) -> [PASS][136] +30 similar issues
   [135]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-rkl-5/igt@kms_frontbuffer_tracking@fbcpsr-rgb101010-draw-mmap-wc.html
   [136]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-rkl-6/igt@kms_frontbuffer_tracking@fbcpsr-rgb101010-draw-mmap-wc.html

  * igt@kms_plane@plane-panning-bottom-right-suspend@pipe-b-planes:
    - shard-apl:          [DMESG-WARN][137] ([i915#180]) -> [PASS][138] +2 similar issues
   [137]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-apl2/igt@kms_plane@plane-panning-bottom-right-suspend@pipe-b-planes.html
   [138]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-apl8/igt@kms_plane@plane-panning-bottom-right-suspend@pipe-b-planes.html

  * igt@kms_plane@plane-panning-bottom-right@pipe-b-planes:
    - {shard-rkl}:        [SKIP][139] ([i915#3558]) -> [PASS][140] +1 similar issue
   [139]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-rkl-1/igt@kms_plane@plane-panning-bottom-right@pipe-b-planes.html
   [140]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-rkl-6/igt@kms_plane@plane-panning-bottom-right@pipe-b-planes.html

  * igt@kms_plane@plane-position-covered@pipe-b-planes:
    - {shard-rkl}:        [SKIP][141] ([i915#1849] / [i915#3558]) -> [PASS][142] +2 similar issues
   [141]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-rkl-2/igt@kms_plane@plane-position-covered@pipe-b-planes.html
   [142]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-rkl-6/igt@kms_plane@plane-position-covered@pipe-b-planes.html

  * igt@kms_plane_alpha_blend@pipe-b-constant-alpha-min:
    - shard-skl:          [FAIL][143] ([fdo#108145] / [i915#265]) -> [PASS][144]
   [143]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-skl1/igt@kms_plane_alpha_blend@pipe-b-constant-alpha-min.html
   [144]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl5/igt@kms_plane_alpha_blend@pipe-b-constant-alpha-min.html

  * igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-0-5@pipe-a-edp-1:
    - shard-iclb:         [SKIP][145] ([i915#5235]) -> [PASS][146] +5 similar issues
   [145]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-iclb2/igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-0-5@pipe-a-edp-1.html
   [146]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-iclb8/igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-0-5@pipe-a-edp-1.html

  * igt@kms_prime@basic-crc@second-to-first:
    - {shard-rkl}:        [SKIP][147] ([i915#1849]) -> [PASS][148] +1 similar issue
   [147]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-rkl-2/igt@kms_prime@basic-crc@second-to-first.html
   [148]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-rkl-6/igt@kms_prime@basic-crc@second-to-first.html

  * igt@kms_psr@cursor_render:
    - {shard-rkl}:        [SKIP][149] ([i915#1072]) -> [PASS][150] +2 similar issues
   [149]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-rkl-1/igt@kms_psr@cursor_render.html
   [150]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-rkl-6/igt@kms_psr@cursor_render.html

  * igt@kms_psr@psr2_cursor_mmap_cpu:
    - shard-iclb:         [SKIP][151] ([fdo#109441]) -> [PASS][152] +2 similar issues
   [151]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-iclb6/igt@kms_psr@psr2_cursor_mmap_cpu.html
   [152]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-iclb2/igt@kms_psr@psr2_cursor_mmap_cpu.html

  * igt@kms_universal_plane@universal-plane-gen9-features-pipe-a:
    - {shard-rkl}:        [SKIP][153] ([i915#1845] / [i915#4070] / [i915#4098]) -> [PASS][154] +1 similar issue
   [153]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-rkl-1/igt@kms_universal_plane@universal-plane-gen9-features-pipe-a.html
   [154]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-rkl-6/igt@kms_universal_plane@universal-plane-gen9-features-pipe-a.html

  * igt@perf@blocking:
    - shard-skl:          [FAIL][155] ([i915#1542]) -> [PASS][156]
   [155]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-skl9/igt@perf@blocking.html
   [156]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl6/igt@perf@blocking.html

  * igt@perf@gen12-mi-rpc:
    - {shard-rkl}:        [SKIP][157] ([fdo#109289]) -> [PASS][158]
   [157]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-rkl-5/igt@perf@gen12-mi-rpc.html
   [158]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-rkl-6/igt@perf@gen12-mi-rpc.html

  * igt@prime_vgem@basic-read:
    - {shard-rkl}:        [SKIP][159] ([fdo#109295] / [i915#3291] / [i915#3708]) -> [PASS][160]
   [159]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-rkl-2/igt@prime_vgem@basic-read.html
   [160]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-rkl-5/igt@prime_vgem@basic-read.html

  * igt@sysfs_timeslice_duration@timeout@vecs0:
    - {shard-rkl}:        [FAIL][161] ([i915#1755]) -> [PASS][162]
   [161]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-rkl-5/igt@sysfs_timeslice_duration@timeout@vecs0.html
   [162]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-rkl-6/igt@sysfs_timeslice_duration@timeout@vecs0.html

  
#### Warnings ####

  * igt@gem_exec_balancer@parallel-ordering:
    - shard-iclb:         [FAIL][163] ([i915#6117]) -> [SKIP][164] ([i915#4525])
   [163]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-iclb2/igt@gem_exec_balancer@parallel-ordering.html
   [164]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-iclb8/igt@gem_exec_balancer@parallel-ordering.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
    - shard-tglb:         [FAIL][165] ([i915#2849]) -> [FAIL][166] ([i915#2842])
   [165]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-tglb5/igt@gem_exec_fair@basic-none-solo@rcs0.html
   [166]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-tglb2/igt@gem_exec_fair@basic-none-solo@rcs0.html

  * igt@gem_render_copy@yf-tiled-mc-ccs-to-vebox-y-tiled:
    - shard-skl:          [SKIP][167] ([fdo#109271] / [i915#1888]) -> [SKIP][168] ([fdo#109271])
   [167]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-skl6/igt@gem_render_copy@yf-tiled-mc-ccs-to-vebox-y-tiled.html
   [168]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl2/igt@gem_render_copy@yf-tiled-mc-ccs-to-vebox-y-tiled.html

  * igt@i915_pm_dc@dc6-psr:
    - shard-skl:          [INCOMPLETE][169] -> [FAIL][170] ([i915#454])
   [169]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-skl1/igt@i915_pm_dc@dc6-psr.html
   [170]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl7/igt@i915_pm_dc@dc6-psr.html

  * igt@kms_ccs@pipe-d-random-ccs-data-4_tiled_dg2_mc_ccs:
    - shard-skl:          [SKIP][171] ([fdo#109271]) -> [SKIP][172] ([fdo#109271] / [i915#1888])
   [171]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-skl6/igt@kms_ccs@pipe-d-random-ccs-data-4_tiled_dg2_mc_ccs.html
   [172]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl2/igt@kms_ccs@pipe-d-random-ccs-data-4_tiled_dg2_mc_ccs.html

  * igt@kms_psr2_sf@cursor-plane-move-continuous-exceed-fully-sf:
    - shard-iclb:         [SKIP][173] ([i915#658]) -> [SKIP][174] ([i915#2920])
   [173]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-iclb6/igt@kms_psr2_sf@cursor-plane-move-continuous-exceed-fully-sf.html
   [174]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-iclb2/igt@kms_psr2_sf@cursor-plane-move-continuous-exceed-fully-sf.html

  * igt@kms_psr2_sf@cursor-plane-update-sf:
    - shard-iclb:         [SKIP][175] ([fdo#111068] / [i915#658]) -> [SKIP][176] ([i915#2920])
   [175]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-iclb7/igt@kms_psr2_sf@cursor-plane-update-sf.html
   [176]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-iclb2/igt@kms_psr2_sf@cursor-plane-update-sf.html

  * igt@kms_psr2_sf@overlay-plane-move-continuous-exceed-sf:
    - shard-iclb:         [SKIP][177] ([i915#2920]) -> [SKIP][178] ([i915#658])
   [177]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-iclb2/igt@kms_psr2_sf@overlay-plane-move-continuous-exceed-sf.html
   [178]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-iclb1/igt@kms_psr2_sf@overlay-plane-move-continuous-exceed-sf.html

  * igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area:
    - shard-iclb:         [SKIP][179] ([i915#2920]) -> [SKIP][180] ([fdo#111068] / [i915#658]) +1 similar issue
   [179]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-iclb2/igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area.html
   [180]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-iclb6/igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area.html

  * igt@kms_psr2_su@page_flip-p010:
    - shard-iclb:         [FAIL][181] ([i915#5939]) -> [SKIP][182] ([fdo#109642] / [fdo#111068] / [i915#658])
   [181]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-iclb2/igt@kms_psr2_su@page_flip-p010.html
   [182]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-iclb8/igt@kms_psr2_su@page_flip-p010.html

  * igt@runner@aborted:
    - shard-skl:          ([FAIL][183], [FAIL][184]) ([i915#3002] / [i915#4312] / [i915#5257]) -> ([FAIL][185], [FAIL][186]) ([i915#2029] / [i915#3002] / [i915#4312] / [i915#5257])
   [183]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-skl10/igt@runner@aborted.html
   [184]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-skl4/igt@runner@aborted.html
   [185]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl5/igt@runner@aborted.html
   [186]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-skl1/igt@runner@aborted.html
    - shard-apl:          ([FAIL][187], [FAIL][188], [FAIL][189], [FAIL][190], [FAIL][191]) ([fdo#109271] / [i915#180] / [i915#3002] / [i915#4312] / [i915#5257]) -> ([FAIL][192], [FAIL][193], [FAIL][194]) ([i915#3002] / [i915#4312] / [i915#5257])
   [187]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-apl4/igt@runner@aborted.html
   [188]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-apl1/igt@runner@aborted.html
   [189]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-apl2/igt@runner@aborted.html
   [190]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-apl6/igt@runner@aborted.html
   [191]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11877/shard-apl2/igt@runner@aborted.html
   [192]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-apl1/igt@runner@aborted.html
   [193]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-apl4/igt@runner@aborted.html
   [194]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/shard-apl6/igt@runner@aborted.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109274]: https://bugs.freedesktop.org/show_bug.cgi?id=109274
  [fdo#109280]: https://bugs.freedesktop.org/show_bug.cgi?id=109280
  [fdo#109284]: https://bugs.freedesktop.org/show_bug.cgi?id=109284
  [fdo#109289]: https://bugs.freedesktop.org/show_bug.cgi?id=109289
  [fdo#109291]: https://bugs.freedesktop.org/show_bug.cgi?id=109291
  [fdo#109295]: https://bugs.freedesktop.org/show_bug.cgi?id=109295
  [fdo#109300]: https://bugs.freedesktop.org/show_bug.cgi?id=109300
  [fdo#109308]: https://bugs.freedesktop.org/show_bug.cgi?id=109308
  [fdo#109309]: https://bugs.freedesktop.org/show_bug.cgi?id=109309
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [fdo#109506]: https://bugs.freedesktop.org/show_bug.cgi?id=109506
  [fdo#109642]: https://bugs.freedesktop.org/show_bug.cgi?id=109642
  [fdo#110254]: https://bugs.freedesktop.org/show_bug.cgi?id=110254
  [fdo#110723]: https://bugs.freedesktop.org/show_bug.cgi?id=110723
  [fdo#111068]: https://bugs.freedesktop.org/show_bug.cgi?id=111068
  [fdo#111314]: https://bugs.freedesktop.org/show_bug.cgi?id=111314
  [fdo#111614]: https://bugs.freedesktop.org/show_bug.cgi?id=111614
  [fdo#111615]: https://bugs.freedesktop.org/show_bug.cgi?id=111615
  [fdo#111644]: https://bugs.freedesktop.org/show_bug.cgi?id=111644
  [fdo#111825]: https://bugs.freedesktop.org/show_bug.cgi?id=111825
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1072]: https://gitlab.freedesktop.org/drm/intel/issues/1072
  [i915#118]: https://gitlab.freedesktop.org/drm/intel/issues/118
  [i915#1188]: https://gitlab.freedesktop.org/drm/intel/issues/1188
  [i915#1319]: https://gitlab.freedesktop.org/drm/intel/issues/1319
  [i915#132]: https://gitlab.freedesktop.org/drm/intel/issues/132
  [i915#1397]: https://gitlab.freedesktop.org/drm/intel/issues/1397
  [i915#1542]: https://gitlab.freedesktop.org/drm/intel/issues/1542
  [i915#1755]: https://gitlab.freedesktop.org/drm/intel/issues/1755
  [i915#180]: https://gitlab.freedesktop.org/drm/intel/issues/180
  [i915#1825]: https://gitlab.freedesktop.org/drm/intel/issues/1825
  [i915#1845]: https://gitlab.freedesktop.org/drm/intel/issues/1845
  [i915#1849]: https://gitlab.freedesktop.org/drm/intel/issues/1849
  [i915#1850]: https://gitlab.freedesktop.org/drm/intel/issues/1850
  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2029]: https://gitlab.freedesktop.org/drm/intel/issues/2029
  [i915#2122]: https://gitlab.freedesktop.org/drm/intel/issues/2122
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#2232]: https://gitlab.freedesktop.org/drm/intel/issues/2232
  [i915#2346]: https://gitlab.freedesktop.org/drm/intel/issues/2346
  [i915#2410]: https://gitlab.freedesktop.org/drm/intel/issues/2410
  [i915#2411]: https://gitlab.freedesktop.org/drm/intel/issues/2411
  [i915#2437]: https://gitlab.freedesktop.org/drm/intel/issues/2437
  [i915#2527]: https://gitlab.freedesktop.org/drm/intel/issues/2527
  [i915#2530]: https://gitlab.freedesktop.org/drm/intel/issues/2530
  [i915#2546]: https://gitlab.freedesktop.org/drm/intel/issues/2546
  [i915#2582]: https://gitlab.freedesktop.org/drm/intel/issues/2582
  [i915#265]: https://gitlab.freedesktop.org/drm/intel/issues/265
  [i915#2658]: https://gitlab.freedesktop.org/drm/intel/issues/2658
  [i915#2672]: https://gitlab.freedesktop.org/drm/intel/issues/2672
  [i915#2681]: https://gitlab.freedesktop.org/drm/intel/issues/2681
  [i915#2842]: https://gitlab.freedesktop.org/drm/intel/issues/2842
  [i915#2849]: https://gitlab.freedesktop.org/drm/intel/issues/2849
  [i915#2898]: https://gitlab.freedesktop.org/drm/intel/issues/2898
  [i915#2920]: https://gitlab.freedesktop.org/drm/intel/issues/2920
  [i915#2994]: https://gitlab.freedesktop.org/drm/intel/issues/2994
  [i915#3002]: https://gitlab.freedesktop.org/drm/intel/issues/3002
  [i915#3012]: https://gitlab.freedesktop.org/drm/intel/issues/3012
  [i915#3116]: https://gitlab.freedesktop.org/drm/intel/issues/3116
  [i915#3281]: https://gitlab.freedesktop.org/drm/intel/issues/3281
  [i915#3282]: https://gitlab.freedesktop.org/drm/intel/issues/3282
  [i915#3291]: https://gitlab.freedesktop.org/drm/intel/issues/3291
  [i915#3297]: https://gitlab.freedesktop.org/drm/intel/issues/3297
  [i915#3299]: https://gitlab.freedesktop.org/drm/intel/issues/3299
  [i915#3301]: https://gitlab.freedesktop.org/drm/intel/issues/3301
  [i915#3318]: https://gitlab.freedesktop.org/drm/intel/issues/3318
  [i915#3323]: https://gitlab.freedesktop.org/drm/intel/issues/3323
  [i915#3359]: https://gitlab.freedesktop.org/drm/intel/issues/3359
  [i915#3467]: https://gitlab.freedesktop.org/drm/intel/issues/3467
  [i915#3555]: https://gitlab.freedesktop.org/drm/intel/issues/3555
  [i915#3558]: https://gitlab.freedesktop.org/drm/intel/issues/3558
  [i915#3637]: https://gitlab.freedesktop.org/drm/intel/issues/3637
  [i915#3638]: https://gitlab.freedesktop.org/drm/intel/issues/3638
  [i915#3689]: https://gitlab.freedesktop.org/drm/intel/issues/3689
  [i915#3708]: https://gitlab.freedesktop.org/drm/intel/issues/3708
  [i915#3734]: https://gitlab.freedesktop.org/drm/intel/issues/3734
  [i915#3743]: https://gitlab.freedesktop.org/drm/intel/issues/3743
  [i915#3778]: https://gitlab.freedesktop.org/drm/intel/issues/3778
  [i915#3810]: https://gitlab.freedesktop.org/drm/intel/issues/3810
  [i915#3828]: https://gitlab.freedesktop.org/drm/intel/issues/3828
  [i915#3886]: https://gitlab.freedesktop.org/drm/intel/issues/3886
  [i915#3987]: https://gitlab.freedesktop.org/drm/intel/issues/3987
  [i915#4016]: https://gitlab.freedesktop.org/drm/intel/issues/4016
  [i915#4070]: https://gitlab.freedesktop.org/drm/intel/issues/4070
  [i915#4098]: https://gitlab.freedesktop.org/drm/intel/issues/4098
  [i915#4270]: https://gitlab.freedesktop.org/drm/intel/issues/4270
  [i915#4281]: https://gitlab.freedesktop.org/drm/intel/issues/4281
  [i915#4312]: https://gitlab.freedesktop.org/drm/intel/issues/4312
  [i915#4369]: https://gitlab.freedesktop.org/drm/intel/issues/4369
  [i915#4462]: https://gitlab.freedesktop.org/drm/intel/issues/4462
  [i915#4525]: https://gitlab.freedesktop.org/drm/intel/issues/4525
  [i915#454]: https://gitlab.freedesktop.org/drm/intel/issues/454
  [i915#4613]: https://gitlab.freedesktop.org/drm/intel/issues/4613
  [i915#4939]: https://gitlab.freedesktop.org/drm/intel/issues/4939
  [i915#5235]: https://gitlab.freedesktop.org/drm/intel/issues/5235
  [i915#5257]: https://gitlab.freedesktop.org/drm/intel/issues/5257
  [i915#5286]: https://gitlab.freedesktop.org/drm/intel/issues/5286
  [i915#5287]: https://gitlab.freedesktop.org/drm/intel/issues/5287
  [i915#5325]: https://gitlab.freedesktop.org/drm/intel/issues/5325
  [i915#5327]: https://gitlab.freedesktop.org/drm/intel/issues/5327
  [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533
  [i915#5461]: https://gitlab.freedesktop.org/drm/intel/issues/5461
  [i915#5784]: https://gitlab.freedesktop.org/drm/intel/issues/5784
  [i915#5903]: https://gitlab.freedesktop.org/drm/intel/issues/5903
  [i915#5939]: https://gitlab.freedesktop.org/drm/intel/issues/5939
  [i915#6095]: https://gitlab.freedesktop.org/drm/intel/issues/6095
  [i915#6117]: https://gitlab.freedesktop.org/drm/intel/issues/6117
  [i915#6227]: https://gitlab.freedesktop.org/drm/intel/issues/6227
  [i915#6248]: https://gitlab.freedesktop.org/drm/intel/issues/6248
  [i915#6252]: https://gitlab.freedesktop.org/drm/intel/issues/6252
  [i915#658]: https://gitlab.freedesktop.org/drm/intel/issues/658
  [i915#79]: https://gitlab.freedesktop.org/drm/intel/issues/79


Build changes
-------------

  * Linux: CI_DRM_11877 -> Patchwork_106272v2

  CI-20190529: 20190529
  CI_DRM_11877: e55cefc370de5a38ee848aa96082d9d9f4cacdb9 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6578: 7d289d89131ec37c1145bcdb86149914587d7406 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_106272v2: e55cefc370de5a38ee848aa96082d9d9f4cacdb9 @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_106272v2/index.html

[-- Attachment #2: Type: text/html, Size: 58461 bytes --]

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 07/12] drm/i915/guc: Route semaphores to GuC for Gen12+
  2022-07-12 23:31   ` [Intel-gfx] " John.C.Harrison
@ 2022-07-15 17:21     ` Ceraolo Spurio, Daniele
  -1 siblings, 0 replies; 56+ messages in thread
From: Ceraolo Spurio, Daniele @ 2022-07-15 17:21 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX; +Cc: Michał Winiarski, DRI-Devel



On 7/12/2022 4:31 PM, John.C.Harrison@Intel.com wrote:
> From: Michał Winiarski <michal.winiarski@intel.com>
>
> Since we're going to use semaphores in selftests (and eventually in
> regular GuC submission), let's route semaphores to GuC.

I'd specify that this interrupt is only relevant for semaphores that 
context switch out when their condition is not satisfied, which is not 
something we currently allow (but we do plan to as you mentioned). Also, 
the routing only happens when in GuC submission mode.

Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

Daniele

>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h        |  4 ++++
>   drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 14 ++++++++++++++
>   2 files changed, 18 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
> index 8dc063f087eb1..a7092f711e9cd 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
> @@ -102,6 +102,10 @@
>   #define   GUC_SEND_TRIGGER		  (1<<0)
>   #define GEN11_GUC_HOST_INTERRUPT	_MMIO(0x1901f0)
>   
> +#define GEN12_GUC_SEM_INTR_ENABLES	_MMIO(0xc71c)
> +#define   GUC_SEM_INTR_ROUTE_TO_GUC	BIT(31)
> +#define   GUC_SEM_INTR_ENABLE_ALL	(0xff)
> +
>   #define GUC_NUM_DOORBELLS		256
>   
>   /* format of the HW-monitored doorbell cacheline */
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 40f726c61e951..7537459080278 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -3953,13 +3953,27 @@ int intel_guc_submission_setup(struct intel_engine_cs *engine)
>   
>   void intel_guc_submission_enable(struct intel_guc *guc)
>   {
> +	struct intel_gt *gt = guc_to_gt(guc);
> +
> +	/* Enable and route to GuC */
> +	if (GRAPHICS_VER(gt->i915) >= 12)
> +		intel_uncore_write(gt->uncore, GEN12_GUC_SEM_INTR_ENABLES,
> +				   GUC_SEM_INTR_ROUTE_TO_GUC |
> +				   GUC_SEM_INTR_ENABLE_ALL);
> +
>   	guc_init_lrc_mapping(guc);
>   	guc_init_engine_stats(guc);
>   }
>   
>   void intel_guc_submission_disable(struct intel_guc *guc)
>   {
> +	struct intel_gt *gt = guc_to_gt(guc);
> +
>   	/* Note: By the time we're here, GuC may have already been reset */
> +
> +	/* Disable and route to host */
> +	if (GRAPHICS_VER(gt->i915) >= 12)
> +		intel_uncore_write(gt->uncore, GEN12_GUC_SEM_INTR_ENABLES, 0x0);
>   }
>   
>   static bool __guc_submission_supported(struct intel_guc *guc)


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Intel-gfx] [PATCH 07/12] drm/i915/guc: Route semaphores to GuC for Gen12+
@ 2022-07-15 17:21     ` Ceraolo Spurio, Daniele
  0 siblings, 0 replies; 56+ messages in thread
From: Ceraolo Spurio, Daniele @ 2022-07-15 17:21 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX; +Cc: Michał Winiarski, DRI-Devel



On 7/12/2022 4:31 PM, John.C.Harrison@Intel.com wrote:
> From: Michał Winiarski <michal.winiarski@intel.com>
>
> Since we're going to use semaphores in selftests (and eventually in
> regular GuC submission), let's route semaphores to GuC.

I'd specify that this interrupt is only relevant for semaphores that 
context switch out when their condition is not satisfied, which is not 
something we currently allow (but we do plan to as you mentioned). Also, 
the routing only happens when in GuC submission mode.

Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

Daniele

>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h        |  4 ++++
>   drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 14 ++++++++++++++
>   2 files changed, 18 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
> index 8dc063f087eb1..a7092f711e9cd 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h
> @@ -102,6 +102,10 @@
>   #define   GUC_SEND_TRIGGER		  (1<<0)
>   #define GEN11_GUC_HOST_INTERRUPT	_MMIO(0x1901f0)
>   
> +#define GEN12_GUC_SEM_INTR_ENABLES	_MMIO(0xc71c)
> +#define   GUC_SEM_INTR_ROUTE_TO_GUC	BIT(31)
> +#define   GUC_SEM_INTR_ENABLE_ALL	(0xff)
> +
>   #define GUC_NUM_DOORBELLS		256
>   
>   /* format of the HW-monitored doorbell cacheline */
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 40f726c61e951..7537459080278 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -3953,13 +3953,27 @@ int intel_guc_submission_setup(struct intel_engine_cs *engine)
>   
>   void intel_guc_submission_enable(struct intel_guc *guc)
>   {
> +	struct intel_gt *gt = guc_to_gt(guc);
> +
> +	/* Enable and route to GuC */
> +	if (GRAPHICS_VER(gt->i915) >= 12)
> +		intel_uncore_write(gt->uncore, GEN12_GUC_SEM_INTR_ENABLES,
> +				   GUC_SEM_INTR_ROUTE_TO_GUC |
> +				   GUC_SEM_INTR_ENABLE_ALL);
> +
>   	guc_init_lrc_mapping(guc);
>   	guc_init_engine_stats(guc);
>   }
>   
>   void intel_guc_submission_disable(struct intel_guc *guc)
>   {
> +	struct intel_gt *gt = guc_to_gt(guc);
> +
>   	/* Note: By the time we're here, GuC may have already been reset */
> +
> +	/* Disable and route to host */
> +	if (GRAPHICS_VER(gt->i915) >= 12)
> +		intel_uncore_write(gt->uncore, GEN12_GUC_SEM_INTR_ENABLES, 0x0);
>   }
>   
>   static bool __guc_submission_supported(struct intel_guc *guc)


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Intel-gfx] [PATCH 01/12] drm/i915: Remove bogus GEM_BUG_ON in unpark
  2022-07-12 23:31   ` [Intel-gfx] " John.C.Harrison
  (?)
@ 2022-07-18 12:15   ` Tvrtko Ursulin
  2022-07-19  0:05     ` John Harrison
  -1 siblings, 1 reply; 56+ messages in thread
From: Tvrtko Ursulin @ 2022-07-18 12:15 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX; +Cc: DRI-Devel


On 13/07/2022 00:31, John.C.Harrison@Intel.com wrote:
> From: Matthew Brost <matthew.brost@intel.com>
> 
> Remove bogus GEM_BUG_ON which compared kernel context timeline seqno to
> seqno in memory on engine PM unpark. If a GT reset occurred these values
> might not match as a kernel context could be skipped. This bug was
> hidden by always switching to a kernel context on park (execlists
> requirement).

Reset of the kernel context? Under which circumstances does that happen?

It is unclear if the claim is this to be a general problem or the assert 
is only invalid with the GuC. Lack of a CI reported issue suggests it is 
not a generic problem?

Regards,

Tvrtko

> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_engine_pm.c | 2 --
>   1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> index b0a4a2dbe3ee9..fb3e1599d04ec 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> @@ -68,8 +68,6 @@ static int __engine_unpark(struct intel_wakeref *wf)
>   			 ce->timeline->seqno,
>   			 READ_ONCE(*ce->timeline->hwsp_seqno),
>   			 ce->ring->emit);
> -		GEM_BUG_ON(ce->timeline->seqno !=
> -			   READ_ONCE(*ce->timeline->hwsp_seqno));
>   	}
>   
>   	if (engine->unpark)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Intel-gfx] [PATCH 02/12] drm/i915/guc: Don't call ring_is_idle in GuC submission
  2022-07-12 23:31   ` [Intel-gfx] " John.C.Harrison
  (?)
@ 2022-07-18 12:26   ` Tvrtko Ursulin
  2022-07-19  0:09     ` John Harrison
  -1 siblings, 1 reply; 56+ messages in thread
From: Tvrtko Ursulin @ 2022-07-18 12:26 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX; +Cc: DRI-Devel


On 13/07/2022 00:31, John.C.Harrison@Intel.com wrote:
> From: Matthew Brost <matthew.brost@intel.com>
> 
> The engine registers really shouldn't be touched during GuC submission
> as the GuC owns the registers. Don't call ring_is_idle and tie

Touch being just read and it is somehow harmful?

> intel_engine_is_idle strictly to the engine pm.

Strictly seems wrong - it is just ring_is_idle check that is replaced 
and not the whole implementation of intel_engine_is_idle.

> Because intel_engine_is_idle tied to the engine pm, retire requests
> before checking intel_engines_are_idle in gt_drop_caches, and lastly
Why is re-ordering important? I at least can't understand it. I hope 
it's not working around IGT failures.

> increase the timeout in gt_drop_caches for the intel_engines_are_idle
> check.

Same here - why?

Regards,

Tvrtko

> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_engine_cs.c | 13 +++++++++++++
>   drivers/gpu/drm/i915/i915_debugfs.c       |  6 +++---
>   drivers/gpu/drm/i915/i915_drv.h           |  2 +-
>   3 files changed, 17 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 283870c659911..959a7c92e8f4d 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -1602,6 +1602,9 @@ static bool ring_is_idle(struct intel_engine_cs *engine)
>   {
>   	bool idle = true;
>   
> +	/* GuC submission shouldn't access HEAD & TAIL via MMIO */
> +	GEM_BUG_ON(intel_engine_uses_guc(engine));
> +
>   	if (I915_SELFTEST_ONLY(!engine->mmio_base))
>   		return true;
>   
> @@ -1668,6 +1671,16 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
>   	if (!i915_sched_engine_is_empty(engine->sched_engine))
>   		return false;
>   
> +	/*
> +	 * We shouldn't touch engine registers with GuC submission as the GuC
> +	 * owns the registers. Let's tie the idle to engine pm, at worst this
> +	 * function sometimes will falsely report non-idle when idle during the
> +	 * delay to retire requests or with virtual engines and a request
> +	 * running on another instance within the same class / submit mask.
> +	 */
> +	if (intel_engine_uses_guc(engine))
> +		return false;
> +
>   	/* Ring stopped? */
>   	return ring_is_idle(engine);
>   }
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 94e5c29d2ee3a..ee5334840e9cb 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -654,13 +654,13 @@ gt_drop_caches(struct intel_gt *gt, u64 val)
>   {
>   	int ret;
>   
> +	if (val & DROP_RETIRE || val & DROP_RESET_ACTIVE)
> +		intel_gt_retire_requests(gt);
> +
>   	if (val & DROP_RESET_ACTIVE &&
>   	    wait_for(intel_engines_are_idle(gt), I915_IDLE_ENGINES_TIMEOUT))
>   		intel_gt_set_wedged(gt);
>   
> -	if (val & DROP_RETIRE)
> -		intel_gt_retire_requests(gt);
> -
>   	if (val & (DROP_IDLE | DROP_ACTIVE)) {
>   		ret = intel_gt_wait_for_idle(gt, MAX_SCHEDULE_TIMEOUT);
>   		if (ret)
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index c22f29c3faa0e..53c7474dde495 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -278,7 +278,7 @@ struct i915_gem_mm {
>   	u32 shrink_count;
>   };
>   
> -#define I915_IDLE_ENGINES_TIMEOUT (200) /* in ms */
> +#define I915_IDLE_ENGINES_TIMEOUT (500) /* in ms */
>   
>   unsigned long i915_fence_context_timeout(const struct drm_i915_private *i915,
>   					 u64 context);

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Intel-gfx] [PATCH 10/12] drm/i915/guc: Support larger contexts on newer hardware
  2022-07-12 23:31   ` [Intel-gfx] " John.C.Harrison
  (?)
@ 2022-07-18 12:35   ` Tvrtko Ursulin
  2022-07-19  0:13     ` John Harrison
  -1 siblings, 1 reply; 56+ messages in thread
From: Tvrtko Ursulin @ 2022-07-18 12:35 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX; +Cc: DRI-Devel


On 13/07/2022 00:31, John.C.Harrison@Intel.com wrote:
> From: Matthew Brost <matthew.brost@intel.com>
> 
> The GuC needs a copy of a golden context for implementing watchdog
> resets (aka media resets). This context is larger on newer platforms.
> So adjust the size being allocated/copied accordingly.

What were the consequences of this being too small? Media watchdog reset 
broken impacting userspace? Platforms? Do we have an IGT testcase? Do we 
need a Fixes: tag? Copy stable?

Regards,

Tvrtko

> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 10 +++++++---
>   1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> index ba7541f3ca610..74cbe8eaf5318 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> @@ -464,7 +464,11 @@ static void fill_engine_enable_masks(struct intel_gt *gt,
>   }
>   
>   #define LR_HW_CONTEXT_SIZE (80 * sizeof(u32))
> -#define LRC_SKIP_SIZE (LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SIZE)
> +#define XEHP_LR_HW_CONTEXT_SIZE (96 * sizeof(u32))
> +#define LR_HW_CONTEXT_SZ(i915) (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50) ? \
> +				    XEHP_LR_HW_CONTEXT_SIZE : \
> +				    LR_HW_CONTEXT_SIZE)
> +#define LRC_SKIP_SIZE(i915) (LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SZ(i915))
>   static int guc_prep_golden_context(struct intel_guc *guc)
>   {
>   	struct intel_gt *gt = guc_to_gt(guc);
> @@ -525,7 +529,7 @@ static int guc_prep_golden_context(struct intel_guc *guc)
>   		 * on all engines).
>   		 */
>   		ads_blob_write(guc, ads.eng_state_size[guc_class],
> -			       real_size - LRC_SKIP_SIZE);
> +			       real_size - LRC_SKIP_SIZE(gt->i915));
>   		ads_blob_write(guc, ads.golden_context_lrca[guc_class],
>   			       addr_ggtt);
>   
> @@ -599,7 +603,7 @@ static void guc_init_golden_context(struct intel_guc *guc)
>   		}
>   
>   		GEM_BUG_ON(ads_blob_read(guc, ads.eng_state_size[guc_class]) !=
> -			   real_size - LRC_SKIP_SIZE);
> +			   real_size - LRC_SKIP_SIZE(gt->i915));
>   		GEM_BUG_ON(ads_blob_read(guc, ads.golden_context_lrca[guc_class]) != addr_ggtt);
>   
>   		addr_ggtt += alloc_size;

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Intel-gfx] [PATCH 11/12] drm/i915/guc: Don't abort on CTB_UNUSED status
  2022-07-12 23:31   ` [Intel-gfx] " John.C.Harrison
  (?)
@ 2022-07-18 12:36   ` Tvrtko Ursulin
  2022-07-19  0:16     ` John Harrison
  -1 siblings, 1 reply; 56+ messages in thread
From: Tvrtko Ursulin @ 2022-07-18 12:36 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX; +Cc: DRI-Devel


On 13/07/2022 00:31, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> When the KMD sends a CLIENT_RESET request to GuC (as part of the
> suspend sequence), GuC will mark the CTB buffer as 'UNUSED'. If the
> KMD then checked the CTB queue, it would see a non-zero status value
> and report the buffer as corrupted.
> 
> Technically, no G2H messages should be received once the CLIENT_RESET
> has been sent. However, if a context was outstanding on an engine then
> it would get reset and a reset notification would be sent. So, don't
> actually treat UNUSED as a catastrophic error. Just flag it up as
> unexpected and keep going.

Does it need a Fixes: tag?

Regards,

Tvrtko

> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> ---
>   .../i915/gt/uc/abi/guc_communication_ctb_abi.h |  8 +++++---
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c      | 18 ++++++++++++++++--
>   2 files changed, 21 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
> index df83c1cc7c7a6..28b8387f97b77 100644
> --- a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
> +++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
> @@ -37,6 +37,7 @@
>    *  |   |       |   - _`GUC_CTB_STATUS_OVERFLOW` = 1 (head/tail too large)     |
>    *  |   |       |   - _`GUC_CTB_STATUS_UNDERFLOW` = 2 (truncated message)      |
>    *  |   |       |   - _`GUC_CTB_STATUS_MISMATCH` = 4 (head/tail modified)      |
> + *  |   |       |   - _`GUC_CTB_STATUS_UNUSED` = 8 (CTB is not in use)         |
>    *  +---+-------+--------------------------------------------------------------+
>    *  |...|       | RESERVED = MBZ                                               |
>    *  +---+-------+--------------------------------------------------------------+
> @@ -49,9 +50,10 @@ struct guc_ct_buffer_desc {
>   	u32 tail;
>   	u32 status;
>   #define GUC_CTB_STATUS_NO_ERROR				0
> -#define GUC_CTB_STATUS_OVERFLOW				(1 << 0)
> -#define GUC_CTB_STATUS_UNDERFLOW			(1 << 1)
> -#define GUC_CTB_STATUS_MISMATCH				(1 << 2)
> +#define GUC_CTB_STATUS_OVERFLOW				BIT(0)
> +#define GUC_CTB_STATUS_UNDERFLOW			BIT(1)
> +#define GUC_CTB_STATUS_MISMATCH				BIT(2)
> +#define GUC_CTB_STATUS_UNUSED				BIT(3)
>   	u32 reserved[13];
>   } __packed;
>   static_assert(sizeof(struct guc_ct_buffer_desc) == 64);
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> index f01325cd1b625..11b5d4ddb19ce 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> @@ -816,8 +816,22 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
>   	if (unlikely(ctb->broken))
>   		return -EPIPE;
>   
> -	if (unlikely(desc->status))
> -		goto corrupted;
> +	if (unlikely(desc->status)) {
> +		u32 status = desc->status;
> +
> +		if (status & GUC_CTB_STATUS_UNUSED) {
> +			/*
> +			 * Potentially valid if a CLIENT_RESET request resulted in
> +			 * contexts/engines being reset. But should never happen as
> +			 * no contexts should be active when CLIENT_RESET is sent.
> +			 */
> +			CT_ERROR(ct, "Unexpected G2H after GuC has stopped!\n");
> +			status &= ~GUC_CTB_STATUS_UNUSED;
> +		}
> +
> +		if (status)
> +			goto corrupted;
> +	}
>   
>   	GEM_BUG_ON(head > size);
>   

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Intel-gfx] [PATCH 01/12] drm/i915: Remove bogus GEM_BUG_ON in unpark
  2022-07-18 12:15   ` Tvrtko Ursulin
@ 2022-07-19  0:05     ` John Harrison
  2022-07-19  9:42       ` Tvrtko Ursulin
  0 siblings, 1 reply; 56+ messages in thread
From: John Harrison @ 2022-07-19  0:05 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-GFX; +Cc: DRI-Devel

On 7/18/2022 05:15, Tvrtko Ursulin wrote:
>
> On 13/07/2022 00:31, John.C.Harrison@Intel.com wrote:
>> From: Matthew Brost <matthew.brost@intel.com>
>>
>> Remove bogus GEM_BUG_ON which compared kernel context timeline seqno to
>> seqno in memory on engine PM unpark. If a GT reset occurred these values
>> might not match as a kernel context could be skipped. This bug was
>> hidden by always switching to a kernel context on park (execlists
>> requirement).
>
> Reset of the kernel context? Under which circumstances does that happen?
As per description, the issue is with full GT reset.

>
> It is unclear if the claim is this to be a general problem or the 
> assert is only invalid with the GuC. Lack of a CI reported issue 
> suggests it is not a generic problem?
Currently it is not an issue because we always switch to the kernel 
context because that's how execlists works and the entire driver is 
fundamentally based on execlist operation. When we stop using the kernel 
context as a (non-functional) barrier when using GuC submission, then 
you would see an issue without this fix.

John.


>
> Regards,
>
> Tvrtko
>
>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>> ---
>>   drivers/gpu/drm/i915/gt/intel_engine_pm.c | 2 --
>>   1 file changed, 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
>> b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>> index b0a4a2dbe3ee9..fb3e1599d04ec 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>> @@ -68,8 +68,6 @@ static int __engine_unpark(struct intel_wakeref *wf)
>>                ce->timeline->seqno,
>>                READ_ONCE(*ce->timeline->hwsp_seqno),
>>                ce->ring->emit);
>> -        GEM_BUG_ON(ce->timeline->seqno !=
>> -               READ_ONCE(*ce->timeline->hwsp_seqno));
>>       }
>>         if (engine->unpark)


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Intel-gfx] [PATCH 02/12] drm/i915/guc: Don't call ring_is_idle in GuC submission
  2022-07-18 12:26   ` Tvrtko Ursulin
@ 2022-07-19  0:09     ` John Harrison
  2022-07-19  9:49       ` Tvrtko Ursulin
  0 siblings, 1 reply; 56+ messages in thread
From: John Harrison @ 2022-07-19  0:09 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-GFX, Matthew Brost; +Cc: DRI-Devel

On 7/18/2022 05:26, Tvrtko Ursulin wrote:
>
> On 13/07/2022 00:31, John.C.Harrison@Intel.com wrote:
>> From: Matthew Brost <matthew.brost@intel.com>
>>
>> The engine registers really shouldn't be touched during GuC submission
>> as the GuC owns the registers. Don't call ring_is_idle and tie
>
> Touch being just read and it is somehow harmful?
The registers are meaningless when GuC is controlling the submission. 
The i915 driver has no knowledge of what context is or is not executing 
on any given engine at any given time. So reading reading the ring 
registers is incorrect - it can lead to bad assumptions about what state 
the hardware is in.

>
>> intel_engine_is_idle strictly to the engine pm.
>
> Strictly seems wrong - it is just ring_is_idle check that is replaced 
> and not the whole implementation of intel_engine_is_idle.
>
>> Because intel_engine_is_idle tied to the engine pm, retire requests
>> before checking intel_engines_are_idle in gt_drop_caches, and lastly
> Why is re-ordering important? I at least can't understand it. I hope 
> it's not working around IGT failures.
If requests are physically completed but not retired then they will be 
holding unnecessary PM references. So we need to flush those out before 
checking for idle.

>
>> increase the timeout in gt_drop_caches for the intel_engines_are_idle
>> check.
>
> Same here - why?
@Matthew Brost - do you recall which particular tests were hitting an 
issue? I'm guessing gem_ctx_create? I believe that's the one that 
creates and destroys thousands of contexts. That is much slower with GuC 
(GuC communication required) than with execlists (i915 internal state 
change only).

John.



>
> Regards,
>
> Tvrtko
>
>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>> ---
>>   drivers/gpu/drm/i915/gt/intel_engine_cs.c | 13 +++++++++++++
>>   drivers/gpu/drm/i915/i915_debugfs.c       |  6 +++---
>>   drivers/gpu/drm/i915/i915_drv.h           |  2 +-
>>   3 files changed, 17 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>> index 283870c659911..959a7c92e8f4d 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>> @@ -1602,6 +1602,9 @@ static bool ring_is_idle(struct intel_engine_cs 
>> *engine)
>>   {
>>       bool idle = true;
>>   +    /* GuC submission shouldn't access HEAD & TAIL via MMIO */
>> +    GEM_BUG_ON(intel_engine_uses_guc(engine));
>> +
>>       if (I915_SELFTEST_ONLY(!engine->mmio_base))
>>           return true;
>>   @@ -1668,6 +1671,16 @@ bool intel_engine_is_idle(struct 
>> intel_engine_cs *engine)
>>       if (!i915_sched_engine_is_empty(engine->sched_engine))
>>           return false;
>>   +    /*
>> +     * We shouldn't touch engine registers with GuC submission as 
>> the GuC
>> +     * owns the registers. Let's tie the idle to engine pm, at worst 
>> this
>> +     * function sometimes will falsely report non-idle when idle 
>> during the
>> +     * delay to retire requests or with virtual engines and a request
>> +     * running on another instance within the same class / submit mask.
>> +     */
>> +    if (intel_engine_uses_guc(engine))
>> +        return false;
>> +
>>       /* Ring stopped? */
>>       return ring_is_idle(engine);
>>   }
>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
>> b/drivers/gpu/drm/i915/i915_debugfs.c
>> index 94e5c29d2ee3a..ee5334840e9cb 100644
>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>> @@ -654,13 +654,13 @@ gt_drop_caches(struct intel_gt *gt, u64 val)
>>   {
>>       int ret;
>>   +    if (val & DROP_RETIRE || val & DROP_RESET_ACTIVE)
>> +        intel_gt_retire_requests(gt);
>> +
>>       if (val & DROP_RESET_ACTIVE &&
>>           wait_for(intel_engines_are_idle(gt), 
>> I915_IDLE_ENGINES_TIMEOUT))
>>           intel_gt_set_wedged(gt);
>>   -    if (val & DROP_RETIRE)
>> -        intel_gt_retire_requests(gt);
>> -
>>       if (val & (DROP_IDLE | DROP_ACTIVE)) {
>>           ret = intel_gt_wait_for_idle(gt, MAX_SCHEDULE_TIMEOUT);
>>           if (ret)
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index c22f29c3faa0e..53c7474dde495 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -278,7 +278,7 @@ struct i915_gem_mm {
>>       u32 shrink_count;
>>   };
>>   -#define I915_IDLE_ENGINES_TIMEOUT (200) /* in ms */
>> +#define I915_IDLE_ENGINES_TIMEOUT (500) /* in ms */
>>     unsigned long i915_fence_context_timeout(const struct 
>> drm_i915_private *i915,
>>                        u64 context);


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Intel-gfx] [PATCH 10/12] drm/i915/guc: Support larger contexts on newer hardware
  2022-07-18 12:35   ` Tvrtko Ursulin
@ 2022-07-19  0:13     ` John Harrison
  2022-07-19  9:56       ` Tvrtko Ursulin
  0 siblings, 1 reply; 56+ messages in thread
From: John Harrison @ 2022-07-19  0:13 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-GFX; +Cc: DRI-Devel

On 7/18/2022 05:35, Tvrtko Ursulin wrote:
>
> On 13/07/2022 00:31, John.C.Harrison@Intel.com wrote:
>> From: Matthew Brost <matthew.brost@intel.com>
>>
>> The GuC needs a copy of a golden context for implementing watchdog
>> resets (aka media resets). This context is larger on newer platforms.
>> So adjust the size being allocated/copied accordingly.
>
> What were the consequences of this being too small? Media watchdog 
> reset broken impacting userspace? Platforms? Do we have an IGT 
> testcase? Do we need a Fixes: tag? Copy stable?
Yes. Not sure if we have an IGT for the media watchdog. I recall writing 
something a long time back but I don't think it ever got merged due to 
push back that I don't recall right now. And no because it only affects 
DG2 onwards which is still forceprobed.

John.


>
> Regards,
>
> Tvrtko
>
>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>> ---
>>   drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 10 +++++++---
>>   1 file changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
>> index ba7541f3ca610..74cbe8eaf5318 100644
>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
>> @@ -464,7 +464,11 @@ static void fill_engine_enable_masks(struct 
>> intel_gt *gt,
>>   }
>>     #define LR_HW_CONTEXT_SIZE (80 * sizeof(u32))
>> -#define LRC_SKIP_SIZE (LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SIZE)
>> +#define XEHP_LR_HW_CONTEXT_SIZE (96 * sizeof(u32))
>> +#define LR_HW_CONTEXT_SZ(i915) (GRAPHICS_VER_FULL(i915) >= 
>> IP_VER(12, 50) ? \
>> +                    XEHP_LR_HW_CONTEXT_SIZE : \
>> +                    LR_HW_CONTEXT_SIZE)
>> +#define LRC_SKIP_SIZE(i915) (LRC_PPHWSP_SZ * PAGE_SIZE + 
>> LR_HW_CONTEXT_SZ(i915))
>>   static int guc_prep_golden_context(struct intel_guc *guc)
>>   {
>>       struct intel_gt *gt = guc_to_gt(guc);
>> @@ -525,7 +529,7 @@ static int guc_prep_golden_context(struct 
>> intel_guc *guc)
>>            * on all engines).
>>            */
>>           ads_blob_write(guc, ads.eng_state_size[guc_class],
>> -                   real_size - LRC_SKIP_SIZE);
>> +                   real_size - LRC_SKIP_SIZE(gt->i915));
>>           ads_blob_write(guc, ads.golden_context_lrca[guc_class],
>>                      addr_ggtt);
>>   @@ -599,7 +603,7 @@ static void guc_init_golden_context(struct 
>> intel_guc *guc)
>>           }
>>             GEM_BUG_ON(ads_blob_read(guc, 
>> ads.eng_state_size[guc_class]) !=
>> -               real_size - LRC_SKIP_SIZE);
>> +               real_size - LRC_SKIP_SIZE(gt->i915));
>>           GEM_BUG_ON(ads_blob_read(guc, 
>> ads.golden_context_lrca[guc_class]) != addr_ggtt);
>>             addr_ggtt += alloc_size;


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Intel-gfx] [PATCH 11/12] drm/i915/guc: Don't abort on CTB_UNUSED status
  2022-07-18 12:36   ` Tvrtko Ursulin
@ 2022-07-19  0:16     ` John Harrison
  0 siblings, 0 replies; 56+ messages in thread
From: John Harrison @ 2022-07-19  0:16 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-GFX; +Cc: DRI-Devel

On 7/18/2022 05:36, Tvrtko Ursulin wrote:
> On 13/07/2022 00:31, John.C.Harrison@Intel.com wrote:
>> From: John Harrison <John.C.Harrison@Intel.com>
>>
>> When the KMD sends a CLIENT_RESET request to GuC (as part of the
>> suspend sequence), GuC will mark the CTB buffer as 'UNUSED'. If the
>> KMD then checked the CTB queue, it would see a non-zero status value
>> and report the buffer as corrupted.
>>
>> Technically, no G2H messages should be received once the CLIENT_RESET
>> has been sent. However, if a context was outstanding on an engine then
>> it would get reset and a reset notification would be sent. So, don't
>> actually treat UNUSED as a catastrophic error. Just flag it up as
>> unexpected and keep going.
>
> Does it need a Fixes: tag?
Not really. It's a theoretical hole only. It was exposed during POC work 
for a nasty w/a that ultimately didn't need to go forwards. The POC was 
generating fake interrupts and causing us to check the CTB channel after 
the CLIENT_RESET had been processed. We have never had an actual 
instance of an outstanding request during CLIENT_RESET. That would be a 
much more serious bug - trying to suspend while an engine is still 
processing a request.

John.


>
> Regards,
>
> Tvrtko
>
>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>> ---
>>   .../i915/gt/uc/abi/guc_communication_ctb_abi.h |  8 +++++---
>>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c      | 18 ++++++++++++++++--
>>   2 files changed, 21 insertions(+), 5 deletions(-)
>>
>> diff --git 
>> a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h 
>> b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
>> index df83c1cc7c7a6..28b8387f97b77 100644
>> --- a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
>> +++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
>> @@ -37,6 +37,7 @@
>>    *  |   |       |   - _`GUC_CTB_STATUS_OVERFLOW` = 1 (head/tail too 
>> large)     |
>>    *  |   |       |   - _`GUC_CTB_STATUS_UNDERFLOW` = 2 (truncated 
>> message)      |
>>    *  |   |       |   - _`GUC_CTB_STATUS_MISMATCH` = 4 (head/tail 
>> modified)      |
>> + *  |   |       |   - _`GUC_CTB_STATUS_UNUSED` = 8 (CTB is not in 
>> use)         |
>>    * 
>> +---+-------+--------------------------------------------------------------+
>>    *  |...|       | RESERVED = 
>> MBZ                                               |
>>    * 
>> +---+-------+--------------------------------------------------------------+
>> @@ -49,9 +50,10 @@ struct guc_ct_buffer_desc {
>>       u32 tail;
>>       u32 status;
>>   #define GUC_CTB_STATUS_NO_ERROR                0
>> -#define GUC_CTB_STATUS_OVERFLOW                (1 << 0)
>> -#define GUC_CTB_STATUS_UNDERFLOW            (1 << 1)
>> -#define GUC_CTB_STATUS_MISMATCH                (1 << 2)
>> +#define GUC_CTB_STATUS_OVERFLOW                BIT(0)
>> +#define GUC_CTB_STATUS_UNDERFLOW            BIT(1)
>> +#define GUC_CTB_STATUS_MISMATCH                BIT(2)
>> +#define GUC_CTB_STATUS_UNUSED                BIT(3)
>>       u32 reserved[13];
>>   } __packed;
>>   static_assert(sizeof(struct guc_ct_buffer_desc) == 64);
>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>> index f01325cd1b625..11b5d4ddb19ce 100644
>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>> @@ -816,8 +816,22 @@ static int ct_read(struct intel_guc_ct *ct, 
>> struct ct_incoming_msg **msg)
>>       if (unlikely(ctb->broken))
>>           return -EPIPE;
>>   -    if (unlikely(desc->status))
>> -        goto corrupted;
>> +    if (unlikely(desc->status)) {
>> +        u32 status = desc->status;
>> +
>> +        if (status & GUC_CTB_STATUS_UNUSED) {
>> +            /*
>> +             * Potentially valid if a CLIENT_RESET request resulted in
>> +             * contexts/engines being reset. But should never happen as
>> +             * no contexts should be active when CLIENT_RESET is sent.
>> +             */
>> +            CT_ERROR(ct, "Unexpected G2H after GuC has stopped!\n");
>> +            status &= ~GUC_CTB_STATUS_UNUSED;
>> +        }
>> +
>> +        if (status)
>> +            goto corrupted;
>> +    }
>>         GEM_BUG_ON(head > size);


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Intel-gfx] [PATCH 01/12] drm/i915: Remove bogus GEM_BUG_ON in unpark
  2022-07-19  0:05     ` John Harrison
@ 2022-07-19  9:42       ` Tvrtko Ursulin
  2022-07-21  0:54         ` John Harrison
  0 siblings, 1 reply; 56+ messages in thread
From: Tvrtko Ursulin @ 2022-07-19  9:42 UTC (permalink / raw)
  To: John Harrison, Intel-GFX; +Cc: DRI-Devel


On 19/07/2022 01:05, John Harrison wrote:
> On 7/18/2022 05:15, Tvrtko Ursulin wrote:
>>
>> On 13/07/2022 00:31, John.C.Harrison@Intel.com wrote:
>>> From: Matthew Brost <matthew.brost@intel.com>
>>>
>>> Remove bogus GEM_BUG_ON which compared kernel context timeline seqno to
>>> seqno in memory on engine PM unpark. If a GT reset occurred these values
>>> might not match as a kernel context could be skipped. This bug was
>>> hidden by always switching to a kernel context on park (execlists
>>> requirement).
>>
>> Reset of the kernel context? Under which circumstances does that happen?
> As per description, the issue is with full GT reset.
> 
>>
>> It is unclear if the claim is this to be a general problem or the 
>> assert is only invalid with the GuC. Lack of a CI reported issue 
>> suggests it is not a generic problem?
> Currently it is not an issue because we always switch to the kernel 
> context because that's how execlists works and the entire driver is 
> fundamentally based on execlist operation. When we stop using the kernel 
> context as a (non-functional) barrier when using GuC submission, then 
> you would see an issue without this fix.

Issue is with GuC, GuC and full reset, or with full reset regardless of 
the backend?

If issue is only with GuC patch should have drm/i915/guc prefix as 
minimum. But if it actually only becomes a problem when GuC backend 
stops parking with the kernel context when I think the whole unpark code 
should be refactored in a cleaner way than just removing the one assert. 
Otherwise what is the point of leaving everything else in there?

Or if the issue is backend agnostic, *if* full reset happens to hit 
during parking, then it is different. Wouldn't that be a race with 
parking and reset which probably shouldn't happen to start with.

Regards,

Tvrtko

> 
> John.
> 
> 
>>
>> Regards,
>>
>> Tvrtko
>>
>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/gt/intel_engine_pm.c | 2 --
>>>   1 file changed, 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
>>> b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>> index b0a4a2dbe3ee9..fb3e1599d04ec 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>> @@ -68,8 +68,6 @@ static int __engine_unpark(struct intel_wakeref *wf)
>>>                ce->timeline->seqno,
>>>                READ_ONCE(*ce->timeline->hwsp_seqno),
>>>                ce->ring->emit);
>>> -        GEM_BUG_ON(ce->timeline->seqno !=
>>> -               READ_ONCE(*ce->timeline->hwsp_seqno));
>>>       }
>>>         if (engine->unpark)
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Intel-gfx] [PATCH 02/12] drm/i915/guc: Don't call ring_is_idle in GuC submission
  2022-07-19  0:09     ` John Harrison
@ 2022-07-19  9:49       ` Tvrtko Ursulin
  2022-07-19 10:14         ` Tvrtko Ursulin
  0 siblings, 1 reply; 56+ messages in thread
From: Tvrtko Ursulin @ 2022-07-19  9:49 UTC (permalink / raw)
  To: John Harrison, Intel-GFX, Matthew Brost; +Cc: DRI-Devel


On 19/07/2022 01:09, John Harrison wrote:
> On 7/18/2022 05:26, Tvrtko Ursulin wrote:
>>
>> On 13/07/2022 00:31, John.C.Harrison@Intel.com wrote:
>>> From: Matthew Brost <matthew.brost@intel.com>
>>>
>>> The engine registers really shouldn't be touched during GuC submission
>>> as the GuC owns the registers. Don't call ring_is_idle and tie
>>
>> Touch being just read and it is somehow harmful?
> The registers are meaningless when GuC is controlling the submission. 
> The i915 driver has no knowledge of what context is or is not executing 
> on any given engine at any given time. So reading reading the ring 
> registers is incorrect - it can lead to bad assumptions about what state 
> the hardware is in.

Same is actually true with the execlists backend. The code in 
ring_is_idle is not concerning itself with which context is running or 
not. Just that the head/tail/ctl appear idle.

Problem/motivation appears to be on a higher than simply ring registers.

I am not claiming it makes sense with Guc and that it has to remain but 
just suggesting for as a minimum clearer commit message.

>>> intel_engine_is_idle strictly to the engine pm.
>>
>> Strictly seems wrong - it is just ring_is_idle check that is replaced 
>> and not the whole implementation of intel_engine_is_idle.
>>
>>> Because intel_engine_is_idle tied to the engine pm, retire requests
>>> before checking intel_engines_are_idle in gt_drop_caches, and lastly
>> Why is re-ordering important? I at least can't understand it. I hope 
>> it's not working around IGT failures.
> If requests are physically completed but not retired then they will be 
> holding unnecessary PM references. So we need to flush those out before 
> checking for idle.

And if they are not as someone passes in DROP_RESET_ACTIVE? They will 
not retire and code still enters intel_engines_are_idle so that has to 
work, no? Something does not align for me still.

>>> increase the timeout in gt_drop_caches for the intel_engines_are_idle
>>> check.
>>
>> Same here - why?
> @Matthew Brost - do you recall which particular tests were hitting an 
> issue? I'm guessing gem_ctx_create? I believe that's the one that 
> creates and destroys thousands of contexts. That is much slower with GuC 
> (GuC communication required) than with execlists (i915 internal state 
> change only).

And if that is a logically separate change please split the patch up.

Regards,

Tvrtko

> 
> John.
> 
> 
> 
>>
>> Regards,
>>
>> Tvrtko
>>
>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/gt/intel_engine_cs.c | 13 +++++++++++++
>>>   drivers/gpu/drm/i915/i915_debugfs.c       |  6 +++---
>>>   drivers/gpu/drm/i915/i915_drv.h           |  2 +-
>>>   3 files changed, 17 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
>>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> index 283870c659911..959a7c92e8f4d 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> @@ -1602,6 +1602,9 @@ static bool ring_is_idle(struct intel_engine_cs 
>>> *engine)
>>>   {
>>>       bool idle = true;
>>>   +    /* GuC submission shouldn't access HEAD & TAIL via MMIO */
>>> +    GEM_BUG_ON(intel_engine_uses_guc(engine));
>>> +
>>>       if (I915_SELFTEST_ONLY(!engine->mmio_base))
>>>           return true;
>>>   @@ -1668,6 +1671,16 @@ bool intel_engine_is_idle(struct 
>>> intel_engine_cs *engine)
>>>       if (!i915_sched_engine_is_empty(engine->sched_engine))
>>>           return false;
>>>   +    /*
>>> +     * We shouldn't touch engine registers with GuC submission as 
>>> the GuC
>>> +     * owns the registers. Let's tie the idle to engine pm, at worst 
>>> this
>>> +     * function sometimes will falsely report non-idle when idle 
>>> during the
>>> +     * delay to retire requests or with virtual engines and a request
>>> +     * running on another instance within the same class / submit mask.
>>> +     */
>>> +    if (intel_engine_uses_guc(engine))
>>> +        return false;
>>> +
>>>       /* Ring stopped? */
>>>       return ring_is_idle(engine);
>>>   }
>>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
>>> b/drivers/gpu/drm/i915/i915_debugfs.c
>>> index 94e5c29d2ee3a..ee5334840e9cb 100644
>>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>>> @@ -654,13 +654,13 @@ gt_drop_caches(struct intel_gt *gt, u64 val)
>>>   {
>>>       int ret;
>>>   +    if (val & DROP_RETIRE || val & DROP_RESET_ACTIVE)
>>> +        intel_gt_retire_requests(gt);
>>> +
>>>       if (val & DROP_RESET_ACTIVE &&
>>>           wait_for(intel_engines_are_idle(gt), 
>>> I915_IDLE_ENGINES_TIMEOUT))
>>>           intel_gt_set_wedged(gt);
>>>   -    if (val & DROP_RETIRE)
>>> -        intel_gt_retire_requests(gt);
>>> -
>>>       if (val & (DROP_IDLE | DROP_ACTIVE)) {
>>>           ret = intel_gt_wait_for_idle(gt, MAX_SCHEDULE_TIMEOUT);
>>>           if (ret)
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>> b/drivers/gpu/drm/i915/i915_drv.h
>>> index c22f29c3faa0e..53c7474dde495 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -278,7 +278,7 @@ struct i915_gem_mm {
>>>       u32 shrink_count;
>>>   };
>>>   -#define I915_IDLE_ENGINES_TIMEOUT (200) /* in ms */
>>> +#define I915_IDLE_ENGINES_TIMEOUT (500) /* in ms */
>>>     unsigned long i915_fence_context_timeout(const struct 
>>> drm_i915_private *i915,
>>>                        u64 context);
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Intel-gfx] [PATCH 10/12] drm/i915/guc: Support larger contexts on newer hardware
  2022-07-19  0:13     ` John Harrison
@ 2022-07-19  9:56       ` Tvrtko Ursulin
  2022-07-22 19:32         ` John Harrison
  0 siblings, 1 reply; 56+ messages in thread
From: Tvrtko Ursulin @ 2022-07-19  9:56 UTC (permalink / raw)
  To: John Harrison, Intel-GFX; +Cc: DRI-Devel


On 19/07/2022 01:13, John Harrison wrote:
> On 7/18/2022 05:35, Tvrtko Ursulin wrote:
>>
>> On 13/07/2022 00:31, John.C.Harrison@Intel.com wrote:
>>> From: Matthew Brost <matthew.brost@intel.com>
>>>
>>> The GuC needs a copy of a golden context for implementing watchdog
>>> resets (aka media resets). This context is larger on newer platforms.
>>> So adjust the size being allocated/copied accordingly.
>>
>> What were the consequences of this being too small? Media watchdog 
>> reset broken impacting userspace? Platforms? Do we have an IGT 
>> testcase? Do we need a Fixes: tag? Copy stable?
> Yes. Not sure if we have an IGT for the media watchdog. I recall writing 
> something a long time back but I don't think it ever got merged due to 
> push back that I don't recall right now. And no because it only affects 
> DG2 onwards which is still forceprobed.

Right, hm, I don't know if the MBD SKU promise for DG2 relies on force 
probe removal or not. My impression certainly was that a bunch of uapi 
we recently merged made people happy in that respect - that we satisfied 
the commit to deliver that support with 5.19. Maybe I am wrong, or 
perhaps to err on the side of safety you could add the right Fixes: tag 
regardless? Pick some patch which enables GuC for DG2 if there isn't 
anything better I guess. Or you could check with James.

Regards,

Tvrtko

> John.
> 
> 
>>
>> Regards,
>>
>> Tvrtko
>>
>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 10 +++++++---
>>>   1 file changed, 7 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
>>> index ba7541f3ca610..74cbe8eaf5318 100644
>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
>>> @@ -464,7 +464,11 @@ static void fill_engine_enable_masks(struct 
>>> intel_gt *gt,
>>>   }
>>>     #define LR_HW_CONTEXT_SIZE (80 * sizeof(u32))
>>> -#define LRC_SKIP_SIZE (LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SIZE)
>>> +#define XEHP_LR_HW_CONTEXT_SIZE (96 * sizeof(u32))
>>> +#define LR_HW_CONTEXT_SZ(i915) (GRAPHICS_VER_FULL(i915) >= 
>>> IP_VER(12, 50) ? \
>>> +                    XEHP_LR_HW_CONTEXT_SIZE : \
>>> +                    LR_HW_CONTEXT_SIZE)
>>> +#define LRC_SKIP_SIZE(i915) (LRC_PPHWSP_SZ * PAGE_SIZE + 
>>> LR_HW_CONTEXT_SZ(i915))
>>>   static int guc_prep_golden_context(struct intel_guc *guc)
>>>   {
>>>       struct intel_gt *gt = guc_to_gt(guc);
>>> @@ -525,7 +529,7 @@ static int guc_prep_golden_context(struct 
>>> intel_guc *guc)
>>>            * on all engines).
>>>            */
>>>           ads_blob_write(guc, ads.eng_state_size[guc_class],
>>> -                   real_size - LRC_SKIP_SIZE);
>>> +                   real_size - LRC_SKIP_SIZE(gt->i915));
>>>           ads_blob_write(guc, ads.golden_context_lrca[guc_class],
>>>                      addr_ggtt);
>>>   @@ -599,7 +603,7 @@ static void guc_init_golden_context(struct 
>>> intel_guc *guc)
>>>           }
>>>             GEM_BUG_ON(ads_blob_read(guc, 
>>> ads.eng_state_size[guc_class]) !=
>>> -               real_size - LRC_SKIP_SIZE);
>>> +               real_size - LRC_SKIP_SIZE(gt->i915));
>>>           GEM_BUG_ON(ads_blob_read(guc, 
>>> ads.golden_context_lrca[guc_class]) != addr_ggtt);
>>>             addr_ggtt += alloc_size;
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Intel-gfx] [PATCH 02/12] drm/i915/guc: Don't call ring_is_idle in GuC submission
  2022-07-19  9:49       ` Tvrtko Ursulin
@ 2022-07-19 10:14         ` Tvrtko Ursulin
  0 siblings, 0 replies; 56+ messages in thread
From: Tvrtko Ursulin @ 2022-07-19 10:14 UTC (permalink / raw)
  To: John Harrison, Intel-GFX, Matthew Brost; +Cc: DRI-Devel


On 19/07/2022 10:49, Tvrtko Ursulin wrote:
> 
> On 19/07/2022 01:09, John Harrison wrote:
>> On 7/18/2022 05:26, Tvrtko Ursulin wrote:
>>>
>>> On 13/07/2022 00:31, John.C.Harrison@Intel.com wrote:
>>>> From: Matthew Brost <matthew.brost@intel.com>
>>>>
>>>> The engine registers really shouldn't be touched during GuC submission
>>>> as the GuC owns the registers. Don't call ring_is_idle and tie
>>>
>>> Touch being just read and it is somehow harmful?
>> The registers are meaningless when GuC is controlling the submission. 
>> The i915 driver has no knowledge of what context is or is not 
>> executing on any given engine at any given time. So reading reading 
>> the ring registers is incorrect - it can lead to bad assumptions about 
>> what state the hardware is in.
> 
> Same is actually true with the execlists backend. The code in 
> ring_is_idle is not concerning itself with which context is running or 
> not. Just that the head/tail/ctl appear idle.
> 
> Problem/motivation appears to be on a higher than simply ring registers.
> 
> I am not claiming it makes sense with Guc and that it has to remain but 
> just suggesting for as a minimum clearer commit message.
> 
>>>> intel_engine_is_idle strictly to the engine pm.
>>>
>>> Strictly seems wrong - it is just ring_is_idle check that is replaced 
>>> and not the whole implementation of intel_engine_is_idle.
>>>
>>>> Because intel_engine_is_idle tied to the engine pm, retire requests
>>>> before checking intel_engines_are_idle in gt_drop_caches, and lastly
>>> Why is re-ordering important? I at least can't understand it. I hope 
>>> it's not working around IGT failures.
>> If requests are physically completed but not retired then they will be 
>> holding unnecessary PM references. So we need to flush those out 
>> before checking for idle.
> 
> And if they are not as someone passes in DROP_RESET_ACTIVE? They will 
> not retire and code still enters intel_engines_are_idle so that has to 
> work, no? Something does not align for me still.

With "not retire" I meant potentially not retire within 
I915_IDLE_ENGINES_TIMEOUT. I guess hack happens to work for some or all 
IGTs which use DROP_RESET_ACTIVE.

Does it also mean patch would fix that problem without touching 
intel_engine_is_idle/ring_is_idle - with just the re-ordering in 
gt_drop_caches?

Regards,

Tvrtko

> 
>>>> increase the timeout in gt_drop_caches for the intel_engines_are_idle
>>>> check.
>>>
>>> Same here - why?
>> @Matthew Brost - do you recall which particular tests were hitting an 
>> issue? I'm guessing gem_ctx_create? I believe that's the one that 
>> creates and destroys thousands of contexts. That is much slower with 
>> GuC (GuC communication required) than with execlists (i915 internal 
>> state change only).
> 
> And if that is a logically separate change please split the patch up.
> 
> Regards,
> 
> Tvrtko
> 
>>
>> John.
>>
>>
>>
>>>
>>> Regards,
>>>
>>> Tvrtko
>>>
>>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>>> ---
>>>>   drivers/gpu/drm/i915/gt/intel_engine_cs.c | 13 +++++++++++++
>>>>   drivers/gpu/drm/i915/i915_debugfs.c       |  6 +++---
>>>>   drivers/gpu/drm/i915/i915_drv.h           |  2 +-
>>>>   3 files changed, 17 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
>>>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>> index 283870c659911..959a7c92e8f4d 100644
>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>> @@ -1602,6 +1602,9 @@ static bool ring_is_idle(struct 
>>>> intel_engine_cs *engine)
>>>>   {
>>>>       bool idle = true;
>>>>   +    /* GuC submission shouldn't access HEAD & TAIL via MMIO */
>>>> +    GEM_BUG_ON(intel_engine_uses_guc(engine));
>>>> +
>>>>       if (I915_SELFTEST_ONLY(!engine->mmio_base))
>>>>           return true;
>>>>   @@ -1668,6 +1671,16 @@ bool intel_engine_is_idle(struct 
>>>> intel_engine_cs *engine)
>>>>       if (!i915_sched_engine_is_empty(engine->sched_engine))
>>>>           return false;
>>>>   +    /*
>>>> +     * We shouldn't touch engine registers with GuC submission as 
>>>> the GuC
>>>> +     * owns the registers. Let's tie the idle to engine pm, at 
>>>> worst this
>>>> +     * function sometimes will falsely report non-idle when idle 
>>>> during the
>>>> +     * delay to retire requests or with virtual engines and a request
>>>> +     * running on another instance within the same class / submit 
>>>> mask.
>>>> +     */
>>>> +    if (intel_engine_uses_guc(engine))
>>>> +        return false;
>>>> +
>>>>       /* Ring stopped? */
>>>>       return ring_is_idle(engine);
>>>>   }
>>>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
>>>> b/drivers/gpu/drm/i915/i915_debugfs.c
>>>> index 94e5c29d2ee3a..ee5334840e9cb 100644
>>>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>>>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>>>> @@ -654,13 +654,13 @@ gt_drop_caches(struct intel_gt *gt, u64 val)
>>>>   {
>>>>       int ret;
>>>>   +    if (val & DROP_RETIRE || val & DROP_RESET_ACTIVE)
>>>> +        intel_gt_retire_requests(gt);
>>>> +
>>>>       if (val & DROP_RESET_ACTIVE &&
>>>>           wait_for(intel_engines_are_idle(gt), 
>>>> I915_IDLE_ENGINES_TIMEOUT))
>>>>           intel_gt_set_wedged(gt);
>>>>   -    if (val & DROP_RETIRE)
>>>> -        intel_gt_retire_requests(gt);
>>>> -
>>>>       if (val & (DROP_IDLE | DROP_ACTIVE)) {
>>>>           ret = intel_gt_wait_for_idle(gt, MAX_SCHEDULE_TIMEOUT);
>>>>           if (ret)
>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>> index c22f29c3faa0e..53c7474dde495 100644
>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>> @@ -278,7 +278,7 @@ struct i915_gem_mm {
>>>>       u32 shrink_count;
>>>>   };
>>>>   -#define I915_IDLE_ENGINES_TIMEOUT (200) /* in ms */
>>>> +#define I915_IDLE_ENGINES_TIMEOUT (500) /* in ms */
>>>>     unsigned long i915_fence_context_timeout(const struct 
>>>> drm_i915_private *i915,
>>>>                        u64 context);
>>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Intel-gfx] [PATCH 01/12] drm/i915: Remove bogus GEM_BUG_ON in unpark
  2022-07-19  9:42       ` Tvrtko Ursulin
@ 2022-07-21  0:54         ` John Harrison
  2022-07-21  9:24           ` Tvrtko Ursulin
  0 siblings, 1 reply; 56+ messages in thread
From: John Harrison @ 2022-07-21  0:54 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-GFX; +Cc: DRI-Devel



On 7/19/2022 02:42, Tvrtko Ursulin wrote:
>
> On 19/07/2022 01:05, John Harrison wrote:
>> On 7/18/2022 05:15, Tvrtko Ursulin wrote:
>>>
>>> On 13/07/2022 00:31, John.C.Harrison@Intel.com wrote:
>>>> From: Matthew Brost <matthew.brost@intel.com>
>>>>
>>>> Remove bogus GEM_BUG_ON which compared kernel context timeline 
>>>> seqno to
>>>> seqno in memory on engine PM unpark. If a GT reset occurred these 
>>>> values
>>>> might not match as a kernel context could be skipped. This bug was
>>>> hidden by always switching to a kernel context on park (execlists
>>>> requirement).
>>>
>>> Reset of the kernel context? Under which circumstances does that 
>>> happen?
>> As per description, the issue is with full GT reset.
>>
>>>
>>> It is unclear if the claim is this to be a general problem or the 
>>> assert is only invalid with the GuC. Lack of a CI reported issue 
>>> suggests it is not a generic problem?
>> Currently it is not an issue because we always switch to the kernel 
>> context because that's how execlists works and the entire driver is 
>> fundamentally based on execlist operation. When we stop using the 
>> kernel context as a (non-functional) barrier when using GuC 
>> submission, then you would see an issue without this fix.
>
> Issue is with GuC, GuC and full reset, or with full reset regardless 
> of the backend?
The issue is with code making invalid assumptions. The assumption is 
currently not failing because the execlist backend requires the use of a 
barrier context for a bunch of operations. The GuC backend does not 
require this. In fact, the barrier context does not function as a 
barrier when the scheduler is external to i915. Hence the desire to 
remove the use of the barrier context from generic i915 operation and 
make it only used when in execlist mode. At that point, the invalid 
assumption will no longer work and the BUG will fire.

>
> If issue is only with GuC patch should have drm/i915/guc prefix as 
> minimum. But if it actually only becomes a problem when GuC backend 
> stops parking with the kernel context when I think the whole unpark 
> code should be refactored in a cleaner way than just removing the one 
> assert. Otherwise what is the point of leaving everything else in there?
>
> Or if the issue is backend agnostic, *if* full reset happens to hit 
> during parking, then it is different. Wouldn't that be a race with 
> parking and reset which probably shouldn't happen to start with.
>
The issue is neither with GuC nor with resets, GT or otherwise. The 
issue is with generic i915 code making assumptions about backend 
implementations that are only correct for the execlist implementation.

John.


> Regards,
>
> Tvrtko
>
>>
>> John.
>>
>>
>>>
>>> Regards,
>>>
>>> Tvrtko
>>>
>>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>>> ---
>>>>   drivers/gpu/drm/i915/gt/intel_engine_pm.c | 2 --
>>>>   1 file changed, 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
>>>> b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>> index b0a4a2dbe3ee9..fb3e1599d04ec 100644
>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>> @@ -68,8 +68,6 @@ static int __engine_unpark(struct intel_wakeref *wf)
>>>>                ce->timeline->seqno,
>>>>                READ_ONCE(*ce->timeline->hwsp_seqno),
>>>>                ce->ring->emit);
>>>> -        GEM_BUG_ON(ce->timeline->seqno !=
>>>> -               READ_ONCE(*ce->timeline->hwsp_seqno));
>>>>       }
>>>>         if (engine->unpark)
>>


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Intel-gfx] [PATCH 01/12] drm/i915: Remove bogus GEM_BUG_ON in unpark
  2022-07-21  0:54         ` John Harrison
@ 2022-07-21  9:24           ` Tvrtko Ursulin
  2022-07-22 19:09             ` John Harrison
  0 siblings, 1 reply; 56+ messages in thread
From: Tvrtko Ursulin @ 2022-07-21  9:24 UTC (permalink / raw)
  To: John Harrison, Intel-GFX; +Cc: DRI-Devel



On 21/07/2022 01:54, John Harrison wrote:
> On 7/19/2022 02:42, Tvrtko Ursulin wrote:
>> On 19/07/2022 01:05, John Harrison wrote:
>>> On 7/18/2022 05:15, Tvrtko Ursulin wrote:
>>>>
>>>> On 13/07/2022 00:31, John.C.Harrison@Intel.com wrote:
>>>>> From: Matthew Brost <matthew.brost@intel.com>
>>>>>
>>>>> Remove bogus GEM_BUG_ON which compared kernel context timeline 
>>>>> seqno to
>>>>> seqno in memory on engine PM unpark. If a GT reset occurred these 
>>>>> values
>>>>> might not match as a kernel context could be skipped. This bug was
>>>>> hidden by always switching to a kernel context on park (execlists
>>>>> requirement).
>>>>
>>>> Reset of the kernel context? Under which circumstances does that 
>>>> happen?
>>> As per description, the issue is with full GT reset.
>>>
>>>>
>>>> It is unclear if the claim is this to be a general problem or the 
>>>> assert is only invalid with the GuC. Lack of a CI reported issue 
>>>> suggests it is not a generic problem?
>>> Currently it is not an issue because we always switch to the kernel 
>>> context because that's how execlists works and the entire driver is 
>>> fundamentally based on execlist operation. When we stop using the 
>>> kernel context as a (non-functional) barrier when using GuC 
>>> submission, then you would see an issue without this fix.

Let me pick this point to try again - I am simply asking for a clear 
description of steps which lead to the problem, instead of, what I find 
are, generic and hard to penetrate statements of invalid assumptions etc.

I picked this spot because of this: "When we stop using the kernel 
context as a (non-functional) barrier when using GuC submission, then 
you would see an issue without this fix."

I point to 363324292710 ("drm/i915/guc: Don't call 
switch_to_kernel_context with GuC submission"). Hence it is not when but 
it already happened. Which in my mind then does not compute - I can't 
grok the explanation which appears to fall over on the first claim.

Or perhaps the bug on is already firing today on every GuC enabled 
machine in the CI? In which case there is a Fixes: link to be added?

I have asked about, if we have 363324292710, and if this patch is 
removing the seqno bug on, why it is not removing something more in 
__engine_unpark, gated on "is guc"? Like ss there a point to sanitizing 
the context which wasn't lost, because it wasn't used to park the engine 
with?

Or if the problem can't be hit with execlists (in case reset claim from 
the commit message misleading), why shouldn't the bug on be changed to 
contain the !guc condition instead of being remove?

I am simply asking for a clear explanation of the conditions and steps 
which lead to the bug on incorrectly firing. It doesn't have to be long 
text or anything like that, just clear so we can close this and move on.

Regards,

Tvrtko

>>
>> Issue is with GuC, GuC and full reset, or with full reset regardless 
>> of the backend?
> The issue is with code making invalid assumptions. The assumption is 
> currently not failing because the execlist backend requires the use of a 
> barrier context for a bunch of operations. The GuC backend does not 
> require this. In fact, the barrier context does not function as a 
> barrier when the scheduler is external to i915. Hence the desire to 
> remove the use of the barrier context from generic i915 operation and 
> make it only used when in execlist mode. At that point, the invalid 
> assumption will no longer work and the BUG will fire.
> 
>>
>> If issue is only with GuC patch should have drm/i915/guc prefix as 
>> minimum. But if it actually only becomes a problem when GuC backend 
>> stops parking with the kernel context when I think the whole unpark 
>> code should be refactored in a cleaner way than just removing the one 
>> assert. Otherwise what is the point of leaving everything else in there?
>>
>> Or if the issue is backend agnostic, *if* full reset happens to hit 
>> during parking, then it is different. Wouldn't that be a race with 
>> parking and reset which probably shouldn't happen to start with.
>>
> The issue is neither with GuC nor with resets, GT or otherwise. The 
> issue is with generic i915 code making assumptions about backend 
> implementations that are only correct for the execlist implementation.
> 
> John.
> 
> 
>> Regards,
>>
>> Tvrtko
>>
>>>
>>> John.
>>>
>>>
>>>>
>>>> Regards,
>>>>
>>>> Tvrtko
>>>>
>>>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>>>> ---
>>>>>   drivers/gpu/drm/i915/gt/intel_engine_pm.c | 2 --
>>>>>   1 file changed, 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>> index b0a4a2dbe3ee9..fb3e1599d04ec 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>> @@ -68,8 +68,6 @@ static int __engine_unpark(struct intel_wakeref *wf)
>>>>>                ce->timeline->seqno,
>>>>>                READ_ONCE(*ce->timeline->hwsp_seqno),
>>>>>                ce->ring->emit);
>>>>> -        GEM_BUG_ON(ce->timeline->seqno !=
>>>>> -               READ_ONCE(*ce->timeline->hwsp_seqno));
>>>>>       }
>>>>>         if (engine->unpark)
>>>
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Intel-gfx] [PATCH 01/12] drm/i915: Remove bogus GEM_BUG_ON in unpark
  2022-07-21  9:24           ` Tvrtko Ursulin
@ 2022-07-22 19:09             ` John Harrison
  0 siblings, 0 replies; 56+ messages in thread
From: John Harrison @ 2022-07-22 19:09 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-GFX, Matthew Brost; +Cc: DRI-Devel

On 7/21/2022 02:24, Tvrtko Ursulin wrote:
> On 21/07/2022 01:54, John Harrison wrote:
>> On 7/19/2022 02:42, Tvrtko Ursulin wrote:
>>> On 19/07/2022 01:05, John Harrison wrote:
>>>> On 7/18/2022 05:15, Tvrtko Ursulin wrote:
>>>>>
>>>>> On 13/07/2022 00:31, John.C.Harrison@Intel.com wrote:
>>>>>> From: Matthew Brost <matthew.brost@intel.com>
>>>>>>
>>>>>> Remove bogus GEM_BUG_ON which compared kernel context timeline 
>>>>>> seqno to
>>>>>> seqno in memory on engine PM unpark. If a GT reset occurred these 
>>>>>> values
>>>>>> might not match as a kernel context could be skipped. This bug was
>>>>>> hidden by always switching to a kernel context on park (execlists
>>>>>> requirement).
>>>>>
>>>>> Reset of the kernel context? Under which circumstances does that 
>>>>> happen?
>>>> As per description, the issue is with full GT reset.
>>>>
>>>>>
>>>>> It is unclear if the claim is this to be a general problem or the 
>>>>> assert is only invalid with the GuC. Lack of a CI reported issue 
>>>>> suggests it is not a generic problem?
>>>> Currently it is not an issue because we always switch to the kernel 
>>>> context because that's how execlists works and the entire driver is 
>>>> fundamentally based on execlist operation. When we stop using the 
>>>> kernel context as a (non-functional) barrier when using GuC 
>>>> submission, then you would see an issue without this fix.
>
> Let me pick this point to try again - I am simply asking for a clear 
> description of steps which lead to the problem, instead of, what I 
> find are, generic and hard to penetrate statements of invalid 
> assumptions etc.
>
> I picked this spot because of this: "When we stop using the kernel 
> context as a (non-functional) barrier when using GuC submission, then 
> you would see an issue without this fix."
>
> I point to 363324292710 ("drm/i915/guc: Don't call 
> switch_to_kernel_context with GuC submission"). Hence it is not when 
> but it already happened. Which in my mind then does not compute - I 
> can't grok the explanation which appears to fall over on the first claim.
>
> Or perhaps the bug on is already firing today on every GuC enabled 
> machine in the CI? In which case there is a Fixes: link to be added?
>
> I have asked about, if we have 363324292710, and if this patch is 
> removing the seqno bug on, why it is not removing something more in 
> __engine_unpark, gated on "is guc"? Like ss there a point to 
> sanitizing the context which wasn't lost, because it wasn't used to 
> park the engine with?
>
> Or if the problem can't be hit with execlists (in case reset claim 
> from the commit message misleading), why shouldn't the bug on be 
> changed to contain the !guc condition instead of being remove?
>
> I am simply asking for a clear explanation of the conditions and steps 
> which lead to the bug on incorrectly firing. It doesn't have to be 
> long text or anything like that, just clear so we can close this and 
> move on.
>
> Regards,
>
> Tvrtko
@Matthew Brost, it's your patch, do you recall the details of when it 
was going bang? I vaguely recall something about it being hit in local 
testing pre-merge rather than by CI post-merge.

John.

>
>>>
>>> Issue is with GuC, GuC and full reset, or with full reset regardless 
>>> of the backend?
>> The issue is with code making invalid assumptions. The assumption is 
>> currently not failing because the execlist backend requires the use 
>> of a barrier context for a bunch of operations. The GuC backend does 
>> not require this. In fact, the barrier context does not function as a 
>> barrier when the scheduler is external to i915. Hence the desire to 
>> remove the use of the barrier context from generic i915 operation and 
>> make it only used when in execlist mode. At that point, the invalid 
>> assumption will no longer work and the BUG will fire.
>>
>>>
>>> If issue is only with GuC patch should have drm/i915/guc prefix as 
>>> minimum. But if it actually only becomes a problem when GuC backend 
>>> stops parking with the kernel context when I think the whole unpark 
>>> code should be refactored in a cleaner way than just removing the 
>>> one assert. Otherwise what is the point of leaving everything else 
>>> in there?
>>>
>>> Or if the issue is backend agnostic, *if* full reset happens to hit 
>>> during parking, then it is different. Wouldn't that be a race with 
>>> parking and reset which probably shouldn't happen to start with.
>>>
>> The issue is neither with GuC nor with resets, GT or otherwise. The 
>> issue is with generic i915 code making assumptions about backend 
>> implementations that are only correct for the execlist implementation.
>>
>> John.
>>
>>
>>> Regards,
>>>
>>> Tvrtko
>>>
>>>>
>>>> John.
>>>>
>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>> Tvrtko
>>>>>
>>>>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>>>>> ---
>>>>>>   drivers/gpu/drm/i915/gt/intel_engine_pm.c | 2 --
>>>>>>   1 file changed, 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>> index b0a4a2dbe3ee9..fb3e1599d04ec 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>>> @@ -68,8 +68,6 @@ static int __engine_unpark(struct intel_wakeref 
>>>>>> *wf)
>>>>>>                ce->timeline->seqno,
>>>>>> READ_ONCE(*ce->timeline->hwsp_seqno),
>>>>>>                ce->ring->emit);
>>>>>> -        GEM_BUG_ON(ce->timeline->seqno !=
>>>>>> - READ_ONCE(*ce->timeline->hwsp_seqno));
>>>>>>       }
>>>>>>         if (engine->unpark)
>>>>
>>


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Intel-gfx] [PATCH 10/12] drm/i915/guc: Support larger contexts on newer hardware
  2022-07-19  9:56       ` Tvrtko Ursulin
@ 2022-07-22 19:32         ` John Harrison
  2022-07-25 11:24           ` Tvrtko Ursulin
  0 siblings, 1 reply; 56+ messages in thread
From: John Harrison @ 2022-07-22 19:32 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-GFX; +Cc: DRI-Devel

On 7/19/2022 02:56, Tvrtko Ursulin wrote:
> On 19/07/2022 01:13, John Harrison wrote:
>> On 7/18/2022 05:35, Tvrtko Ursulin wrote:
>>>
>>> On 13/07/2022 00:31, John.C.Harrison@Intel.com wrote:
>>>> From: Matthew Brost <matthew.brost@intel.com>
>>>>
>>>> The GuC needs a copy of a golden context for implementing watchdog
>>>> resets (aka media resets). This context is larger on newer platforms.
>>>> So adjust the size being allocated/copied accordingly.
>>>
>>> What were the consequences of this being too small? Media watchdog 
>>> reset broken impacting userspace? Platforms? Do we have an IGT 
>>> testcase? Do we need a Fixes: tag? Copy stable?
>> Yes. Not sure if we have an IGT for the media watchdog. I recall 
>> writing something a long time back but I don't think it ever got 
>> merged due to push back that I don't recall right now. And no because 
>> it only affects DG2 onwards which is still forceprobed.
>
> Right, hm, I don't know if the MBD SKU promise for DG2 relies on force 
> probe removal or not. My impression certainly was that a bunch of uapi 
> we recently merged made people happy in that respect - that we 
> satisfied the commit to deliver that support with 5.19. Maybe I am 
> wrong, or perhaps to err on the side of safety you could add the right 
> Fixes: tag regardless? Pick some patch which enables GuC for DG2 if 
> there isn't anything better I guess. Or you could check with James.
Adding "Fixes: random patch that is actually irrelevant" seems like the 
wrong thing to do. This is not a bug fix. It is new platform support. 
And it is not the only thing required to support that new platform that 
is not currently in 5.19. E.g. DG2 requires at least GuC v70.4.2 to 
support some hardware w/a's. The guidance for that was to not add Fixes 
tags but to send a manual pull request once everything is ready.

John.


>
> Regards,
>
> Tvrtko
>
>> John.
>>
>>
>>>
>>> Regards,
>>>
>>> Tvrtko
>>>
>>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>>> ---
>>>>   drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 10 +++++++---
>>>>   1 file changed, 7 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
>>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
>>>> index ba7541f3ca610..74cbe8eaf5318 100644
>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
>>>> @@ -464,7 +464,11 @@ static void fill_engine_enable_masks(struct 
>>>> intel_gt *gt,
>>>>   }
>>>>     #define LR_HW_CONTEXT_SIZE (80 * sizeof(u32))
>>>> -#define LRC_SKIP_SIZE (LRC_PPHWSP_SZ * PAGE_SIZE + 
>>>> LR_HW_CONTEXT_SIZE)
>>>> +#define XEHP_LR_HW_CONTEXT_SIZE (96 * sizeof(u32))
>>>> +#define LR_HW_CONTEXT_SZ(i915) (GRAPHICS_VER_FULL(i915) >= 
>>>> IP_VER(12, 50) ? \
>>>> +                    XEHP_LR_HW_CONTEXT_SIZE : \
>>>> +                    LR_HW_CONTEXT_SIZE)
>>>> +#define LRC_SKIP_SIZE(i915) (LRC_PPHWSP_SZ * PAGE_SIZE + 
>>>> LR_HW_CONTEXT_SZ(i915))
>>>>   static int guc_prep_golden_context(struct intel_guc *guc)
>>>>   {
>>>>       struct intel_gt *gt = guc_to_gt(guc);
>>>> @@ -525,7 +529,7 @@ static int guc_prep_golden_context(struct 
>>>> intel_guc *guc)
>>>>            * on all engines).
>>>>            */
>>>>           ads_blob_write(guc, ads.eng_state_size[guc_class],
>>>> -                   real_size - LRC_SKIP_SIZE);
>>>> +                   real_size - LRC_SKIP_SIZE(gt->i915));
>>>>           ads_blob_write(guc, ads.golden_context_lrca[guc_class],
>>>>                      addr_ggtt);
>>>>   @@ -599,7 +603,7 @@ static void guc_init_golden_context(struct 
>>>> intel_guc *guc)
>>>>           }
>>>>             GEM_BUG_ON(ads_blob_read(guc, 
>>>> ads.eng_state_size[guc_class]) !=
>>>> -               real_size - LRC_SKIP_SIZE);
>>>> +               real_size - LRC_SKIP_SIZE(gt->i915));
>>>>           GEM_BUG_ON(ads_blob_read(guc, 
>>>> ads.golden_context_lrca[guc_class]) != addr_ggtt);
>>>>             addr_ggtt += alloc_size;
>>


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Intel-gfx] [PATCH 06/12] drm/i915/guc: Use streaming loads to speed up dumping the guc log
  2022-07-12 23:31   ` [Intel-gfx] " John.C.Harrison
  (?)
@ 2022-07-22 20:05   ` John Harrison
  -1 siblings, 0 replies; 56+ messages in thread
From: John Harrison @ 2022-07-22 20:05 UTC (permalink / raw)
  To: Intel-GFX; +Cc: DRI-Devel, Chris Wilson

On 7/12/2022 16:31, John.C.Harrison@Intel.com wrote:
> From: Chris Wilson <chris.p.wilson@intel.com>
>
> Use a temporary page and mempy_from_wc to reduce the time it takes to
> dump the guc log to debugfs.
>
> Signed-off-by: Chris Wilson <chris.p.wilson@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>

> ---
>   drivers/gpu/drm/i915/gt/uc/intel_guc_log.c | 24 ++++++++++++++++------
>   1 file changed, 18 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
> index 45f62cdabe356..ff091adb56096 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
> @@ -749,8 +749,9 @@ int intel_guc_log_dump(struct intel_guc_log *log, struct drm_printer *p,
>   	struct intel_guc *guc = log_to_guc(log);
>   	struct intel_uc *uc = container_of(guc, struct intel_uc, guc);
>   	struct drm_i915_gem_object *obj = NULL;
> -	u32 *map;
> -	int i = 0;
> +	void *map;
> +	u32 *page;
> +	int i, j;
>   
>   	if (!intel_guc_is_supported(guc))
>   		return -ENODEV;
> @@ -763,23 +764,34 @@ int intel_guc_log_dump(struct intel_guc_log *log, struct drm_printer *p,
>   	if (!obj)
>   		return 0;
>   
> +	page = (u32 *)__get_free_page(GFP_KERNEL);
> +	if (!page)
> +		return -ENOMEM;
> +
>   	intel_guc_dump_time_info(guc, p);
>   
>   	map = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC);
>   	if (IS_ERR(map)) {
>   		DRM_DEBUG("Failed to pin object\n");
>   		drm_puts(p, "(log data unaccessible)\n");
> +		free_page((unsigned long)page);
>   		return PTR_ERR(map);
>   	}
>   
> -	for (i = 0; i < obj->base.size / sizeof(u32); i += 4)
> -		drm_printf(p, "0x%08x 0x%08x 0x%08x 0x%08x\n",
> -			   *(map + i), *(map + i + 1),
> -			   *(map + i + 2), *(map + i + 3));
> +	for (i = 0; i < obj->base.size; i += PAGE_SIZE) {
> +		if (!i915_memcpy_from_wc(page, map + i, PAGE_SIZE))
> +			memcpy(page, map + i, PAGE_SIZE);
> +
> +		for (j = 0; j < PAGE_SIZE / sizeof(u32); j += 4)
> +			drm_printf(p, "0x%08x 0x%08x 0x%08x 0x%08x\n",
> +				   *(page + j + 0), *(page + j + 1),
> +				   *(page + j + 2), *(page + j + 3));
> +	}
>   
>   	drm_puts(p, "\n");
>   
>   	i915_gem_object_unpin_map(obj);
> +	free_page((unsigned long)page);
>   
>   	return 0;
>   }


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Intel-gfx] [PATCH 10/12] drm/i915/guc: Support larger contexts on newer hardware
  2022-07-22 19:32         ` John Harrison
@ 2022-07-25 11:24           ` Tvrtko Ursulin
  0 siblings, 0 replies; 56+ messages in thread
From: Tvrtko Ursulin @ 2022-07-25 11:24 UTC (permalink / raw)
  To: John Harrison, Intel-GFX; +Cc: DRI-Devel


On 22/07/2022 20:32, John Harrison wrote:
> On 7/19/2022 02:56, Tvrtko Ursulin wrote:
>> On 19/07/2022 01:13, John Harrison wrote:
>>> On 7/18/2022 05:35, Tvrtko Ursulin wrote:
>>>>
>>>> On 13/07/2022 00:31, John.C.Harrison@Intel.com wrote:
>>>>> From: Matthew Brost <matthew.brost@intel.com>
>>>>>
>>>>> The GuC needs a copy of a golden context for implementing watchdog
>>>>> resets (aka media resets). This context is larger on newer platforms.
>>>>> So adjust the size being allocated/copied accordingly.
>>>>
>>>> What were the consequences of this being too small? Media watchdog 
>>>> reset broken impacting userspace? Platforms? Do we have an IGT 
>>>> testcase? Do we need a Fixes: tag? Copy stable?
>>> Yes. Not sure if we have an IGT for the media watchdog. I recall 
>>> writing something a long time back but I don't think it ever got 
>>> merged due to push back that I don't recall right now. And no because 
>>> it only affects DG2 onwards which is still forceprobed.
>>
>> Right, hm, I don't know if the MBD SKU promise for DG2 relies on force 
>> probe removal or not. My impression certainly was that a bunch of uapi 
>> we recently merged made people happy in that respect - that we 
>> satisfied the commit to deliver that support with 5.19. Maybe I am 
>> wrong, or perhaps to err on the side of safety you could add the right 
>> Fixes: tag regardless? Pick some patch which enables GuC for DG2 if 
>> there isn't anything better I guess. Or you could check with James.
> Adding "Fixes: random patch that is actually irrelevant" seems like the 
> wrong thing to do. This is not a bug fix. It is new platform support. 
> And it is not the only thing required to support that new platform that 
> is not currently in 5.19. E.g. DG2 requires at least GuC v70.4.2 to 
> support some hardware w/a's. The guidance for that was to not add Fixes 
> tags but to send a manual pull request once everything is ready.

All I know is that some people were really interested(*) that 5.19 
contains everything needed for DG2. Hence I suggested to err on the side 
of safety, or at least check with folks.

Bottom line is, if you want this fix to be in 5.19, or even 5.20, you 
should add a Fixes: tag. Otherwise it will be in 5.21 at the earliest. 
Your call, I only tried to be helpful and avoid another failure.

Regards,

Tvrtko

*) To the point of actively pining the maintainers to ensure patches do 
not miss the merge window.

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2022-07-25 11:24 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-12 23:31 [PATCH 00/12] Random assortment of (mostly) GuC related patches John.C.Harrison
2022-07-12 23:31 ` [Intel-gfx] " John.C.Harrison
2022-07-12 23:31 ` [PATCH 01/12] drm/i915: Remove bogus GEM_BUG_ON in unpark John.C.Harrison
2022-07-12 23:31   ` [Intel-gfx] " John.C.Harrison
2022-07-18 12:15   ` Tvrtko Ursulin
2022-07-19  0:05     ` John Harrison
2022-07-19  9:42       ` Tvrtko Ursulin
2022-07-21  0:54         ` John Harrison
2022-07-21  9:24           ` Tvrtko Ursulin
2022-07-22 19:09             ` John Harrison
2022-07-12 23:31 ` [PATCH 02/12] drm/i915/guc: Don't call ring_is_idle in GuC submission John.C.Harrison
2022-07-12 23:31   ` [Intel-gfx] " John.C.Harrison
2022-07-18 12:26   ` Tvrtko Ursulin
2022-07-19  0:09     ` John Harrison
2022-07-19  9:49       ` Tvrtko Ursulin
2022-07-19 10:14         ` Tvrtko Ursulin
2022-07-12 23:31 ` [PATCH 03/12] drm/i915/guc: Fix issues with live_preempt_cancel John.C.Harrison
2022-07-12 23:31   ` [Intel-gfx] " John.C.Harrison
2022-07-12 23:31 ` [PATCH 04/12] drm/i915/guc: Add GuC <-> kernel time stamp translation information John.C.Harrison
2022-07-12 23:31   ` [Intel-gfx] " John.C.Harrison
2022-07-12 23:31 ` [PATCH 05/12] drm/i915/guc: Record CTB info in error logs John.C.Harrison
2022-07-12 23:31   ` [Intel-gfx] " John.C.Harrison
2022-07-12 23:31 ` [PATCH 06/12] drm/i915/guc: Use streaming loads to speed up dumping the guc log John.C.Harrison
2022-07-12 23:31   ` [Intel-gfx] " John.C.Harrison
2022-07-22 20:05   ` John Harrison
2022-07-12 23:31 ` [PATCH 07/12] drm/i915/guc: Route semaphores to GuC for Gen12+ John.C.Harrison
2022-07-12 23:31   ` [Intel-gfx] " John.C.Harrison
2022-07-13  0:51   ` Matthew Brost
2022-07-13  0:51     ` [Intel-gfx] " Matthew Brost
2022-07-15 17:21   ` Ceraolo Spurio, Daniele
2022-07-15 17:21     ` [Intel-gfx] " Ceraolo Spurio, Daniele
2022-07-12 23:31 ` [PATCH 08/12] drm/i915/guc: Add selftest for a hung GuC John.C.Harrison
2022-07-12 23:31   ` [Intel-gfx] " John.C.Harrison
2022-07-12 23:31 ` [PATCH 09/12] drm/i915/selftest: Cope with not having an RCS engine John.C.Harrison
2022-07-12 23:31   ` [Intel-gfx] " John.C.Harrison
2022-07-13  0:48   ` Matthew Brost
2022-07-13  0:48     ` [Intel-gfx] " Matthew Brost
2022-07-12 23:31 ` [PATCH 10/12] drm/i915/guc: Support larger contexts on newer hardware John.C.Harrison
2022-07-12 23:31   ` [Intel-gfx] " John.C.Harrison
2022-07-18 12:35   ` Tvrtko Ursulin
2022-07-19  0:13     ` John Harrison
2022-07-19  9:56       ` Tvrtko Ursulin
2022-07-22 19:32         ` John Harrison
2022-07-25 11:24           ` Tvrtko Ursulin
2022-07-12 23:31 ` [PATCH 11/12] drm/i915/guc: Don't abort on CTB_UNUSED status John.C.Harrison
2022-07-12 23:31   ` [Intel-gfx] " John.C.Harrison
2022-07-18 12:36   ` Tvrtko Ursulin
2022-07-19  0:16     ` John Harrison
2022-07-12 23:31 ` [PATCH 12/12] drm/i915/guc: Add a helper for log buffer size John.C.Harrison
2022-07-12 23:31   ` [Intel-gfx] " John.C.Harrison
2022-07-13  0:46   ` Matthew Brost
2022-07-13  0:46     ` [Intel-gfx] " Matthew Brost
2022-07-13  0:31 ` [Intel-gfx] ✗ Fi.CI.BUILD: warning for Random assortment of (mostly) GuC related patches Patchwork
2022-07-13 20:09 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Random assortment of (mostly) GuC related patches (rev2) Patchwork
2022-07-13 20:27 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2022-07-14  1:41 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.