[PATCH v4 0/4] Improve anti-pre-emption w/a for compute workloads

dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v4 0/4] Improve anti-pre-emption w/a for compute workloads
@ 2022-09-29  2:18 John.C.Harrison
  2022-09-29  2:18 ` [PATCH v4 1/4] drm/i915/guc: Limit scheduling properties to avoid overflow John.C.Harrison
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: John.C.Harrison @ 2022-09-29  2:18 UTC (permalink / raw)
  To: Intel-GFX; +Cc: John Harrison, DRI-Devel

From: John Harrison <John.C.Harrison@Intel.com>

Compute workloads are inherently not pre-emptible on current hardware.
Thus the pre-emption timeout was disabled as a workaround to prevent
unwanted resets. Instead, the hang detection was left to the heartbeat
and its (longer) timeout. This is undesirable with GuC submission as
the heartbeat is a full GT reset rather than a per engine reset and so
is much more destructive. Instead, just bump the pre-emption timeout
to a big value. Also, update the heartbeat to allow such a long
pre-emption delay in the final heartbeat period.

v2: Add clamping helpers.
v3: Remove long timeout algorithm and replace with hard coded value
(review feedback from Tvrtko). Also, fix execlist selftest failure and
fix bug in compute enabling patch related to pre-emption timeouts.
v4: Add multiple BUG_ONs to re-check already range checked values (Tvrtko)

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>

John Harrison (4):
  drm/i915/guc: Limit scheduling properties to avoid overflow
  drm/i915: Fix compute pre-emption w/a to apply to compute engines
  drm/i915: Make the heartbeat play nice with long pre-emption timeouts
  drm/i915: Improve long running compute w/a for GuC submission

 drivers/gpu/drm/i915/Kconfig.profile          |  26 ++++-
 drivers/gpu/drm/i915/gt/intel_engine.h        |   6 ++
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     | 102 +++++++++++++++---
 .../gpu/drm/i915/gt/intel_engine_heartbeat.c  |  19 ++++
 drivers/gpu/drm/i915/gt/sysfs_engines.c       |  25 +++--
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  21 ++++
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |   8 ++
 7 files changed, 179 insertions(+), 28 deletions(-)

-- 
2.37.3

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v4 1/4] drm/i915/guc: Limit scheduling properties to avoid overflow
  2022-09-29  2:18 [PATCH v4 0/4] Improve anti-pre-emption w/a for compute workloads John.C.Harrison
@ 2022-09-29  2:18 ` John.C.Harrison
  2022-09-29  7:39   ` [Intel-gfx] " Tvrtko Ursulin
  2022-09-29  2:18 ` [PATCH v4 2/4] drm/i915: Fix compute pre-emption w/a to apply to compute engines John.C.Harrison
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: John.C.Harrison @ 2022-09-29  2:18 UTC (permalink / raw)
  To: Intel-GFX; +Cc: Daniele Ceraolo Spurio, John Harrison, DRI-Devel

From: John Harrison <John.C.Harrison@Intel.com>

GuC converts the pre-emption timeout and timeslice quantum values into
clock ticks internally. That significantly reduces the point of 32bit
overflow. On current platforms, worst case scenario is approximately
110 seconds. Rather than allowing the user to set higher values and
then get confused by early timeouts, add limits when setting these
values.

v2: Add helper functions for clamping (review feedback from Tvrtko).
v3: Add a bunch of BUG_ON range checks in addition to the checks
already in the clamping functions (Tvrtko)

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> (v1)
---
 drivers/gpu/drm/i915/gt/intel_engine.h        |  6 ++
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     | 69 +++++++++++++++++++
 drivers/gpu/drm/i915/gt/sysfs_engines.c       | 25 ++++---
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   | 21 ++++++
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  8 +++
 5 files changed, 119 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index 04e435bce79bd..cbc8b857d5f7a 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -348,4 +348,10 @@ intel_engine_get_hung_context(struct intel_engine_cs *engine)
 	return engine->hung_ce;
 }
 
+u64 intel_clamp_heartbeat_interval_ms(struct intel_engine_cs *engine, u64 value);
+u64 intel_clamp_max_busywait_duration_ns(struct intel_engine_cs *engine, u64 value);
+u64 intel_clamp_preempt_timeout_ms(struct intel_engine_cs *engine, u64 value);
+u64 intel_clamp_stop_timeout_ms(struct intel_engine_cs *engine, u64 value);
+u64 intel_clamp_timeslice_duration_ms(struct intel_engine_cs *engine, u64 value);
+
 #endif /* _INTEL_RINGBUFFER_H_ */
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 2ddcad497fa30..8f16955f0821e 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -512,6 +512,26 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
 		engine->flags |= I915_ENGINE_HAS_EU_PRIORITY;
 	}
 
+	/* Cap properties according to any system limits */
+#define CLAMP_PROP(field) \
+	do { \
+		u64 clamp = intel_clamp_##field(engine, engine->props.field); \
+		if (clamp != engine->props.field) { \
+			drm_notice(&engine->i915->drm, \
+				   "Warning, clamping %s to %lld to prevent overflow\n", \
+				   #field, clamp); \
+			engine->props.field = clamp; \
+		} \
+	} while (0)
+
+	CLAMP_PROP(heartbeat_interval_ms);
+	CLAMP_PROP(max_busywait_duration_ns);
+	CLAMP_PROP(preempt_timeout_ms);
+	CLAMP_PROP(stop_timeout_ms);
+	CLAMP_PROP(timeslice_duration_ms);
+
+#undef CLAMP_PROP
+
 	engine->defaults = engine->props; /* never to change again */
 
 	engine->context_size = intel_engine_context_size(gt, engine->class);
@@ -534,6 +554,55 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
 	return 0;
 }
 
+u64 intel_clamp_heartbeat_interval_ms(struct intel_engine_cs *engine, u64 value)
+{
+	value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
+
+	return value;
+}
+
+u64 intel_clamp_max_busywait_duration_ns(struct intel_engine_cs *engine, u64 value)
+{
+	value = min(value, jiffies_to_nsecs(2));
+
+	return value;
+}
+
+u64 intel_clamp_preempt_timeout_ms(struct intel_engine_cs *engine, u64 value)
+{
+	/*
+	 * NB: The GuC API only supports 32bit values. However, the limit is further
+	 * reduced due to internal calculations which would otherwise overflow.
+	 */
+	if (intel_guc_submission_is_wanted(&engine->gt->uc.guc))
+		value = min_t(u64, value, guc_policy_max_preempt_timeout_ms());
+
+	value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
+
+	return value;
+}
+
+u64 intel_clamp_stop_timeout_ms(struct intel_engine_cs *engine, u64 value)
+{
+	value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
+
+	return value;
+}
+
+u64 intel_clamp_timeslice_duration_ms(struct intel_engine_cs *engine, u64 value)
+{
+	/*
+	 * NB: The GuC API only supports 32bit values. However, the limit is further
+	 * reduced due to internal calculations which would otherwise overflow.
+	 */
+	if (intel_guc_submission_is_wanted(&engine->gt->uc.guc))
+		value = min_t(u64, value, guc_policy_max_exec_quantum_ms());
+
+	value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
+
+	return value;
+}
+
 static void __setup_engine_capabilities(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *i915 = engine->i915;
diff --git a/drivers/gpu/drm/i915/gt/sysfs_engines.c b/drivers/gpu/drm/i915/gt/sysfs_engines.c
index 9670310562029..f2d9858d827c2 100644
--- a/drivers/gpu/drm/i915/gt/sysfs_engines.c
+++ b/drivers/gpu/drm/i915/gt/sysfs_engines.c
@@ -144,7 +144,7 @@ max_spin_store(struct kobject *kobj, struct kobj_attribute *attr,
 	       const char *buf, size_t count)
 {
 	struct intel_engine_cs *engine = kobj_to_engine(kobj);
-	unsigned long long duration;
+	unsigned long long duration, clamped;
 	int err;
 
 	/*
@@ -168,7 +168,8 @@ max_spin_store(struct kobject *kobj, struct kobj_attribute *attr,
 	if (err)
 		return err;
 
-	if (duration > jiffies_to_nsecs(2))
+	clamped = intel_clamp_max_busywait_duration_ns(engine, duration);
+	if (duration != clamped)
 		return -EINVAL;
 
 	WRITE_ONCE(engine->props.max_busywait_duration_ns, duration);
@@ -203,7 +204,7 @@ timeslice_store(struct kobject *kobj, struct kobj_attribute *attr,
 		const char *buf, size_t count)
 {
 	struct intel_engine_cs *engine = kobj_to_engine(kobj);
-	unsigned long long duration;
+	unsigned long long duration, clamped;
 	int err;
 
 	/*
@@ -218,7 +219,8 @@ timeslice_store(struct kobject *kobj, struct kobj_attribute *attr,
 	if (err)
 		return err;
 
-	if (duration > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
+	clamped = intel_clamp_timeslice_duration_ms(engine, duration);
+	if (duration != clamped)
 		return -EINVAL;
 
 	WRITE_ONCE(engine->props.timeslice_duration_ms, duration);
@@ -256,7 +258,7 @@ stop_store(struct kobject *kobj, struct kobj_attribute *attr,
 	   const char *buf, size_t count)
 {
 	struct intel_engine_cs *engine = kobj_to_engine(kobj);
-	unsigned long long duration;
+	unsigned long long duration, clamped;
 	int err;
 
 	/*
@@ -272,7 +274,8 @@ stop_store(struct kobject *kobj, struct kobj_attribute *attr,
 	if (err)
 		return err;
 
-	if (duration > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
+	clamped = intel_clamp_stop_timeout_ms(engine, duration);
+	if (duration != clamped)
 		return -EINVAL;
 
 	WRITE_ONCE(engine->props.stop_timeout_ms, duration);
@@ -306,7 +309,7 @@ preempt_timeout_store(struct kobject *kobj, struct kobj_attribute *attr,
 		      const char *buf, size_t count)
 {
 	struct intel_engine_cs *engine = kobj_to_engine(kobj);
-	unsigned long long timeout;
+	unsigned long long timeout, clamped;
 	int err;
 
 	/*
@@ -322,7 +325,8 @@ preempt_timeout_store(struct kobject *kobj, struct kobj_attribute *attr,
 	if (err)
 		return err;
 
-	if (timeout > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
+	clamped = intel_clamp_preempt_timeout_ms(engine, timeout);
+	if (timeout != clamped)
 		return -EINVAL;
 
 	WRITE_ONCE(engine->props.preempt_timeout_ms, timeout);
@@ -362,7 +366,7 @@ heartbeat_store(struct kobject *kobj, struct kobj_attribute *attr,
 		const char *buf, size_t count)
 {
 	struct intel_engine_cs *engine = kobj_to_engine(kobj);
-	unsigned long long delay;
+	unsigned long long delay, clamped;
 	int err;
 
 	/*
@@ -379,7 +383,8 @@ heartbeat_store(struct kobject *kobj, struct kobj_attribute *attr,
 	if (err)
 		return err;
 
-	if (delay >= jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
+	clamped = intel_clamp_heartbeat_interval_ms(engine, delay);
+	if (delay != clamped)
 		return -EINVAL;
 
 	err = intel_engine_set_heartbeat(engine, delay);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
index e7a7fb450f442..968ebd79dce70 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
@@ -327,6 +327,27 @@ struct guc_update_scheduling_policy {
 
 #define GLOBAL_POLICY_DEFAULT_DPC_PROMOTE_TIME_US 500000
 
+/*
+ * GuC converts the timeout to clock ticks internally. Different platforms have
+ * different GuC clocks. Thus, the maximum value before overflow is platform
+ * dependent. Current worst case scenario is about 110s. So, the spec says to
+ * limit to 100s to be safe.
+ */
+#define GUC_POLICY_MAX_EXEC_QUANTUM_US		(100 * 1000 * 1000UL)
+#define GUC_POLICY_MAX_PREEMPT_TIMEOUT_US	(100 * 1000 * 1000UL)
+
+static inline u32 guc_policy_max_exec_quantum_ms(void)
+{
+	BUILD_BUG_ON(GUC_POLICY_MAX_EXEC_QUANTUM_US >= UINT_MAX);
+	return GUC_POLICY_MAX_EXEC_QUANTUM_US / 1000;
+}
+
+static inline u32 guc_policy_max_preempt_timeout_ms(void)
+{
+	BUILD_BUG_ON(GUC_POLICY_MAX_PREEMPT_TIMEOUT_US >= UINT_MAX);
+	return GUC_POLICY_MAX_PREEMPT_TIMEOUT_US / 1000;
+}
+
 struct guc_policies {
 	u32 submission_queue_depth[GUC_MAX_ENGINE_CLASSES];
 	/* In micro seconds. How much time to allow before DPC processing is
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 0ef295a94060c..039c187ce570d 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -2430,6 +2430,10 @@ static int guc_context_policy_init_v70(struct intel_context *ce, bool loop)
 	int ret;
 
 	/* NB: For both of these, zero means disabled. */
+	GEM_BUG_ON(overflows_type(engine->props.timeslice_duration_ms * 1000,
+				  execution_quantum));
+	GEM_BUG_ON(overflows_type(engine->props.preempt_timeout_ms * 1000,
+				  preemption_timeout));
 	execution_quantum = engine->props.timeslice_duration_ms * 1000;
 	preemption_timeout = engine->props.preempt_timeout_ms * 1000;
 
@@ -2463,6 +2467,10 @@ static void guc_context_policy_init_v69(struct intel_engine_cs *engine,
 		desc->policy_flags |= CONTEXT_POLICY_FLAG_PREEMPT_TO_IDLE_V69;
 
 	/* NB: For both of these, zero means disabled. */
+	GEM_BUG_ON(overflows_type(engine->props.timeslice_duration_ms * 1000,
+				  desc->execution_quantum));
+	GEM_BUG_ON(overflows_type(engine->props.preempt_timeout_ms * 1000,
+				  desc->preemption_timeout));
 	desc->execution_quantum = engine->props.timeslice_duration_ms * 1000;
 	desc->preemption_timeout = engine->props.preempt_timeout_ms * 1000;
 }
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 2/4] drm/i915: Fix compute pre-emption w/a to apply to compute engines
  2022-09-29  2:18 [PATCH v4 0/4] Improve anti-pre-emption w/a for compute workloads John.C.Harrison
  2022-09-29  2:18 ` [PATCH v4 1/4] drm/i915/guc: Limit scheduling properties to avoid overflow John.C.Harrison
@ 2022-09-29  2:18 ` John.C.Harrison
  2022-09-29  2:18 ` [PATCH v4 3/4] drm/i915: Make the heartbeat play nice with long pre-emption timeouts John.C.Harrison
  2022-09-29  2:18 ` [PATCH v4 4/4] drm/i915: Improve long running compute w/a for GuC submission John.C.Harrison
  3 siblings, 0 replies; 15+ messages in thread
From: John.C.Harrison @ 2022-09-29  2:18 UTC (permalink / raw)
  To: Intel-GFX
  Cc: Daniel Vetter, DRI-Devel, Chris Wilson, Stuart Summers,
	Daniele Ceraolo Spurio, Jason Ekstrand, Akeem G Abodunrin,
	Matthew Brost, Matthew Auld, Ramalingam C, Thomas Hellström,
	Jani Nikula, Lucas De Marchi, Aravind Iddamsetty, Tvrtko Ursulin,
	Michał Winiarski, Tvrtko Ursulin, Tejas Upadhyay,
	Umesh Nerlige Ramappa, John Harrison

From: John Harrison <John.C.Harrison@Intel.com>

An earlier patch added support for compute engines. However, it missed
enabling the anti-pre-emption w/a for the new engine class. So move
the 'compute capable' flag earlier and use it for the pre-emption w/a
test.

Fixes: c674c5b9342e ("drm/i915/xehp: CCS should use RCS setup functions")
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: "Michał Winiarski" <michal.winiarski@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 24 +++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 8f16955f0821e..fcbccd8d244e9 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -486,6 +486,17 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
 	engine->logical_mask = BIT(logical_instance);
 	__sprint_engine_name(engine);
 
+	if ((engine->class == COMPUTE_CLASS && !RCS_MASK(engine->gt) &&
+	     __ffs(CCS_MASK(engine->gt)) == engine->instance) ||
+	     engine->class == RENDER_CLASS)
+		engine->flags |= I915_ENGINE_FIRST_RENDER_COMPUTE;
+
+	/* features common between engines sharing EUs */
+	if (engine->class == RENDER_CLASS || engine->class == COMPUTE_CLASS) {
+		engine->flags |= I915_ENGINE_HAS_RCS_REG_STATE;
+		engine->flags |= I915_ENGINE_HAS_EU_PRIORITY;
+	}
+
 	engine->props.heartbeat_interval_ms =
 		CONFIG_DRM_I915_HEARTBEAT_INTERVAL;
 	engine->props.max_busywait_duration_ns =
@@ -498,20 +509,9 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
 		CONFIG_DRM_I915_TIMESLICE_DURATION;
 
 	/* Override to uninterruptible for OpenCL workloads. */
-	if (GRAPHICS_VER(i915) == 12 && engine->class == RENDER_CLASS)
+	if (GRAPHICS_VER(i915) == 12 && (engine->flags & I915_ENGINE_HAS_RCS_REG_STATE))
 		engine->props.preempt_timeout_ms = 0;
 
-	if ((engine->class == COMPUTE_CLASS && !RCS_MASK(engine->gt) &&
-	     __ffs(CCS_MASK(engine->gt)) == engine->instance) ||
-	     engine->class == RENDER_CLASS)
-		engine->flags |= I915_ENGINE_FIRST_RENDER_COMPUTE;
-
-	/* features common between engines sharing EUs */
-	if (engine->class == RENDER_CLASS || engine->class == COMPUTE_CLASS) {
-		engine->flags |= I915_ENGINE_HAS_RCS_REG_STATE;
-		engine->flags |= I915_ENGINE_HAS_EU_PRIORITY;
-	}
-
 	/* Cap properties according to any system limits */
 #define CLAMP_PROP(field) \
 	do { \
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 3/4] drm/i915: Make the heartbeat play nice with long pre-emption timeouts
  2022-09-29  2:18 [PATCH v4 0/4] Improve anti-pre-emption w/a for compute workloads John.C.Harrison
  2022-09-29  2:18 ` [PATCH v4 1/4] drm/i915/guc: Limit scheduling properties to avoid overflow John.C.Harrison
  2022-09-29  2:18 ` [PATCH v4 2/4] drm/i915: Fix compute pre-emption w/a to apply to compute engines John.C.Harrison
@ 2022-09-29  2:18 ` John.C.Harrison
  2022-09-29  7:42   ` [Intel-gfx] " Tvrtko Ursulin
  2022-09-29  2:18 ` [PATCH v4 4/4] drm/i915: Improve long running compute w/a for GuC submission John.C.Harrison
  3 siblings, 1 reply; 15+ messages in thread
From: John.C.Harrison @ 2022-09-29  2:18 UTC (permalink / raw)
  To: Intel-GFX; +Cc: John Harrison, DRI-Devel

From: John Harrison <John.C.Harrison@Intel.com>

Compute workloads are inherently not pre-emptible for long periods on
current hardware. As a workaround for this, the pre-emption timeout
for compute capable engines was disabled. This is undesirable with GuC
submission as it prevents per engine reset of hung contexts. Hence the
next patch will re-enable the timeout but bumped up by an order of
magnitude.

However, the heartbeat might not respect that. Depending upon current
activity, a pre-emption to the heartbeat pulse might not even be
attempted until the last heartbeat period. Which means that only one
period is granted for the pre-emption to occur. With the aforesaid
bump, the pre-emption timeout could be significantly larger than this
heartbeat period.

So adjust the heartbeat code to take the pre-emption timeout into
account. When it reaches the final (high priority) period, it now
ensures the delay before hitting reset is bigger than the pre-emption
timeout.

v2: Fix for selftests which adjust the heartbeat period manually.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 .../gpu/drm/i915/gt/intel_engine_heartbeat.c  | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
index a3698f611f457..823a790a0e2ae 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
@@ -22,9 +22,28 @@

 static bool next_heartbeat(struct intel_engine_cs *engine)
 {
+	struct i915_request *rq;
 	long delay;

 	delay = READ_ONCE(engine->props.heartbeat_interval_ms);
+
+	rq = engine->heartbeat.systole;
+
+	if (rq && rq->sched.attr.priority >= I915_PRIORITY_BARRIER &&
+	    delay == engine->defaults.heartbeat_interval_ms) {
+		long longer;
+
+		/*
+		 * The final try is at the highest priority possible. Up until now
+		 * a pre-emption might not even have been attempted. So make sure
+		 * this last attempt allows enough time for a pre-emption to occur.
+		 */
+		longer = READ_ONCE(engine->props.preempt_timeout_ms) * 2;
+		longer = intel_clamp_heartbeat_interval_ms(engine, longer);
+		if (longer > delay)
+			delay = longer;
+	}
+
 	if (!delay)
 		return false;

-- 
2.37.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 4/4] drm/i915: Improve long running compute w/a for GuC submission
  2022-09-29  2:18 [PATCH v4 0/4] Improve anti-pre-emption w/a for compute workloads John.C.Harrison
                   ` (2 preceding siblings ...)
  2022-09-29  2:18 ` [PATCH v4 3/4] drm/i915: Make the heartbeat play nice with long pre-emption timeouts John.C.Harrison
@ 2022-09-29  2:18 ` John.C.Harrison
  2022-09-29  7:44   ` [Intel-gfx] " Tvrtko Ursulin
  3 siblings, 1 reply; 15+ messages in thread
From: John.C.Harrison @ 2022-09-29  2:18 UTC (permalink / raw)
  To: Intel-GFX; +Cc: Michal Mrozek, Daniele Ceraolo Spurio, John Harrison, DRI-Devel

From: John Harrison <John.C.Harrison@Intel.com>

A workaround was added to the driver to allow compute workloads to run
'forever' by disabling pre-emption on the RCS engine for Gen12.
It is not totally unbound as the heartbeat will kick in eventually
and cause a reset of the hung engine.

However, this does not work well in GuC submission mode. In GuC mode,
the pre-emption timeout is how GuC detects hung contexts and triggers
a per engine reset. Thus, disabling the timeout means also losing all
per engine reset ability. A full GT reset will still occur when the
heartbeat finally expires, but that is a much more destructive and
undesirable mechanism.

The purpose of the workaround is actually to give compute tasks longer
to reach a pre-emption point after a pre-emption request has been
issued. This is necessary because Gen12 does not support mid-thread
pre-emption and compute tasks can have long running threads.

So, rather than disabling the timeout completely, just set it to a
'long' value.

v2: Review feedback from Tvrtko - must hard code the 'long' value
instead of determining it algorithmically. So make it an extra CONFIG
definition. Also, remove the execlist centric comment from the
existing pre-emption timeout CONFIG option given that it applies to
more than just execlists.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> (v1)
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
---
 drivers/gpu/drm/i915/Kconfig.profile      | 26 +++++++++++++++++++----
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  9 ++++++--
 2 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
index 39328567c2007..7cc38d25ee5c8 100644
--- a/drivers/gpu/drm/i915/Kconfig.profile
+++ b/drivers/gpu/drm/i915/Kconfig.profile
@@ -57,10 +57,28 @@ config DRM_I915_PREEMPT_TIMEOUT
 	default 640 # milliseconds
 	help
 	  How long to wait (in milliseconds) for a preemption event to occur
-	  when submitting a new context via execlists. If the current context
-	  does not hit an arbitration point and yield to HW before the timer
-	  expires, the HW will be reset to allow the more important context
-	  to execute.
+	  when submitting a new context. If the current context does not hit
+	  an arbitration point and yield to HW before the timer expires, the
+	  HW will be reset to allow the more important context to execute.
+
+	  This is adjustable via
+	  /sys/class/drm/card?/engine/*/preempt_timeout_ms
+
+	  May be 0 to disable the timeout.
+
+	  The compiled in default may get overridden at driver probe time on
+	  certain platforms and certain engines which will be reflected in the
+	  sysfs control.
+
+config DRM_I915_PREEMPT_TIMEOUT_COMPUTE
+	int "Preempt timeout for compute engines (ms, jiffy granularity)"
+	default 7500 # milliseconds
+	help
+	  How long to wait (in milliseconds) for a preemption event to occur
+	  when submitting a new context to a compute capable engine. If the
+	  current context does not hit an arbitration point and yield to HW
+	  before the timer expires, the HW will be reset to allow the more
+	  important context to execute.
 
 	  This is adjustable via
 	  /sys/class/drm/card?/engine/*/preempt_timeout_ms
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index fcbccd8d244e9..c1257723d1949 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -508,9 +508,14 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
 	engine->props.timeslice_duration_ms =
 		CONFIG_DRM_I915_TIMESLICE_DURATION;
 
-	/* Override to uninterruptible for OpenCL workloads. */
+	/*
+	 * Mid-thread pre-emption is not available in Gen12. Unfortunately,
+	 * some compute workloads run quite long threads. That means they get
+	 * reset due to not pre-empting in a timely manner. So, bump the
+	 * pre-emption timeout value to be much higher for compute engines.
+	 */
 	if (GRAPHICS_VER(i915) == 12 && (engine->flags & I915_ENGINE_HAS_RCS_REG_STATE))
-		engine->props.preempt_timeout_ms = 0;
+		engine->props.preempt_timeout_ms = CONFIG_DRM_I915_PREEMPT_TIMEOUT_COMPUTE;
 
 	/* Cap properties according to any system limits */
 #define CLAMP_PROP(field) \
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Intel-gfx] [PATCH v4 1/4] drm/i915/guc: Limit scheduling properties to avoid overflow
  2022-09-29  2:18 ` [PATCH v4 1/4] drm/i915/guc: Limit scheduling properties to avoid overflow John.C.Harrison
@ 2022-09-29  7:39   ` Tvrtko Ursulin
  0 siblings, 0 replies; 15+ messages in thread
From: Tvrtko Ursulin @ 2022-09-29  7:39 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX; +Cc: DRI-Devel


On 29/09/2022 03:18, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> GuC converts the pre-emption timeout and timeslice quantum values into
> clock ticks internally. That significantly reduces the point of 32bit
> overflow. On current platforms, worst case scenario is approximately
> 110 seconds. Rather than allowing the user to set higher values and
> then get confused by early timeouts, add limits when setting these
> values.
> 
> v2: Add helper functions for clamping (review feedback from Tvrtko).
> v3: Add a bunch of BUG_ON range checks in addition to the checks
> already in the clamping functions (Tvrtko)
> 
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> (v1)

Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

> ---
>   drivers/gpu/drm/i915/gt/intel_engine.h        |  6 ++
>   drivers/gpu/drm/i915/gt/intel_engine_cs.c     | 69 +++++++++++++++++++
>   drivers/gpu/drm/i915/gt/sysfs_engines.c       | 25 ++++---
>   drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   | 21 ++++++
>   .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  8 +++
>   5 files changed, 119 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
> index 04e435bce79bd..cbc8b857d5f7a 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine.h
> @@ -348,4 +348,10 @@ intel_engine_get_hung_context(struct intel_engine_cs *engine)
>   	return engine->hung_ce;
>   }
>   
> +u64 intel_clamp_heartbeat_interval_ms(struct intel_engine_cs *engine, u64 value);
> +u64 intel_clamp_max_busywait_duration_ns(struct intel_engine_cs *engine, u64 value);
> +u64 intel_clamp_preempt_timeout_ms(struct intel_engine_cs *engine, u64 value);
> +u64 intel_clamp_stop_timeout_ms(struct intel_engine_cs *engine, u64 value);
> +u64 intel_clamp_timeslice_duration_ms(struct intel_engine_cs *engine, u64 value);
> +
>   #endif /* _INTEL_RINGBUFFER_H_ */
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 2ddcad497fa30..8f16955f0821e 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -512,6 +512,26 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
>   		engine->flags |= I915_ENGINE_HAS_EU_PRIORITY;
>   	}
>   
> +	/* Cap properties according to any system limits */
> +#define CLAMP_PROP(field) \
> +	do { \
> +		u64 clamp = intel_clamp_##field(engine, engine->props.field); \
> +		if (clamp != engine->props.field) { \
> +			drm_notice(&engine->i915->drm, \
> +				   "Warning, clamping %s to %lld to prevent overflow\n", \
> +				   #field, clamp); \
> +			engine->props.field = clamp; \
> +		} \
> +	} while (0)
> +
> +	CLAMP_PROP(heartbeat_interval_ms);
> +	CLAMP_PROP(max_busywait_duration_ns);
> +	CLAMP_PROP(preempt_timeout_ms);
> +	CLAMP_PROP(stop_timeout_ms);
> +	CLAMP_PROP(timeslice_duration_ms);
> +
> +#undef CLAMP_PROP
> +
>   	engine->defaults = engine->props; /* never to change again */
>   
>   	engine->context_size = intel_engine_context_size(gt, engine->class);
> @@ -534,6 +554,55 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
>   	return 0;
>   }
>   
> +u64 intel_clamp_heartbeat_interval_ms(struct intel_engine_cs *engine, u64 value)
> +{
> +	value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
> +
> +	return value;
> +}
> +
> +u64 intel_clamp_max_busywait_duration_ns(struct intel_engine_cs *engine, u64 value)
> +{
> +	value = min(value, jiffies_to_nsecs(2));
> +
> +	return value;
> +}
> +
> +u64 intel_clamp_preempt_timeout_ms(struct intel_engine_cs *engine, u64 value)
> +{
> +	/*
> +	 * NB: The GuC API only supports 32bit values. However, the limit is further
> +	 * reduced due to internal calculations which would otherwise overflow.
> +	 */
> +	if (intel_guc_submission_is_wanted(&engine->gt->uc.guc))
> +		value = min_t(u64, value, guc_policy_max_preempt_timeout_ms());
> +
> +	value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
> +
> +	return value;
> +}
> +
> +u64 intel_clamp_stop_timeout_ms(struct intel_engine_cs *engine, u64 value)
> +{
> +	value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
> +
> +	return value;
> +}
> +
> +u64 intel_clamp_timeslice_duration_ms(struct intel_engine_cs *engine, u64 value)
> +{
> +	/*
> +	 * NB: The GuC API only supports 32bit values. However, the limit is further
> +	 * reduced due to internal calculations which would otherwise overflow.
> +	 */
> +	if (intel_guc_submission_is_wanted(&engine->gt->uc.guc))
> +		value = min_t(u64, value, guc_policy_max_exec_quantum_ms());
> +
> +	value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
> +
> +	return value;
> +}
> +
>   static void __setup_engine_capabilities(struct intel_engine_cs *engine)
>   {
>   	struct drm_i915_private *i915 = engine->i915;
> diff --git a/drivers/gpu/drm/i915/gt/sysfs_engines.c b/drivers/gpu/drm/i915/gt/sysfs_engines.c
> index 9670310562029..f2d9858d827c2 100644
> --- a/drivers/gpu/drm/i915/gt/sysfs_engines.c
> +++ b/drivers/gpu/drm/i915/gt/sysfs_engines.c
> @@ -144,7 +144,7 @@ max_spin_store(struct kobject *kobj, struct kobj_attribute *attr,
>   	       const char *buf, size_t count)
>   {
>   	struct intel_engine_cs *engine = kobj_to_engine(kobj);
> -	unsigned long long duration;
> +	unsigned long long duration, clamped;
>   	int err;
>   
>   	/*
> @@ -168,7 +168,8 @@ max_spin_store(struct kobject *kobj, struct kobj_attribute *attr,
>   	if (err)
>   		return err;
>   
> -	if (duration > jiffies_to_nsecs(2))
> +	clamped = intel_clamp_max_busywait_duration_ns(engine, duration);
> +	if (duration != clamped)
>   		return -EINVAL;
>   
>   	WRITE_ONCE(engine->props.max_busywait_duration_ns, duration);
> @@ -203,7 +204,7 @@ timeslice_store(struct kobject *kobj, struct kobj_attribute *attr,
>   		const char *buf, size_t count)
>   {
>   	struct intel_engine_cs *engine = kobj_to_engine(kobj);
> -	unsigned long long duration;
> +	unsigned long long duration, clamped;
>   	int err;
>   
>   	/*
> @@ -218,7 +219,8 @@ timeslice_store(struct kobject *kobj, struct kobj_attribute *attr,
>   	if (err)
>   		return err;
>   
> -	if (duration > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
> +	clamped = intel_clamp_timeslice_duration_ms(engine, duration);
> +	if (duration != clamped)
>   		return -EINVAL;
>   
>   	WRITE_ONCE(engine->props.timeslice_duration_ms, duration);
> @@ -256,7 +258,7 @@ stop_store(struct kobject *kobj, struct kobj_attribute *attr,
>   	   const char *buf, size_t count)
>   {
>   	struct intel_engine_cs *engine = kobj_to_engine(kobj);
> -	unsigned long long duration;
> +	unsigned long long duration, clamped;
>   	int err;
>   
>   	/*
> @@ -272,7 +274,8 @@ stop_store(struct kobject *kobj, struct kobj_attribute *attr,
>   	if (err)
>   		return err;
>   
> -	if (duration > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
> +	clamped = intel_clamp_stop_timeout_ms(engine, duration);
> +	if (duration != clamped)
>   		return -EINVAL;
>   
>   	WRITE_ONCE(engine->props.stop_timeout_ms, duration);
> @@ -306,7 +309,7 @@ preempt_timeout_store(struct kobject *kobj, struct kobj_attribute *attr,
>   		      const char *buf, size_t count)
>   {
>   	struct intel_engine_cs *engine = kobj_to_engine(kobj);
> -	unsigned long long timeout;
> +	unsigned long long timeout, clamped;
>   	int err;
>   
>   	/*
> @@ -322,7 +325,8 @@ preempt_timeout_store(struct kobject *kobj, struct kobj_attribute *attr,
>   	if (err)
>   		return err;
>   
> -	if (timeout > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
> +	clamped = intel_clamp_preempt_timeout_ms(engine, timeout);
> +	if (timeout != clamped)
>   		return -EINVAL;
>   
>   	WRITE_ONCE(engine->props.preempt_timeout_ms, timeout);
> @@ -362,7 +366,7 @@ heartbeat_store(struct kobject *kobj, struct kobj_attribute *attr,
>   		const char *buf, size_t count)
>   {
>   	struct intel_engine_cs *engine = kobj_to_engine(kobj);
> -	unsigned long long delay;
> +	unsigned long long delay, clamped;
>   	int err;
>   
>   	/*
> @@ -379,7 +383,8 @@ heartbeat_store(struct kobject *kobj, struct kobj_attribute *attr,
>   	if (err)
>   		return err;
>   
> -	if (delay >= jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
> +	clamped = intel_clamp_heartbeat_interval_ms(engine, delay);
> +	if (delay != clamped)
>   		return -EINVAL;
>   
>   	err = intel_engine_set_heartbeat(engine, delay);
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
> index e7a7fb450f442..968ebd79dce70 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
> @@ -327,6 +327,27 @@ struct guc_update_scheduling_policy {
>   
>   #define GLOBAL_POLICY_DEFAULT_DPC_PROMOTE_TIME_US 500000
>   
> +/*
> + * GuC converts the timeout to clock ticks internally. Different platforms have
> + * different GuC clocks. Thus, the maximum value before overflow is platform
> + * dependent. Current worst case scenario is about 110s. So, the spec says to
> + * limit to 100s to be safe.
> + */
> +#define GUC_POLICY_MAX_EXEC_QUANTUM_US		(100 * 1000 * 1000UL)
> +#define GUC_POLICY_MAX_PREEMPT_TIMEOUT_US	(100 * 1000 * 1000UL)
> +
> +static inline u32 guc_policy_max_exec_quantum_ms(void)
> +{
> +	BUILD_BUG_ON(GUC_POLICY_MAX_EXEC_QUANTUM_US >= UINT_MAX);
> +	return GUC_POLICY_MAX_EXEC_QUANTUM_US / 1000;
> +}
> +
> +static inline u32 guc_policy_max_preempt_timeout_ms(void)
> +{
> +	BUILD_BUG_ON(GUC_POLICY_MAX_PREEMPT_TIMEOUT_US >= UINT_MAX);
> +	return GUC_POLICY_MAX_PREEMPT_TIMEOUT_US / 1000;
> +}
> +
>   struct guc_policies {
>   	u32 submission_queue_depth[GUC_MAX_ENGINE_CLASSES];
>   	/* In micro seconds. How much time to allow before DPC processing is
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 0ef295a94060c..039c187ce570d 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -2430,6 +2430,10 @@ static int guc_context_policy_init_v70(struct intel_context *ce, bool loop)
>   	int ret;
>   
>   	/* NB: For both of these, zero means disabled. */
> +	GEM_BUG_ON(overflows_type(engine->props.timeslice_duration_ms * 1000,
> +				  execution_quantum));
> +	GEM_BUG_ON(overflows_type(engine->props.preempt_timeout_ms * 1000,
> +				  preemption_timeout));
>   	execution_quantum = engine->props.timeslice_duration_ms * 1000;
>   	preemption_timeout = engine->props.preempt_timeout_ms * 1000;
>   
> @@ -2463,6 +2467,10 @@ static void guc_context_policy_init_v69(struct intel_engine_cs *engine,
>   		desc->policy_flags |= CONTEXT_POLICY_FLAG_PREEMPT_TO_IDLE_V69;
>   
>   	/* NB: For both of these, zero means disabled. */
> +	GEM_BUG_ON(overflows_type(engine->props.timeslice_duration_ms * 1000,
> +				  desc->execution_quantum));
> +	GEM_BUG_ON(overflows_type(engine->props.preempt_timeout_ms * 1000,
> +				  desc->preemption_timeout));
>   	desc->execution_quantum = engine->props.timeslice_duration_ms * 1000;
>   	desc->preemption_timeout = engine->props.preempt_timeout_ms * 1000;
>   }

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Intel-gfx] [PATCH v4 3/4] drm/i915: Make the heartbeat play nice with long pre-emption timeouts
  2022-09-29  2:18 ` [PATCH v4 3/4] drm/i915: Make the heartbeat play nice with long pre-emption timeouts John.C.Harrison
@ 2022-09-29  7:42   ` Tvrtko Ursulin
  2022-09-29 16:21     ` John Harrison
  0 siblings, 1 reply; 15+ messages in thread
From: Tvrtko Ursulin @ 2022-09-29  7:42 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX; +Cc: DRI-Devel


On 29/09/2022 03:18, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> Compute workloads are inherently not pre-emptible for long periods on
> current hardware. As a workaround for this, the pre-emption timeout
> for compute capable engines was disabled. This is undesirable with GuC
> submission as it prevents per engine reset of hung contexts. Hence the
> next patch will re-enable the timeout but bumped up by an order of
> magnitude.
> 
> However, the heartbeat might not respect that. Depending upon current
> activity, a pre-emption to the heartbeat pulse might not even be
> attempted until the last heartbeat period. Which means that only one
> period is granted for the pre-emption to occur. With the aforesaid
> bump, the pre-emption timeout could be significantly larger than this
> heartbeat period.
> 
> So adjust the heartbeat code to take the pre-emption timeout into
> account. When it reaches the final (high priority) period, it now
> ensures the delay before hitting reset is bigger than the pre-emption
> timeout.
> 
> v2: Fix for selftests which adjust the heartbeat period manually.
> 
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> ---
>   .../gpu/drm/i915/gt/intel_engine_heartbeat.c  | 19 +++++++++++++++++++
>   1 file changed, 19 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
> index a3698f611f457..823a790a0e2ae 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
> @@ -22,9 +22,28 @@
>   
>   static bool next_heartbeat(struct intel_engine_cs *engine)
>   {
> +	struct i915_request *rq;
>   	long delay;
>   
>   	delay = READ_ONCE(engine->props.heartbeat_interval_ms);
> +
> +	rq = engine->heartbeat.systole;
> +
> +	if (rq && rq->sched.attr.priority >= I915_PRIORITY_BARRIER &&
> +	    delay == engine->defaults.heartbeat_interval_ms) {

Maybe I forgot but what is the reason for the check against the default 
heartbeat interval?

Regards,

Tvrtko

> +		long longer;
> +
> +		/*
> +		 * The final try is at the highest priority possible. Up until now
> +		 * a pre-emption might not even have been attempted. So make sure
> +		 * this last attempt allows enough time for a pre-emption to occur.
> +		 */
> +		longer = READ_ONCE(engine->props.preempt_timeout_ms) * 2;
> +		longer = intel_clamp_heartbeat_interval_ms(engine, longer);
> +		if (longer > delay)
> +			delay = longer;
> +	}
> +
>   	if (!delay)
>   		return false;
>   

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Intel-gfx] [PATCH v4 4/4] drm/i915: Improve long running compute w/a for GuC submission
  2022-09-29  2:18 ` [PATCH v4 4/4] drm/i915: Improve long running compute w/a for GuC submission John.C.Harrison
@ 2022-09-29  7:44   ` Tvrtko Ursulin
  0 siblings, 0 replies; 15+ messages in thread
From: Tvrtko Ursulin @ 2022-09-29  7:44 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX; +Cc: Michal Mrozek, DRI-Devel


On 29/09/2022 03:18, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> A workaround was added to the driver to allow compute workloads to run
> 'forever' by disabling pre-emption on the RCS engine for Gen12.
> It is not totally unbound as the heartbeat will kick in eventually
> and cause a reset of the hung engine.
> 
> However, this does not work well in GuC submission mode. In GuC mode,
> the pre-emption timeout is how GuC detects hung contexts and triggers
> a per engine reset. Thus, disabling the timeout means also losing all
> per engine reset ability. A full GT reset will still occur when the
> heartbeat finally expires, but that is a much more destructive and
> undesirable mechanism.
> 
> The purpose of the workaround is actually to give compute tasks longer
> to reach a pre-emption point after a pre-emption request has been
> issued. This is necessary because Gen12 does not support mid-thread
> pre-emption and compute tasks can have long running threads.
> 
> So, rather than disabling the timeout completely, just set it to a
> 'long' value.
> 
> v2: Review feedback from Tvrtko - must hard code the 'long' value
> instead of determining it algorithmically. So make it an extra CONFIG
> definition. Also, remove the execlist centric comment from the
> existing pre-emption timeout CONFIG option given that it applies to
> more than just execlists.
> 
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> (v1)
> Acked-by: Michal Mrozek <michal.mrozek@intel.com>

Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

> ---
>   drivers/gpu/drm/i915/Kconfig.profile      | 26 +++++++++++++++++++----
>   drivers/gpu/drm/i915/gt/intel_engine_cs.c |  9 ++++++--
>   2 files changed, 29 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
> index 39328567c2007..7cc38d25ee5c8 100644
> --- a/drivers/gpu/drm/i915/Kconfig.profile
> +++ b/drivers/gpu/drm/i915/Kconfig.profile
> @@ -57,10 +57,28 @@ config DRM_I915_PREEMPT_TIMEOUT
>   	default 640 # milliseconds
>   	help
>   	  How long to wait (in milliseconds) for a preemption event to occur
> -	  when submitting a new context via execlists. If the current context
> -	  does not hit an arbitration point and yield to HW before the timer
> -	  expires, the HW will be reset to allow the more important context
> -	  to execute.
> +	  when submitting a new context. If the current context does not hit
> +	  an arbitration point and yield to HW before the timer expires, the
> +	  HW will be reset to allow the more important context to execute.
> +
> +	  This is adjustable via
> +	  /sys/class/drm/card?/engine/*/preempt_timeout_ms
> +
> +	  May be 0 to disable the timeout.
> +
> +	  The compiled in default may get overridden at driver probe time on
> +	  certain platforms and certain engines which will be reflected in the
> +	  sysfs control.
> +
> +config DRM_I915_PREEMPT_TIMEOUT_COMPUTE
> +	int "Preempt timeout for compute engines (ms, jiffy granularity)"
> +	default 7500 # milliseconds
> +	help
> +	  How long to wait (in milliseconds) for a preemption event to occur
> +	  when submitting a new context to a compute capable engine. If the
> +	  current context does not hit an arbitration point and yield to HW
> +	  before the timer expires, the HW will be reset to allow the more
> +	  important context to execute.
>   
>   	  This is adjustable via
>   	  /sys/class/drm/card?/engine/*/preempt_timeout_ms
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index fcbccd8d244e9..c1257723d1949 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -508,9 +508,14 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
>   	engine->props.timeslice_duration_ms =
>   		CONFIG_DRM_I915_TIMESLICE_DURATION;
>   
> -	/* Override to uninterruptible for OpenCL workloads. */
> +	/*
> +	 * Mid-thread pre-emption is not available in Gen12. Unfortunately,
> +	 * some compute workloads run quite long threads. That means they get
> +	 * reset due to not pre-empting in a timely manner. So, bump the
> +	 * pre-emption timeout value to be much higher for compute engines.
> +	 */
>   	if (GRAPHICS_VER(i915) == 12 && (engine->flags & I915_ENGINE_HAS_RCS_REG_STATE))
> -		engine->props.preempt_timeout_ms = 0;
> +		engine->props.preempt_timeout_ms = CONFIG_DRM_I915_PREEMPT_TIMEOUT_COMPUTE;
>   
>   	/* Cap properties according to any system limits */
>   #define CLAMP_PROP(field) \

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Intel-gfx] [PATCH v4 3/4] drm/i915: Make the heartbeat play nice with long pre-emption timeouts
  2022-09-29  7:42   ` [Intel-gfx] " Tvrtko Ursulin
@ 2022-09-29 16:21     ` John Harrison
  2022-09-30  9:22       ` Tvrtko Ursulin
  0 siblings, 1 reply; 15+ messages in thread
From: John Harrison @ 2022-09-29 16:21 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-GFX; +Cc: DRI-Devel

On 9/29/2022 00:42, Tvrtko Ursulin wrote:
> On 29/09/2022 03:18, John.C.Harrison@Intel.com wrote:
>> From: John Harrison <John.C.Harrison@Intel.com>
>>
>> Compute workloads are inherently not pre-emptible for long periods on
>> current hardware. As a workaround for this, the pre-emption timeout
>> for compute capable engines was disabled. This is undesirable with GuC
>> submission as it prevents per engine reset of hung contexts. Hence the
>> next patch will re-enable the timeout but bumped up by an order of
>> magnitude.
>>
>> However, the heartbeat might not respect that. Depending upon current
>> activity, a pre-emption to the heartbeat pulse might not even be
>> attempted until the last heartbeat period. Which means that only one
>> period is granted for the pre-emption to occur. With the aforesaid
>> bump, the pre-emption timeout could be significantly larger than this
>> heartbeat period.
>>
>> So adjust the heartbeat code to take the pre-emption timeout into
>> account. When it reaches the final (high priority) period, it now
>> ensures the delay before hitting reset is bigger than the pre-emption
>> timeout.
>>
>> v2: Fix for selftests which adjust the heartbeat period manually.
>>
>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>> ---
>>   .../gpu/drm/i915/gt/intel_engine_heartbeat.c  | 19 +++++++++++++++++++
>>   1 file changed, 19 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c 
>> b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
>> index a3698f611f457..823a790a0e2ae 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
>> @@ -22,9 +22,28 @@
>>     static bool next_heartbeat(struct intel_engine_cs *engine)
>>   {
>> +    struct i915_request *rq;
>>       long delay;
>>         delay = READ_ONCE(engine->props.heartbeat_interval_ms);
>> +
>> +    rq = engine->heartbeat.systole;
>> +
>> +    if (rq && rq->sched.attr.priority >= I915_PRIORITY_BARRIER &&
>> +        delay == engine->defaults.heartbeat_interval_ms) {
>
> Maybe I forgot but what is the reason for the check against the 
> default heartbeat interval?
That's the 'v2: fix for selftests that manually adjust the heartbeat'. 
If something (or someone) has explicitly set an override of the 
heartbeat then it has to be assumed that they know what they are doing, 
and if things don't work any more that's their problem. But if we don't 
respect their override then they won't get the timings they expect and 
the selftest will fail.

John.

>
> Regards,
>
> Tvrtko
>
>> +        long longer;
>> +
>> +        /*
>> +         * The final try is at the highest priority possible. Up 
>> until now
>> +         * a pre-emption might not even have been attempted. So make 
>> sure
>> +         * this last attempt allows enough time for a pre-emption to 
>> occur.
>> +         */
>> +        longer = READ_ONCE(engine->props.preempt_timeout_ms) * 2;
>> +        longer = intel_clamp_heartbeat_interval_ms(engine, longer);
>> +        if (longer > delay)
>> +            delay = longer;
>> +    }
>> +
>>       if (!delay)
>>           return false;


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Intel-gfx] [PATCH v4 3/4] drm/i915: Make the heartbeat play nice with long pre-emption timeouts
  2022-09-29 16:21     ` John Harrison
@ 2022-09-30  9:22       ` Tvrtko Ursulin
  2022-09-30 17:44         ` John Harrison
  0 siblings, 1 reply; 15+ messages in thread
From: Tvrtko Ursulin @ 2022-09-30  9:22 UTC (permalink / raw)
  To: John Harrison, Intel-GFX; +Cc: DRI-Devel


On 29/09/2022 17:21, John Harrison wrote:
> On 9/29/2022 00:42, Tvrtko Ursulin wrote:
>> On 29/09/2022 03:18, John.C.Harrison@Intel.com wrote:
>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>
>>> Compute workloads are inherently not pre-emptible for long periods on
>>> current hardware. As a workaround for this, the pre-emption timeout
>>> for compute capable engines was disabled. This is undesirable with GuC
>>> submission as it prevents per engine reset of hung contexts. Hence the
>>> next patch will re-enable the timeout but bumped up by an order of
>>> magnitude.
>>>
>>> However, the heartbeat might not respect that. Depending upon current
>>> activity, a pre-emption to the heartbeat pulse might not even be
>>> attempted until the last heartbeat period. Which means that only one
>>> period is granted for the pre-emption to occur. With the aforesaid
>>> bump, the pre-emption timeout could be significantly larger than this
>>> heartbeat period.
>>>
>>> So adjust the heartbeat code to take the pre-emption timeout into
>>> account. When it reaches the final (high priority) period, it now
>>> ensures the delay before hitting reset is bigger than the pre-emption
>>> timeout.
>>>
>>> v2: Fix for selftests which adjust the heartbeat period manually.
>>>
>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>> ---
>>>   .../gpu/drm/i915/gt/intel_engine_heartbeat.c  | 19 +++++++++++++++++++
>>>   1 file changed, 19 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c 
>>> b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
>>> index a3698f611f457..823a790a0e2ae 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
>>> @@ -22,9 +22,28 @@
>>>     static bool next_heartbeat(struct intel_engine_cs *engine)
>>>   {
>>> +    struct i915_request *rq;
>>>       long delay;
>>>         delay = READ_ONCE(engine->props.heartbeat_interval_ms);
>>> +
>>> +    rq = engine->heartbeat.systole;
>>> +
>>> +    if (rq && rq->sched.attr.priority >= I915_PRIORITY_BARRIER &&
>>> +        delay == engine->defaults.heartbeat_interval_ms) {
>>
>> Maybe I forgot but what is the reason for the check against the 
>> default heartbeat interval?
> That's the 'v2: fix for selftests that manually adjust the heartbeat'. 
> If something (or someone) has explicitly set an override of the 
> heartbeat then it has to be assumed that they know what they are doing, 
> and if things don't work any more that's their problem. But if we don't 
> respect their override then they won't get the timings they expect and 
> the selftest will fail.

Isn't this a bit too strict for the non-selftest case? If the new 
concept is extending the last pulse to guarantee preemption, then I 
think we could allow tweaking of the heartbeat period. Like what if user 
wants 1s, or 10s instead of 2.5s - why would that need to break the 
improvement from this patch?

In what ways selftests fail? Are they trying to guess time to reset 
based on the hearbeat period set? If so perhaps add a helper to query it 
based on the last pulse extension.

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Intel-gfx] [PATCH v4 3/4] drm/i915: Make the heartbeat play nice with long pre-emption timeouts
  2022-09-30  9:22       ` Tvrtko Ursulin
@ 2022-09-30 17:44         ` John Harrison
  2022-10-03  7:53           ` Tvrtko Ursulin
  0 siblings, 1 reply; 15+ messages in thread
From: John Harrison @ 2022-09-30 17:44 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-GFX; +Cc: DRI-Devel

On 9/30/2022 02:22, Tvrtko Ursulin wrote:
> On 29/09/2022 17:21, John Harrison wrote:
>> On 9/29/2022 00:42, Tvrtko Ursulin wrote:
>>> On 29/09/2022 03:18, John.C.Harrison@Intel.com wrote:
>>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>>
>>>> Compute workloads are inherently not pre-emptible for long periods on
>>>> current hardware. As a workaround for this, the pre-emption timeout
>>>> for compute capable engines was disabled. This is undesirable with GuC
>>>> submission as it prevents per engine reset of hung contexts. Hence the
>>>> next patch will re-enable the timeout but bumped up by an order of
>>>> magnitude.
>>>>
>>>> However, the heartbeat might not respect that. Depending upon current
>>>> activity, a pre-emption to the heartbeat pulse might not even be
>>>> attempted until the last heartbeat period. Which means that only one
>>>> period is granted for the pre-emption to occur. With the aforesaid
>>>> bump, the pre-emption timeout could be significantly larger than this
>>>> heartbeat period.
>>>>
>>>> So adjust the heartbeat code to take the pre-emption timeout into
>>>> account. When it reaches the final (high priority) period, it now
>>>> ensures the delay before hitting reset is bigger than the pre-emption
>>>> timeout.
>>>>
>>>> v2: Fix for selftests which adjust the heartbeat period manually.
>>>>
>>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>>> ---
>>>>   .../gpu/drm/i915/gt/intel_engine_heartbeat.c  | 19 
>>>> +++++++++++++++++++
>>>>   1 file changed, 19 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c 
>>>> b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
>>>> index a3698f611f457..823a790a0e2ae 100644
>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
>>>> @@ -22,9 +22,28 @@
>>>>     static bool next_heartbeat(struct intel_engine_cs *engine)
>>>>   {
>>>> +    struct i915_request *rq;
>>>>       long delay;
>>>>         delay = READ_ONCE(engine->props.heartbeat_interval_ms);
>>>> +
>>>> +    rq = engine->heartbeat.systole;
>>>> +
>>>> +    if (rq && rq->sched.attr.priority >= I915_PRIORITY_BARRIER &&
>>>> +        delay == engine->defaults.heartbeat_interval_ms) {
>>>
>>> Maybe I forgot but what is the reason for the check against the 
>>> default heartbeat interval?
>> That's the 'v2: fix for selftests that manually adjust the 
>> heartbeat'. If something (or someone) has explicitly set an override 
>> of the heartbeat then it has to be assumed that they know what they 
>> are doing, and if things don't work any more that's their problem. 
>> But if we don't respect their override then they won't get the 
>> timings they expect and the selftest will fail.
>
> Isn't this a bit too strict for the non-selftest case? If the new 
> concept is extending the last pulse to guarantee preemption, then I 
> think we could allow tweaking of the heartbeat period. Like what if 
> user wants 1s, or 10s instead of 2.5s - why would that need to break 
> the improvement from this patch?
Then the user is back to where they were before this patch.

>
> In what ways selftests fail? Are they trying to guess time to reset 
> based on the hearbeat period set? If so perhaps add a helper to query 
> it based on the last pulse extension.

I don't recall. It was six months ago when I was actually working on 
this. And right now I do not have the time to go back and re-run all the 
testing and re-write a bunch of self tests with whole new helpers and 
algorithms and whatever else might be necessary to polish this to 
perfection. And in the meantime, all the existing issues are still 
present - there is no range checking on any of this stuff, it is very 
possible for a driver with default settings to break a legal workload 
because the heartbeat and pre-emption are fighting with each other, we 
don't even have per engine resets enabled, etc.

Maybe it could be even better with a follow up patch. Feel free to do 
that. But as it stands, this patch set significantly improves the 
situation without making anything worse.

John.

>
> Regards,
>
> Tvrtko


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Intel-gfx] [PATCH v4 3/4] drm/i915: Make the heartbeat play nice with long pre-emption timeouts
  2022-09-30 17:44         ` John Harrison
@ 2022-10-03  7:53           ` Tvrtko Ursulin
  2022-10-03 12:00             ` Tvrtko Ursulin
  0 siblings, 1 reply; 15+ messages in thread
From: Tvrtko Ursulin @ 2022-10-03  7:53 UTC (permalink / raw)
  To: John Harrison, Intel-GFX; +Cc: DRI-Devel


On 30/09/2022 18:44, John Harrison wrote:
> On 9/30/2022 02:22, Tvrtko Ursulin wrote:
>> On 29/09/2022 17:21, John Harrison wrote:
>>> On 9/29/2022 00:42, Tvrtko Ursulin wrote:
>>>> On 29/09/2022 03:18, John.C.Harrison@Intel.com wrote:
>>>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>>>
>>>>> Compute workloads are inherently not pre-emptible for long periods on
>>>>> current hardware. As a workaround for this, the pre-emption timeout
>>>>> for compute capable engines was disabled. This is undesirable with GuC
>>>>> submission as it prevents per engine reset of hung contexts. Hence the
>>>>> next patch will re-enable the timeout but bumped up by an order of
>>>>> magnitude.
>>>>>
>>>>> However, the heartbeat might not respect that. Depending upon current
>>>>> activity, a pre-emption to the heartbeat pulse might not even be
>>>>> attempted until the last heartbeat period. Which means that only one
>>>>> period is granted for the pre-emption to occur. With the aforesaid
>>>>> bump, the pre-emption timeout could be significantly larger than this
>>>>> heartbeat period.
>>>>>
>>>>> So adjust the heartbeat code to take the pre-emption timeout into
>>>>> account. When it reaches the final (high priority) period, it now
>>>>> ensures the delay before hitting reset is bigger than the pre-emption
>>>>> timeout.
>>>>>
>>>>> v2: Fix for selftests which adjust the heartbeat period manually.
>>>>>
>>>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>>>> ---
>>>>>   .../gpu/drm/i915/gt/intel_engine_heartbeat.c  | 19 
>>>>> +++++++++++++++++++
>>>>>   1 file changed, 19 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c 
>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
>>>>> index a3698f611f457..823a790a0e2ae 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
>>>>> @@ -22,9 +22,28 @@
>>>>>     static bool next_heartbeat(struct intel_engine_cs *engine)
>>>>>   {
>>>>> +    struct i915_request *rq;
>>>>>       long delay;
>>>>>         delay = READ_ONCE(engine->props.heartbeat_interval_ms);
>>>>> +
>>>>> +    rq = engine->heartbeat.systole;
>>>>> +
>>>>> +    if (rq && rq->sched.attr.priority >= I915_PRIORITY_BARRIER &&
>>>>> +        delay == engine->defaults.heartbeat_interval_ms) {
>>>>
>>>> Maybe I forgot but what is the reason for the check against the 
>>>> default heartbeat interval?
>>> That's the 'v2: fix for selftests that manually adjust the 
>>> heartbeat'. If something (or someone) has explicitly set an override 
>>> of the heartbeat then it has to be assumed that they know what they 
>>> are doing, and if things don't work any more that's their problem. 
>>> But if we don't respect their override then they won't get the 
>>> timings they expect and the selftest will fail.
>>
>> Isn't this a bit too strict for the non-selftest case? If the new 
>> concept is extending the last pulse to guarantee preemption, then I 
>> think we could allow tweaking of the heartbeat period. Like what if 
>> user wants 1s, or 10s instead of 2.5s - why would that need to break 
>> the improvement from this patch?
> Then the user is back to where they were before this patch.
> 
>>
>> In what ways selftests fail? Are they trying to guess time to reset 
>> based on the hearbeat period set? If so perhaps add a helper to query 
>> it based on the last pulse extension.
> 
> I don't recall. It was six months ago when I was actually working on 
> this. And right now I do not have the time to go back and re-run all the 
> testing and re-write a bunch of self tests with whole new helpers and 
> algorithms and whatever else might be necessary to polish this to 
> perfection. And in the meantime, all the existing issues are still 
> present - there is no range checking on any of this stuff, it is very 
> possible for a driver with default settings to break a legal workload 
> because the heartbeat and pre-emption are fighting with each other, we 
> don't even have per engine resets enabled, etc.
> 
> Maybe it could be even better with a follow up patch. Feel free to do 
> that. But as it stands, this patch set significantly improves the 
> situation without making anything worse.

As we seem to be in agreement that the check against default heartbeat 
is a hack with only purpose to work around assumptions made by 
selftests, then please file a Jira about removing it (this hack). Then 
work can be assigned to someone to clean it up. With that done I would 
agree the series is indeed an improvement and it would have my ack.

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Intel-gfx] [PATCH v4 3/4] drm/i915: Make the heartbeat play nice with long pre-emption timeouts
  2022-10-03  7:53           ` Tvrtko Ursulin
@ 2022-10-03 12:00             ` Tvrtko Ursulin
  2022-10-05 18:48               ` John Harrison
  0 siblings, 1 reply; 15+ messages in thread
From: Tvrtko Ursulin @ 2022-10-03 12:00 UTC (permalink / raw)
  To: John Harrison, Intel-GFX; +Cc: DRI-Devel


On 03/10/2022 08:53, Tvrtko Ursulin wrote:
> 
> On 30/09/2022 18:44, John Harrison wrote:
>> On 9/30/2022 02:22, Tvrtko Ursulin wrote:
>>> On 29/09/2022 17:21, John Harrison wrote:
>>>> On 9/29/2022 00:42, Tvrtko Ursulin wrote:
>>>>> On 29/09/2022 03:18, John.C.Harrison@Intel.com wrote:
>>>>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>>>>
>>>>>> Compute workloads are inherently not pre-emptible for long periods on
>>>>>> current hardware. As a workaround for this, the pre-emption timeout
>>>>>> for compute capable engines was disabled. This is undesirable with 
>>>>>> GuC
>>>>>> submission as it prevents per engine reset of hung contexts. Hence 
>>>>>> the
>>>>>> next patch will re-enable the timeout but bumped up by an order of
>>>>>> magnitude.
>>>>>>
>>>>>> However, the heartbeat might not respect that. Depending upon current
>>>>>> activity, a pre-emption to the heartbeat pulse might not even be
>>>>>> attempted until the last heartbeat period. Which means that only one
>>>>>> period is granted for the pre-emption to occur. With the aforesaid
>>>>>> bump, the pre-emption timeout could be significantly larger than this
>>>>>> heartbeat period.
>>>>>>
>>>>>> So adjust the heartbeat code to take the pre-emption timeout into
>>>>>> account. When it reaches the final (high priority) period, it now
>>>>>> ensures the delay before hitting reset is bigger than the pre-emption
>>>>>> timeout.
>>>>>>
>>>>>> v2: Fix for selftests which adjust the heartbeat period manually.
>>>>>>
>>>>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>>>>> ---
>>>>>>   .../gpu/drm/i915/gt/intel_engine_heartbeat.c  | 19 
>>>>>> +++++++++++++++++++
>>>>>>   1 file changed, 19 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c 
>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
>>>>>> index a3698f611f457..823a790a0e2ae 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
>>>>>> @@ -22,9 +22,28 @@
>>>>>>     static bool next_heartbeat(struct intel_engine_cs *engine)
>>>>>>   {
>>>>>> +    struct i915_request *rq;
>>>>>>       long delay;
>>>>>>         delay = READ_ONCE(engine->props.heartbeat_interval_ms);
>>>>>> +
>>>>>> +    rq = engine->heartbeat.systole;
>>>>>> +
>>>>>> +    if (rq && rq->sched.attr.priority >= I915_PRIORITY_BARRIER &&
>>>>>> +        delay == engine->defaults.heartbeat_interval_ms) {
>>>>>
>>>>> Maybe I forgot but what is the reason for the check against the 
>>>>> default heartbeat interval?
>>>> That's the 'v2: fix for selftests that manually adjust the 
>>>> heartbeat'. If something (or someone) has explicitly set an override 
>>>> of the heartbeat then it has to be assumed that they know what they 
>>>> are doing, and if things don't work any more that's their problem. 
>>>> But if we don't respect their override then they won't get the 
>>>> timings they expect and the selftest will fail.
>>>
>>> Isn't this a bit too strict for the non-selftest case? If the new 
>>> concept is extending the last pulse to guarantee preemption, then I 
>>> think we could allow tweaking of the heartbeat period. Like what if 
>>> user wants 1s, or 10s instead of 2.5s - why would that need to break 
>>> the improvement from this patch?
>> Then the user is back to where they were before this patch.
>>
>>>
>>> In what ways selftests fail? Are they trying to guess time to reset 
>>> based on the hearbeat period set? If so perhaps add a helper to query 
>>> it based on the last pulse extension.
>>
>> I don't recall. It was six months ago when I was actually working on 
>> this. And right now I do not have the time to go back and re-run all 
>> the testing and re-write a bunch of self tests with whole new helpers 
>> and algorithms and whatever else might be necessary to polish this to 
>> perfection. And in the meantime, all the existing issues are still 
>> present - there is no range checking on any of this stuff, it is very 
>> possible for a driver with default settings to break a legal workload 
>> because the heartbeat and pre-emption are fighting with each other, we 
>> don't even have per engine resets enabled, etc.
>>
>> Maybe it could be even better with a follow up patch. Feel free to do 
>> that. But as it stands, this patch set significantly improves the 
>> situation without making anything worse.
> 
> As we seem to be in agreement that the check against default heartbeat 
> is a hack with only purpose to work around assumptions made by 
> selftests, then please file a Jira about removing it (this hack). Then 
> work can be assigned to someone to clean it up. With that done I would 
> agree the series is indeed an improvement and it would have my ack.

One more thing - put a comment in the code along the lines of 
"FIXME/HACK: Work around selftests assumptions by only extending the 
last heartbeat if the period is at default value". The the Jira can 
associate to that comment.

Until that is resolve it may also be worth emitting a drm_notice if 
heartbeat is changed via sysfs? Informing users the things will not work 
as expected if they fiddle with it. Whether as a blanket warning or 
checking first the 3-4x heartbeat vs preempt timeout value. That message 
should then go away once the follow up work to fix the selftests is 
done. See what the other reviewers will think.

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Intel-gfx] [PATCH v4 3/4] drm/i915: Make the heartbeat play nice with long pre-emption timeouts
  2022-10-03 12:00             ` Tvrtko Ursulin
@ 2022-10-05 18:48               ` John Harrison
  2022-10-06 10:03                 ` Tvrtko Ursulin
  0 siblings, 1 reply; 15+ messages in thread
From: John Harrison @ 2022-10-05 18:48 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-GFX; +Cc: DRI-Devel

On 10/3/2022 05:00, Tvrtko Ursulin wrote:
> On 03/10/2022 08:53, Tvrtko Ursulin wrote:
>> On 30/09/2022 18:44, John Harrison wrote:
>>> On 9/30/2022 02:22, Tvrtko Ursulin wrote:
>>>> On 29/09/2022 17:21, John Harrison wrote:
>>>>> On 9/29/2022 00:42, Tvrtko Ursulin wrote:
>>>>>> On 29/09/2022 03:18, John.C.Harrison@Intel.com wrote:
>>>>>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>>>>>
>>>>>>> Compute workloads are inherently not pre-emptible for long 
>>>>>>> periods on
>>>>>>> current hardware. As a workaround for this, the pre-emption timeout
>>>>>>> for compute capable engines was disabled. This is undesirable 
>>>>>>> with GuC
>>>>>>> submission as it prevents per engine reset of hung contexts. 
>>>>>>> Hence the
>>>>>>> next patch will re-enable the timeout but bumped up by an order of
>>>>>>> magnitude.
>>>>>>>
>>>>>>> However, the heartbeat might not respect that. Depending upon 
>>>>>>> current
>>>>>>> activity, a pre-emption to the heartbeat pulse might not even be
>>>>>>> attempted until the last heartbeat period. Which means that only 
>>>>>>> one
>>>>>>> period is granted for the pre-emption to occur. With the aforesaid
>>>>>>> bump, the pre-emption timeout could be significantly larger than 
>>>>>>> this
>>>>>>> heartbeat period.
>>>>>>>
>>>>>>> So adjust the heartbeat code to take the pre-emption timeout into
>>>>>>> account. When it reaches the final (high priority) period, it now
>>>>>>> ensures the delay before hitting reset is bigger than the 
>>>>>>> pre-emption
>>>>>>> timeout.
>>>>>>>
>>>>>>> v2: Fix for selftests which adjust the heartbeat period manually.
>>>>>>>
>>>>>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>>>>>> ---
>>>>>>>   .../gpu/drm/i915/gt/intel_engine_heartbeat.c  | 19 
>>>>>>> +++++++++++++++++++
>>>>>>>   1 file changed, 19 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c 
>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
>>>>>>> index a3698f611f457..823a790a0e2ae 100644
>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
>>>>>>> @@ -22,9 +22,28 @@
>>>>>>>     static bool next_heartbeat(struct intel_engine_cs *engine)
>>>>>>>   {
>>>>>>> +    struct i915_request *rq;
>>>>>>>       long delay;
>>>>>>>         delay = READ_ONCE(engine->props.heartbeat_interval_ms);
>>>>>>> +
>>>>>>> +    rq = engine->heartbeat.systole;
>>>>>>> +
>>>>>>> +    if (rq && rq->sched.attr.priority >= I915_PRIORITY_BARRIER &&
>>>>>>> +        delay == engine->defaults.heartbeat_interval_ms) {
>>>>>>
>>>>>> Maybe I forgot but what is the reason for the check against the 
>>>>>> default heartbeat interval?
>>>>> That's the 'v2: fix for selftests that manually adjust the 
>>>>> heartbeat'. If something (or someone) has explicitly set an 
>>>>> override of the heartbeat then it has to be assumed that they know 
>>>>> what they are doing, and if things don't work any more that's 
>>>>> their problem. But if we don't respect their override then they 
>>>>> won't get the timings they expect and the selftest will fail.
>>>>
>>>> Isn't this a bit too strict for the non-selftest case? If the new 
>>>> concept is extending the last pulse to guarantee preemption, then I 
>>>> think we could allow tweaking of the heartbeat period. Like what if 
>>>> user wants 1s, or 10s instead of 2.5s - why would that need to 
>>>> break the improvement from this patch?
>>> Then the user is back to where they were before this patch.
>>>
>>>>
>>>> In what ways selftests fail? Are they trying to guess time to reset 
>>>> based on the hearbeat period set? If so perhaps add a helper to 
>>>> query it based on the last pulse extension.
>>>
>>> I don't recall. It was six months ago when I was actually working on 
>>> this. And right now I do not have the time to go back and re-run all 
>>> the testing and re-write a bunch of self tests with whole new 
>>> helpers and algorithms and whatever else might be necessary to 
>>> polish this to perfection. And in the meantime, all the existing 
>>> issues are still present - there is no range checking on any of this 
>>> stuff, it is very possible for a driver with default settings to 
>>> break a legal workload because the heartbeat and pre-emption are 
>>> fighting with each other, we don't even have per engine resets 
>>> enabled, etc.
>>>
>>> Maybe it could be even better with a follow up patch. Feel free to 
>>> do that. But as it stands, this patch set significantly improves the 
>>> situation without making anything worse.
>>
>> As we seem to be in agreement that the check against default 
>> heartbeat is a hack with only purpose to work around assumptions made 
>> by selftests, then please file a Jira about removing it (this hack). 
>> Then work can be assigned to someone to clean it up. With that done I 
>> would agree the series is indeed an improvement and it would have my 
>> ack.
VLK-39595

>
> One more thing - put a comment in the code along the lines of 
> "FIXME/HACK: Work around selftests assumptions by only extending the 
> last heartbeat if the period is at default value". The the Jira can 
> associate to that comment.
>
> Until that is resolve it may also be worth emitting a drm_notice if 
> heartbeat is changed via sysfs? Informing users the things will not 
> work as expected if they fiddle with it. Whether as a blanket warning 
> or checking first the 3-4x heartbeat vs preempt timeout value. That 
> message should then go away once the follow up work to fix the 
> selftests is done. See what the other reviewers will think.
>
What should the drm_notice say? How can you describe to an innocent end 
user what the symptoms might be in a single, concise line rather than 
the huge email thread that it took to explain to you, an experienced 
i915 kernel developer?

Is there a single end user out there that actually uses the sysfs 
interface for tuning these parameters? AFAIK, the usage is 99.999% IGT, 
the rest is internal developers debugging problems. Maybe someone 
somewhere has noticed them because they have a compute task that takes 
tens of seconds to complete. But the official guidance for compute users 
is to simply disable the heartbeat completely. We never tell anyone to 
try to tune the period or the pre-emption timeout to their specific 
application. That's just too complicated and unpredictable. We need 
defaults that work for the general case and for compute it is disabled. 
Manual tuning just isn't useful. Unless it's to reduce the stupidly high 
pre-emption timeout to get a more responsive desktop because they never 
actually need to run long compute tasks. And in that case, you don't 
need extended last periods because your pre-emption timeout is already 
massively smaller than the period.

John.


> Regards,
>
> Tvrtko


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Intel-gfx] [PATCH v4 3/4] drm/i915: Make the heartbeat play nice with long pre-emption timeouts
  2022-10-05 18:48               ` John Harrison
@ 2022-10-06 10:03                 ` Tvrtko Ursulin
  0 siblings, 0 replies; 15+ messages in thread
From: Tvrtko Ursulin @ 2022-10-06 10:03 UTC (permalink / raw)
  To: John Harrison, Intel-GFX; +Cc: DRI-Devel


On 05/10/2022 19:48, John Harrison wrote:
> On 10/3/2022 05:00, Tvrtko Ursulin wrote:
>> On 03/10/2022 08:53, Tvrtko Ursulin wrote:
>>> On 30/09/2022 18:44, John Harrison wrote:
>>>> On 9/30/2022 02:22, Tvrtko Ursulin wrote:
>>>>> On 29/09/2022 17:21, John Harrison wrote:
>>>>>> On 9/29/2022 00:42, Tvrtko Ursulin wrote:
>>>>>>> On 29/09/2022 03:18, John.C.Harrison@Intel.com wrote:
>>>>>>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>>>>>>
>>>>>>>> Compute workloads are inherently not pre-emptible for long 
>>>>>>>> periods on
>>>>>>>> current hardware. As a workaround for this, the pre-emption timeout
>>>>>>>> for compute capable engines was disabled. This is undesirable 
>>>>>>>> with GuC
>>>>>>>> submission as it prevents per engine reset of hung contexts. 
>>>>>>>> Hence the
>>>>>>>> next patch will re-enable the timeout but bumped up by an order of
>>>>>>>> magnitude.
>>>>>>>>
>>>>>>>> However, the heartbeat might not respect that. Depending upon 
>>>>>>>> current
>>>>>>>> activity, a pre-emption to the heartbeat pulse might not even be
>>>>>>>> attempted until the last heartbeat period. Which means that only 
>>>>>>>> one
>>>>>>>> period is granted for the pre-emption to occur. With the aforesaid
>>>>>>>> bump, the pre-emption timeout could be significantly larger than 
>>>>>>>> this
>>>>>>>> heartbeat period.
>>>>>>>>
>>>>>>>> So adjust the heartbeat code to take the pre-emption timeout into
>>>>>>>> account. When it reaches the final (high priority) period, it now
>>>>>>>> ensures the delay before hitting reset is bigger than the 
>>>>>>>> pre-emption
>>>>>>>> timeout.
>>>>>>>>
>>>>>>>> v2: Fix for selftests which adjust the heartbeat period manually.
>>>>>>>>
>>>>>>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>>>>>>> ---
>>>>>>>>   .../gpu/drm/i915/gt/intel_engine_heartbeat.c  | 19 
>>>>>>>> +++++++++++++++++++
>>>>>>>>   1 file changed, 19 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c 
>>>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
>>>>>>>> index a3698f611f457..823a790a0e2ae 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
>>>>>>>> @@ -22,9 +22,28 @@
>>>>>>>>     static bool next_heartbeat(struct intel_engine_cs *engine)
>>>>>>>>   {
>>>>>>>> +    struct i915_request *rq;
>>>>>>>>       long delay;
>>>>>>>>         delay = READ_ONCE(engine->props.heartbeat_interval_ms);
>>>>>>>> +
>>>>>>>> +    rq = engine->heartbeat.systole;
>>>>>>>> +
>>>>>>>> +    if (rq && rq->sched.attr.priority >= I915_PRIORITY_BARRIER &&
>>>>>>>> +        delay == engine->defaults.heartbeat_interval_ms) {
>>>>>>>
>>>>>>> Maybe I forgot but what is the reason for the check against the 
>>>>>>> default heartbeat interval?
>>>>>> That's the 'v2: fix for selftests that manually adjust the 
>>>>>> heartbeat'. If something (or someone) has explicitly set an 
>>>>>> override of the heartbeat then it has to be assumed that they know 
>>>>>> what they are doing, and if things don't work any more that's 
>>>>>> their problem. But if we don't respect their override then they 
>>>>>> won't get the timings they expect and the selftest will fail.
>>>>>
>>>>> Isn't this a bit too strict for the non-selftest case? If the new 
>>>>> concept is extending the last pulse to guarantee preemption, then I 
>>>>> think we could allow tweaking of the heartbeat period. Like what if 
>>>>> user wants 1s, or 10s instead of 2.5s - why would that need to 
>>>>> break the improvement from this patch?
>>>> Then the user is back to where they were before this patch.
>>>>
>>>>>
>>>>> In what ways selftests fail? Are they trying to guess time to reset 
>>>>> based on the hearbeat period set? If so perhaps add a helper to 
>>>>> query it based on the last pulse extension.
>>>>
>>>> I don't recall. It was six months ago when I was actually working on 
>>>> this. And right now I do not have the time to go back and re-run all 
>>>> the testing and re-write a bunch of self tests with whole new 
>>>> helpers and algorithms and whatever else might be necessary to 
>>>> polish this to perfection. And in the meantime, all the existing 
>>>> issues are still present - there is no range checking on any of this 
>>>> stuff, it is very possible for a driver with default settings to 
>>>> break a legal workload because the heartbeat and pre-emption are 
>>>> fighting with each other, we don't even have per engine resets 
>>>> enabled, etc.
>>>>
>>>> Maybe it could be even better with a follow up patch. Feel free to 
>>>> do that. But as it stands, this patch set significantly improves the 
>>>> situation without making anything worse.
>>>
>>> As we seem to be in agreement that the check against default 
>>> heartbeat is a hack with only purpose to work around assumptions made 
>>> by selftests, then please file a Jira about removing it (this hack). 
>>> Then work can be assigned to someone to clean it up. With that done I 
>>> would agree the series is indeed an improvement and it would have my 
>>> ack.
> VLK-39595
> 
>>
>> One more thing - put a comment in the code along the lines of 
>> "FIXME/HACK: Work around selftests assumptions by only extending the 
>> last heartbeat if the period is at default value". The the Jira can 
>> associate to that comment.
>>
>> Until that is resolve it may also be worth emitting a drm_notice if 
>> heartbeat is changed via sysfs? Informing users the things will not 
>> work as expected if they fiddle with it. Whether as a blanket warning 
>> or checking first the 3-4x heartbeat vs preempt timeout value. That 
>> message should then go away once the follow up work to fix the 
>> selftests is done. See what the other reviewers will think.
>>
> What should the drm_notice say? How can you describe to an innocent end 
> user what the symptoms might be in a single, concise line rather than 
> the huge email thread that it took to explain to you, an experienced 
> i915 kernel developer?

I think what would be useful is to have something like:

sysfs_write, or was it a setter helper?:

/* FIXME: Remove together with equally marked hack in next_heartbeat. */
if (heartbeat != default && heartbeat < 2 * preempt_timeout) {
	if (guc)
		drm_note("%s heartbeat interval adjusted to a non-default value which 
may downgrade individual engine resets to full GPU resets!\n", 
engine->name);
	else
		drm_note("%s heartbeat interval adjusted to a non-default value which 
may cause engine resets to target innocent contexts!\n", engine->name);
}

Sounds correct? Double preempt timeout or maybe 1.5x, not sure.

I do accept this is something which we could have had from the start (of 
heartbeats and preempt timeout), but now that this patch fixes the logic 
to extend the last pulse, apart from the selftests hack which limits it 
again, I think it is reasonable to add it in scope of this series.

> Is there a single end user out there that actually uses the sysfs 
> interface for tuning these parameters? AFAIK, the usage is 99.999% IGT, 
> the rest is internal developers debugging problems. Maybe someone 
> somewhere has noticed them because they have a compute task that takes 
> tens of seconds to complete. But the official guidance for compute users 
> is to simply disable the heartbeat completely. We never tell anyone to 
> try to tune the period or the pre-emption timeout to their specific 
> application. That's just too complicated and unpredictable. We need 
> defaults that work for the general case and for compute it is disabled. 
> Manual tuning just isn't useful. Unless it's to reduce the stupidly high 
> pre-emption timeout to get a more responsive desktop because they never 
> actually need to run long compute tasks. And in that case, you don't 
> need extended last periods because your pre-emption timeout is already 
> massively smaller than the period.

Don't know if there are users but it's out there and it's ABI so I would 
err on the side of caution. It happened in the past that we were 
discussing how something is maybe not important and then with uncanny 
coincidence users immediately appear complaining about the very same thing.

Again, I do agree the patch is not making anything worse - on the 
contrary it is improving things in the default configuration. And as I 
think the principle of last pulse extension is the right approach it 
feels selftests are too weak of a reason to give up on it so easily.

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-10-06 10:03 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-29  2:18 [PATCH v4 0/4] Improve anti-pre-emption w/a for compute workloads John.C.Harrison
2022-09-29  2:18 ` [PATCH v4 1/4] drm/i915/guc: Limit scheduling properties to avoid overflow John.C.Harrison
2022-09-29  7:39   ` [Intel-gfx] " Tvrtko Ursulin
2022-09-29  2:18 ` [PATCH v4 2/4] drm/i915: Fix compute pre-emption w/a to apply to compute engines John.C.Harrison
2022-09-29  2:18 ` [PATCH v4 3/4] drm/i915: Make the heartbeat play nice with long pre-emption timeouts John.C.Harrison
2022-09-29  7:42   ` [Intel-gfx] " Tvrtko Ursulin
2022-09-29 16:21     ` John Harrison
2022-09-30  9:22       ` Tvrtko Ursulin
2022-09-30 17:44         ` John Harrison
2022-10-03  7:53           ` Tvrtko Ursulin
2022-10-03 12:00             ` Tvrtko Ursulin
2022-10-05 18:48               ` John Harrison
2022-10-06 10:03                 ` Tvrtko Ursulin
2022-09-29  2:18 ` [PATCH v4 4/4] drm/i915: Improve long running compute w/a for GuC submission John.C.Harrison
2022-09-29  7:44   ` [Intel-gfx] " Tvrtko Ursulin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).