All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/4] Improve anti-pre-emption w/a for compute workloads
@ 2022-03-03 22:37 ` John.C.Harrison
  0 siblings, 0 replies; 27+ messages in thread
From: John.C.Harrison @ 2022-03-03 22:37 UTC (permalink / raw)
  To: Intel-GFX; +Cc: John Harrison, DRI-Devel

From: John Harrison <John.C.Harrison@Intel.com>

Compute workloads are inherently not pre-emptible on current hardware.
Thus the pre-emption timeout was disabled as a workaround to prevent
unwanted resets. Instead, the hang detection was left to the heartbeat
and its (longer) timeout. This is undesirable with GuC submission as
the heartbeat is a full GT reset rather than a per engine reset and so
is much more destructive. Instead, just bump the pre-emption timeout
to a big value. Also, update the heartbeat to allow such a long
pre-emption delay in the final heartbeat period.

v2: Add clamping helpers.
v3: Remove long timeout algorithm and replace with hard coded value
(review feedback from Tvrtko). Also, fix execlist selftest failure and
fix bug in compute enabling patch related to pre-emption timeouts.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>


John Harrison (4):
  drm/i915/guc: Limit scheduling properties to avoid overflow
  drm/i915: Fix compute pre-emption w/a to apply to compute engines
  drm/i915: Make the heartbeat play nice with long pre-emption timeouts
  drm/i915: Improve long running OCL w/a for GuC submission

 drivers/gpu/drm/i915/Kconfig.profile          | 26 +++++-
 drivers/gpu/drm/i915/gt/intel_engine.h        |  6 ++
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     | 92 +++++++++++++++++--
 .../gpu/drm/i915/gt/intel_engine_heartbeat.c  | 18 ++++
 drivers/gpu/drm/i915/gt/sysfs_engines.c       | 25 +++--
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  9 ++
 6 files changed, 153 insertions(+), 23 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Intel-gfx] [PATCH v3 0/4] Improve anti-pre-emption w/a for compute workloads
@ 2022-03-03 22:37 ` John.C.Harrison
  0 siblings, 0 replies; 27+ messages in thread
From: John.C.Harrison @ 2022-03-03 22:37 UTC (permalink / raw)
  To: Intel-GFX; +Cc: DRI-Devel

From: John Harrison <John.C.Harrison@Intel.com>

Compute workloads are inherently not pre-emptible on current hardware.
Thus the pre-emption timeout was disabled as a workaround to prevent
unwanted resets. Instead, the hang detection was left to the heartbeat
and its (longer) timeout. This is undesirable with GuC submission as
the heartbeat is a full GT reset rather than a per engine reset and so
is much more destructive. Instead, just bump the pre-emption timeout
to a big value. Also, update the heartbeat to allow such a long
pre-emption delay in the final heartbeat period.

v2: Add clamping helpers.
v3: Remove long timeout algorithm and replace with hard coded value
(review feedback from Tvrtko). Also, fix execlist selftest failure and
fix bug in compute enabling patch related to pre-emption timeouts.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>


John Harrison (4):
  drm/i915/guc: Limit scheduling properties to avoid overflow
  drm/i915: Fix compute pre-emption w/a to apply to compute engines
  drm/i915: Make the heartbeat play nice with long pre-emption timeouts
  drm/i915: Improve long running OCL w/a for GuC submission

 drivers/gpu/drm/i915/Kconfig.profile          | 26 +++++-
 drivers/gpu/drm/i915/gt/intel_engine.h        |  6 ++
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     | 92 +++++++++++++++++--
 .../gpu/drm/i915/gt/intel_engine_heartbeat.c  | 18 ++++
 drivers/gpu/drm/i915/gt/sysfs_engines.c       | 25 +++--
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  9 ++
 6 files changed, 153 insertions(+), 23 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v3 1/4] drm/i915/guc: Limit scheduling properties to avoid overflow
  2022-03-03 22:37 ` [Intel-gfx] " John.C.Harrison
@ 2022-03-03 22:37   ` John.C.Harrison
  -1 siblings, 0 replies; 27+ messages in thread
From: John.C.Harrison @ 2022-03-03 22:37 UTC (permalink / raw)
  To: Intel-GFX; +Cc: Daniele Ceraolo Spurio, John Harrison, DRI-Devel

From: John Harrison <John.C.Harrison@Intel.com>

GuC converts the pre-emption timeout and timeslice quantum values into
clock ticks internally. That significantly reduces the point of 32bit
overflow. On current platforms, worst case scenario is approximately
110 seconds. Rather than allowing the user to set higher values and
then get confused by early timeouts, add limits when setting these
values.

v2: Add helper functins for clamping (review feedback from Tvrtko).

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> (v1)
---
 drivers/gpu/drm/i915/gt/intel_engine.h      |  6 ++
 drivers/gpu/drm/i915/gt/intel_engine_cs.c   | 69 +++++++++++++++++++++
 drivers/gpu/drm/i915/gt/sysfs_engines.c     | 25 +++++---
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h |  9 +++
 4 files changed, 99 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index 1c0ab05c3c40..d7044c4e526e 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -351,4 +351,10 @@ intel_engine_get_hung_context(struct intel_engine_cs *engine)
 	return engine->hung_ce;
 }
 
+u64 intel_clamp_heartbeat_interval_ms(struct intel_engine_cs *engine, u64 value);
+u64 intel_clamp_max_busywait_duration_ns(struct intel_engine_cs *engine, u64 value);
+u64 intel_clamp_preempt_timeout_ms(struct intel_engine_cs *engine, u64 value);
+u64 intel_clamp_stop_timeout_ms(struct intel_engine_cs *engine, u64 value);
+u64 intel_clamp_timeslice_duration_ms(struct intel_engine_cs *engine, u64 value);
+
 #endif /* _INTEL_RINGBUFFER_H_ */
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 7447411a5b26..22e70e4e007c 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -442,6 +442,26 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
 		engine->flags |= I915_ENGINE_HAS_EU_PRIORITY;
 	}
 
+	/* Cap properties according to any system limits */
+#define CLAMP_PROP(field) \
+	do { \
+		u64 clamp = intel_clamp_##field(engine, engine->props.field); \
+		if (clamp != engine->props.field) { \
+			drm_notice(&engine->i915->drm, \
+				   "Warning, clamping %s to %lld to prevent overflow\n", \
+				   #field, clamp); \
+			engine->props.field = clamp; \
+		} \
+	} while (0)
+
+	CLAMP_PROP(heartbeat_interval_ms);
+	CLAMP_PROP(max_busywait_duration_ns);
+	CLAMP_PROP(preempt_timeout_ms);
+	CLAMP_PROP(stop_timeout_ms);
+	CLAMP_PROP(timeslice_duration_ms);
+
+#undef CLAMP_PROP
+
 	engine->defaults = engine->props; /* never to change again */
 
 	engine->context_size = intel_engine_context_size(gt, engine->class);
@@ -464,6 +484,55 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
 	return 0;
 }
 
+u64 intel_clamp_heartbeat_interval_ms(struct intel_engine_cs *engine, u64 value)
+{
+	value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
+
+	return value;
+}
+
+u64 intel_clamp_max_busywait_duration_ns(struct intel_engine_cs *engine, u64 value)
+{
+	value = min(value, jiffies_to_nsecs(2));
+
+	return value;
+}
+
+u64 intel_clamp_preempt_timeout_ms(struct intel_engine_cs *engine, u64 value)
+{
+	/*
+	 * NB: The GuC API only supports 32bit values. However, the limit is further
+	 * reduced due to internal calculations which would otherwise overflow.
+	 */
+	if (intel_guc_submission_is_wanted(&engine->gt->uc.guc))
+		value = min_t(u64, value, GUC_POLICY_MAX_PREEMPT_TIMEOUT_MS);
+
+	value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
+
+	return value;
+}
+
+u64 intel_clamp_stop_timeout_ms(struct intel_engine_cs *engine, u64 value)
+{
+	value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
+
+	return value;
+}
+
+u64 intel_clamp_timeslice_duration_ms(struct intel_engine_cs *engine, u64 value)
+{
+	/*
+	 * NB: The GuC API only supports 32bit values. However, the limit is further
+	 * reduced due to internal calculations which would otherwise overflow.
+	 */
+	if (intel_guc_submission_is_wanted(&engine->gt->uc.guc))
+		value = min_t(u64, value, GUC_POLICY_MAX_EXEC_QUANTUM_MS);
+
+	value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
+
+	return value;
+}
+
 static void __setup_engine_capabilities(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *i915 = engine->i915;
diff --git a/drivers/gpu/drm/i915/gt/sysfs_engines.c b/drivers/gpu/drm/i915/gt/sysfs_engines.c
index 967031056202..f2d9858d827c 100644
--- a/drivers/gpu/drm/i915/gt/sysfs_engines.c
+++ b/drivers/gpu/drm/i915/gt/sysfs_engines.c
@@ -144,7 +144,7 @@ max_spin_store(struct kobject *kobj, struct kobj_attribute *attr,
 	       const char *buf, size_t count)
 {
 	struct intel_engine_cs *engine = kobj_to_engine(kobj);
-	unsigned long long duration;
+	unsigned long long duration, clamped;
 	int err;
 
 	/*
@@ -168,7 +168,8 @@ max_spin_store(struct kobject *kobj, struct kobj_attribute *attr,
 	if (err)
 		return err;
 
-	if (duration > jiffies_to_nsecs(2))
+	clamped = intel_clamp_max_busywait_duration_ns(engine, duration);
+	if (duration != clamped)
 		return -EINVAL;
 
 	WRITE_ONCE(engine->props.max_busywait_duration_ns, duration);
@@ -203,7 +204,7 @@ timeslice_store(struct kobject *kobj, struct kobj_attribute *attr,
 		const char *buf, size_t count)
 {
 	struct intel_engine_cs *engine = kobj_to_engine(kobj);
-	unsigned long long duration;
+	unsigned long long duration, clamped;
 	int err;
 
 	/*
@@ -218,7 +219,8 @@ timeslice_store(struct kobject *kobj, struct kobj_attribute *attr,
 	if (err)
 		return err;
 
-	if (duration > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
+	clamped = intel_clamp_timeslice_duration_ms(engine, duration);
+	if (duration != clamped)
 		return -EINVAL;
 
 	WRITE_ONCE(engine->props.timeslice_duration_ms, duration);
@@ -256,7 +258,7 @@ stop_store(struct kobject *kobj, struct kobj_attribute *attr,
 	   const char *buf, size_t count)
 {
 	struct intel_engine_cs *engine = kobj_to_engine(kobj);
-	unsigned long long duration;
+	unsigned long long duration, clamped;
 	int err;
 
 	/*
@@ -272,7 +274,8 @@ stop_store(struct kobject *kobj, struct kobj_attribute *attr,
 	if (err)
 		return err;
 
-	if (duration > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
+	clamped = intel_clamp_stop_timeout_ms(engine, duration);
+	if (duration != clamped)
 		return -EINVAL;
 
 	WRITE_ONCE(engine->props.stop_timeout_ms, duration);
@@ -306,7 +309,7 @@ preempt_timeout_store(struct kobject *kobj, struct kobj_attribute *attr,
 		      const char *buf, size_t count)
 {
 	struct intel_engine_cs *engine = kobj_to_engine(kobj);
-	unsigned long long timeout;
+	unsigned long long timeout, clamped;
 	int err;
 
 	/*
@@ -322,7 +325,8 @@ preempt_timeout_store(struct kobject *kobj, struct kobj_attribute *attr,
 	if (err)
 		return err;
 
-	if (timeout > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
+	clamped = intel_clamp_preempt_timeout_ms(engine, timeout);
+	if (timeout != clamped)
 		return -EINVAL;
 
 	WRITE_ONCE(engine->props.preempt_timeout_ms, timeout);
@@ -362,7 +366,7 @@ heartbeat_store(struct kobject *kobj, struct kobj_attribute *attr,
 		const char *buf, size_t count)
 {
 	struct intel_engine_cs *engine = kobj_to_engine(kobj);
-	unsigned long long delay;
+	unsigned long long delay, clamped;
 	int err;
 
 	/*
@@ -379,7 +383,8 @@ heartbeat_store(struct kobject *kobj, struct kobj_attribute *attr,
 	if (err)
 		return err;
 
-	if (delay >= jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
+	clamped = intel_clamp_heartbeat_interval_ms(engine, delay);
+	if (delay != clamped)
 		return -EINVAL;
 
 	err = intel_engine_set_heartbeat(engine, delay);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
index 4b300b6cc0f9..a2d574f2fdd5 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
@@ -262,6 +262,15 @@ struct guc_lrc_desc {
 
 #define GLOBAL_POLICY_DEFAULT_DPC_PROMOTE_TIME_US 500000
 
+/*
+ * GuC converts the timeout to clock ticks internally. Different platforms have
+ * different GuC clocks. Thus, the maximum value before overflow is platform
+ * dependent. Current worst case scenario is about 110s. So, limit to 100s to be
+ * safe.
+ */
+#define GUC_POLICY_MAX_EXEC_QUANTUM_MS		(100 * 1000)
+#define GUC_POLICY_MAX_PREEMPT_TIMEOUT_MS	(100 * 1000)
+
 struct guc_policies {
 	u32 submission_queue_depth[GUC_MAX_ENGINE_CLASSES];
 	/* In micro seconds. How much time to allow before DPC processing is
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Intel-gfx] [PATCH v3 1/4] drm/i915/guc: Limit scheduling properties to avoid overflow
@ 2022-03-03 22:37   ` John.C.Harrison
  0 siblings, 0 replies; 27+ messages in thread
From: John.C.Harrison @ 2022-03-03 22:37 UTC (permalink / raw)
  To: Intel-GFX; +Cc: DRI-Devel

From: John Harrison <John.C.Harrison@Intel.com>

GuC converts the pre-emption timeout and timeslice quantum values into
clock ticks internally. That significantly reduces the point of 32bit
overflow. On current platforms, worst case scenario is approximately
110 seconds. Rather than allowing the user to set higher values and
then get confused by early timeouts, add limits when setting these
values.

v2: Add helper functins for clamping (review feedback from Tvrtko).

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> (v1)
---
 drivers/gpu/drm/i915/gt/intel_engine.h      |  6 ++
 drivers/gpu/drm/i915/gt/intel_engine_cs.c   | 69 +++++++++++++++++++++
 drivers/gpu/drm/i915/gt/sysfs_engines.c     | 25 +++++---
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h |  9 +++
 4 files changed, 99 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index 1c0ab05c3c40..d7044c4e526e 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -351,4 +351,10 @@ intel_engine_get_hung_context(struct intel_engine_cs *engine)
 	return engine->hung_ce;
 }
 
+u64 intel_clamp_heartbeat_interval_ms(struct intel_engine_cs *engine, u64 value);
+u64 intel_clamp_max_busywait_duration_ns(struct intel_engine_cs *engine, u64 value);
+u64 intel_clamp_preempt_timeout_ms(struct intel_engine_cs *engine, u64 value);
+u64 intel_clamp_stop_timeout_ms(struct intel_engine_cs *engine, u64 value);
+u64 intel_clamp_timeslice_duration_ms(struct intel_engine_cs *engine, u64 value);
+
 #endif /* _INTEL_RINGBUFFER_H_ */
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 7447411a5b26..22e70e4e007c 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -442,6 +442,26 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
 		engine->flags |= I915_ENGINE_HAS_EU_PRIORITY;
 	}
 
+	/* Cap properties according to any system limits */
+#define CLAMP_PROP(field) \
+	do { \
+		u64 clamp = intel_clamp_##field(engine, engine->props.field); \
+		if (clamp != engine->props.field) { \
+			drm_notice(&engine->i915->drm, \
+				   "Warning, clamping %s to %lld to prevent overflow\n", \
+				   #field, clamp); \
+			engine->props.field = clamp; \
+		} \
+	} while (0)
+
+	CLAMP_PROP(heartbeat_interval_ms);
+	CLAMP_PROP(max_busywait_duration_ns);
+	CLAMP_PROP(preempt_timeout_ms);
+	CLAMP_PROP(stop_timeout_ms);
+	CLAMP_PROP(timeslice_duration_ms);
+
+#undef CLAMP_PROP
+
 	engine->defaults = engine->props; /* never to change again */
 
 	engine->context_size = intel_engine_context_size(gt, engine->class);
@@ -464,6 +484,55 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
 	return 0;
 }
 
+u64 intel_clamp_heartbeat_interval_ms(struct intel_engine_cs *engine, u64 value)
+{
+	value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
+
+	return value;
+}
+
+u64 intel_clamp_max_busywait_duration_ns(struct intel_engine_cs *engine, u64 value)
+{
+	value = min(value, jiffies_to_nsecs(2));
+
+	return value;
+}
+
+u64 intel_clamp_preempt_timeout_ms(struct intel_engine_cs *engine, u64 value)
+{
+	/*
+	 * NB: The GuC API only supports 32bit values. However, the limit is further
+	 * reduced due to internal calculations which would otherwise overflow.
+	 */
+	if (intel_guc_submission_is_wanted(&engine->gt->uc.guc))
+		value = min_t(u64, value, GUC_POLICY_MAX_PREEMPT_TIMEOUT_MS);
+
+	value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
+
+	return value;
+}
+
+u64 intel_clamp_stop_timeout_ms(struct intel_engine_cs *engine, u64 value)
+{
+	value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
+
+	return value;
+}
+
+u64 intel_clamp_timeslice_duration_ms(struct intel_engine_cs *engine, u64 value)
+{
+	/*
+	 * NB: The GuC API only supports 32bit values. However, the limit is further
+	 * reduced due to internal calculations which would otherwise overflow.
+	 */
+	if (intel_guc_submission_is_wanted(&engine->gt->uc.guc))
+		value = min_t(u64, value, GUC_POLICY_MAX_EXEC_QUANTUM_MS);
+
+	value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
+
+	return value;
+}
+
 static void __setup_engine_capabilities(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *i915 = engine->i915;
diff --git a/drivers/gpu/drm/i915/gt/sysfs_engines.c b/drivers/gpu/drm/i915/gt/sysfs_engines.c
index 967031056202..f2d9858d827c 100644
--- a/drivers/gpu/drm/i915/gt/sysfs_engines.c
+++ b/drivers/gpu/drm/i915/gt/sysfs_engines.c
@@ -144,7 +144,7 @@ max_spin_store(struct kobject *kobj, struct kobj_attribute *attr,
 	       const char *buf, size_t count)
 {
 	struct intel_engine_cs *engine = kobj_to_engine(kobj);
-	unsigned long long duration;
+	unsigned long long duration, clamped;
 	int err;
 
 	/*
@@ -168,7 +168,8 @@ max_spin_store(struct kobject *kobj, struct kobj_attribute *attr,
 	if (err)
 		return err;
 
-	if (duration > jiffies_to_nsecs(2))
+	clamped = intel_clamp_max_busywait_duration_ns(engine, duration);
+	if (duration != clamped)
 		return -EINVAL;
 
 	WRITE_ONCE(engine->props.max_busywait_duration_ns, duration);
@@ -203,7 +204,7 @@ timeslice_store(struct kobject *kobj, struct kobj_attribute *attr,
 		const char *buf, size_t count)
 {
 	struct intel_engine_cs *engine = kobj_to_engine(kobj);
-	unsigned long long duration;
+	unsigned long long duration, clamped;
 	int err;
 
 	/*
@@ -218,7 +219,8 @@ timeslice_store(struct kobject *kobj, struct kobj_attribute *attr,
 	if (err)
 		return err;
 
-	if (duration > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
+	clamped = intel_clamp_timeslice_duration_ms(engine, duration);
+	if (duration != clamped)
 		return -EINVAL;
 
 	WRITE_ONCE(engine->props.timeslice_duration_ms, duration);
@@ -256,7 +258,7 @@ stop_store(struct kobject *kobj, struct kobj_attribute *attr,
 	   const char *buf, size_t count)
 {
 	struct intel_engine_cs *engine = kobj_to_engine(kobj);
-	unsigned long long duration;
+	unsigned long long duration, clamped;
 	int err;
 
 	/*
@@ -272,7 +274,8 @@ stop_store(struct kobject *kobj, struct kobj_attribute *attr,
 	if (err)
 		return err;
 
-	if (duration > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
+	clamped = intel_clamp_stop_timeout_ms(engine, duration);
+	if (duration != clamped)
 		return -EINVAL;
 
 	WRITE_ONCE(engine->props.stop_timeout_ms, duration);
@@ -306,7 +309,7 @@ preempt_timeout_store(struct kobject *kobj, struct kobj_attribute *attr,
 		      const char *buf, size_t count)
 {
 	struct intel_engine_cs *engine = kobj_to_engine(kobj);
-	unsigned long long timeout;
+	unsigned long long timeout, clamped;
 	int err;
 
 	/*
@@ -322,7 +325,8 @@ preempt_timeout_store(struct kobject *kobj, struct kobj_attribute *attr,
 	if (err)
 		return err;
 
-	if (timeout > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
+	clamped = intel_clamp_preempt_timeout_ms(engine, timeout);
+	if (timeout != clamped)
 		return -EINVAL;
 
 	WRITE_ONCE(engine->props.preempt_timeout_ms, timeout);
@@ -362,7 +366,7 @@ heartbeat_store(struct kobject *kobj, struct kobj_attribute *attr,
 		const char *buf, size_t count)
 {
 	struct intel_engine_cs *engine = kobj_to_engine(kobj);
-	unsigned long long delay;
+	unsigned long long delay, clamped;
 	int err;
 
 	/*
@@ -379,7 +383,8 @@ heartbeat_store(struct kobject *kobj, struct kobj_attribute *attr,
 	if (err)
 		return err;
 
-	if (delay >= jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
+	clamped = intel_clamp_heartbeat_interval_ms(engine, delay);
+	if (delay != clamped)
 		return -EINVAL;
 
 	err = intel_engine_set_heartbeat(engine, delay);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
index 4b300b6cc0f9..a2d574f2fdd5 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
@@ -262,6 +262,15 @@ struct guc_lrc_desc {
 
 #define GLOBAL_POLICY_DEFAULT_DPC_PROMOTE_TIME_US 500000
 
+/*
+ * GuC converts the timeout to clock ticks internally. Different platforms have
+ * different GuC clocks. Thus, the maximum value before overflow is platform
+ * dependent. Current worst case scenario is about 110s. So, limit to 100s to be
+ * safe.
+ */
+#define GUC_POLICY_MAX_EXEC_QUANTUM_MS		(100 * 1000)
+#define GUC_POLICY_MAX_PREEMPT_TIMEOUT_MS	(100 * 1000)
+
 struct guc_policies {
 	u32 submission_queue_depth[GUC_MAX_ENGINE_CLASSES];
 	/* In micro seconds. How much time to allow before DPC processing is
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3 2/4] drm/i915: Fix compute pre-emption w/a to apply to compute engines
  2022-03-03 22:37 ` [Intel-gfx] " John.C.Harrison
@ 2022-03-03 22:37   ` John.C.Harrison
  -1 siblings, 0 replies; 27+ messages in thread
From: John.C.Harrison @ 2022-03-03 22:37 UTC (permalink / raw)
  To: Intel-GFX
  Cc: Daniel Vetter, DRI-Devel, Chris Wilson, Stuart Summers,
	Daniele Ceraolo Spurio, Jason Ekstrand, Akeem G Abodunrin,
	Matthew Brost, Matthew Auld, Thomas Hellström, Jani Nikula,
	Lucas De Marchi, Aravind Iddamsetty, Tvrtko Ursulin,
	Michał Winiarski, Tvrtko Ursulin, Tejas Upadhyay,
	Umesh Nerlige Ramappa, John Harrison

From: John Harrison <John.C.Harrison@Intel.com>

An earlier patch added support for compute engines. However, it missed
enabling the anti-pre-emption w/a for the new engine class. So move
the 'compute capable' flag earlier and use it for the pre-emption w/a
test.

Fixes: c674c5b9342e ("drm/i915/xehp: CCS should use RCS setup functions")
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: "Michał Winiarski" <michal.winiarski@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 22e70e4e007c..4185c7338581 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -421,6 +421,12 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
 	engine->logical_mask = BIT(logical_instance);
 	__sprint_engine_name(engine);
 
+	/* features common between engines sharing EUs */
+	if (engine->class == RENDER_CLASS || engine->class == COMPUTE_CLASS) {
+		engine->flags |= I915_ENGINE_HAS_RCS_REG_STATE;
+		engine->flags |= I915_ENGINE_HAS_EU_PRIORITY;
+	}
+
 	engine->props.heartbeat_interval_ms =
 		CONFIG_DRM_I915_HEARTBEAT_INTERVAL;
 	engine->props.max_busywait_duration_ns =
@@ -433,15 +439,9 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
 		CONFIG_DRM_I915_TIMESLICE_DURATION;
 
 	/* Override to uninterruptible for OpenCL workloads. */
-	if (GRAPHICS_VER(i915) == 12 && engine->class == RENDER_CLASS)
+	if (GRAPHICS_VER(i915) == 12 && (engine->flags & I915_ENGINE_HAS_RCS_REG_STATE))
 		engine->props.preempt_timeout_ms = 0;
 
-	/* features common between engines sharing EUs */
-	if (engine->class == RENDER_CLASS || engine->class == COMPUTE_CLASS) {
-		engine->flags |= I915_ENGINE_HAS_RCS_REG_STATE;
-		engine->flags |= I915_ENGINE_HAS_EU_PRIORITY;
-	}
-
 	/* Cap properties according to any system limits */
 #define CLAMP_PROP(field) \
 	do { \
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Intel-gfx] [PATCH v3 2/4] drm/i915: Fix compute pre-emption w/a to apply to compute engines
@ 2022-03-03 22:37   ` John.C.Harrison
  0 siblings, 0 replies; 27+ messages in thread
From: John.C.Harrison @ 2022-03-03 22:37 UTC (permalink / raw)
  To: Intel-GFX
  Cc: Daniel Vetter, DRI-Devel, Chris Wilson, Matthew Auld,
	Thomas Hellström, Jani Nikula, Lucas De Marchi,
	Michał Winiarski, Tejas Upadhyay

From: John Harrison <John.C.Harrison@Intel.com>

An earlier patch added support for compute engines. However, it missed
enabling the anti-pre-emption w/a for the new engine class. So move
the 'compute capable' flag earlier and use it for the pre-emption w/a
test.

Fixes: c674c5b9342e ("drm/i915/xehp: CCS should use RCS setup functions")
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: "Michał Winiarski" <michal.winiarski@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 22e70e4e007c..4185c7338581 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -421,6 +421,12 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
 	engine->logical_mask = BIT(logical_instance);
 	__sprint_engine_name(engine);
 
+	/* features common between engines sharing EUs */
+	if (engine->class == RENDER_CLASS || engine->class == COMPUTE_CLASS) {
+		engine->flags |= I915_ENGINE_HAS_RCS_REG_STATE;
+		engine->flags |= I915_ENGINE_HAS_EU_PRIORITY;
+	}
+
 	engine->props.heartbeat_interval_ms =
 		CONFIG_DRM_I915_HEARTBEAT_INTERVAL;
 	engine->props.max_busywait_duration_ns =
@@ -433,15 +439,9 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
 		CONFIG_DRM_I915_TIMESLICE_DURATION;
 
 	/* Override to uninterruptible for OpenCL workloads. */
-	if (GRAPHICS_VER(i915) == 12 && engine->class == RENDER_CLASS)
+	if (GRAPHICS_VER(i915) == 12 && (engine->flags & I915_ENGINE_HAS_RCS_REG_STATE))
 		engine->props.preempt_timeout_ms = 0;
 
-	/* features common between engines sharing EUs */
-	if (engine->class == RENDER_CLASS || engine->class == COMPUTE_CLASS) {
-		engine->flags |= I915_ENGINE_HAS_RCS_REG_STATE;
-		engine->flags |= I915_ENGINE_HAS_EU_PRIORITY;
-	}
-
 	/* Cap properties according to any system limits */
 #define CLAMP_PROP(field) \
 	do { \
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3 3/4] drm/i915: Make the heartbeat play nice with long pre-emption timeouts
  2022-03-03 22:37 ` [Intel-gfx] " John.C.Harrison
@ 2022-03-03 22:37   ` John.C.Harrison
  -1 siblings, 0 replies; 27+ messages in thread
From: John.C.Harrison @ 2022-03-03 22:37 UTC (permalink / raw)
  To: Intel-GFX; +Cc: John Harrison, DRI-Devel

From: John Harrison <John.C.Harrison@Intel.com>

Compute workloads are inherently not pre-emptible for long periods on
current hardware. As a workaround for this, the pre-emption timeout
for compute capable engines was disabled. This is undesirable with GuC
submission as it prevents per engine reset of hung contexts. Hence the
next patch will re-enable the timeout but bumped up by an order of
magnitude.

However, the heartbeat might not respect that. Depending upon current
activity, a pre-emption to the heartbeat pulse might not even be
attempted until the last heartbeat period. Which means that only one
period is granted for the pre-emption to occur. With the aforesaid
bump, the pre-emption timeout could be significantly larger than this
heartbeat period.

So adjust the heartbeat code to take the pre-emption timeout into
account. When it reaches the final (high priority) period, it now
ensures the delay before hitting reset is bigger than the pre-emption
timeout.

v2: Fix for selftests which adjust the heartbeat period manually.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 .../gpu/drm/i915/gt/intel_engine_heartbeat.c   | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
index a3698f611f45..0dc53def8e42 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
@@ -22,9 +22,27 @@
 
 static bool next_heartbeat(struct intel_engine_cs *engine)
 {
+	struct i915_request *rq;
 	long delay;
 
 	delay = READ_ONCE(engine->props.heartbeat_interval_ms);
+
+	rq = engine->heartbeat.systole;
+
+	if (rq && rq->sched.attr.priority >= I915_PRIORITY_BARRIER &&
+	    delay == engine->defaults.heartbeat_interval_ms) {
+		long longer;
+
+		/*
+		 * The final try is at the highest priority possible. Up until now
+		 * a pre-emption might not even have been attempted. So make sure
+		 * this last attempt allows enough time for a pre-emption to occur.
+		 */
+		longer = READ_ONCE(engine->props.preempt_timeout_ms) * 2;
+		if (longer > delay)
+			delay = longer;
+	}
+
 	if (!delay)
 		return false;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Intel-gfx] [PATCH v3 3/4] drm/i915: Make the heartbeat play nice with long pre-emption timeouts
@ 2022-03-03 22:37   ` John.C.Harrison
  0 siblings, 0 replies; 27+ messages in thread
From: John.C.Harrison @ 2022-03-03 22:37 UTC (permalink / raw)
  To: Intel-GFX; +Cc: DRI-Devel

From: John Harrison <John.C.Harrison@Intel.com>

Compute workloads are inherently not pre-emptible for long periods on
current hardware. As a workaround for this, the pre-emption timeout
for compute capable engines was disabled. This is undesirable with GuC
submission as it prevents per engine reset of hung contexts. Hence the
next patch will re-enable the timeout but bumped up by an order of
magnitude.

However, the heartbeat might not respect that. Depending upon current
activity, a pre-emption to the heartbeat pulse might not even be
attempted until the last heartbeat period. Which means that only one
period is granted for the pre-emption to occur. With the aforesaid
bump, the pre-emption timeout could be significantly larger than this
heartbeat period.

So adjust the heartbeat code to take the pre-emption timeout into
account. When it reaches the final (high priority) period, it now
ensures the delay before hitting reset is bigger than the pre-emption
timeout.

v2: Fix for selftests which adjust the heartbeat period manually.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 .../gpu/drm/i915/gt/intel_engine_heartbeat.c   | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
index a3698f611f45..0dc53def8e42 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
@@ -22,9 +22,27 @@
 
 static bool next_heartbeat(struct intel_engine_cs *engine)
 {
+	struct i915_request *rq;
 	long delay;
 
 	delay = READ_ONCE(engine->props.heartbeat_interval_ms);
+
+	rq = engine->heartbeat.systole;
+
+	if (rq && rq->sched.attr.priority >= I915_PRIORITY_BARRIER &&
+	    delay == engine->defaults.heartbeat_interval_ms) {
+		long longer;
+
+		/*
+		 * The final try is at the highest priority possible. Up until now
+		 * a pre-emption might not even have been attempted. So make sure
+		 * this last attempt allows enough time for a pre-emption to occur.
+		 */
+		longer = READ_ONCE(engine->props.preempt_timeout_ms) * 2;
+		if (longer > delay)
+			delay = longer;
+	}
+
 	if (!delay)
 		return false;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3 4/4] drm/i915: Improve long running OCL w/a for GuC submission
  2022-03-03 22:37 ` [Intel-gfx] " John.C.Harrison
@ 2022-03-03 22:37   ` John.C.Harrison
  -1 siblings, 0 replies; 27+ messages in thread
From: John.C.Harrison @ 2022-03-03 22:37 UTC (permalink / raw)
  To: Intel-GFX; +Cc: Michal Mrozek, Daniele Ceraolo Spurio, John Harrison, DRI-Devel

From: John Harrison <John.C.Harrison@Intel.com>

A workaround was added to the driver to allow OpenCL workloads to run
'forever' by disabling pre-emption on the RCS engine for Gen12.
It is not totally unbound as the heartbeat will kick in eventually
and cause a reset of the hung engine.

However, this does not work well in GuC submission mode. In GuC mode,
the pre-emption timeout is how GuC detects hung contexts and triggers
a per engine reset. Thus, disabling the timeout means also losing all
per engine reset ability. A full GT reset will still occur when the
heartbeat finally expires, but that is a much more destructive and
undesirable mechanism.

The purpose of the workaround is actually to give OpenCL tasks longer
to reach a pre-emption point after a pre-emption request has been
issued. This is necessary because Gen12 does not support mid-thread
pre-emption and OpenCL can have long running threads.

So, rather than disabling the timeout completely, just set it to a
'long' value.

v2: Review feedback from Tvrtko - must hard code the 'long' value
instead of determining it algorithmically. So make it an extra CONFIG
definition. Also, remove the execlist centric comment from the
existing pre-emption timeout CONFIG option given that it applies to
more than just execlists.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> (v1)
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
---
 drivers/gpu/drm/i915/Kconfig.profile      | 26 +++++++++++++++++++----
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  9 ++++++--
 2 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
index 39328567c200..7cc38d25ee5c 100644
--- a/drivers/gpu/drm/i915/Kconfig.profile
+++ b/drivers/gpu/drm/i915/Kconfig.profile
@@ -57,10 +57,28 @@ config DRM_I915_PREEMPT_TIMEOUT
 	default 640 # milliseconds
 	help
 	  How long to wait (in milliseconds) for a preemption event to occur
-	  when submitting a new context via execlists. If the current context
-	  does not hit an arbitration point and yield to HW before the timer
-	  expires, the HW will be reset to allow the more important context
-	  to execute.
+	  when submitting a new context. If the current context does not hit
+	  an arbitration point and yield to HW before the timer expires, the
+	  HW will be reset to allow the more important context to execute.
+
+	  This is adjustable via
+	  /sys/class/drm/card?/engine/*/preempt_timeout_ms
+
+	  May be 0 to disable the timeout.
+
+	  The compiled in default may get overridden at driver probe time on
+	  certain platforms and certain engines which will be reflected in the
+	  sysfs control.
+
+config DRM_I915_PREEMPT_TIMEOUT_COMPUTE
+	int "Preempt timeout for compute engines (ms, jiffy granularity)"
+	default 7500 # milliseconds
+	help
+	  How long to wait (in milliseconds) for a preemption event to occur
+	  when submitting a new context to a compute capable engine. If the
+	  current context does not hit an arbitration point and yield to HW
+	  before the timer expires, the HW will be reset to allow the more
+	  important context to execute.
 
 	  This is adjustable via
 	  /sys/class/drm/card?/engine/*/preempt_timeout_ms
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 4185c7338581..cc0954ad836a 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -438,9 +438,14 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
 	engine->props.timeslice_duration_ms =
 		CONFIG_DRM_I915_TIMESLICE_DURATION;
 
-	/* Override to uninterruptible for OpenCL workloads. */
+	/*
+	 * Mid-thread pre-emption is not available in Gen12. Unfortunately,
+	 * some OpenCL workloads run quite long threads. That means they get
+	 * reset due to not pre-empting in a timely manner. So, bump the
+	 * pre-emption timeout value to be much higher for compute engines.
+	 */
 	if (GRAPHICS_VER(i915) == 12 && (engine->flags & I915_ENGINE_HAS_RCS_REG_STATE))
-		engine->props.preempt_timeout_ms = 0;
+		engine->props.preempt_timeout_ms = CONFIG_DRM_I915_PREEMPT_TIMEOUT_COMPUTE;
 
 	/* Cap properties according to any system limits */
 #define CLAMP_PROP(field) \
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Intel-gfx] [PATCH v3 4/4] drm/i915: Improve long running OCL w/a for GuC submission
@ 2022-03-03 22:37   ` John.C.Harrison
  0 siblings, 0 replies; 27+ messages in thread
From: John.C.Harrison @ 2022-03-03 22:37 UTC (permalink / raw)
  To: Intel-GFX; +Cc: Michal Mrozek, DRI-Devel

From: John Harrison <John.C.Harrison@Intel.com>

A workaround was added to the driver to allow OpenCL workloads to run
'forever' by disabling pre-emption on the RCS engine for Gen12.
It is not totally unbound as the heartbeat will kick in eventually
and cause a reset of the hung engine.

However, this does not work well in GuC submission mode. In GuC mode,
the pre-emption timeout is how GuC detects hung contexts and triggers
a per engine reset. Thus, disabling the timeout means also losing all
per engine reset ability. A full GT reset will still occur when the
heartbeat finally expires, but that is a much more destructive and
undesirable mechanism.

The purpose of the workaround is actually to give OpenCL tasks longer
to reach a pre-emption point after a pre-emption request has been
issued. This is necessary because Gen12 does not support mid-thread
pre-emption and OpenCL can have long running threads.

So, rather than disabling the timeout completely, just set it to a
'long' value.

v2: Review feedback from Tvrtko - must hard code the 'long' value
instead of determining it algorithmically. So make it an extra CONFIG
definition. Also, remove the execlist centric comment from the
existing pre-emption timeout CONFIG option given that it applies to
more than just execlists.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> (v1)
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
---
 drivers/gpu/drm/i915/Kconfig.profile      | 26 +++++++++++++++++++----
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  9 ++++++--
 2 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
index 39328567c200..7cc38d25ee5c 100644
--- a/drivers/gpu/drm/i915/Kconfig.profile
+++ b/drivers/gpu/drm/i915/Kconfig.profile
@@ -57,10 +57,28 @@ config DRM_I915_PREEMPT_TIMEOUT
 	default 640 # milliseconds
 	help
 	  How long to wait (in milliseconds) for a preemption event to occur
-	  when submitting a new context via execlists. If the current context
-	  does not hit an arbitration point and yield to HW before the timer
-	  expires, the HW will be reset to allow the more important context
-	  to execute.
+	  when submitting a new context. If the current context does not hit
+	  an arbitration point and yield to HW before the timer expires, the
+	  HW will be reset to allow the more important context to execute.
+
+	  This is adjustable via
+	  /sys/class/drm/card?/engine/*/preempt_timeout_ms
+
+	  May be 0 to disable the timeout.
+
+	  The compiled in default may get overridden at driver probe time on
+	  certain platforms and certain engines which will be reflected in the
+	  sysfs control.
+
+config DRM_I915_PREEMPT_TIMEOUT_COMPUTE
+	int "Preempt timeout for compute engines (ms, jiffy granularity)"
+	default 7500 # milliseconds
+	help
+	  How long to wait (in milliseconds) for a preemption event to occur
+	  when submitting a new context to a compute capable engine. If the
+	  current context does not hit an arbitration point and yield to HW
+	  before the timer expires, the HW will be reset to allow the more
+	  important context to execute.
 
 	  This is adjustable via
 	  /sys/class/drm/card?/engine/*/preempt_timeout_ms
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 4185c7338581..cc0954ad836a 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -438,9 +438,14 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
 	engine->props.timeslice_duration_ms =
 		CONFIG_DRM_I915_TIMESLICE_DURATION;
 
-	/* Override to uninterruptible for OpenCL workloads. */
+	/*
+	 * Mid-thread pre-emption is not available in Gen12. Unfortunately,
+	 * some OpenCL workloads run quite long threads. That means they get
+	 * reset due to not pre-empting in a timely manner. So, bump the
+	 * pre-emption timeout value to be much higher for compute engines.
+	 */
 	if (GRAPHICS_VER(i915) == 12 && (engine->flags & I915_ENGINE_HAS_RCS_REG_STATE))
-		engine->props.preempt_timeout_ms = 0;
+		engine->props.preempt_timeout_ms = CONFIG_DRM_I915_PREEMPT_TIMEOUT_COMPUTE;
 
 	/* Cap properties according to any system limits */
 #define CLAMP_PROP(field) \
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 2/4] drm/i915: Fix compute pre-emption w/a to apply to compute engines
  2022-03-03 22:37   ` [Intel-gfx] " John.C.Harrison
@ 2022-03-03 23:16     ` Matt Roper
  -1 siblings, 0 replies; 27+ messages in thread
From: Matt Roper @ 2022-03-03 23:16 UTC (permalink / raw)
  To: John.C.Harrison
  Cc: Lucas De Marchi, Tvrtko Ursulin, Thomas Hellström,
	Michał Winiarski, Jason Ekstrand, Tvrtko Ursulin,
	Jani Nikula, Daniel Vetter, Intel-GFX, DRI-Devel, Chris Wilson,
	Stuart Summers, Daniele Ceraolo Spurio, Matthew Auld,
	Aravind Iddamsetty, Matthew Brost, Akeem G Abodunrin,
	Tejas Upadhyay, Umesh Nerlige Ramappa

On Thu, Mar 03, 2022 at 02:37:35PM -0800, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> An earlier patch added support for compute engines. However, it missed
> enabling the anti-pre-emption w/a for the new engine class. So move
> the 'compute capable' flag earlier and use it for the pre-emption w/a
> test.
> 
> Fixes: c674c5b9342e ("drm/i915/xehp: CCS should use RCS setup functions")
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
> Cc: Matt Roper <matthew.d.roper@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> Cc: John Harrison <John.C.Harrison@Intel.com>
> Cc: Jason Ekstrand <jason@jlekstrand.net>
> Cc: "Michał Winiarski" <michal.winiarski@intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
> Cc: Stuart Summers <stuart.summers@intel.com>
> Cc: Matthew Auld <matthew.auld@intel.com>
> Cc: Jani Nikula <jani.nikula@intel.com>
> Cc: Ramalingam C <ramalingam.c@intel.com>
> Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

> ---
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 22e70e4e007c..4185c7338581 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -421,6 +421,12 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
>  	engine->logical_mask = BIT(logical_instance);
>  	__sprint_engine_name(engine);
>  
> +	/* features common between engines sharing EUs */
> +	if (engine->class == RENDER_CLASS || engine->class == COMPUTE_CLASS) {
> +		engine->flags |= I915_ENGINE_HAS_RCS_REG_STATE;
> +		engine->flags |= I915_ENGINE_HAS_EU_PRIORITY;
> +	}
> +
>  	engine->props.heartbeat_interval_ms =
>  		CONFIG_DRM_I915_HEARTBEAT_INTERVAL;
>  	engine->props.max_busywait_duration_ns =
> @@ -433,15 +439,9 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
>  		CONFIG_DRM_I915_TIMESLICE_DURATION;
>  
>  	/* Override to uninterruptible for OpenCL workloads. */
> -	if (GRAPHICS_VER(i915) == 12 && engine->class == RENDER_CLASS)
> +	if (GRAPHICS_VER(i915) == 12 && (engine->flags & I915_ENGINE_HAS_RCS_REG_STATE))
>  		engine->props.preempt_timeout_ms = 0;
>  
> -	/* features common between engines sharing EUs */
> -	if (engine->class == RENDER_CLASS || engine->class == COMPUTE_CLASS) {
> -		engine->flags |= I915_ENGINE_HAS_RCS_REG_STATE;
> -		engine->flags |= I915_ENGINE_HAS_EU_PRIORITY;
> -	}
> -
>  	/* Cap properties according to any system limits */
>  #define CLAMP_PROP(field) \
>  	do { \
> -- 
> 2.25.1
> 

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Intel-gfx] [PATCH v3 2/4] drm/i915: Fix compute pre-emption w/a to apply to compute engines
@ 2022-03-03 23:16     ` Matt Roper
  0 siblings, 0 replies; 27+ messages in thread
From: Matt Roper @ 2022-03-03 23:16 UTC (permalink / raw)
  To: John.C.Harrison
  Cc: Lucas De Marchi, Thomas Hellström, Michał Winiarski,
	Jani Nikula, Daniel Vetter, Intel-GFX, DRI-Devel, Chris Wilson,
	Matthew Auld, Tejas Upadhyay

On Thu, Mar 03, 2022 at 02:37:35PM -0800, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> An earlier patch added support for compute engines. However, it missed
> enabling the anti-pre-emption w/a for the new engine class. So move
> the 'compute capable' flag earlier and use it for the pre-emption w/a
> test.
> 
> Fixes: c674c5b9342e ("drm/i915/xehp: CCS should use RCS setup functions")
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
> Cc: Matt Roper <matthew.d.roper@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> Cc: John Harrison <John.C.Harrison@Intel.com>
> Cc: Jason Ekstrand <jason@jlekstrand.net>
> Cc: "Michał Winiarski" <michal.winiarski@intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
> Cc: Stuart Summers <stuart.summers@intel.com>
> Cc: Matthew Auld <matthew.auld@intel.com>
> Cc: Jani Nikula <jani.nikula@intel.com>
> Cc: Ramalingam C <ramalingam.c@intel.com>
> Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

> ---
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 22e70e4e007c..4185c7338581 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -421,6 +421,12 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
>  	engine->logical_mask = BIT(logical_instance);
>  	__sprint_engine_name(engine);
>  
> +	/* features common between engines sharing EUs */
> +	if (engine->class == RENDER_CLASS || engine->class == COMPUTE_CLASS) {
> +		engine->flags |= I915_ENGINE_HAS_RCS_REG_STATE;
> +		engine->flags |= I915_ENGINE_HAS_EU_PRIORITY;
> +	}
> +
>  	engine->props.heartbeat_interval_ms =
>  		CONFIG_DRM_I915_HEARTBEAT_INTERVAL;
>  	engine->props.max_busywait_duration_ns =
> @@ -433,15 +439,9 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
>  		CONFIG_DRM_I915_TIMESLICE_DURATION;
>  
>  	/* Override to uninterruptible for OpenCL workloads. */
> -	if (GRAPHICS_VER(i915) == 12 && engine->class == RENDER_CLASS)
> +	if (GRAPHICS_VER(i915) == 12 && (engine->flags & I915_ENGINE_HAS_RCS_REG_STATE))
>  		engine->props.preempt_timeout_ms = 0;
>  
> -	/* features common between engines sharing EUs */
> -	if (engine->class == RENDER_CLASS || engine->class == COMPUTE_CLASS) {
> -		engine->flags |= I915_ENGINE_HAS_RCS_REG_STATE;
> -		engine->flags |= I915_ENGINE_HAS_EU_PRIORITY;
> -	}
> -
>  	/* Cap properties according to any system limits */
>  #define CLAMP_PROP(field) \
>  	do { \
> -- 
> 2.25.1
> 

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Improve anti-pre-emption w/a for compute workloads (rev4)
  2022-03-03 22:37 ` [Intel-gfx] " John.C.Harrison
                   ` (4 preceding siblings ...)
  (?)
@ 2022-03-04  0:55 ` Patchwork
  -1 siblings, 0 replies; 27+ messages in thread
From: Patchwork @ 2022-03-04  0:55 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: Improve anti-pre-emption w/a for compute workloads (rev4)
URL   : https://patchwork.freedesktop.org/series/100428/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
5ef6e2a52bd2 drm/i915/guc: Limit scheduling properties to avoid overflow
-:42: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'field' - possible side-effects?
#42: FILE: drivers/gpu/drm/i915/gt/intel_engine_cs.c:446:
+#define CLAMP_PROP(field) \
+	do { \
+		u64 clamp = intel_clamp_##field(engine, engine->props.field); \
+		if (clamp != engine->props.field) { \
+			drm_notice(&engine->i915->drm, \
+				   "Warning, clamping %s to %lld to prevent overflow\n", \
+				   #field, clamp); \
+			engine->props.field = clamp; \
+		} \
+	} while (0)

total: 0 errors, 0 warnings, 1 checks, 191 lines checked
1f742dfb98cf drm/i915: Fix compute pre-emption w/a to apply to compute engines
de2bbe1afe48 drm/i915: Make the heartbeat play nice with long pre-emption timeouts
53812ed93f89 drm/i915: Improve long running OCL w/a for GuC submission



^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for Improve anti-pre-emption w/a for compute workloads (rev4)
  2022-03-03 22:37 ` [Intel-gfx] " John.C.Harrison
                   ` (5 preceding siblings ...)
  (?)
@ 2022-03-04  0:56 ` Patchwork
  -1 siblings, 0 replies; 27+ messages in thread
From: Patchwork @ 2022-03-04  0:56 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: Improve anti-pre-emption w/a for compute workloads (rev4)
URL   : https://patchwork.freedesktop.org/series/100428/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for Improve anti-pre-emption w/a for compute workloads (rev4)
  2022-03-03 22:37 ` [Intel-gfx] " John.C.Harrison
                   ` (6 preceding siblings ...)
  (?)
@ 2022-03-04  1:28 ` Patchwork
  -1 siblings, 0 replies; 27+ messages in thread
From: Patchwork @ 2022-03-04  1:28 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 4838 bytes --]

== Series Details ==

Series: Improve anti-pre-emption w/a for compute workloads (rev4)
URL   : https://patchwork.freedesktop.org/series/100428/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_11322 -> Patchwork_22482
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/index.html

Participating hosts (45 -> 43)
------------------------------

  Additional (1): fi-pnv-d510 
  Missing    (3): bat-rpls-2 fi-bsw-cyan fi-bdw-samus 

Known issues
------------

  Here are the changes found in Patchwork_22482 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_suspend@basic-s3@smem:
    - fi-skl-6600u:       [PASS][1] -> [INCOMPLETE][2] ([i915#4547])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/fi-skl-6600u/igt@gem_exec_suspend@basic-s3@smem.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/fi-skl-6600u/igt@gem_exec_suspend@basic-s3@smem.html

  * igt@i915_selftest@live@hangcheck:
    - fi-bdw-5557u:       NOTRUN -> [INCOMPLETE][3] ([i915#3921])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/fi-bdw-5557u/igt@i915_selftest@live@hangcheck.html

  * igt@kms_chamelium@dp-crc-fast:
    - fi-bdw-5557u:       NOTRUN -> [SKIP][4] ([fdo#109271] / [fdo#111827]) +8 similar issues
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/fi-bdw-5557u/igt@kms_chamelium@dp-crc-fast.html

  * igt@kms_psr@cursor_plane_move:
    - fi-bdw-5557u:       NOTRUN -> [SKIP][5] ([fdo#109271]) +13 similar issues
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/fi-bdw-5557u/igt@kms_psr@cursor_plane_move.html

  * igt@prime_vgem@basic-userptr:
    - fi-pnv-d510:        NOTRUN -> [SKIP][6] ([fdo#109271]) +57 similar issues
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/fi-pnv-d510/igt@prime_vgem@basic-userptr.html

  * igt@runner@aborted:
    - fi-skl-6600u:       NOTRUN -> [FAIL][7] ([i915#4312])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/fi-skl-6600u/igt@runner@aborted.html

  
#### Possible fixes ####

  * igt@i915_selftest@live@perf:
    - {fi-tgl-dsi}:       [DMESG-WARN][8] ([i915#2867]) -> [PASS][9]
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/fi-tgl-dsi/igt@i915_selftest@live@perf.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/fi-tgl-dsi/igt@i915_selftest@live@perf.html

  * igt@kms_busy@basic@flip:
    - {bat-adlp-6}:       [DMESG-WARN][10] ([i915#3576]) -> [PASS][11]
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/bat-adlp-6/igt@kms_busy@basic@flip.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/bat-adlp-6/igt@kms_busy@basic@flip.html

  
#### Warnings ####

  * igt@i915_selftest@live@hangcheck:
    - bat-dg1-5:          [DMESG-FAIL][12] ([i915#4494] / [i915#4957]) -> [DMESG-FAIL][13] ([i915#4957])
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/bat-dg1-5/igt@i915_selftest@live@hangcheck.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/bat-dg1-5/igt@i915_selftest@live@hangcheck.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#2867]: https://gitlab.freedesktop.org/drm/intel/issues/2867
  [i915#3576]: https://gitlab.freedesktop.org/drm/intel/issues/3576
  [i915#3921]: https://gitlab.freedesktop.org/drm/intel/issues/3921
  [i915#4312]: https://gitlab.freedesktop.org/drm/intel/issues/4312
  [i915#4494]: https://gitlab.freedesktop.org/drm/intel/issues/4494
  [i915#4547]: https://gitlab.freedesktop.org/drm/intel/issues/4547
  [i915#4957]: https://gitlab.freedesktop.org/drm/intel/issues/4957


Build changes
-------------

  * Linux: CI_DRM_11322 -> Patchwork_22482

  CI-20190529: 20190529
  CI_DRM_11322: 7328c1d5b59036c9eee129e43b07b417e6c127e1 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6362: 698695136f8ade2391f2d8f45300eae2df02e947 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_22482: 53812ed93f893158ffa88c09603ea2cf8a7d48f5 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

53812ed93f89 drm/i915: Improve long running OCL w/a for GuC submission
de2bbe1afe48 drm/i915: Make the heartbeat play nice with long pre-emption timeouts
1f742dfb98cf drm/i915: Fix compute pre-emption w/a to apply to compute engines
5ef6e2a52bd2 drm/i915/guc: Limit scheduling properties to avoid overflow

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/index.html

[-- Attachment #2: Type: text/html, Size: 5888 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Intel-gfx] ✗ Fi.CI.IGT: failure for Improve anti-pre-emption w/a for compute workloads (rev4)
  2022-03-03 22:37 ` [Intel-gfx] " John.C.Harrison
                   ` (7 preceding siblings ...)
  (?)
@ 2022-03-04 15:09 ` Patchwork
  -1 siblings, 0 replies; 27+ messages in thread
From: Patchwork @ 2022-03-04 15:09 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 30281 bytes --]

== Series Details ==

Series: Improve anti-pre-emption w/a for compute workloads (rev4)
URL   : https://patchwork.freedesktop.org/series/100428/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_11322_full -> Patchwork_22482_full
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_22482_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_22482_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Participating hosts (13 -> 13)
------------------------------

  No changes in participating hosts

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_22482_full:

### IGT changes ###

#### Possible regressions ####

  * igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-16bpp-ytile-upscaling:
    - shard-tglb:         [PASS][1] -> [INCOMPLETE][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-tglb1/igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-16bpp-ytile-upscaling.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-tglb8/igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-16bpp-ytile-upscaling.html

  
#### Suppressed ####

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@gem_exec_parallel@basic@vcs0:
    - {shard-rkl}:        NOTRUN -> [DMESG-WARN][3] +1 similar issue
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-5/igt@gem_exec_parallel@basic@vcs0.html

  * {igt@kms_plane_scaling@downscale-with-pixel-format-factor-0-25@pipe-d-hdmi-a-1-downscale-with-pixel-format}:
    - {shard-tglu}:       NOTRUN -> [SKIP][4] +1 similar issue
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-tglu-5/igt@kms_plane_scaling@downscale-with-pixel-format-factor-0-25@pipe-d-hdmi-a-1-downscale-with-pixel-format.html

  * {igt@kms_plane_scaling@planes-downscale-factor-0-5@pipe-a-edp-1-planes-downscale}:
    - shard-iclb:         [PASS][5] -> [SKIP][6] +2 similar issues
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-iclb5/igt@kms_plane_scaling@planes-downscale-factor-0-5@pipe-a-edp-1-planes-downscale.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-iclb2/igt@kms_plane_scaling@planes-downscale-factor-0-5@pipe-a-edp-1-planes-downscale.html

  * igt@prime_mmap_coherency@ioctl-errors:
    - {shard-dg1}:        NOTRUN -> [SKIP][7] +2 similar issues
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-dg1-17/igt@prime_mmap_coherency@ioctl-errors.html

  
Known issues
------------

  Here are the changes found in Patchwork_22482_full that come from known issues:

### CI changes ###

#### Possible fixes ####

  * boot:
    - shard-apl:          ([PASS][8], [FAIL][9], [PASS][10], [PASS][11], [PASS][12], [PASS][13], [PASS][14], [PASS][15], [PASS][16], [PASS][17], [PASS][18], [PASS][19], [PASS][20], [PASS][21], [PASS][22], [PASS][23], [PASS][24], [PASS][25], [PASS][26], [PASS][27], [PASS][28], [PASS][29], [PASS][30], [PASS][31], [PASS][32]) ([i915#4386]) -> ([PASS][33], [PASS][34], [PASS][35], [PASS][36], [PASS][37], [PASS][38], [PASS][39], [PASS][40], [PASS][41], [PASS][42], [PASS][43], [PASS][44], [PASS][45], [PASS][46], [PASS][47], [PASS][48], [PASS][49], [PASS][50], [PASS][51], [PASS][52], [PASS][53], [PASS][54], [PASS][55], [PASS][56], [PASS][57])
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl1/boot.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl1/boot.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl1/boot.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl1/boot.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl2/boot.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl2/boot.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl2/boot.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl2/boot.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl2/boot.html
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl3/boot.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl3/boot.html
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl3/boot.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl4/boot.html
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl4/boot.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl4/boot.html
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl6/boot.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl6/boot.html
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl6/boot.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl6/boot.html
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl7/boot.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl7/boot.html
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl7/boot.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl8/boot.html
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl8/boot.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl8/boot.html
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl3/boot.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl3/boot.html
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl3/boot.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl3/boot.html
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl1/boot.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl1/boot.html
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl2/boot.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl1/boot.html
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl2/boot.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl8/boot.html
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl8/boot.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl8/boot.html
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl8/boot.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl7/boot.html
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl7/boot.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl7/boot.html
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl7/boot.html
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl6/boot.html
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl6/boot.html
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl6/boot.html
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl6/boot.html
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl4/boot.html
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl4/boot.html
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl4/boot.html
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl4/boot.html
    - {shard-rkl}:        ([PASS][58], [PASS][59], [PASS][60], [PASS][61], [FAIL][62], [PASS][63], [PASS][64], [PASS][65], [PASS][66], [PASS][67], [PASS][68], [PASS][69], [PASS][70], [PASS][71], [PASS][72], [PASS][73], [PASS][74], [PASS][75], [PASS][76]) ([i915#5131]) -> ([PASS][77], [PASS][78], [PASS][79], [PASS][80], [PASS][81], [PASS][82], [PASS][83], [PASS][84], [PASS][85], [PASS][86], [PASS][87], [PASS][88], [PASS][89], [PASS][90], [PASS][91], [PASS][92], [PASS][93], [PASS][94], [PASS][95], [PASS][96], [PASS][97], [PASS][98])
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-6/boot.html
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-6/boot.html
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-6/boot.html
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-6/boot.html
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-5/boot.html
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-5/boot.html
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-5/boot.html
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-5/boot.html
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-5/boot.html
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-5/boot.html
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-2/boot.html
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-2/boot.html
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-2/boot.html
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-2/boot.html
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-2/boot.html
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-2/boot.html
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-1/boot.html
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-1/boot.html
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-1/boot.html
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-6/boot.html
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-6/boot.html
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-6/boot.html
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-6/boot.html
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-5/boot.html
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-5/boot.html
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-5/boot.html
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-5/boot.html
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-5/boot.html
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-5/boot.html
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-4/boot.html
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-4/boot.html
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-2/boot.html
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-2/boot.html
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-2/boot.html
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-2/boot.html
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-2/boot.html
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-1/boot.html
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-1/boot.html
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-1/boot.html
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-1/boot.html
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-1/boot.html

  

### IGT changes ###

#### Issues hit ####

  * igt@gem_create@create-massive:
    - shard-apl:          NOTRUN -> [DMESG-WARN][99] ([i915#4991])
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl6/igt@gem_create@create-massive.html

  * igt@gem_exec_balancer@parallel:
    - shard-iclb:         NOTRUN -> [DMESG-WARN][100] ([i915#5076])
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-iclb1/igt@gem_exec_balancer@parallel.html
    - shard-tglb:         NOTRUN -> [DMESG-WARN][101] ([i915#5076])
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-tglb8/igt@gem_exec_balancer@parallel.html

  * igt@gem_exec_fair@basic-pace@vcs1:
    - shard-kbl:          [PASS][102] -> [FAIL][103] ([i915#2842]) +1 similar issue
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-kbl4/igt@gem_exec_fair@basic-pace@vcs1.html
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-kbl4/igt@gem_exec_fair@basic-pace@vcs1.html

  * igt@gem_render_copy@linear-to-vebox-y-tiled:
    - shard-apl:          NOTRUN -> [SKIP][104] ([fdo#109271]) +135 similar issues
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl3/igt@gem_render_copy@linear-to-vebox-y-tiled.html

  * igt@i915_pm_dc@dc6-dpms:
    - shard-iclb:         [PASS][105] -> [FAIL][106] ([i915#454])
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-iclb6/igt@i915_pm_dc@dc6-dpms.html
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-iclb3/igt@i915_pm_dc@dc6-dpms.html

  * igt@i915_suspend@fence-restore-tiled2untiled:
    - shard-apl:          [PASS][107] -> [DMESG-WARN][108] ([i915#180]) +2 similar issues
   [107]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl4/igt@i915_suspend@fence-restore-tiled2untiled.html
   [108]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl3/igt@i915_suspend@fence-restore-tiled2untiled.html

  * igt@kms_async_flips@alternate-sync-async-flip:
    - shard-glk:          [PASS][109] -> [FAIL][110] ([i915#2521])
   [109]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-glk9/igt@kms_async_flips@alternate-sync-async-flip.html
   [110]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-glk4/igt@kms_async_flips@alternate-sync-async-flip.html

  * igt@kms_big_fb@linear-32bpp-rotate-0:
    - shard-glk:          [PASS][111] -> [DMESG-WARN][112] ([i915#118]) +1 similar issue
   [111]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-glk6/igt@kms_big_fb@linear-32bpp-rotate-0.html
   [112]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-glk9/igt@kms_big_fb@linear-32bpp-rotate-0.html

  * igt@kms_big_fb@linear-64bpp-rotate-270:
    - shard-iclb:         NOTRUN -> [SKIP][113] ([fdo#110725] / [fdo#111614])
   [113]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-iclb1/igt@kms_big_fb@linear-64bpp-rotate-270.html
    - shard-tglb:         NOTRUN -> [SKIP][114] ([fdo#111614])
   [114]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-tglb8/igt@kms_big_fb@linear-64bpp-rotate-270.html

  * igt@kms_big_fb@x-tiled-max-hw-stride-32bpp-rotate-180-hflip-async-flip:
    - shard-apl:          NOTRUN -> [SKIP][115] ([fdo#109271] / [i915#3777])
   [115]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl6/igt@kms_big_fb@x-tiled-max-hw-stride-32bpp-rotate-180-hflip-async-flip.html

  * igt@kms_ccs@pipe-b-crc-sprite-planes-basic-y_tiled_gen12_rc_ccs_cc:
    - shard-apl:          NOTRUN -> [SKIP][116] ([fdo#109271] / [i915#3886]) +8 similar issues
   [116]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl6/igt@kms_ccs@pipe-b-crc-sprite-planes-basic-y_tiled_gen12_rc_ccs_cc.html

  * igt@kms_ccs@pipe-d-crc-primary-basic-y_tiled_ccs:
    - shard-iclb:         NOTRUN -> [SKIP][117] ([fdo#109278]) +1 similar issue
   [117]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-iclb1/igt@kms_ccs@pipe-d-crc-primary-basic-y_tiled_ccs.html
    - shard-tglb:         NOTRUN -> [SKIP][118] ([i915#3689])
   [118]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-tglb8/igt@kms_ccs@pipe-d-crc-primary-basic-y_tiled_ccs.html

  * igt@kms_chamelium@dp-crc-fast:
    - shard-iclb:         NOTRUN -> [SKIP][119] ([fdo#109284] / [fdo#111827])
   [119]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-iclb1/igt@kms_chamelium@dp-crc-fast.html
    - shard-tglb:         NOTRUN -> [SKIP][120] ([fdo#109284] / [fdo#111827])
   [120]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-tglb8/igt@kms_chamelium@dp-crc-fast.html

  * igt@kms_chamelium@vga-hpd-enable-disable-mode:
    - shard-apl:          NOTRUN -> [SKIP][121] ([fdo#109271] / [fdo#111827]) +12 similar issues
   [121]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl6/igt@kms_chamelium@vga-hpd-enable-disable-mode.html

  * igt@kms_content_protection@lic:
    - shard-apl:          NOTRUN -> [TIMEOUT][122] ([i915#1319])
   [122]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl1/igt@kms_content_protection@lic.html

  * igt@kms_cursor_legacy@2x-long-flip-vs-cursor-legacy:
    - shard-glk:          [PASS][123] -> [FAIL][124] ([i915#72])
   [123]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-glk1/igt@kms_cursor_legacy@2x-long-flip-vs-cursor-legacy.html
   [124]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-glk4/igt@kms_cursor_legacy@2x-long-flip-vs-cursor-legacy.html

  * igt@kms_frontbuffer_tracking@fbcpsr-2p-scndscrn-spr-indfb-fullscreen:
    - shard-tglb:         NOTRUN -> [SKIP][125] ([fdo#109280] / [fdo#111825])
   [125]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-tglb6/igt@kms_frontbuffer_tracking@fbcpsr-2p-scndscrn-spr-indfb-fullscreen.html

  * igt@kms_pipe_b_c_ivb@from-pipe-c-to-b-with-3-lanes:
    - shard-tglb:         NOTRUN -> [SKIP][126] ([fdo#109289])
   [126]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-tglb6/igt@kms_pipe_b_c_ivb@from-pipe-c-to-b-with-3-lanes.html

  * igt@kms_pipe_crc_basic@read-crc-pipe-d-frame-sequence:
    - shard-apl:          NOTRUN -> [SKIP][127] ([fdo#109271] / [i915#533]) +1 similar issue
   [127]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl3/igt@kms_pipe_crc_basic@read-crc-pipe-d-frame-sequence.html

  * igt@kms_plane_scaling@scaler-with-clipping-clamping@pipe-b-edp-1-scaler-with-clipping-clamping:
    - shard-iclb:         [PASS][128] -> [SKIP][129] ([i915#5176]) +1 similar issue
   [128]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-iclb6/igt@kms_plane_scaling@scaler-with-clipping-clamping@pipe-b-edp-1-scaler-with-clipping-clamping.html
   [129]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-iclb3/igt@kms_plane_scaling@scaler-with-clipping-clamping@pipe-b-edp-1-scaler-with-clipping-clamping.html

  * igt@kms_psr2_su@page_flip-p010:
    - shard-apl:          NOTRUN -> [SKIP][130] ([fdo#109271] / [i915#658])
   [130]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl3/igt@kms_psr2_su@page_flip-p010.html

  * igt@kms_psr@psr2_cursor_plane_onoff:
    - shard-iclb:         [PASS][131] -> [SKIP][132] ([fdo#109441])
   [131]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-iclb2/igt@kms_psr@psr2_cursor_plane_onoff.html
   [132]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-iclb1/igt@kms_psr@psr2_cursor_plane_onoff.html

  * igt@kms_sysfs_edid_timing:
    - shard-apl:          NOTRUN -> [FAIL][133] ([IGT#2])
   [133]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl7/igt@kms_sysfs_edid_timing.html

  * igt@kms_writeback@writeback-invalid-parameters:
    - shard-apl:          NOTRUN -> [SKIP][134] ([fdo#109271] / [i915#2437]) +1 similar issue
   [134]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl1/igt@kms_writeback@writeback-invalid-parameters.html

  * igt@nouveau_crc@pipe-c-source-outp-complete:
    - shard-tglb:         NOTRUN -> [SKIP][135] ([i915#2530])
   [135]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-tglb6/igt@nouveau_crc@pipe-c-source-outp-complete.html

  * igt@sysfs_clients@fair-7:
    - shard-apl:          NOTRUN -> [SKIP][136] ([fdo#109271] / [i915#2994]) +2 similar issues
   [136]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl7/igt@sysfs_clients@fair-7.html

  
#### Possible fixes ####

  * igt@gem_ctx_persistence@engines-hostile@rcs0:
    - {shard-rkl}:        [FAIL][137] ([i915#2410]) -> [PASS][138]
   [137]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-2/igt@gem_ctx_persistence@engines-hostile@rcs0.html
   [138]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-2/igt@gem_ctx_persistence@engines-hostile@rcs0.html

  * igt@gem_eio@unwedge-stress:
    - shard-tglb:         [TIMEOUT][139] ([i915#3063] / [i915#3648]) -> [PASS][140]
   [139]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-tglb7/igt@gem_eio@unwedge-stress.html
   [140]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-tglb7/igt@gem_eio@unwedge-stress.html
    - shard-iclb:         [TIMEOUT][141] ([i915#2481] / [i915#3070]) -> [PASS][142]
   [141]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-iclb4/igt@gem_eio@unwedge-stress.html
   [142]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-iclb2/igt@gem_eio@unwedge-stress.html

  * igt@gem_exec_fair@basic-none@vcs0:
    - shard-apl:          [FAIL][143] ([i915#2842]) -> [PASS][144] +1 similar issue
   [143]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-apl7/igt@gem_exec_fair@basic-none@vcs0.html
   [144]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-apl2/igt@gem_exec_fair@basic-none@vcs0.html

  * igt@gem_exec_fair@basic-pace-share@rcs0:
    - {shard-tglu}:       [FAIL][145] ([i915#2842]) -> [PASS][146] +1 similar issue
   [145]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-tglu-1/igt@gem_exec_fair@basic-pace-share@rcs0.html
   [146]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-tglu-6/igt@gem_exec_fair@basic-pace-share@rcs0.html

  * igt@gem_exec_fair@basic-pace@vecs0:
    - shard-kbl:          [FAIL][147] ([i915#2842]) -> [PASS][148]
   [147]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-kbl4/igt@gem_exec_fair@basic-pace@vecs0.html
   [148]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-kbl4/igt@gem_exec_fair@basic-pace@vecs0.html

  * igt@gem_exec_reloc@basic-scanout@vecs0:
    - {shard-rkl}:        [SKIP][149] ([i915#3639]) -> [PASS][150] +3 similar issues
   [149]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-5/igt@gem_exec_reloc@basic-scanout@vecs0.html
   [150]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-6/igt@gem_exec_reloc@basic-scanout@vecs0.html

  * igt@gem_exec_whisper@basic-normal:
    - shard-glk:          [DMESG-WARN][151] ([i915#118]) -> [PASS][152]
   [151]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-glk4/igt@gem_exec_whisper@basic-normal.html
   [152]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-glk7/igt@gem_exec_whisper@basic-normal.html

  * igt@gem_mmap_offset@clear:
    - {shard-rkl}:        [INCOMPLETE][153] -> [PASS][154]
   [153]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-5/igt@gem_mmap_offset@clear.html
   [154]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-2/igt@gem_mmap_offset@clear.html

  * igt@gem_workarounds@suspend-resume-fd:
    - shard-snb:          [TIMEOUT][155] -> [PASS][156]
   [155]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-snb6/igt@gem_workarounds@suspend-resume-fd.html
   [156]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-snb7/igt@gem_workarounds@suspend-resume-fd.html

  * igt@i915_pm_backlight@bad-brightness:
    - {shard-rkl}:        [SKIP][157] ([i915#3012]) -> [PASS][158]
   [157]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-1/igt@i915_pm_backlight@bad-brightness.html
   [158]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-6/igt@i915_pm_backlight@bad-brightness.html

  * igt@i915_pm_dc@dc6-psr:
    - shard-iclb:         [FAIL][159] ([i915#454]) -> [PASS][160]
   [159]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-iclb4/igt@i915_pm_dc@dc6-psr.html
   [160]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-iclb1/igt@i915_pm_dc@dc6-psr.html

  * igt@i915_pm_rpm@dpms-mode-unset-lpsp:
    - {shard-rkl}:        [SKIP][161] ([i915#1397]) -> [PASS][162]
   [161]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-5/igt@i915_pm_rpm@dpms-mode-unset-lpsp.html
   [162]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-6/igt@i915_pm_rpm@dpms-mode-unset-lpsp.html

  * igt@i915_pm_rpm@system-suspend-modeset:
    - {shard-rkl}:        [SKIP][163] ([fdo#109308]) -> [PASS][164] +1 similar issue
   [163]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-1/igt@i915_pm_rpm@system-suspend-modeset.html
   [164]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-6/igt@i915_pm_rpm@system-suspend-modeset.html

  * igt@kms_3d:
    - {shard-dg1}:        [SKIP][165] -> [PASS][166] +18 similar issues
   [165]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-dg1-18/igt@kms_3d.html
   [166]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-dg1-15/igt@kms_3d.html

  * igt@kms_atomic@atomic_plane_damage:
    - {shard-rkl}:        [SKIP][167] ([i915#4098]) -> [PASS][168]
   [167]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-5/igt@kms_atomic@atomic_plane_damage.html
   [168]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-6/igt@kms_atomic@atomic_plane_damage.html

  * igt@kms_big_fb@linear-16bpp-rotate-180:
    - {shard-tglu}:       [DMESG-WARN][169] ([i915#402]) -> [PASS][170]
   [169]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-tglu-3/igt@kms_big_fb@linear-16bpp-rotate-180.html
   [170]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-tglu-2/igt@kms_big_fb@linear-16bpp-rotate-180.html

  * igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-0:
    - {shard-dg1}:        [FAIL][171] -> [PASS][172] +2 similar issues
   [171]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-dg1-18/igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-0.html
   [172]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-dg1-15/igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-0.html

  * igt@kms_ccs@pipe-b-crc-primary-basic-y_tiled_gen12_rc_ccs_cc:
    - {shard-rkl}:        [SKIP][173] ([i915#1845] / [i915#4098]) -> [PASS][174] +2 similar issues
   [173]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-1/igt@kms_ccs@pipe-b-crc-primary-basic-y_tiled_gen12_rc_ccs_cc.html
   [174]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-6/igt@kms_ccs@pipe-b-crc-primary-basic-y_tiled_gen12_rc_ccs_cc.html

  * igt@kms_color@pipe-a-ctm-green-to-red:
    - {shard-rkl}:        [SKIP][175] ([i915#1149] / [i915#1849]) -> [PASS][176] +1 similar issue
   [175]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-5/igt@kms_color@pipe-a-ctm-green-to-red.html
   [176]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-6/igt@kms_color@pipe-a-ctm-green-to-red.html

  * igt@kms_cursor_crc@pipe-a-cursor-256x256-random:
    - {shard-rkl}:        [SKIP][177] ([fdo#112022] / [i915#4070]) -> [PASS][178] +2 similar issues
   [177]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-1/igt@kms_cursor_crc@pipe-a-cursor-256x256-random.html
   [178]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-6/igt@kms_cursor_crc@pipe-a-cursor-256x256-random.html

  * igt@kms_cursor_crc@pipe-b-cursor-64x21-onscreen:
    - {shard-rkl}:        [SKIP][179] ([fdo#112022]) -> [PASS][180] +2 similar issues
   [179]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-5/igt@kms_cursor_crc@pipe-b-cursor-64x21-onscreen.html
   [180]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-6/igt@kms_cursor_crc@pipe-b-cursor-64x21-onscreen.html

  * igt@kms_cursor_edge_walk@pipe-a-256x256-right-edge:
    - {shard-rkl}:        [SKIP][181] ([i915#1849]) -> [PASS][182] +17 similar issues
   [181]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-5/igt@kms_cursor_edge_walk@pipe-a-256x256-right-edge.html
   [182]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-6/igt@kms_cursor_edge_walk@pipe-a-256x256-right-edge.html

  * igt@kms_cursor_legacy@basic-flip-after-cursor-legacy:
    - {shard-rkl}:        [SKIP][183] ([fdo#111825]) -> [PASS][184] +1 similar issue
   [183]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-5/igt@kms_cursor_legacy@basic-flip-after-cursor-legacy.html
   [184]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-6/igt@kms_cursor_legacy@basic-flip-after-cursor-legacy.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions:
    - {shard-rkl}:        [SKIP][185] ([fdo#111825] / [i915#4070]) -> [PASS][186] +2 similar issues
   [185]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-1/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html
   [186]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-6/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html

  * igt@kms_cursor_legacy@pipe-c-single-move:
    - {shard-rkl}:        [SKIP][187] ([i915#4070]) -> [PASS][188] +1 similar issue
   [187]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11322/shard-rkl-2/igt@kms_cursor_legacy@pipe-c-single-move.html
   [188]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/shard-rkl-5/igt@kms_cursor_legacy@pipe-c-single-move.html

  * igt@kms_draw_crc@draw-method-xrgb8888-pwrite-ytiled:
    - {shard-rkl}:        [SKIP][189] ([fdo#111314]) -> [PASS][190] +3 similar issues
   [189]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DR

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22482/index.html

[-- Attachment #2: Type: text/html, Size: 32813 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH v3 4/4] drm/i915: Improve long running OCL w/a for GuC submission
  2022-03-03 22:37   ` [Intel-gfx] " John.C.Harrison
@ 2022-03-08  9:03     ` Mrozek, Michal
  -1 siblings, 0 replies; 27+ messages in thread
From: Mrozek, Michal @ 2022-03-08  9:03 UTC (permalink / raw)
  To: Harrison, John C, Intel-GFX; +Cc: Ceraolo Spurio, Daniele, DRI-Devel

Acked-by: Michal Mrozek <michal.mrozek@intel.com>

-----Original Message-----
From: Harrison, John C <john.c.harrison@intel.com> 
Sent: Thursday, March 3, 2022 11:38 PM
To: Intel-GFX@Lists.FreeDesktop.Org
Cc: DRI-Devel@Lists.FreeDesktop.Org; Harrison, John C <john.c.harrison@intel.com>; Ceraolo Spurio, Daniele <daniele.ceraolospurio@intel.com>; Mrozek, Michal <michal.mrozek@intel.com>
Subject: [PATCH v3 4/4] drm/i915: Improve long running OCL w/a for GuC submission

From: John Harrison <John.C.Harrison@Intel.com>

A workaround was added to the driver to allow OpenCL workloads to run 'forever' by disabling pre-emption on the RCS engine for Gen12.
It is not totally unbound as the heartbeat will kick in eventually and cause a reset of the hung engine.

However, this does not work well in GuC submission mode. In GuC mode, the pre-emption timeout is how GuC detects hung contexts and triggers a per engine reset. Thus, disabling the timeout means also losing all per engine reset ability. A full GT reset will still occur when the heartbeat finally expires, but that is a much more destructive and undesirable mechanism.

The purpose of the workaround is actually to give OpenCL tasks longer to reach a pre-emption point after a pre-emption request has been issued. This is necessary because Gen12 does not support mid-thread pre-emption and OpenCL can have long running threads.

So, rather than disabling the timeout completely, just set it to a 'long' value.

v2: Review feedback from Tvrtko - must hard code the 'long' value instead of determining it algorithmically. So make it an extra CONFIG definition. Also, remove the execlist centric comment from the existing pre-emption timeout CONFIG option given that it applies to more than just execlists.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> (v1)
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
---
 drivers/gpu/drm/i915/Kconfig.profile      | 26 +++++++++++++++++++----
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  9 ++++++--
 2 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
index 39328567c200..7cc38d25ee5c 100644
--- a/drivers/gpu/drm/i915/Kconfig.profile
+++ b/drivers/gpu/drm/i915/Kconfig.profile
@@ -57,10 +57,28 @@ config DRM_I915_PREEMPT_TIMEOUT
 	default 640 # milliseconds
 	help
 	  How long to wait (in milliseconds) for a preemption event to occur
-	  when submitting a new context via execlists. If the current context
-	  does not hit an arbitration point and yield to HW before the timer
-	  expires, the HW will be reset to allow the more important context
-	  to execute.
+	  when submitting a new context. If the current context does not hit
+	  an arbitration point and yield to HW before the timer expires, the
+	  HW will be reset to allow the more important context to execute.
+
+	  This is adjustable via
+	  /sys/class/drm/card?/engine/*/preempt_timeout_ms
+
+	  May be 0 to disable the timeout.
+
+	  The compiled in default may get overridden at driver probe time on
+	  certain platforms and certain engines which will be reflected in the
+	  sysfs control.
+
+config DRM_I915_PREEMPT_TIMEOUT_COMPUTE
+	int "Preempt timeout for compute engines (ms, jiffy granularity)"
+	default 7500 # milliseconds
+	help
+	  How long to wait (in milliseconds) for a preemption event to occur
+	  when submitting a new context to a compute capable engine. If the
+	  current context does not hit an arbitration point and yield to HW
+	  before the timer expires, the HW will be reset to allow the more
+	  important context to execute.
 
 	  This is adjustable via
 	  /sys/class/drm/card?/engine/*/preempt_timeout_ms
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 4185c7338581..cc0954ad836a 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -438,9 +438,14 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
 	engine->props.timeslice_duration_ms =
 		CONFIG_DRM_I915_TIMESLICE_DURATION;
 
-	/* Override to uninterruptible for OpenCL workloads. */
+	/*
+	 * Mid-thread pre-emption is not available in Gen12. Unfortunately,
+	 * some OpenCL workloads run quite long threads. That means they get
+	 * reset due to not pre-empting in a timely manner. So, bump the
+	 * pre-emption timeout value to be much higher for compute engines.
+	 */
 	if (GRAPHICS_VER(i915) == 12 && (engine->flags & I915_ENGINE_HAS_RCS_REG_STATE))
-		engine->props.preempt_timeout_ms = 0;
+		engine->props.preempt_timeout_ms = 
+CONFIG_DRM_I915_PREEMPT_TIMEOUT_COMPUTE;
 
 	/* Cap properties according to any system limits */  #define CLAMP_PROP(field) \
--
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [Intel-gfx] [PATCH v3 4/4] drm/i915: Improve long running OCL w/a for GuC submission
@ 2022-03-08  9:03     ` Mrozek, Michal
  0 siblings, 0 replies; 27+ messages in thread
From: Mrozek, Michal @ 2022-03-08  9:03 UTC (permalink / raw)
  To: Harrison, John C, Intel-GFX; +Cc: DRI-Devel

Acked-by: Michal Mrozek <michal.mrozek@intel.com>

-----Original Message-----
From: Harrison, John C <john.c.harrison@intel.com> 
Sent: Thursday, March 3, 2022 11:38 PM
To: Intel-GFX@Lists.FreeDesktop.Org
Cc: DRI-Devel@Lists.FreeDesktop.Org; Harrison, John C <john.c.harrison@intel.com>; Ceraolo Spurio, Daniele <daniele.ceraolospurio@intel.com>; Mrozek, Michal <michal.mrozek@intel.com>
Subject: [PATCH v3 4/4] drm/i915: Improve long running OCL w/a for GuC submission

From: John Harrison <John.C.Harrison@Intel.com>

A workaround was added to the driver to allow OpenCL workloads to run 'forever' by disabling pre-emption on the RCS engine for Gen12.
It is not totally unbound as the heartbeat will kick in eventually and cause a reset of the hung engine.

However, this does not work well in GuC submission mode. In GuC mode, the pre-emption timeout is how GuC detects hung contexts and triggers a per engine reset. Thus, disabling the timeout means also losing all per engine reset ability. A full GT reset will still occur when the heartbeat finally expires, but that is a much more destructive and undesirable mechanism.

The purpose of the workaround is actually to give OpenCL tasks longer to reach a pre-emption point after a pre-emption request has been issued. This is necessary because Gen12 does not support mid-thread pre-emption and OpenCL can have long running threads.

So, rather than disabling the timeout completely, just set it to a 'long' value.

v2: Review feedback from Tvrtko - must hard code the 'long' value instead of determining it algorithmically. So make it an extra CONFIG definition. Also, remove the execlist centric comment from the existing pre-emption timeout CONFIG option given that it applies to more than just execlists.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> (v1)
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
---
 drivers/gpu/drm/i915/Kconfig.profile      | 26 +++++++++++++++++++----
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  9 ++++++--
 2 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
index 39328567c200..7cc38d25ee5c 100644
--- a/drivers/gpu/drm/i915/Kconfig.profile
+++ b/drivers/gpu/drm/i915/Kconfig.profile
@@ -57,10 +57,28 @@ config DRM_I915_PREEMPT_TIMEOUT
 	default 640 # milliseconds
 	help
 	  How long to wait (in milliseconds) for a preemption event to occur
-	  when submitting a new context via execlists. If the current context
-	  does not hit an arbitration point and yield to HW before the timer
-	  expires, the HW will be reset to allow the more important context
-	  to execute.
+	  when submitting a new context. If the current context does not hit
+	  an arbitration point and yield to HW before the timer expires, the
+	  HW will be reset to allow the more important context to execute.
+
+	  This is adjustable via
+	  /sys/class/drm/card?/engine/*/preempt_timeout_ms
+
+	  May be 0 to disable the timeout.
+
+	  The compiled in default may get overridden at driver probe time on
+	  certain platforms and certain engines which will be reflected in the
+	  sysfs control.
+
+config DRM_I915_PREEMPT_TIMEOUT_COMPUTE
+	int "Preempt timeout for compute engines (ms, jiffy granularity)"
+	default 7500 # milliseconds
+	help
+	  How long to wait (in milliseconds) for a preemption event to occur
+	  when submitting a new context to a compute capable engine. If the
+	  current context does not hit an arbitration point and yield to HW
+	  before the timer expires, the HW will be reset to allow the more
+	  important context to execute.
 
 	  This is adjustable via
 	  /sys/class/drm/card?/engine/*/preempt_timeout_ms
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 4185c7338581..cc0954ad836a 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -438,9 +438,14 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
 	engine->props.timeslice_duration_ms =
 		CONFIG_DRM_I915_TIMESLICE_DURATION;
 
-	/* Override to uninterruptible for OpenCL workloads. */
+	/*
+	 * Mid-thread pre-emption is not available in Gen12. Unfortunately,
+	 * some OpenCL workloads run quite long threads. That means they get
+	 * reset due to not pre-empting in a timely manner. So, bump the
+	 * pre-emption timeout value to be much higher for compute engines.
+	 */
 	if (GRAPHICS_VER(i915) == 12 && (engine->flags & I915_ENGINE_HAS_RCS_REG_STATE))
-		engine->props.preempt_timeout_ms = 0;
+		engine->props.preempt_timeout_ms = 
+CONFIG_DRM_I915_PREEMPT_TIMEOUT_COMPUTE;
 
 	/* Cap properties according to any system limits */  #define CLAMP_PROP(field) \
--
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [Intel-gfx] [PATCH v3 4/4] drm/i915: Improve long running OCL w/a for GuC submission
  2022-03-03 22:37   ` [Intel-gfx] " John.C.Harrison
  (?)
  (?)
@ 2022-03-08  9:41   ` Tvrtko Ursulin
  2022-03-09 21:16     ` John Harrison
  -1 siblings, 1 reply; 27+ messages in thread
From: Tvrtko Ursulin @ 2022-03-08  9:41 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX; +Cc: Michal Mrozek, DRI-Devel


On 03/03/2022 22:37, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> A workaround was added to the driver to allow OpenCL workloads to run
> 'forever' by disabling pre-emption on the RCS engine for Gen12.
> It is not totally unbound as the heartbeat will kick in eventually
> and cause a reset of the hung engine.
> 
> However, this does not work well in GuC submission mode. In GuC mode,
> the pre-emption timeout is how GuC detects hung contexts and triggers
> a per engine reset. Thus, disabling the timeout means also losing all
> per engine reset ability. A full GT reset will still occur when the
> heartbeat finally expires, but that is a much more destructive and
> undesirable mechanism.
> 
> The purpose of the workaround is actually to give OpenCL tasks longer
> to reach a pre-emption point after a pre-emption request has been
> issued. This is necessary because Gen12 does not support mid-thread
> pre-emption and OpenCL can have long running threads.
> 
> So, rather than disabling the timeout completely, just set it to a
> 'long' value.
> 
> v2: Review feedback from Tvrtko - must hard code the 'long' value
> instead of determining it algorithmically. So make it an extra CONFIG
> definition. Also, remove the execlist centric comment from the
> existing pre-emption timeout CONFIG option given that it applies to
> more than just execlists.
> 
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> (v1)
> Acked-by: Michal Mrozek <michal.mrozek@intel.com>
> ---
>   drivers/gpu/drm/i915/Kconfig.profile      | 26 +++++++++++++++++++----
>   drivers/gpu/drm/i915/gt/intel_engine_cs.c |  9 ++++++--
>   2 files changed, 29 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
> index 39328567c200..7cc38d25ee5c 100644
> --- a/drivers/gpu/drm/i915/Kconfig.profile
> +++ b/drivers/gpu/drm/i915/Kconfig.profile
> @@ -57,10 +57,28 @@ config DRM_I915_PREEMPT_TIMEOUT
>   	default 640 # milliseconds
>   	help
>   	  How long to wait (in milliseconds) for a preemption event to occur
> -	  when submitting a new context via execlists. If the current context
> -	  does not hit an arbitration point and yield to HW before the timer
> -	  expires, the HW will be reset to allow the more important context
> -	  to execute.
> +	  when submitting a new context. If the current context does not hit
> +	  an arbitration point and yield to HW before the timer expires, the
> +	  HW will be reset to allow the more important context to execute.
> +
> +	  This is adjustable via
> +	  /sys/class/drm/card?/engine/*/preempt_timeout_ms
> +
> +	  May be 0 to disable the timeout.
> +
> +	  The compiled in default may get overridden at driver probe time on
> +	  certain platforms and certain engines which will be reflected in the
> +	  sysfs control.
> +
> +config DRM_I915_PREEMPT_TIMEOUT_COMPUTE
> +	int "Preempt timeout for compute engines (ms, jiffy granularity)"
> +	default 7500 # milliseconds
> +	help
> +	  How long to wait (in milliseconds) for a preemption event to occur
> +	  when submitting a new context to a compute capable engine. If the
> +	  current context does not hit an arbitration point and yield to HW
> +	  before the timer expires, the HW will be reset to allow the more
> +	  important context to execute.
>   
>   	  This is adjustable via
>   	  /sys/class/drm/card?/engine/*/preempt_timeout_ms
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 4185c7338581..cc0954ad836a 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -438,9 +438,14 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
>   	engine->props.timeslice_duration_ms =
>   		CONFIG_DRM_I915_TIMESLICE_DURATION;
>   
> -	/* Override to uninterruptible for OpenCL workloads. */
> +	/*
> +	 * Mid-thread pre-emption is not available in Gen12. Unfortunately,
> +	 * some OpenCL workloads run quite long threads. That means they get
> +	 * reset due to not pre-empting in a timely manner. So, bump the
> +	 * pre-emption timeout value to be much higher for compute engines.
> +	 */
>   	if (GRAPHICS_VER(i915) == 12 && (engine->flags & I915_ENGINE_HAS_RCS_REG_STATE))
> -		engine->props.preempt_timeout_ms = 0;
> +		engine->props.preempt_timeout_ms = CONFIG_DRM_I915_PREEMPT_TIMEOUT_COMPUTE;

I wouldn't go as far as adding a config option since as it is it only 
applies to Gen12 but Kconfig text says nothing about that. And I am not 
saying you should add a Gen12 specific config option, that would be 
weird. So IMO just drop it.

Regards,

Tvrtko

>   
>   	/* Cap properties according to any system limits */
>   #define CLAMP_PROP(field) \

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Intel-gfx] [PATCH v3 1/4] drm/i915/guc: Limit scheduling properties to avoid overflow
  2022-03-03 22:37   ` [Intel-gfx] " John.C.Harrison
  (?)
@ 2022-03-08  9:43   ` Tvrtko Ursulin
  2022-03-09 21:10     ` John Harrison
  -1 siblings, 1 reply; 27+ messages in thread
From: Tvrtko Ursulin @ 2022-03-08  9:43 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX; +Cc: DRI-Devel



On 03/03/2022 22:37, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> GuC converts the pre-emption timeout and timeslice quantum values into
> clock ticks internally. That significantly reduces the point of 32bit
> overflow. On current platforms, worst case scenario is approximately
> 110 seconds. Rather than allowing the user to set higher values and
> then get confused by early timeouts, add limits when setting these
> values.
> 
> v2: Add helper functins for clamping (review feedback from Tvrtko).
> 
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> (v1)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index b3a429a92c0d..8208164c25e7 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -2218,13 +2218,24 @@ static inline u32 get_children_join_value(struct intel_context *ce,
  static void guc_context_policy_init(struct intel_engine_cs *engine,
                                     struct guc_lrc_desc *desc)
  {
+       struct drm_device *drm = &engine->i915->drm;
+
         desc->policy_flags = 0;
  
         if (engine->flags & I915_ENGINE_WANT_FORCED_PREEMPTION)
                 desc->policy_flags |= CONTEXT_POLICY_FLAG_PREEMPT_TO_IDLE;
  
         /* NB: For both of these, zero means disabled. */
+       if (overflows_type(engine->props.timeslice_duration_ms * 1000,
+                          desc->execution_quantum))
+               drm_warn_once(drm, "GuC interface cannot support %lums timeslice!\n",
+                             engine->props.timeslice_duration_ms);
         desc->execution_quantum = engine->props.timeslice_duration_ms * 1000;
+
+       if (overflows_type(engine->props.preempt_timeout_ms * 1000,
+                          desc->preemption_timeout))
+               drm_warn_once(drm, "GuC interface cannot support %lums preemption timeout!\n",
+                             engine->props.preempt_timeout_ms);
         desc->preemption_timeout = engine->props.preempt_timeout_ms * 1000;
  }
  

With that:

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

> ---
>   drivers/gpu/drm/i915/gt/intel_engine.h      |  6 ++
>   drivers/gpu/drm/i915/gt/intel_engine_cs.c   | 69 +++++++++++++++++++++
>   drivers/gpu/drm/i915/gt/sysfs_engines.c     | 25 +++++---
>   drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h |  9 +++
>   4 files changed, 99 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
> index 1c0ab05c3c40..d7044c4e526e 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine.h
> @@ -351,4 +351,10 @@ intel_engine_get_hung_context(struct intel_engine_cs *engine)
>   	return engine->hung_ce;
>   }
>   
> +u64 intel_clamp_heartbeat_interval_ms(struct intel_engine_cs *engine, u64 value);
> +u64 intel_clamp_max_busywait_duration_ns(struct intel_engine_cs *engine, u64 value);
> +u64 intel_clamp_preempt_timeout_ms(struct intel_engine_cs *engine, u64 value);
> +u64 intel_clamp_stop_timeout_ms(struct intel_engine_cs *engine, u64 value);
> +u64 intel_clamp_timeslice_duration_ms(struct intel_engine_cs *engine, u64 value);
> +
>   #endif /* _INTEL_RINGBUFFER_H_ */
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 7447411a5b26..22e70e4e007c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -442,6 +442,26 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
>   		engine->flags |= I915_ENGINE_HAS_EU_PRIORITY;
>   	}
>   
> +	/* Cap properties according to any system limits */
> +#define CLAMP_PROP(field) \
> +	do { \
> +		u64 clamp = intel_clamp_##field(engine, engine->props.field); \
> +		if (clamp != engine->props.field) { \
> +			drm_notice(&engine->i915->drm, \
> +				   "Warning, clamping %s to %lld to prevent overflow\n", \
> +				   #field, clamp); \
> +			engine->props.field = clamp; \
> +		} \
> +	} while (0)
> +
> +	CLAMP_PROP(heartbeat_interval_ms);
> +	CLAMP_PROP(max_busywait_duration_ns);
> +	CLAMP_PROP(preempt_timeout_ms);
> +	CLAMP_PROP(stop_timeout_ms);
> +	CLAMP_PROP(timeslice_duration_ms);
> +
> +#undef CLAMP_PROP
> +
>   	engine->defaults = engine->props; /* never to change again */
>   
>   	engine->context_size = intel_engine_context_size(gt, engine->class);
> @@ -464,6 +484,55 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
>   	return 0;
>   }
>   
> +u64 intel_clamp_heartbeat_interval_ms(struct intel_engine_cs *engine, u64 value)
> +{
> +	value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
> +
> +	return value;
> +}
> +
> +u64 intel_clamp_max_busywait_duration_ns(struct intel_engine_cs *engine, u64 value)
> +{
> +	value = min(value, jiffies_to_nsecs(2));
> +
> +	return value;
> +}
> +
> +u64 intel_clamp_preempt_timeout_ms(struct intel_engine_cs *engine, u64 value)
> +{
> +	/*
> +	 * NB: The GuC API only supports 32bit values. However, the limit is further
> +	 * reduced due to internal calculations which would otherwise overflow.
> +	 */
> +	if (intel_guc_submission_is_wanted(&engine->gt->uc.guc))
> +		value = min_t(u64, value, GUC_POLICY_MAX_PREEMPT_TIMEOUT_MS);
> +
> +	value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
> +
> +	return value;
> +}
> +
> +u64 intel_clamp_stop_timeout_ms(struct intel_engine_cs *engine, u64 value)
> +{
> +	value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
> +
> +	return value;
> +}
> +
> +u64 intel_clamp_timeslice_duration_ms(struct intel_engine_cs *engine, u64 value)
> +{
> +	/*
> +	 * NB: The GuC API only supports 32bit values. However, the limit is further
> +	 * reduced due to internal calculations which would otherwise overflow.
> +	 */
> +	if (intel_guc_submission_is_wanted(&engine->gt->uc.guc))
> +		value = min_t(u64, value, GUC_POLICY_MAX_EXEC_QUANTUM_MS);
> +
> +	value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
> +
> +	return value;
> +}
> +
>   static void __setup_engine_capabilities(struct intel_engine_cs *engine)
>   {
>   	struct drm_i915_private *i915 = engine->i915;
> diff --git a/drivers/gpu/drm/i915/gt/sysfs_engines.c b/drivers/gpu/drm/i915/gt/sysfs_engines.c
> index 967031056202..f2d9858d827c 100644
> --- a/drivers/gpu/drm/i915/gt/sysfs_engines.c
> +++ b/drivers/gpu/drm/i915/gt/sysfs_engines.c
> @@ -144,7 +144,7 @@ max_spin_store(struct kobject *kobj, struct kobj_attribute *attr,
>   	       const char *buf, size_t count)
>   {
>   	struct intel_engine_cs *engine = kobj_to_engine(kobj);
> -	unsigned long long duration;
> +	unsigned long long duration, clamped;
>   	int err;
>   
>   	/*
> @@ -168,7 +168,8 @@ max_spin_store(struct kobject *kobj, struct kobj_attribute *attr,
>   	if (err)
>   		return err;
>   
> -	if (duration > jiffies_to_nsecs(2))
> +	clamped = intel_clamp_max_busywait_duration_ns(engine, duration);
> +	if (duration != clamped)
>   		return -EINVAL;
>   
>   	WRITE_ONCE(engine->props.max_busywait_duration_ns, duration);
> @@ -203,7 +204,7 @@ timeslice_store(struct kobject *kobj, struct kobj_attribute *attr,
>   		const char *buf, size_t count)
>   {
>   	struct intel_engine_cs *engine = kobj_to_engine(kobj);
> -	unsigned long long duration;
> +	unsigned long long duration, clamped;
>   	int err;
>   
>   	/*
> @@ -218,7 +219,8 @@ timeslice_store(struct kobject *kobj, struct kobj_attribute *attr,
>   	if (err)
>   		return err;
>   
> -	if (duration > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
> +	clamped = intel_clamp_timeslice_duration_ms(engine, duration);
> +	if (duration != clamped)
>   		return -EINVAL;
>   
>   	WRITE_ONCE(engine->props.timeslice_duration_ms, duration);
> @@ -256,7 +258,7 @@ stop_store(struct kobject *kobj, struct kobj_attribute *attr,
>   	   const char *buf, size_t count)
>   {
>   	struct intel_engine_cs *engine = kobj_to_engine(kobj);
> -	unsigned long long duration;
> +	unsigned long long duration, clamped;
>   	int err;
>   
>   	/*
> @@ -272,7 +274,8 @@ stop_store(struct kobject *kobj, struct kobj_attribute *attr,
>   	if (err)
>   		return err;
>   
> -	if (duration > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
> +	clamped = intel_clamp_stop_timeout_ms(engine, duration);
> +	if (duration != clamped)
>   		return -EINVAL;
>   
>   	WRITE_ONCE(engine->props.stop_timeout_ms, duration);
> @@ -306,7 +309,7 @@ preempt_timeout_store(struct kobject *kobj, struct kobj_attribute *attr,
>   		      const char *buf, size_t count)
>   {
>   	struct intel_engine_cs *engine = kobj_to_engine(kobj);
> -	unsigned long long timeout;
> +	unsigned long long timeout, clamped;
>   	int err;
>   
>   	/*
> @@ -322,7 +325,8 @@ preempt_timeout_store(struct kobject *kobj, struct kobj_attribute *attr,
>   	if (err)
>   		return err;
>   
> -	if (timeout > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
> +	clamped = intel_clamp_preempt_timeout_ms(engine, timeout);
> +	if (timeout != clamped)
>   		return -EINVAL;
>   
>   	WRITE_ONCE(engine->props.preempt_timeout_ms, timeout);
> @@ -362,7 +366,7 @@ heartbeat_store(struct kobject *kobj, struct kobj_attribute *attr,
>   		const char *buf, size_t count)
>   {
>   	struct intel_engine_cs *engine = kobj_to_engine(kobj);
> -	unsigned long long delay;
> +	unsigned long long delay, clamped;
>   	int err;
>   
>   	/*
> @@ -379,7 +383,8 @@ heartbeat_store(struct kobject *kobj, struct kobj_attribute *attr,
>   	if (err)
>   		return err;
>   
> -	if (delay >= jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
> +	clamped = intel_clamp_heartbeat_interval_ms(engine, delay);
> +	if (delay != clamped)
>   		return -EINVAL;
>   
>   	err = intel_engine_set_heartbeat(engine, delay);
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
> index 4b300b6cc0f9..a2d574f2fdd5 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
> @@ -262,6 +262,15 @@ struct guc_lrc_desc {
>   
>   #define GLOBAL_POLICY_DEFAULT_DPC_PROMOTE_TIME_US 500000
>   
> +/*
> + * GuC converts the timeout to clock ticks internally. Different platforms have
> + * different GuC clocks. Thus, the maximum value before overflow is platform
> + * dependent. Current worst case scenario is about 110s. So, limit to 100s to be
> + * safe.
> + */
> +#define GUC_POLICY_MAX_EXEC_QUANTUM_MS		(100 * 1000)
> +#define GUC_POLICY_MAX_PREEMPT_TIMEOUT_MS	(100 * 1000)
> +
>   struct guc_policies {
>   	u32 submission_queue_depth[GUC_MAX_ENGINE_CLASSES];
>   	/* In micro seconds. How much time to allow before DPC processing is

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: failure for Improve anti-pre-emption w/a for compute workloads (rev6)
  2022-03-03 22:37 ` [Intel-gfx] " John.C.Harrison
                   ` (8 preceding siblings ...)
  (?)
@ 2022-03-08 22:34 ` Patchwork
  -1 siblings, 0 replies; 27+ messages in thread
From: Patchwork @ 2022-03-08 22:34 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: Improve anti-pre-emption w/a for compute workloads (rev6)
URL   : https://patchwork.freedesktop.org/series/100428/
State : failure

== Summary ==

Applying: drm/i915/guc: Limit scheduling properties to avoid overflow
error: patch failed: drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:2218
error: drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c: patch does not apply
error: Did you hand edit your patch?
It does not apply to blobs recorded in its index.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
Patch failed at 0001 drm/i915/guc: Limit scheduling properties to avoid overflow
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Intel-gfx] [PATCH v3 1/4] drm/i915/guc: Limit scheduling properties to avoid overflow
  2022-03-08  9:43   ` Tvrtko Ursulin
@ 2022-03-09 21:10     ` John Harrison
  0 siblings, 0 replies; 27+ messages in thread
From: John Harrison @ 2022-03-09 21:10 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-GFX; +Cc: DRI-Devel

On 3/8/2022 01:43, Tvrtko Ursulin wrote:
> On 03/03/2022 22:37, John.C.Harrison@Intel.com wrote:
>> From: John Harrison <John.C.Harrison@Intel.com>
>>
>> GuC converts the pre-emption timeout and timeslice quantum values into
>> clock ticks internally. That significantly reduces the point of 32bit
>> overflow. On current platforms, worst case scenario is approximately
>> 110 seconds. Rather than allowing the user to set higher values and
>> then get confused by early timeouts, add limits when setting these
>> values.
>>
>> v2: Add helper functins for clamping (review feedback from Tvrtko).
>>
>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> 
>> (v1)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index b3a429a92c0d..8208164c25e7 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -2218,13 +2218,24 @@ static inline u32 
> get_children_join_value(struct intel_context *ce,
>  static void guc_context_policy_init(struct intel_engine_cs *engine,
>                                     struct guc_lrc_desc *desc)
>  {
> +       struct drm_device *drm = &engine->i915->drm;
> +
>         desc->policy_flags = 0;
>
>         if (engine->flags & I915_ENGINE_WANT_FORCED_PREEMPTION)
>                 desc->policy_flags |= 
> CONTEXT_POLICY_FLAG_PREEMPT_TO_IDLE;
>
>         /* NB: For both of these, zero means disabled. */
> +       if (overflows_type(engine->props.timeslice_duration_ms * 1000,
> +                          desc->execution_quantum))
> +               drm_warn_once(drm, "GuC interface cannot support %lums 
> timeslice!\n",
> + engine->props.timeslice_duration_ms);
>         desc->execution_quantum = engine->props.timeslice_duration_ms 
> * 1000;
> +
> +       if (overflows_type(engine->props.preempt_timeout_ms * 1000,
> +                          desc->preemption_timeout))
> +               drm_warn_once(drm, "GuC interface cannot support %lums 
> preemption timeout!\n",
> + engine->props.preempt_timeout_ms);
>         desc->preemption_timeout = engine->props.preempt_timeout_ms * 
> 1000;
>  }
As previously explained, this is wrong. If the check must be present 
then it should be a BUG_ON as it is indicative of an internal driver 
failure. There is already a top level helper function for ensuring all 
range checks are done and the value is valid. If that is broken then 
that is a bug and should have been caught in pre-merge testing or code 
review. It is not possible for a bad value to get beyond that helper 
function. That is the whole point of the helper. We do not double bag 
every other value check in the driver. Once you have passed input 
validation, the values are assumed to be correct. Otherwise we would 
have every other line of code be a value check! And if somehow a bad 
value did make it through, simply printing a once shot warning is 
pointless. You are still going to get undefined behaviour potentially 
leading to a totally broken system. E.g. your very big timeout has 
overflowed and become extremely small, thus no batch buffer can ever 
complete because they all get reset before they have even finished the 
context switch in. That is a fundamentally broken system.

John.


>
>
> With that:
>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> Regards,
>
> Tvrtko
>
>> ---
>>   drivers/gpu/drm/i915/gt/intel_engine.h      |  6 ++
>>   drivers/gpu/drm/i915/gt/intel_engine_cs.c   | 69 +++++++++++++++++++++
>>   drivers/gpu/drm/i915/gt/sysfs_engines.c     | 25 +++++---
>>   drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h |  9 +++
>>   4 files changed, 99 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h 
>> b/drivers/gpu/drm/i915/gt/intel_engine.h
>> index 1c0ab05c3c40..d7044c4e526e 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_engine.h
>> +++ b/drivers/gpu/drm/i915/gt/intel_engine.h
>> @@ -351,4 +351,10 @@ intel_engine_get_hung_context(struct 
>> intel_engine_cs *engine)
>>       return engine->hung_ce;
>>   }
>>   +u64 intel_clamp_heartbeat_interval_ms(struct intel_engine_cs 
>> *engine, u64 value);
>> +u64 intel_clamp_max_busywait_duration_ns(struct intel_engine_cs 
>> *engine, u64 value);
>> +u64 intel_clamp_preempt_timeout_ms(struct intel_engine_cs *engine, 
>> u64 value);
>> +u64 intel_clamp_stop_timeout_ms(struct intel_engine_cs *engine, u64 
>> value);
>> +u64 intel_clamp_timeslice_duration_ms(struct intel_engine_cs 
>> *engine, u64 value);
>> +
>>   #endif /* _INTEL_RINGBUFFER_H_ */
>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>> index 7447411a5b26..22e70e4e007c 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>> @@ -442,6 +442,26 @@ static int intel_engine_setup(struct intel_gt 
>> *gt, enum intel_engine_id id,
>>           engine->flags |= I915_ENGINE_HAS_EU_PRIORITY;
>>       }
>>   +    /* Cap properties according to any system limits */
>> +#define CLAMP_PROP(field) \
>> +    do { \
>> +        u64 clamp = intel_clamp_##field(engine, engine->props.field); \
>> +        if (clamp != engine->props.field) { \
>> +            drm_notice(&engine->i915->drm, \
>> +                   "Warning, clamping %s to %lld to prevent 
>> overflow\n", \
>> +                   #field, clamp); \
>> +            engine->props.field = clamp; \
>> +        } \
>> +    } while (0)
>> +
>> +    CLAMP_PROP(heartbeat_interval_ms);
>> +    CLAMP_PROP(max_busywait_duration_ns);
>> +    CLAMP_PROP(preempt_timeout_ms);
>> +    CLAMP_PROP(stop_timeout_ms);
>> +    CLAMP_PROP(timeslice_duration_ms);
>> +
>> +#undef CLAMP_PROP
>> +
>>       engine->defaults = engine->props; /* never to change again */
>>         engine->context_size = intel_engine_context_size(gt, 
>> engine->class);
>> @@ -464,6 +484,55 @@ static int intel_engine_setup(struct intel_gt 
>> *gt, enum intel_engine_id id,
>>       return 0;
>>   }
>>   +u64 intel_clamp_heartbeat_interval_ms(struct intel_engine_cs 
>> *engine, u64 value)
>> +{
>> +    value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
>> +
>> +    return value;
>> +}
>> +
>> +u64 intel_clamp_max_busywait_duration_ns(struct intel_engine_cs 
>> *engine, u64 value)
>> +{
>> +    value = min(value, jiffies_to_nsecs(2));
>> +
>> +    return value;
>> +}
>> +
>> +u64 intel_clamp_preempt_timeout_ms(struct intel_engine_cs *engine, 
>> u64 value)
>> +{
>> +    /*
>> +     * NB: The GuC API only supports 32bit values. However, the 
>> limit is further
>> +     * reduced due to internal calculations which would otherwise 
>> overflow.
>> +     */
>> +    if (intel_guc_submission_is_wanted(&engine->gt->uc.guc))
>> +        value = min_t(u64, value, GUC_POLICY_MAX_PREEMPT_TIMEOUT_MS);
>> +
>> +    value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
>> +
>> +    return value;
>> +}
>> +
>> +u64 intel_clamp_stop_timeout_ms(struct intel_engine_cs *engine, u64 
>> value)
>> +{
>> +    value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
>> +
>> +    return value;
>> +}
>> +
>> +u64 intel_clamp_timeslice_duration_ms(struct intel_engine_cs 
>> *engine, u64 value)
>> +{
>> +    /*
>> +     * NB: The GuC API only supports 32bit values. However, the 
>> limit is further
>> +     * reduced due to internal calculations which would otherwise 
>> overflow.
>> +     */
>> +    if (intel_guc_submission_is_wanted(&engine->gt->uc.guc))
>> +        value = min_t(u64, value, GUC_POLICY_MAX_EXEC_QUANTUM_MS);
>> +
>> +    value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT));
>> +
>> +    return value;
>> +}
>> +
>>   static void __setup_engine_capabilities(struct intel_engine_cs 
>> *engine)
>>   {
>>       struct drm_i915_private *i915 = engine->i915;
>> diff --git a/drivers/gpu/drm/i915/gt/sysfs_engines.c 
>> b/drivers/gpu/drm/i915/gt/sysfs_engines.c
>> index 967031056202..f2d9858d827c 100644
>> --- a/drivers/gpu/drm/i915/gt/sysfs_engines.c
>> +++ b/drivers/gpu/drm/i915/gt/sysfs_engines.c
>> @@ -144,7 +144,7 @@ max_spin_store(struct kobject *kobj, struct 
>> kobj_attribute *attr,
>>              const char *buf, size_t count)
>>   {
>>       struct intel_engine_cs *engine = kobj_to_engine(kobj);
>> -    unsigned long long duration;
>> +    unsigned long long duration, clamped;
>>       int err;
>>         /*
>> @@ -168,7 +168,8 @@ max_spin_store(struct kobject *kobj, struct 
>> kobj_attribute *attr,
>>       if (err)
>>           return err;
>>   -    if (duration > jiffies_to_nsecs(2))
>> +    clamped = intel_clamp_max_busywait_duration_ns(engine, duration);
>> +    if (duration != clamped)
>>           return -EINVAL;
>>         WRITE_ONCE(engine->props.max_busywait_duration_ns, duration);
>> @@ -203,7 +204,7 @@ timeslice_store(struct kobject *kobj, struct 
>> kobj_attribute *attr,
>>           const char *buf, size_t count)
>>   {
>>       struct intel_engine_cs *engine = kobj_to_engine(kobj);
>> -    unsigned long long duration;
>> +    unsigned long long duration, clamped;
>>       int err;
>>         /*
>> @@ -218,7 +219,8 @@ timeslice_store(struct kobject *kobj, struct 
>> kobj_attribute *attr,
>>       if (err)
>>           return err;
>>   -    if (duration > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
>> +    clamped = intel_clamp_timeslice_duration_ms(engine, duration);
>> +    if (duration != clamped)
>>           return -EINVAL;
>>         WRITE_ONCE(engine->props.timeslice_duration_ms, duration);
>> @@ -256,7 +258,7 @@ stop_store(struct kobject *kobj, struct 
>> kobj_attribute *attr,
>>          const char *buf, size_t count)
>>   {
>>       struct intel_engine_cs *engine = kobj_to_engine(kobj);
>> -    unsigned long long duration;
>> +    unsigned long long duration, clamped;
>>       int err;
>>         /*
>> @@ -272,7 +274,8 @@ stop_store(struct kobject *kobj, struct 
>> kobj_attribute *attr,
>>       if (err)
>>           return err;
>>   -    if (duration > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
>> +    clamped = intel_clamp_stop_timeout_ms(engine, duration);
>> +    if (duration != clamped)
>>           return -EINVAL;
>>         WRITE_ONCE(engine->props.stop_timeout_ms, duration);
>> @@ -306,7 +309,7 @@ preempt_timeout_store(struct kobject *kobj, 
>> struct kobj_attribute *attr,
>>                 const char *buf, size_t count)
>>   {
>>       struct intel_engine_cs *engine = kobj_to_engine(kobj);
>> -    unsigned long long timeout;
>> +    unsigned long long timeout, clamped;
>>       int err;
>>         /*
>> @@ -322,7 +325,8 @@ preempt_timeout_store(struct kobject *kobj, 
>> struct kobj_attribute *attr,
>>       if (err)
>>           return err;
>>   -    if (timeout > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
>> +    clamped = intel_clamp_preempt_timeout_ms(engine, timeout);
>> +    if (timeout != clamped)
>>           return -EINVAL;
>>         WRITE_ONCE(engine->props.preempt_timeout_ms, timeout);
>> @@ -362,7 +366,7 @@ heartbeat_store(struct kobject *kobj, struct 
>> kobj_attribute *attr,
>>           const char *buf, size_t count)
>>   {
>>       struct intel_engine_cs *engine = kobj_to_engine(kobj);
>> -    unsigned long long delay;
>> +    unsigned long long delay, clamped;
>>       int err;
>>         /*
>> @@ -379,7 +383,8 @@ heartbeat_store(struct kobject *kobj, struct 
>> kobj_attribute *attr,
>>       if (err)
>>           return err;
>>   -    if (delay >= jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT))
>> +    clamped = intel_clamp_heartbeat_interval_ms(engine, delay);
>> +    if (delay != clamped)
>>           return -EINVAL;
>>         err = intel_engine_set_heartbeat(engine, delay);
>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h 
>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
>> index 4b300b6cc0f9..a2d574f2fdd5 100644
>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
>> @@ -262,6 +262,15 @@ struct guc_lrc_desc {
>>     #define GLOBAL_POLICY_DEFAULT_DPC_PROMOTE_TIME_US 500000
>>   +/*
>> + * GuC converts the timeout to clock ticks internally. Different 
>> platforms have
>> + * different GuC clocks. Thus, the maximum value before overflow is 
>> platform
>> + * dependent. Current worst case scenario is about 110s. So, limit 
>> to 100s to be
>> + * safe.
>> + */
>> +#define GUC_POLICY_MAX_EXEC_QUANTUM_MS        (100 * 1000)
>> +#define GUC_POLICY_MAX_PREEMPT_TIMEOUT_MS    (100 * 1000)
>> +
>>   struct guc_policies {
>>       u32 submission_queue_depth[GUC_MAX_ENGINE_CLASSES];
>>       /* In micro seconds. How much time to allow before DPC 
>> processing is


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Intel-gfx] [PATCH v3 4/4] drm/i915: Improve long running OCL w/a for GuC submission
  2022-03-08  9:41   ` Tvrtko Ursulin
@ 2022-03-09 21:16     ` John Harrison
  2022-03-10  9:27       ` Tvrtko Ursulin
  0 siblings, 1 reply; 27+ messages in thread
From: John Harrison @ 2022-03-09 21:16 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-GFX; +Cc: Michal Mrozek, DRI-Devel

On 3/8/2022 01:41, Tvrtko Ursulin wrote:
> On 03/03/2022 22:37, John.C.Harrison@Intel.com wrote:
>> From: John Harrison <John.C.Harrison@Intel.com>
>>
>> A workaround was added to the driver to allow OpenCL workloads to run
>> 'forever' by disabling pre-emption on the RCS engine for Gen12.
>> It is not totally unbound as the heartbeat will kick in eventually
>> and cause a reset of the hung engine.
>>
>> However, this does not work well in GuC submission mode. In GuC mode,
>> the pre-emption timeout is how GuC detects hung contexts and triggers
>> a per engine reset. Thus, disabling the timeout means also losing all
>> per engine reset ability. A full GT reset will still occur when the
>> heartbeat finally expires, but that is a much more destructive and
>> undesirable mechanism.
>>
>> The purpose of the workaround is actually to give OpenCL tasks longer
>> to reach a pre-emption point after a pre-emption request has been
>> issued. This is necessary because Gen12 does not support mid-thread
>> pre-emption and OpenCL can have long running threads.
>>
>> So, rather than disabling the timeout completely, just set it to a
>> 'long' value.
>>
>> v2: Review feedback from Tvrtko - must hard code the 'long' value
>> instead of determining it algorithmically. So make it an extra CONFIG
>> definition. Also, remove the execlist centric comment from the
>> existing pre-emption timeout CONFIG option given that it applies to
>> more than just execlists.
>>
>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> 
>> (v1)
>> Acked-by: Michal Mrozek <michal.mrozek@intel.com>
>> ---
>>   drivers/gpu/drm/i915/Kconfig.profile      | 26 +++++++++++++++++++----
>>   drivers/gpu/drm/i915/gt/intel_engine_cs.c |  9 ++++++--
>>   2 files changed, 29 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/Kconfig.profile 
>> b/drivers/gpu/drm/i915/Kconfig.profile
>> index 39328567c200..7cc38d25ee5c 100644
>> --- a/drivers/gpu/drm/i915/Kconfig.profile
>> +++ b/drivers/gpu/drm/i915/Kconfig.profile
>> @@ -57,10 +57,28 @@ config DRM_I915_PREEMPT_TIMEOUT
>>       default 640 # milliseconds
>>       help
>>         How long to wait (in milliseconds) for a preemption event to 
>> occur
>> -      when submitting a new context via execlists. If the current 
>> context
>> -      does not hit an arbitration point and yield to HW before the 
>> timer
>> -      expires, the HW will be reset to allow the more important context
>> -      to execute.
>> +      when submitting a new context. If the current context does not 
>> hit
>> +      an arbitration point and yield to HW before the timer expires, 
>> the
>> +      HW will be reset to allow the more important context to execute.
>> +
>> +      This is adjustable via
>> +      /sys/class/drm/card?/engine/*/preempt_timeout_ms
>> +
>> +      May be 0 to disable the timeout.
>> +
>> +      The compiled in default may get overridden at driver probe 
>> time on
>> +      certain platforms and certain engines which will be reflected 
>> in the
>> +      sysfs control.
>> +
>> +config DRM_I915_PREEMPT_TIMEOUT_COMPUTE
>> +    int "Preempt timeout for compute engines (ms, jiffy granularity)"
>> +    default 7500 # milliseconds
>> +    help
>> +      How long to wait (in milliseconds) for a preemption event to 
>> occur
>> +      when submitting a new context to a compute capable engine. If the
>> +      current context does not hit an arbitration point and yield to HW
>> +      before the timer expires, the HW will be reset to allow the more
>> +      important context to execute.
>>           This is adjustable via
>>         /sys/class/drm/card?/engine/*/preempt_timeout_ms
>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>> index 4185c7338581..cc0954ad836a 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>> @@ -438,9 +438,14 @@ static int intel_engine_setup(struct intel_gt 
>> *gt, enum intel_engine_id id,
>>       engine->props.timeslice_duration_ms =
>>           CONFIG_DRM_I915_TIMESLICE_DURATION;
>>   -    /* Override to uninterruptible for OpenCL workloads. */
>> +    /*
>> +     * Mid-thread pre-emption is not available in Gen12. Unfortunately,
>> +     * some OpenCL workloads run quite long threads. That means they 
>> get
>> +     * reset due to not pre-empting in a timely manner. So, bump the
>> +     * pre-emption timeout value to be much higher for compute engines.
>> +     */
>>       if (GRAPHICS_VER(i915) == 12 && (engine->flags & 
>> I915_ENGINE_HAS_RCS_REG_STATE))
>> -        engine->props.preempt_timeout_ms = 0;
>> +        engine->props.preempt_timeout_ms = 
>> CONFIG_DRM_I915_PREEMPT_TIMEOUT_COMPUTE;
>
> I wouldn't go as far as adding a config option since as it is it only 
> applies to Gen12 but Kconfig text says nothing about that. And I am 
> not saying you should add a Gen12 specific config option, that would 
> be weird. So IMO just drop it.
>
You were the one arguing that the driver was illegally overriding the 
user's explicitly chosen settings, including the compile time config 
options. Just having a hardcoded magic number in the driver is the 
absolute worst kind of override there is.

And technically, the config option is not Gen12 specific. It is actually 
compute specific, hence the name. It just so happens that only Gen12 
onwards has compute engines. I can add an extra line to the Kconfig 
description if you want "NB: compute engines only exist on Gen12 but do 
include the RCS engine on Gen12".

John.


> Regards,
>
> Tvrtko
>
>>         /* Cap properties according to any system limits */
>>   #define CLAMP_PROP(field) \


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Intel-gfx] [PATCH v3 4/4] drm/i915: Improve long running OCL w/a for GuC submission
  2022-03-09 21:16     ` John Harrison
@ 2022-03-10  9:27       ` Tvrtko Ursulin
  2022-03-10 20:24         ` John Harrison
  0 siblings, 1 reply; 27+ messages in thread
From: Tvrtko Ursulin @ 2022-03-10  9:27 UTC (permalink / raw)
  To: John Harrison, Intel-GFX; +Cc: Michal Mrozek, DRI-Devel


On 09/03/2022 21:16, John Harrison wrote:
> On 3/8/2022 01:41, Tvrtko Ursulin wrote:
>> On 03/03/2022 22:37, John.C.Harrison@Intel.com wrote:
>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>
>>> A workaround was added to the driver to allow OpenCL workloads to run
>>> 'forever' by disabling pre-emption on the RCS engine for Gen12.
>>> It is not totally unbound as the heartbeat will kick in eventually
>>> and cause a reset of the hung engine.
>>>
>>> However, this does not work well in GuC submission mode. In GuC mode,
>>> the pre-emption timeout is how GuC detects hung contexts and triggers
>>> a per engine reset. Thus, disabling the timeout means also losing all
>>> per engine reset ability. A full GT reset will still occur when the
>>> heartbeat finally expires, but that is a much more destructive and
>>> undesirable mechanism.
>>>
>>> The purpose of the workaround is actually to give OpenCL tasks longer
>>> to reach a pre-emption point after a pre-emption request has been
>>> issued. This is necessary because Gen12 does not support mid-thread
>>> pre-emption and OpenCL can have long running threads.
>>>
>>> So, rather than disabling the timeout completely, just set it to a
>>> 'long' value.
>>>
>>> v2: Review feedback from Tvrtko - must hard code the 'long' value
>>> instead of determining it algorithmically. So make it an extra CONFIG
>>> definition. Also, remove the execlist centric comment from the
>>> existing pre-emption timeout CONFIG option given that it applies to
>>> more than just execlists.
>>>
>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> 
>>> (v1)
>>> Acked-by: Michal Mrozek <michal.mrozek@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/Kconfig.profile      | 26 +++++++++++++++++++----
>>>   drivers/gpu/drm/i915/gt/intel_engine_cs.c |  9 ++++++--
>>>   2 files changed, 29 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/Kconfig.profile 
>>> b/drivers/gpu/drm/i915/Kconfig.profile
>>> index 39328567c200..7cc38d25ee5c 100644
>>> --- a/drivers/gpu/drm/i915/Kconfig.profile
>>> +++ b/drivers/gpu/drm/i915/Kconfig.profile
>>> @@ -57,10 +57,28 @@ config DRM_I915_PREEMPT_TIMEOUT
>>>       default 640 # milliseconds
>>>       help
>>>         How long to wait (in milliseconds) for a preemption event to 
>>> occur
>>> -      when submitting a new context via execlists. If the current 
>>> context
>>> -      does not hit an arbitration point and yield to HW before the 
>>> timer
>>> -      expires, the HW will be reset to allow the more important context
>>> -      to execute.
>>> +      when submitting a new context. If the current context does not 
>>> hit
>>> +      an arbitration point and yield to HW before the timer expires, 
>>> the
>>> +      HW will be reset to allow the more important context to execute.
>>> +
>>> +      This is adjustable via
>>> +      /sys/class/drm/card?/engine/*/preempt_timeout_ms
>>> +
>>> +      May be 0 to disable the timeout.
>>> +
>>> +      The compiled in default may get overridden at driver probe 
>>> time on
>>> +      certain platforms and certain engines which will be reflected 
>>> in the
>>> +      sysfs control.
>>> +
>>> +config DRM_I915_PREEMPT_TIMEOUT_COMPUTE
>>> +    int "Preempt timeout for compute engines (ms, jiffy granularity)"
>>> +    default 7500 # milliseconds
>>> +    help
>>> +      How long to wait (in milliseconds) for a preemption event to 
>>> occur
>>> +      when submitting a new context to a compute capable engine. If the
>>> +      current context does not hit an arbitration point and yield to HW
>>> +      before the timer expires, the HW will be reset to allow the more
>>> +      important context to execute.
>>>           This is adjustable via
>>>         /sys/class/drm/card?/engine/*/preempt_timeout_ms
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
>>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> index 4185c7338581..cc0954ad836a 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> @@ -438,9 +438,14 @@ static int intel_engine_setup(struct intel_gt 
>>> *gt, enum intel_engine_id id,
>>>       engine->props.timeslice_duration_ms =
>>>           CONFIG_DRM_I915_TIMESLICE_DURATION;
>>>   -    /* Override to uninterruptible for OpenCL workloads. */
>>> +    /*
>>> +     * Mid-thread pre-emption is not available in Gen12. Unfortunately,
>>> +     * some OpenCL workloads run quite long threads. That means they 
>>> get
>>> +     * reset due to not pre-empting in a timely manner. So, bump the
>>> +     * pre-emption timeout value to be much higher for compute engines.
>>> +     */
>>>       if (GRAPHICS_VER(i915) == 12 && (engine->flags & 
>>> I915_ENGINE_HAS_RCS_REG_STATE))
>>> -        engine->props.preempt_timeout_ms = 0;
>>> +        engine->props.preempt_timeout_ms = 
>>> CONFIG_DRM_I915_PREEMPT_TIMEOUT_COMPUTE;
>>
>> I wouldn't go as far as adding a config option since as it is it only 
>> applies to Gen12 but Kconfig text says nothing about that. And I am 
>> not saying you should add a Gen12 specific config option, that would 
>> be weird. So IMO just drop it.
>>
> You were the one arguing that the driver was illegally overriding the 
> user's explicitly chosen settings, including the compile time config 

This is a bit out of context and illegally don't think used, so 
misrepresents the earlier discussion. And I certainly did not suggest a 
kconfig option.

> options. Just having a hardcoded magic number in the driver is the 
> absolute worst kind of override there is.
 >
> And technically, the config option is not Gen12 specific. It is actually 
> compute specific, hence the name. It just so happens that only Gen12 
> onwards has compute engines. I can add an extra line to the Kconfig 
> description if you want "NB: compute engines only exist on Gen12 but do 
> include the RCS engine on Gen12".

I am not unconditionally against it but it feels it creates more 
problems than gives solutions.

In kconfig help you say "compute *capable* engine". Here you say only 
Gen12 has compute engines. Well before Gen12 render is compute capable, 
but then how implemented it does not apply which is not good.

Given the runtime override has the only purpose of working around broken 
hardware then I'd still say to drop it. But if you can come up with help 
text which won't be misleading and still not overly complicated I am not 
opposing it.

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Intel-gfx] [PATCH v3 4/4] drm/i915: Improve long running OCL w/a for GuC submission
  2022-03-10  9:27       ` Tvrtko Ursulin
@ 2022-03-10 20:24         ` John Harrison
  2022-03-11 10:07           ` Tvrtko Ursulin
  0 siblings, 1 reply; 27+ messages in thread
From: John Harrison @ 2022-03-10 20:24 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-GFX; +Cc: DRI-Devel

On 3/10/2022 01:27, Tvrtko Ursulin wrote:
> On 09/03/2022 21:16, John Harrison wrote:
>> On 3/8/2022 01:41, Tvrtko Ursulin wrote:
>>> On 03/03/2022 22:37, John.C.Harrison@Intel.com wrote:
>>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>>
>>>> A workaround was added to the driver to allow OpenCL workloads to run
>>>> 'forever' by disabling pre-emption on the RCS engine for Gen12.
>>>> It is not totally unbound as the heartbeat will kick in eventually
>>>> and cause a reset of the hung engine.
>>>>
>>>> However, this does not work well in GuC submission mode. In GuC mode,
>>>> the pre-emption timeout is how GuC detects hung contexts and triggers
>>>> a per engine reset. Thus, disabling the timeout means also losing all
>>>> per engine reset ability. A full GT reset will still occur when the
>>>> heartbeat finally expires, but that is a much more destructive and
>>>> undesirable mechanism.
>>>>
>>>> The purpose of the workaround is actually to give OpenCL tasks longer
>>>> to reach a pre-emption point after a pre-emption request has been
>>>> issued. This is necessary because Gen12 does not support mid-thread
>>>> pre-emption and OpenCL can have long running threads.
>>>>
>>>> So, rather than disabling the timeout completely, just set it to a
>>>> 'long' value.
>>>>
>>>> v2: Review feedback from Tvrtko - must hard code the 'long' value
>>>> instead of determining it algorithmically. So make it an extra CONFIG
>>>> definition. Also, remove the execlist centric comment from the
>>>> existing pre-emption timeout CONFIG option given that it applies to
>>>> more than just execlists.
>>>>
>>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>>> Reviewed-by: Daniele Ceraolo Spurio 
>>>> <daniele.ceraolospurio@intel.com> (v1)
>>>> Acked-by: Michal Mrozek <michal.mrozek@intel.com>
>>>> ---
>>>>   drivers/gpu/drm/i915/Kconfig.profile      | 26 
>>>> +++++++++++++++++++----
>>>>   drivers/gpu/drm/i915/gt/intel_engine_cs.c |  9 ++++++--
>>>>   2 files changed, 29 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/Kconfig.profile 
>>>> b/drivers/gpu/drm/i915/Kconfig.profile
>>>> index 39328567c200..7cc38d25ee5c 100644
>>>> --- a/drivers/gpu/drm/i915/Kconfig.profile
>>>> +++ b/drivers/gpu/drm/i915/Kconfig.profile
>>>> @@ -57,10 +57,28 @@ config DRM_I915_PREEMPT_TIMEOUT
>>>>       default 640 # milliseconds
>>>>       help
>>>>         How long to wait (in milliseconds) for a preemption event 
>>>> to occur
>>>> -      when submitting a new context via execlists. If the current 
>>>> context
>>>> -      does not hit an arbitration point and yield to HW before the 
>>>> timer
>>>> -      expires, the HW will be reset to allow the more important 
>>>> context
>>>> -      to execute.
>>>> +      when submitting a new context. If the current context does 
>>>> not hit
>>>> +      an arbitration point and yield to HW before the timer 
>>>> expires, the
>>>> +      HW will be reset to allow the more important context to 
>>>> execute.
>>>> +
>>>> +      This is adjustable via
>>>> +      /sys/class/drm/card?/engine/*/preempt_timeout_ms
>>>> +
>>>> +      May be 0 to disable the timeout.
>>>> +
>>>> +      The compiled in default may get overridden at driver probe 
>>>> time on
>>>> +      certain platforms and certain engines which will be 
>>>> reflected in the
>>>> +      sysfs control.
>>>> +
>>>> +config DRM_I915_PREEMPT_TIMEOUT_COMPUTE
>>>> +    int "Preempt timeout for compute engines (ms, jiffy granularity)"
>>>> +    default 7500 # milliseconds
>>>> +    help
>>>> +      How long to wait (in milliseconds) for a preemption event to 
>>>> occur
>>>> +      when submitting a new context to a compute capable engine. 
>>>> If the
>>>> +      current context does not hit an arbitration point and yield 
>>>> to HW
>>>> +      before the timer expires, the HW will be reset to allow the 
>>>> more
>>>> +      important context to execute.
>>>>           This is adjustable via
>>>>         /sys/class/drm/card?/engine/*/preempt_timeout_ms
>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
>>>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>> index 4185c7338581..cc0954ad836a 100644
>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>> @@ -438,9 +438,14 @@ static int intel_engine_setup(struct intel_gt 
>>>> *gt, enum intel_engine_id id,
>>>>       engine->props.timeslice_duration_ms =
>>>>           CONFIG_DRM_I915_TIMESLICE_DURATION;
>>>>   -    /* Override to uninterruptible for OpenCL workloads. */
>>>> +    /*
>>>> +     * Mid-thread pre-emption is not available in Gen12. 
>>>> Unfortunately,
>>>> +     * some OpenCL workloads run quite long threads. That means 
>>>> they get
>>>> +     * reset due to not pre-empting in a timely manner. So, bump the
>>>> +     * pre-emption timeout value to be much higher for compute 
>>>> engines.
>>>> +     */
>>>>       if (GRAPHICS_VER(i915) == 12 && (engine->flags & 
>>>> I915_ENGINE_HAS_RCS_REG_STATE))
>>>> -        engine->props.preempt_timeout_ms = 0;
>>>> +        engine->props.preempt_timeout_ms = 
>>>> CONFIG_DRM_I915_PREEMPT_TIMEOUT_COMPUTE;
>>>
>>> I wouldn't go as far as adding a config option since as it is it 
>>> only applies to Gen12 but Kconfig text says nothing about that. And 
>>> I am not saying you should add a Gen12 specific config option, that 
>>> would be weird. So IMO just drop it.
>>>
>> You were the one arguing that the driver was illegally overriding the 
>> user's explicitly chosen settings, including the compile time config 
>
> This is a bit out of context and illegally don't think used, so 
> misrepresents the earlier discussion. And I certainly did not suggest 
> a kconfig option.
My recollection is that you clearly stated the i915 driver should not be 
overriding the user's settings. To me, that makes any override an 
illegal operation.

You did not suggest a Kconfig option but the settings in question are 
all coming from existing Kconfig options. Putting an explicit "timeout = 
7500;" in the code is the worst of all worlds. It is an override of a 
user setting and it is an unmodifiable magic number. The first you have 
stated is not allowed and the second is one of the biggest no-no's of 
any code review. Magic number randomly splatted in the code? Nack, do it 
properly.

So in this case, I don't see that there is much choice except to add a 
new Kconfig option for the override.

>
>> options. Just having a hardcoded magic number in the driver is the 
>> absolute worst kind of override there is.
> >
>> And technically, the config option is not Gen12 specific. It is 
>> actually compute specific, hence the name. It just so happens that 
>> only Gen12 onwards has compute engines. I can add an extra line to 
>> the Kconfig description if you want "NB: compute engines only exist 
>> on Gen12 but do include the RCS engine on Gen12".
>
> I am not unconditionally against it but it feels it creates more 
> problems than gives solutions.
>
> In kconfig help you say "compute *capable* engine". Here you say only 
> Gen12 has compute engines. Well before Gen12 render is compute 
> capable, but then how implemented it does not apply which is not good.
Sorry, yes. For some reason I was thinking compute came in with Gen12.

>
> Given the runtime override has the only purpose of working around 
> broken hardware then I'd still say to drop it. But if you can come up 
> with help text which won't be misleading and still not overly 
> complicated I am not opposing it.
So "when submitting a new context to a compute capable engine on Gen12 
and later platforms"? And maybe add a _GEN12 suffix to the config name 
itself?

John.


>
> Regards,
>
> Tvrtko


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Intel-gfx] [PATCH v3 4/4] drm/i915: Improve long running OCL w/a for GuC submission
  2022-03-10 20:24         ` John Harrison
@ 2022-03-11 10:07           ` Tvrtko Ursulin
  2022-03-11 10:39             ` Tvrtko Ursulin
  0 siblings, 1 reply; 27+ messages in thread
From: Tvrtko Ursulin @ 2022-03-11 10:07 UTC (permalink / raw)
  To: John Harrison, Intel-GFX; +Cc: DRI-Devel


On 10/03/2022 20:24, John Harrison wrote:
> On 3/10/2022 01:27, Tvrtko Ursulin wrote:
>> On 09/03/2022 21:16, John Harrison wrote:
>>> On 3/8/2022 01:41, Tvrtko Ursulin wrote:
>>>> On 03/03/2022 22:37, John.C.Harrison@Intel.com wrote:
>>>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>>>
>>>>> A workaround was added to the driver to allow OpenCL workloads to run
>>>>> 'forever' by disabling pre-emption on the RCS engine for Gen12.
>>>>> It is not totally unbound as the heartbeat will kick in eventually
>>>>> and cause a reset of the hung engine.
>>>>>
>>>>> However, this does not work well in GuC submission mode. In GuC mode,
>>>>> the pre-emption timeout is how GuC detects hung contexts and triggers
>>>>> a per engine reset. Thus, disabling the timeout means also losing all
>>>>> per engine reset ability. A full GT reset will still occur when the
>>>>> heartbeat finally expires, but that is a much more destructive and
>>>>> undesirable mechanism.
>>>>>
>>>>> The purpose of the workaround is actually to give OpenCL tasks longer
>>>>> to reach a pre-emption point after a pre-emption request has been
>>>>> issued. This is necessary because Gen12 does not support mid-thread
>>>>> pre-emption and OpenCL can have long running threads.
>>>>>
>>>>> So, rather than disabling the timeout completely, just set it to a
>>>>> 'long' value.
>>>>>
>>>>> v2: Review feedback from Tvrtko - must hard code the 'long' value
>>>>> instead of determining it algorithmically. So make it an extra CONFIG
>>>>> definition. Also, remove the execlist centric comment from the
>>>>> existing pre-emption timeout CONFIG option given that it applies to
>>>>> more than just execlists.
>>>>>
>>>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>>>> Reviewed-by: Daniele Ceraolo Spurio 
>>>>> <daniele.ceraolospurio@intel.com> (v1)
>>>>> Acked-by: Michal Mrozek <michal.mrozek@intel.com>
>>>>> ---
>>>>>   drivers/gpu/drm/i915/Kconfig.profile      | 26 
>>>>> +++++++++++++++++++----
>>>>>   drivers/gpu/drm/i915/gt/intel_engine_cs.c |  9 ++++++--
>>>>>   2 files changed, 29 insertions(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/Kconfig.profile 
>>>>> b/drivers/gpu/drm/i915/Kconfig.profile
>>>>> index 39328567c200..7cc38d25ee5c 100644
>>>>> --- a/drivers/gpu/drm/i915/Kconfig.profile
>>>>> +++ b/drivers/gpu/drm/i915/Kconfig.profile
>>>>> @@ -57,10 +57,28 @@ config DRM_I915_PREEMPT_TIMEOUT
>>>>>       default 640 # milliseconds
>>>>>       help
>>>>>         How long to wait (in milliseconds) for a preemption event 
>>>>> to occur
>>>>> -      when submitting a new context via execlists. If the current 
>>>>> context
>>>>> -      does not hit an arbitration point and yield to HW before the 
>>>>> timer
>>>>> -      expires, the HW will be reset to allow the more important 
>>>>> context
>>>>> -      to execute.
>>>>> +      when submitting a new context. If the current context does 
>>>>> not hit
>>>>> +      an arbitration point and yield to HW before the timer 
>>>>> expires, the
>>>>> +      HW will be reset to allow the more important context to 
>>>>> execute.
>>>>> +
>>>>> +      This is adjustable via
>>>>> +      /sys/class/drm/card?/engine/*/preempt_timeout_ms
>>>>> +
>>>>> +      May be 0 to disable the timeout.
>>>>> +
>>>>> +      The compiled in default may get overridden at driver probe 
>>>>> time on
>>>>> +      certain platforms and certain engines which will be 
>>>>> reflected in the
>>>>> +      sysfs control.
>>>>> +
>>>>> +config DRM_I915_PREEMPT_TIMEOUT_COMPUTE
>>>>> +    int "Preempt timeout for compute engines (ms, jiffy granularity)"
>>>>> +    default 7500 # milliseconds
>>>>> +    help
>>>>> +      How long to wait (in milliseconds) for a preemption event to 
>>>>> occur
>>>>> +      when submitting a new context to a compute capable engine. 
>>>>> If the
>>>>> +      current context does not hit an arbitration point and yield 
>>>>> to HW
>>>>> +      before the timer expires, the HW will be reset to allow the 
>>>>> more
>>>>> +      important context to execute.
>>>>>           This is adjustable via
>>>>>         /sys/class/drm/card?/engine/*/preempt_timeout_ms
>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>> index 4185c7338581..cc0954ad836a 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>> @@ -438,9 +438,14 @@ static int intel_engine_setup(struct intel_gt 
>>>>> *gt, enum intel_engine_id id,
>>>>>       engine->props.timeslice_duration_ms =
>>>>>           CONFIG_DRM_I915_TIMESLICE_DURATION;
>>>>>   -    /* Override to uninterruptible for OpenCL workloads. */
>>>>> +    /*
>>>>> +     * Mid-thread pre-emption is not available in Gen12. 
>>>>> Unfortunately,
>>>>> +     * some OpenCL workloads run quite long threads. That means 
>>>>> they get
>>>>> +     * reset due to not pre-empting in a timely manner. So, bump the
>>>>> +     * pre-emption timeout value to be much higher for compute 
>>>>> engines.
>>>>> +     */
>>>>>       if (GRAPHICS_VER(i915) == 12 && (engine->flags & 
>>>>> I915_ENGINE_HAS_RCS_REG_STATE))
>>>>> -        engine->props.preempt_timeout_ms = 0;
>>>>> +        engine->props.preempt_timeout_ms = 
>>>>> CONFIG_DRM_I915_PREEMPT_TIMEOUT_COMPUTE;
>>>>
>>>> I wouldn't go as far as adding a config option since as it is it 
>>>> only applies to Gen12 but Kconfig text says nothing about that. And 
>>>> I am not saying you should add a Gen12 specific config option, that 
>>>> would be weird. So IMO just drop it.
>>>>
>>> You were the one arguing that the driver was illegally overriding the 
>>> user's explicitly chosen settings, including the compile time config 
>>
>> This is a bit out of context and illegally don't think used, so 
>> misrepresents the earlier discussion. And I certainly did not suggest 
>> a kconfig option.
> My recollection is that you clearly stated the i915 driver should not be 
> overriding the user's settings. To me, that makes any override an 
> illegal operation.
> 
> You did not suggest a Kconfig option but the settings in question are 
> all coming from existing Kconfig options. Putting an explicit "timeout = 
> 7500;" in the code is the worst of all worlds. It is an override of a 
> user setting and it is an unmodifiable magic number. The first you have 
> stated is not allowed and the second is one of the biggest no-no's of 
> any code review. Magic number randomly splatted in the code? Nack, do it 
> properly.
> 
> So in this case, I don't see that there is much choice except to add a 
> new Kconfig option for the override.

 From memory, I don't think I said override is not allowed. I used the 
override argument in a different context. But honestly I don't feel like 
digging that up at this point since this is just going on for too long.

In principle adding kconfig options should be avoided and in this case 
question is cost vs benefit. What is the benefit? Who will tune it, why, 
and using what knowledge?

I have asked if we can get compute UMD to give us some numbers relating 
to typical desktop workloads but did not get anything.

Currently we override to zero, which is what they wanted. Now we are 
thinking of overriding to 7.5s, which they acked, but it's not very 
transparent what is the thinking behind it.

It simply looks we said 7.5s because that's what gives similar worst 
case before reset compared to existing out of the box setup, while 
allowing GuC engine resets to actually work.

I'd personally go for 2.5s, for the same weak reasons of it being 
similar to existing timeout, and extension of every heartbeat interval. 
But you thought 2.5s was too short, I guess, or preferred to view 
heartbeat as decoupled timeline (barring the last one which *has* to 
couple). Which is fine by me. So we agreed to compromise on that and 
moved on.

So meh. What we end up with is not worse than it was and not having a 
kconfig saves you a complication...

>>> options. Just having a hardcoded magic number in the driver is the 
>>> absolute worst kind of override there is.
>> >
>>> And technically, the config option is not Gen12 specific. It is 
>>> actually compute specific, hence the name. It just so happens that 
>>> only Gen12 onwards has compute engines. I can add an extra line to 
>>> the Kconfig description if you want "NB: compute engines only exist 
>>> on Gen12 but do include the RCS engine on Gen12".
>>
>> I am not unconditionally against it but it feels it creates more 
>> problems than gives solutions.
>>
>> In kconfig help you say "compute *capable* engine". Here you say only 
>> Gen12 has compute engines. Well before Gen12 render is compute 
>> capable, but then how implemented it does not apply which is not good.
> Sorry, yes. For some reason I was thinking compute came in with Gen12.
> 
>>
>> Given the runtime override has the only purpose of working around 
>> broken hardware then I'd still say to drop it. But if you can come up 
>> with help text which won't be misleading and still not overly 
>> complicated I am not opposing it.
> So "when submitting a new context to a compute capable engine on Gen12 
> and later platforms"? And maybe add a _GEN12 suffix to the config name 
> itself?

..."and later" would be wrong, you'd have to change it at some point. 
Not least that the patch as proposed does "== 12", not ">= 12". And we 
can't use the term Gen right? List all the affected platform names? Keep 
patching up the help text as platforms are added? Or do we know the end 
point already?

In summary, I will not oppose it if we can have a kconfig text which is 
accurate, useful and not a maintenance burden.

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Intel-gfx] [PATCH v3 4/4] drm/i915: Improve long running OCL w/a for GuC submission
  2022-03-11 10:07           ` Tvrtko Ursulin
@ 2022-03-11 10:39             ` Tvrtko Ursulin
  0 siblings, 0 replies; 27+ messages in thread
From: Tvrtko Ursulin @ 2022-03-11 10:39 UTC (permalink / raw)
  To: John Harrison, Intel-GFX; +Cc: DRI-Devel


On 11/03/2022 10:07, Tvrtko Ursulin wrote:
> 
> On 10/03/2022 20:24, John Harrison wrote:
>> On 3/10/2022 01:27, Tvrtko Ursulin wrote:
>>> On 09/03/2022 21:16, John Harrison wrote:
>>>> On 3/8/2022 01:41, Tvrtko Ursulin wrote:
>>>>> On 03/03/2022 22:37, John.C.Harrison@Intel.com wrote:
>>>>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>>>>
>>>>>> A workaround was added to the driver to allow OpenCL workloads to run
>>>>>> 'forever' by disabling pre-emption on the RCS engine for Gen12.
>>>>>> It is not totally unbound as the heartbeat will kick in eventually
>>>>>> and cause a reset of the hung engine.
>>>>>>
>>>>>> However, this does not work well in GuC submission mode. In GuC mode,
>>>>>> the pre-emption timeout is how GuC detects hung contexts and triggers
>>>>>> a per engine reset. Thus, disabling the timeout means also losing all
>>>>>> per engine reset ability. A full GT reset will still occur when the
>>>>>> heartbeat finally expires, but that is a much more destructive and
>>>>>> undesirable mechanism.
>>>>>>
>>>>>> The purpose of the workaround is actually to give OpenCL tasks longer
>>>>>> to reach a pre-emption point after a pre-emption request has been
>>>>>> issued. This is necessary because Gen12 does not support mid-thread
>>>>>> pre-emption and OpenCL can have long running threads.
>>>>>>
>>>>>> So, rather than disabling the timeout completely, just set it to a
>>>>>> 'long' value.
>>>>>>
>>>>>> v2: Review feedback from Tvrtko - must hard code the 'long' value
>>>>>> instead of determining it algorithmically. So make it an extra CONFIG
>>>>>> definition. Also, remove the execlist centric comment from the
>>>>>> existing pre-emption timeout CONFIG option given that it applies to
>>>>>> more than just execlists.
>>>>>>
>>>>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>>>>> Reviewed-by: Daniele Ceraolo Spurio 
>>>>>> <daniele.ceraolospurio@intel.com> (v1)
>>>>>> Acked-by: Michal Mrozek <michal.mrozek@intel.com>
>>>>>> ---
>>>>>>   drivers/gpu/drm/i915/Kconfig.profile      | 26 
>>>>>> +++++++++++++++++++----
>>>>>>   drivers/gpu/drm/i915/gt/intel_engine_cs.c |  9 ++++++--
>>>>>>   2 files changed, 29 insertions(+), 6 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/Kconfig.profile 
>>>>>> b/drivers/gpu/drm/i915/Kconfig.profile
>>>>>> index 39328567c200..7cc38d25ee5c 100644
>>>>>> --- a/drivers/gpu/drm/i915/Kconfig.profile
>>>>>> +++ b/drivers/gpu/drm/i915/Kconfig.profile
>>>>>> @@ -57,10 +57,28 @@ config DRM_I915_PREEMPT_TIMEOUT
>>>>>>       default 640 # milliseconds
>>>>>>       help
>>>>>>         How long to wait (in milliseconds) for a preemption event 
>>>>>> to occur
>>>>>> -      when submitting a new context via execlists. If the current 
>>>>>> context
>>>>>> -      does not hit an arbitration point and yield to HW before 
>>>>>> the timer
>>>>>> -      expires, the HW will be reset to allow the more important 
>>>>>> context
>>>>>> -      to execute.
>>>>>> +      when submitting a new context. If the current context does 
>>>>>> not hit
>>>>>> +      an arbitration point and yield to HW before the timer 
>>>>>> expires, the
>>>>>> +      HW will be reset to allow the more important context to 
>>>>>> execute.
>>>>>> +
>>>>>> +      This is adjustable via
>>>>>> +      /sys/class/drm/card?/engine/*/preempt_timeout_ms
>>>>>> +
>>>>>> +      May be 0 to disable the timeout.
>>>>>> +
>>>>>> +      The compiled in default may get overridden at driver probe 
>>>>>> time on
>>>>>> +      certain platforms and certain engines which will be 
>>>>>> reflected in the
>>>>>> +      sysfs control.
>>>>>> +
>>>>>> +config DRM_I915_PREEMPT_TIMEOUT_COMPUTE
>>>>>> +    int "Preempt timeout for compute engines (ms, jiffy 
>>>>>> granularity)"
>>>>>> +    default 7500 # milliseconds
>>>>>> +    help
>>>>>> +      How long to wait (in milliseconds) for a preemption event 
>>>>>> to occur
>>>>>> +      when submitting a new context to a compute capable engine. 
>>>>>> If the
>>>>>> +      current context does not hit an arbitration point and yield 
>>>>>> to HW
>>>>>> +      before the timer expires, the HW will be reset to allow the 
>>>>>> more
>>>>>> +      important context to execute.
>>>>>>           This is adjustable via
>>>>>>         /sys/class/drm/card?/engine/*/preempt_timeout_ms
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
>>>>>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>> index 4185c7338581..cc0954ad836a 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>>>>> @@ -438,9 +438,14 @@ static int intel_engine_setup(struct intel_gt 
>>>>>> *gt, enum intel_engine_id id,
>>>>>>       engine->props.timeslice_duration_ms =
>>>>>>           CONFIG_DRM_I915_TIMESLICE_DURATION;
>>>>>>   -    /* Override to uninterruptible for OpenCL workloads. */
>>>>>> +    /*
>>>>>> +     * Mid-thread pre-emption is not available in Gen12. 
>>>>>> Unfortunately,
>>>>>> +     * some OpenCL workloads run quite long threads. That means 
>>>>>> they get
>>>>>> +     * reset due to not pre-empting in a timely manner. So, bump the
>>>>>> +     * pre-emption timeout value to be much higher for compute 
>>>>>> engines.
>>>>>> +     */
>>>>>>       if (GRAPHICS_VER(i915) == 12 && (engine->flags & 
>>>>>> I915_ENGINE_HAS_RCS_REG_STATE))
>>>>>> -        engine->props.preempt_timeout_ms = 0;
>>>>>> +        engine->props.preempt_timeout_ms = 
>>>>>> CONFIG_DRM_I915_PREEMPT_TIMEOUT_COMPUTE;
>>>>>
>>>>> I wouldn't go as far as adding a config option since as it is it 
>>>>> only applies to Gen12 but Kconfig text says nothing about that. And 
>>>>> I am not saying you should add a Gen12 specific config option, that 
>>>>> would be weird. So IMO just drop it.
>>>>>
>>>> You were the one arguing that the driver was illegally overriding 
>>>> the user's explicitly chosen settings, including the compile time 
>>>> config 
>>>
>>> This is a bit out of context and illegally don't think used, so 
>>> misrepresents the earlier discussion. And I certainly did not suggest 
>>> a kconfig option.
>> My recollection is that you clearly stated the i915 driver should not 
>> be overriding the user's settings. To me, that makes any override an 
>> illegal operation.
>>
>> You did not suggest a Kconfig option but the settings in question are 
>> all coming from existing Kconfig options. Putting an explicit "timeout 
>> = 7500;" in the code is the worst of all worlds. It is an override of 
>> a user setting and it is an unmodifiable magic number. The first you 
>> have stated is not allowed and the second is one of the biggest 
>> no-no's of any code review. Magic number randomly splatted in the 
>> code? Nack, do it properly.
>>
>> So in this case, I don't see that there is much choice except to add a 
>> new Kconfig option for the override.
> 
>  From memory, I don't think I said override is not allowed. I used the 
> override argument in a different context. But honestly I don't feel like 
> digging that up at this point since this is just going on for too long.
> 
> In principle adding kconfig options should be avoided and in this case 
> question is cost vs benefit. What is the benefit? Who will tune it, why, 
> and using what knowledge?
> 
> I have asked if we can get compute UMD to give us some numbers relating 
> to typical desktop workloads but did not get anything.
> 
> Currently we override to zero, which is what they wanted. Now we are 
> thinking of overriding to 7.5s, which they acked, but it's not very 
> transparent what is the thinking behind it.
> 
> It simply looks we said 7.5s because that's what gives similar worst 
> case before reset compared to existing out of the box setup, while 
> allowing GuC engine resets to actually work.
> 
> I'd personally go for 2.5s, for the same weak reasons of it being 
> similar to existing timeout, and extension of every heartbeat interval. 
> But you thought 2.5s was too short, I guess, or preferred to view 
> heartbeat as decoupled timeline (barring the last one which *has* to 
> couple). Which is fine by me. So we agreed to compromise on that and 
> moved on.
> 
> So meh. What we end up with is not worse than it was and not having a 
> kconfig saves you a complication...
> 
>>>> options. Just having a hardcoded magic number in the driver is the 
>>>> absolute worst kind of override there is.
>>> >
>>>> And technically, the config option is not Gen12 specific. It is 
>>>> actually compute specific, hence the name. It just so happens that 
>>>> only Gen12 onwards has compute engines. I can add an extra line to 
>>>> the Kconfig description if you want "NB: compute engines only exist 
>>>> on Gen12 but do include the RCS engine on Gen12".
>>>
>>> I am not unconditionally against it but it feels it creates more 
>>> problems than gives solutions.
>>>
>>> In kconfig help you say "compute *capable* engine". Here you say only 
>>> Gen12 has compute engines. Well before Gen12 render is compute 
>>> capable, but then how implemented it does not apply which is not good.
>> Sorry, yes. For some reason I was thinking compute came in with Gen12.
>>
>>>
>>> Given the runtime override has the only purpose of working around 
>>> broken hardware then I'd still say to drop it. But if you can come up 
>>> with help text which won't be misleading and still not overly 
>>> complicated I am not opposing it.
>> So "when submitting a new context to a compute capable engine on Gen12 
>> and later platforms"? And maybe add a _GEN12 suffix to the config name 
>> itself?
> 
> ..."and later" would be wrong, you'd have to change it at some point. 
> Not least that the patch as proposed does "== 12", not ">= 12". And we 
> can't use the term Gen right? List all the affected platform names? Keep 
> patching up the help text as platforms are added? Or do we know the end 
> point already?
> 
> In summary, I will not oppose it if we can have a kconfig text which is 
> accurate, useful and not a maintenance burden.

Maybe avoid listing specifics and provide some guidance:

"""
config DRM_I915_OVERRIDE_PREEMPT_TIMEOUT
         int "Override preempt timeout (ms, jiffy granularity)"
         default 7500 # milliseconds
         help
           On certain platforms and engines where supported preemption
           granularity is reduced due hardware limitations, a longer timeout
           than DRM_I915_PREEMPT_TIMEOUT (see respective help text) is
           required.

           Shorter timeouts will have more chance of terminating legitimate
           workloads, while longer can have detrimental effect on desktop
           interactivity and ability to terminate hanging workloads in
           reasonable time.

           Usage of the override timeout will be logged during driver probe
           for each affected engine.
"""

If approach is acceptable also feel free to reword and improve my English.

Not sure whether to have compute in the name of kconfig. But I do like override in the name, so if you add compute I think also keeping override is good since it works well with not having to mention platform specifics in kconfig and it signifies it is special case. Or maybe "workaround"?

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2022-03-11 10:40 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-03 22:37 [PATCH v3 0/4] Improve anti-pre-emption w/a for compute workloads John.C.Harrison
2022-03-03 22:37 ` [Intel-gfx] " John.C.Harrison
2022-03-03 22:37 ` [PATCH v3 1/4] drm/i915/guc: Limit scheduling properties to avoid overflow John.C.Harrison
2022-03-03 22:37   ` [Intel-gfx] " John.C.Harrison
2022-03-08  9:43   ` Tvrtko Ursulin
2022-03-09 21:10     ` John Harrison
2022-03-03 22:37 ` [PATCH v3 2/4] drm/i915: Fix compute pre-emption w/a to apply to compute engines John.C.Harrison
2022-03-03 22:37   ` [Intel-gfx] " John.C.Harrison
2022-03-03 23:16   ` Matt Roper
2022-03-03 23:16     ` [Intel-gfx] " Matt Roper
2022-03-03 22:37 ` [PATCH v3 3/4] drm/i915: Make the heartbeat play nice with long pre-emption timeouts John.C.Harrison
2022-03-03 22:37   ` [Intel-gfx] " John.C.Harrison
2022-03-03 22:37 ` [PATCH v3 4/4] drm/i915: Improve long running OCL w/a for GuC submission John.C.Harrison
2022-03-03 22:37   ` [Intel-gfx] " John.C.Harrison
2022-03-08  9:03   ` Mrozek, Michal
2022-03-08  9:03     ` [Intel-gfx] " Mrozek, Michal
2022-03-08  9:41   ` Tvrtko Ursulin
2022-03-09 21:16     ` John Harrison
2022-03-10  9:27       ` Tvrtko Ursulin
2022-03-10 20:24         ` John Harrison
2022-03-11 10:07           ` Tvrtko Ursulin
2022-03-11 10:39             ` Tvrtko Ursulin
2022-03-04  0:55 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Improve anti-pre-emption w/a for compute workloads (rev4) Patchwork
2022-03-04  0:56 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2022-03-04  1:28 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2022-03-04 15:09 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2022-03-08 22:34 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for Improve anti-pre-emption w/a for compute workloads (rev6) Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.