[PATCH 1/3] drm/i915/gt: Flush retire.work timer object on unload

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/3] drm/i915/gt: Flush retire.work timer object on unload
@ 2019-11-15 15:08 ` Chris Wilson
  0 siblings, 0 replies; 16+ messages in thread
From: Chris Wilson @ 2019-11-15 15:08 UTC (permalink / raw)
  To: intel-gfx

We need to wait until the timer object is marked as deactivated before
unloading, so follow up our gentle cancel_delayed_work() with the
synchronous variant to ensure it is flushed off a remote cpu before we
mark the memory as freed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111994
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt.c          | 1 +
 drivers/gpu/drm/i915/gt/intel_gt_requests.c | 6 ++++++
 drivers/gpu/drm/i915/gt/intel_gt_requests.h | 1 +
 3 files changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index c39b21c8d328..b5a9b87e4ec9 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -397,6 +397,7 @@ void intel_gt_driver_release(struct intel_gt *gt)
 void intel_gt_driver_late_release(struct intel_gt *gt)
 {
 	intel_uc_driver_late_release(&gt->uc);
+	intel_gt_fini_requests(gt);
 	intel_gt_fini_reset(gt);
 	intel_gt_fini_timelines(gt);
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
index ccbddddbbd52..a79e6efb31a2 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
@@ -130,3 +130,9 @@ void intel_gt_unpark_requests(struct intel_gt *gt)
 	schedule_delayed_work(&gt->requests.retire_work,
 			      round_jiffies_up_relative(HZ));
 }
+
+void intel_gt_fini_requests(struct intel_gt *gt)
+{
+	/* Wait until the work is marked as finished before unloading! */
+	cancel_delayed_work_sync(&gt->requests.retire_work);
+}
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.h b/drivers/gpu/drm/i915/gt/intel_gt_requests.h
index bd31cbce47e0..fde546424c63 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.h
@@ -20,5 +20,6 @@ int intel_gt_wait_for_idle(struct intel_gt *gt, long timeout);
 void intel_gt_init_requests(struct intel_gt *gt);
 void intel_gt_park_requests(struct intel_gt *gt);
 void intel_gt_unpark_requests(struct intel_gt *gt);
+void intel_gt_fini_requests(struct intel_gt *gt);
 
 #endif /* INTEL_GT_REQUESTS_H */
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Intel-gfx] [PATCH 1/3] drm/i915/gt: Flush retire.work timer object on unload
@ 2019-11-15 15:08 ` Chris Wilson
  0 siblings, 0 replies; 16+ messages in thread
From: Chris Wilson @ 2019-11-15 15:08 UTC (permalink / raw)
  To: intel-gfx

We need to wait until the timer object is marked as deactivated before
unloading, so follow up our gentle cancel_delayed_work() with the
synchronous variant to ensure it is flushed off a remote cpu before we
mark the memory as freed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111994
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt.c          | 1 +
 drivers/gpu/drm/i915/gt/intel_gt_requests.c | 6 ++++++
 drivers/gpu/drm/i915/gt/intel_gt_requests.h | 1 +
 3 files changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index c39b21c8d328..b5a9b87e4ec9 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -397,6 +397,7 @@ void intel_gt_driver_release(struct intel_gt *gt)
 void intel_gt_driver_late_release(struct intel_gt *gt)
 {
 	intel_uc_driver_late_release(&gt->uc);
+	intel_gt_fini_requests(gt);
 	intel_gt_fini_reset(gt);
 	intel_gt_fini_timelines(gt);
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
index ccbddddbbd52..a79e6efb31a2 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
@@ -130,3 +130,9 @@ void intel_gt_unpark_requests(struct intel_gt *gt)
 	schedule_delayed_work(&gt->requests.retire_work,
 			      round_jiffies_up_relative(HZ));
 }
+
+void intel_gt_fini_requests(struct intel_gt *gt)
+{
+	/* Wait until the work is marked as finished before unloading! */
+	cancel_delayed_work_sync(&gt->requests.retire_work);
+}
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.h b/drivers/gpu/drm/i915/gt/intel_gt_requests.h
index bd31cbce47e0..fde546424c63 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.h
@@ -20,5 +20,6 @@ int intel_gt_wait_for_idle(struct intel_gt *gt, long timeout);
 void intel_gt_init_requests(struct intel_gt *gt);
 void intel_gt_park_requests(struct intel_gt *gt);
 void intel_gt_unpark_requests(struct intel_gt *gt);
+void intel_gt_fini_requests(struct intel_gt *gt);
 
 #endif /* INTEL_GT_REQUESTS_H */
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/3] drm/i915/selftests: Disable heartbeat around context barrier tests
@ 2019-11-15 15:08   ` Chris Wilson
  0 siblings, 0 replies; 16+ messages in thread
From: Chris Wilson @ 2019-11-15 15:08 UTC (permalink / raw)
  To: intel-gfx

As the heartbeat has the effect of flushing context barriers, this
interferes with the context barrier tests that are trying to observe
them directly. Disable the heartbeat so that the barriers are as
predictable as the test demands.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/selftest_context.c | 44 +++++++++++++++++++---
 1 file changed, 38 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
index 14ba6ceb9177..3586af636304 100644
--- a/drivers/gpu/drm/i915/gt/selftest_context.c
+++ b/drivers/gpu/drm/i915/gt/selftest_context.c
@@ -5,6 +5,7 @@
  */
 
 #include "i915_selftest.h"
+#include "intel_engine_heartbeat.h"
 #include "intel_engine_pm.h"
 #include "intel_gt.h"
 
@@ -200,6 +201,7 @@ static int live_context_size(void *arg)
 static int __live_active_context(struct intel_engine_cs *engine,
 				 struct i915_gem_context *fixme)
 {
+	unsigned long saved_heartbeat;
 	struct intel_context *ce;
 	int pass;
 	int err;
@@ -227,36 +229,50 @@ static int __live_active_context(struct intel_engine_cs *engine,
 	if (IS_ERR(ce))
 		return PTR_ERR(ce);
 
+	saved_heartbeat = engine->props.heartbeat_interval_ms;
+	engine->props.heartbeat_interval_ms = 0;
+
 	for (pass = 0; pass <= 2; pass++) {
 		struct i915_request *rq;
 
+		intel_engine_pm_get(engine);
+
 		rq = intel_context_create_request(ce);
 		if (IS_ERR(rq)) {
 			err = PTR_ERR(rq);
-			goto err;
+			goto out_engine;
 		}
 
 		err = request_sync(rq);
 		if (err)
-			goto err;
+			goto out_engine;
 
 		/* Context will be kept active until after an idle-barrier. */
 		if (i915_active_is_idle(&ce->active)) {
 			pr_err("context is not active; expected idle-barrier (%s pass %d)\n",
 			       engine->name, pass);
 			err = -EINVAL;
-			goto err;
+			goto out_engine;
 		}
 
 		if (!intel_engine_pm_is_awake(engine)) {
 			pr_err("%s is asleep before idle-barrier\n",
 			       engine->name);
 			err = -EINVAL;
-			goto err;
+			goto out_engine;
 		}
+
+out_engine:
+		intel_engine_pm_put(engine);
+		if (err)
+			goto err;
 	}
 
 	/* Now make sure our idle-barriers are flushed */
+	err = intel_engine_flush_barriers(engine);
+	if (err)
+		goto err;
+
 	err = context_sync(engine->kernel_context);
 	if (err)
 		goto err;
@@ -270,8 +286,9 @@ static int __live_active_context(struct intel_engine_cs *engine,
 		struct drm_printer p = drm_debug_printer(__func__);
 
 		intel_engine_dump(engine, &p,
-				  "%s is still awake after idle-barriers\n",
-				  engine->name);
+				  "%s is still awake:%d after idle-barriers\n",
+				  engine->name,
+				  atomic_read(&engine->wakeref.count));
 		GEM_TRACE_DUMP();
 
 		err = -EINVAL;
@@ -279,6 +296,7 @@ static int __live_active_context(struct intel_engine_cs *engine,
 	}
 
 err:
+	engine->props.heartbeat_interval_ms = saved_heartbeat;
 	intel_context_put(ce);
 	return err;
 }
@@ -349,6 +367,7 @@ static int __live_remote_context(struct intel_engine_cs *engine,
 				 struct i915_gem_context *fixme)
 {
 	struct intel_context *local, *remote;
+	unsigned long saved_heartbeat;
 	int pass;
 	int err;
 
@@ -360,6 +379,12 @@ static int __live_remote_context(struct intel_engine_cs *engine,
 	 * clobber the idle-barrier.
 	 */
 
+	if (intel_engine_pm_is_awake(engine)) {
+		pr_err("%s is awake before starting %s!\n",
+		       engine->name, __func__);
+		return -EINVAL;
+	}
+
 	remote = intel_context_create(fixme, engine);
 	if (IS_ERR(remote))
 		return PTR_ERR(remote);
@@ -370,6 +395,10 @@ static int __live_remote_context(struct intel_engine_cs *engine,
 		goto err_remote;
 	}
 
+	saved_heartbeat = engine->props.heartbeat_interval_ms;
+	engine->props.heartbeat_interval_ms = 0;
+	intel_engine_pm_get(engine);
+
 	for (pass = 0; pass <= 2; pass++) {
 		err = __remote_sync(local, remote);
 		if (err)
@@ -387,6 +416,9 @@ static int __live_remote_context(struct intel_engine_cs *engine,
 		}
 	}
 
+	intel_engine_pm_put(engine);
+	engine->props.heartbeat_interval_ms = saved_heartbeat;
+
 	intel_context_put(local);
 err_remote:
 	intel_context_put(remote);
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Intel-gfx] [PATCH 2/3] drm/i915/selftests: Disable heartbeat around context barrier tests
@ 2019-11-15 15:08   ` Chris Wilson
  0 siblings, 0 replies; 16+ messages in thread
From: Chris Wilson @ 2019-11-15 15:08 UTC (permalink / raw)
  To: intel-gfx

As the heartbeat has the effect of flushing context barriers, this
interferes with the context barrier tests that are trying to observe
them directly. Disable the heartbeat so that the barriers are as
predictable as the test demands.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/selftest_context.c | 44 +++++++++++++++++++---
 1 file changed, 38 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
index 14ba6ceb9177..3586af636304 100644
--- a/drivers/gpu/drm/i915/gt/selftest_context.c
+++ b/drivers/gpu/drm/i915/gt/selftest_context.c
@@ -5,6 +5,7 @@
  */
 
 #include "i915_selftest.h"
+#include "intel_engine_heartbeat.h"
 #include "intel_engine_pm.h"
 #include "intel_gt.h"
 
@@ -200,6 +201,7 @@ static int live_context_size(void *arg)
 static int __live_active_context(struct intel_engine_cs *engine,
 				 struct i915_gem_context *fixme)
 {
+	unsigned long saved_heartbeat;
 	struct intel_context *ce;
 	int pass;
 	int err;
@@ -227,36 +229,50 @@ static int __live_active_context(struct intel_engine_cs *engine,
 	if (IS_ERR(ce))
 		return PTR_ERR(ce);
 
+	saved_heartbeat = engine->props.heartbeat_interval_ms;
+	engine->props.heartbeat_interval_ms = 0;
+
 	for (pass = 0; pass <= 2; pass++) {
 		struct i915_request *rq;
 
+		intel_engine_pm_get(engine);
+
 		rq = intel_context_create_request(ce);
 		if (IS_ERR(rq)) {
 			err = PTR_ERR(rq);
-			goto err;
+			goto out_engine;
 		}
 
 		err = request_sync(rq);
 		if (err)
-			goto err;
+			goto out_engine;
 
 		/* Context will be kept active until after an idle-barrier. */
 		if (i915_active_is_idle(&ce->active)) {
 			pr_err("context is not active; expected idle-barrier (%s pass %d)\n",
 			       engine->name, pass);
 			err = -EINVAL;
-			goto err;
+			goto out_engine;
 		}
 
 		if (!intel_engine_pm_is_awake(engine)) {
 			pr_err("%s is asleep before idle-barrier\n",
 			       engine->name);
 			err = -EINVAL;
-			goto err;
+			goto out_engine;
 		}
+
+out_engine:
+		intel_engine_pm_put(engine);
+		if (err)
+			goto err;
 	}
 
 	/* Now make sure our idle-barriers are flushed */
+	err = intel_engine_flush_barriers(engine);
+	if (err)
+		goto err;
+
 	err = context_sync(engine->kernel_context);
 	if (err)
 		goto err;
@@ -270,8 +286,9 @@ static int __live_active_context(struct intel_engine_cs *engine,
 		struct drm_printer p = drm_debug_printer(__func__);
 
 		intel_engine_dump(engine, &p,
-				  "%s is still awake after idle-barriers\n",
-				  engine->name);
+				  "%s is still awake:%d after idle-barriers\n",
+				  engine->name,
+				  atomic_read(&engine->wakeref.count));
 		GEM_TRACE_DUMP();
 
 		err = -EINVAL;
@@ -279,6 +296,7 @@ static int __live_active_context(struct intel_engine_cs *engine,
 	}
 
 err:
+	engine->props.heartbeat_interval_ms = saved_heartbeat;
 	intel_context_put(ce);
 	return err;
 }
@@ -349,6 +367,7 @@ static int __live_remote_context(struct intel_engine_cs *engine,
 				 struct i915_gem_context *fixme)
 {
 	struct intel_context *local, *remote;
+	unsigned long saved_heartbeat;
 	int pass;
 	int err;
 
@@ -360,6 +379,12 @@ static int __live_remote_context(struct intel_engine_cs *engine,
 	 * clobber the idle-barrier.
 	 */
 
+	if (intel_engine_pm_is_awake(engine)) {
+		pr_err("%s is awake before starting %s!\n",
+		       engine->name, __func__);
+		return -EINVAL;
+	}
+
 	remote = intel_context_create(fixme, engine);
 	if (IS_ERR(remote))
 		return PTR_ERR(remote);
@@ -370,6 +395,10 @@ static int __live_remote_context(struct intel_engine_cs *engine,
 		goto err_remote;
 	}
 
+	saved_heartbeat = engine->props.heartbeat_interval_ms;
+	engine->props.heartbeat_interval_ms = 0;
+	intel_engine_pm_get(engine);
+
 	for (pass = 0; pass <= 2; pass++) {
 		err = __remote_sync(local, remote);
 		if (err)
@@ -387,6 +416,9 @@ static int __live_remote_context(struct intel_engine_cs *engine,
 		}
 	}
 
+	intel_engine_pm_put(engine);
+	engine->props.heartbeat_interval_ms = saved_heartbeat;
+
 	intel_context_put(local);
 err_remote:
 	intel_context_put(remote);
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/3] drm/i915/gt: Track engine round-trip times
@ 2019-11-15 15:08   ` Chris Wilson
  0 siblings, 0 replies; 16+ messages in thread
From: Chris Wilson @ 2019-11-15 15:08 UTC (permalink / raw)
  To: intel-gfx

Knowing the round trip time of an engine is useful for tracking the
health of the system as well as providing a metric for the baseline
responsiveness of the engine. We can use the latter metric for
automatically tuning our waits in selftests and when idling so we don't
confuse a slower system with a dead one.

Upon idling the engine, we send one last pulse to switch the context
away from precious user state to the volatile kernel context. We know
the engine is idle at this point, and the pulse is non-preemptable, so
this provides us with a good measurement of the round trip time. A
secondary effect is that by installing an interrupt onto the pulse, we
can flush the engine immediately upon completion, curtailing the
background flush and entering powersaving immediately.

v2: Manage pm wakerefs to avoid too early module unload
flush the barrier while we are not watching.

References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  2 ++
 drivers/gpu/drm/i915/gt/intel_engine_pm.c    | 36 +++++++++++++++++++-
 drivers/gpu/drm/i915/gt/intel_engine_types.h | 11 ++++++
 drivers/gpu/drm/i915/gt/intel_gt_pm.h        |  5 +++
 drivers/gpu/drm/i915/intel_wakeref.h         | 12 +++++++
 5 files changed, 65 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index b9613d044393..2d11db13dc89 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -334,6 +334,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
 	/* Nothing to do here, execute in order of dependencies */
 	engine->schedule = NULL;
 
+	ewma_delay_init(&engine->delay);
 	seqlock_init(&engine->stats.lock);
 
 	ATOMIC_INIT_NOTIFIER_HEAD(&engine->context_status_notifier);
@@ -1477,6 +1478,7 @@ void intel_engine_dump(struct intel_engine_cs *engine,
 		drm_printf(m, "*** WEDGED ***\n");
 
 	drm_printf(m, "\tAwake? %d\n", atomic_read(&engine->wakeref.count));
+	drm_printf(m, "\tDelay: %luus\n", ewma_delay_read(&engine->delay));
 
 	rcu_read_lock();
 	rq = READ_ONCE(engine->heartbeat.systole);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 3c0f490ff2c7..e79b14e4599b 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -73,6 +73,24 @@ static inline void __timeline_mark_unlock(struct intel_context *ce,
 
 #endif /* !IS_ENABLED(CONFIG_LOCKDEP) */
 
+struct duration_cb {
+	struct dma_fence_cb cb;
+	ktime_t emitted;
+};
+
+static void duration_cb(struct dma_fence *fence, struct dma_fence_cb *cb)
+{
+	struct duration_cb *dcb = container_of(cb, typeof(*dcb), cb);
+	struct intel_engine_cs *engine = to_request(fence)->engine;
+
+	ewma_delay_add(&engine->delay,
+		       ktime_us_delta(ktime_get(), dcb->emitted));
+
+	/* Kick retire for quicker powersaving (soft-rc6). */
+	mod_delayed_work(system_wq, &engine->gt->requests.retire_work, 0);
+	intel_gt_pm_put(engine->gt);
+}
+
 static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 {
 	struct i915_request *rq;
@@ -114,7 +132,23 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 
 	/* Install ourselves as a preemption barrier */
 	rq->sched.attr.priority = I915_PRIORITY_BARRIER;
-	__i915_request_commit(rq);
+	if (likely(!__i915_request_commit(rq))) { /* engine should be idle! */
+		struct duration_cb *dcb;
+
+		BUILD_BUG_ON(sizeof(*dcb) > sizeof(rq->submitq));
+		dcb = (struct duration_cb *)&rq->submitq;
+
+		/*
+		 * Use an interrupt for precise measurement of duration,
+		 * otherwise we rely on someone else retiring all the requests
+		 * which may delay the signaling (i.e. we will likely wait
+		 * until the background request retirement running every
+		 * second or two).
+		 */
+		__intel_gt_pm_get(rq->engine->gt);
+		dma_fence_add_callback(&rq->fence, &dcb->cb, duration_cb);
+		dcb->emitted = ktime_get();
+	}
 
 	/* Release our exclusive hold on the engine */
 	__intel_wakeref_defer_park(&engine->wakeref);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 758f0e8ec672..1de121583482 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -7,6 +7,7 @@
 #ifndef __INTEL_ENGINE_TYPES__
 #define __INTEL_ENGINE_TYPES__
 
+#include <linux/average.h>
 #include <linux/hashtable.h>
 #include <linux/irq_work.h>
 #include <linux/kref.h>
@@ -119,6 +120,9 @@ enum intel_engine_id {
 #define INVALID_ENGINE ((enum intel_engine_id)-1)
 };
 
+/* A simple estimator for the round-trip responsive time of an engine */
+DECLARE_EWMA(delay, 6, 4)
+
 struct st_preempt_hang {
 	struct completion completion;
 	unsigned int count;
@@ -316,6 +320,13 @@ struct intel_engine_cs {
 		struct intel_timeline *timeline;
 	} legacy;
 
+	/*
+	 * We track the average duration of the idle pulse on parking the
+	 * engine to keep an estimate of the how the fast the engine is
+	 * under ideal conditions.
+	 */
+	struct ewma_delay delay;
+
 	/* Rather than have every client wait upon all user interrupts,
 	 * with the herd waking after every interrupt and each doing the
 	 * heavyweight seqno dance, we delegate the task (of being the
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.h b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
index b3e17399be9b..c7271b3acde3 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
@@ -22,6 +22,11 @@ static inline void intel_gt_pm_get(struct intel_gt *gt)
 	intel_wakeref_get(&gt->wakeref);
 }
 
+static inline void __intel_gt_pm_get(struct intel_gt *gt)
+{
+	__intel_wakeref_get(&gt->wakeref);
+}
+
 static inline bool intel_gt_pm_get_if_awake(struct intel_gt *gt)
 {
 	return intel_wakeref_get_if_active(&gt->wakeref);
diff --git a/drivers/gpu/drm/i915/intel_wakeref.h b/drivers/gpu/drm/i915/intel_wakeref.h
index 5f0c972a80fb..2603a177518a 100644
--- a/drivers/gpu/drm/i915/intel_wakeref.h
+++ b/drivers/gpu/drm/i915/intel_wakeref.h
@@ -84,6 +84,18 @@ intel_wakeref_get(struct intel_wakeref *wf)
 	return 0;
 }
 
+/**
+ * __intel_wakeref_get: Acquire the wakeref, again
+ * @wf: the wakeref
+ *
+ * Acquire a hold on an already active wakeref.
+ */
+static inline void
+__intel_wakeref_get(struct intel_wakeref *wf)
+{
+	atomic_inc(&wf->count);
+}
+
 /**
  * intel_wakeref_get_if_in_use: Acquire the wakeref
  * @wf: the wakeref
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Intel-gfx] [PATCH 3/3] drm/i915/gt: Track engine round-trip times
@ 2019-11-15 15:08   ` Chris Wilson
  0 siblings, 0 replies; 16+ messages in thread
From: Chris Wilson @ 2019-11-15 15:08 UTC (permalink / raw)
  To: intel-gfx

Knowing the round trip time of an engine is useful for tracking the
health of the system as well as providing a metric for the baseline
responsiveness of the engine. We can use the latter metric for
automatically tuning our waits in selftests and when idling so we don't
confuse a slower system with a dead one.

Upon idling the engine, we send one last pulse to switch the context
away from precious user state to the volatile kernel context. We know
the engine is idle at this point, and the pulse is non-preemptable, so
this provides us with a good measurement of the round trip time. A
secondary effect is that by installing an interrupt onto the pulse, we
can flush the engine immediately upon completion, curtailing the
background flush and entering powersaving immediately.

v2: Manage pm wakerefs to avoid too early module unload
flush the barrier while we are not watching.

References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  2 ++
 drivers/gpu/drm/i915/gt/intel_engine_pm.c    | 36 +++++++++++++++++++-
 drivers/gpu/drm/i915/gt/intel_engine_types.h | 11 ++++++
 drivers/gpu/drm/i915/gt/intel_gt_pm.h        |  5 +++
 drivers/gpu/drm/i915/intel_wakeref.h         | 12 +++++++
 5 files changed, 65 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index b9613d044393..2d11db13dc89 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -334,6 +334,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
 	/* Nothing to do here, execute in order of dependencies */
 	engine->schedule = NULL;
 
+	ewma_delay_init(&engine->delay);
 	seqlock_init(&engine->stats.lock);
 
 	ATOMIC_INIT_NOTIFIER_HEAD(&engine->context_status_notifier);
@@ -1477,6 +1478,7 @@ void intel_engine_dump(struct intel_engine_cs *engine,
 		drm_printf(m, "*** WEDGED ***\n");
 
 	drm_printf(m, "\tAwake? %d\n", atomic_read(&engine->wakeref.count));
+	drm_printf(m, "\tDelay: %luus\n", ewma_delay_read(&engine->delay));
 
 	rcu_read_lock();
 	rq = READ_ONCE(engine->heartbeat.systole);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 3c0f490ff2c7..e79b14e4599b 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -73,6 +73,24 @@ static inline void __timeline_mark_unlock(struct intel_context *ce,
 
 #endif /* !IS_ENABLED(CONFIG_LOCKDEP) */
 
+struct duration_cb {
+	struct dma_fence_cb cb;
+	ktime_t emitted;
+};
+
+static void duration_cb(struct dma_fence *fence, struct dma_fence_cb *cb)
+{
+	struct duration_cb *dcb = container_of(cb, typeof(*dcb), cb);
+	struct intel_engine_cs *engine = to_request(fence)->engine;
+
+	ewma_delay_add(&engine->delay,
+		       ktime_us_delta(ktime_get(), dcb->emitted));
+
+	/* Kick retire for quicker powersaving (soft-rc6). */
+	mod_delayed_work(system_wq, &engine->gt->requests.retire_work, 0);
+	intel_gt_pm_put(engine->gt);
+}
+
 static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 {
 	struct i915_request *rq;
@@ -114,7 +132,23 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 
 	/* Install ourselves as a preemption barrier */
 	rq->sched.attr.priority = I915_PRIORITY_BARRIER;
-	__i915_request_commit(rq);
+	if (likely(!__i915_request_commit(rq))) { /* engine should be idle! */
+		struct duration_cb *dcb;
+
+		BUILD_BUG_ON(sizeof(*dcb) > sizeof(rq->submitq));
+		dcb = (struct duration_cb *)&rq->submitq;
+
+		/*
+		 * Use an interrupt for precise measurement of duration,
+		 * otherwise we rely on someone else retiring all the requests
+		 * which may delay the signaling (i.e. we will likely wait
+		 * until the background request retirement running every
+		 * second or two).
+		 */
+		__intel_gt_pm_get(rq->engine->gt);
+		dma_fence_add_callback(&rq->fence, &dcb->cb, duration_cb);
+		dcb->emitted = ktime_get();
+	}
 
 	/* Release our exclusive hold on the engine */
 	__intel_wakeref_defer_park(&engine->wakeref);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 758f0e8ec672..1de121583482 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -7,6 +7,7 @@
 #ifndef __INTEL_ENGINE_TYPES__
 #define __INTEL_ENGINE_TYPES__
 
+#include <linux/average.h>
 #include <linux/hashtable.h>
 #include <linux/irq_work.h>
 #include <linux/kref.h>
@@ -119,6 +120,9 @@ enum intel_engine_id {
 #define INVALID_ENGINE ((enum intel_engine_id)-1)
 };
 
+/* A simple estimator for the round-trip responsive time of an engine */
+DECLARE_EWMA(delay, 6, 4)
+
 struct st_preempt_hang {
 	struct completion completion;
 	unsigned int count;
@@ -316,6 +320,13 @@ struct intel_engine_cs {
 		struct intel_timeline *timeline;
 	} legacy;
 
+	/*
+	 * We track the average duration of the idle pulse on parking the
+	 * engine to keep an estimate of the how the fast the engine is
+	 * under ideal conditions.
+	 */
+	struct ewma_delay delay;
+
 	/* Rather than have every client wait upon all user interrupts,
 	 * with the herd waking after every interrupt and each doing the
 	 * heavyweight seqno dance, we delegate the task (of being the
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.h b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
index b3e17399be9b..c7271b3acde3 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
@@ -22,6 +22,11 @@ static inline void intel_gt_pm_get(struct intel_gt *gt)
 	intel_wakeref_get(&gt->wakeref);
 }
 
+static inline void __intel_gt_pm_get(struct intel_gt *gt)
+{
+	__intel_wakeref_get(&gt->wakeref);
+}
+
 static inline bool intel_gt_pm_get_if_awake(struct intel_gt *gt)
 {
 	return intel_wakeref_get_if_active(&gt->wakeref);
diff --git a/drivers/gpu/drm/i915/intel_wakeref.h b/drivers/gpu/drm/i915/intel_wakeref.h
index 5f0c972a80fb..2603a177518a 100644
--- a/drivers/gpu/drm/i915/intel_wakeref.h
+++ b/drivers/gpu/drm/i915/intel_wakeref.h
@@ -84,6 +84,18 @@ intel_wakeref_get(struct intel_wakeref *wf)
 	return 0;
 }
 
+/**
+ * __intel_wakeref_get: Acquire the wakeref, again
+ * @wf: the wakeref
+ *
+ * Acquire a hold on an already active wakeref.
+ */
+static inline void
+__intel_wakeref_get(struct intel_wakeref *wf)
+{
+	atomic_inc(&wf->count);
+}
+
 /**
  * intel_wakeref_get_if_in_use: Acquire the wakeref
  * @wf: the wakeref
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] drm/i915/gt: Flush retire.work timer object on unload
@ 2019-11-15 16:09   ` Tvrtko Ursulin
  0 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2019-11-15 16:09 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 15/11/2019 15:08, Chris Wilson wrote:
> We need to wait until the timer object is marked as deactivated before
> unloading, so follow up our gentle cancel_delayed_work() with the
> synchronous variant to ensure it is flushed off a remote cpu before we
> mark the memory as freed.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111994
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_gt.c          | 1 +
>   drivers/gpu/drm/i915/gt/intel_gt_requests.c | 6 ++++++
>   drivers/gpu/drm/i915/gt/intel_gt_requests.h | 1 +
>   3 files changed, 8 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
> index c39b21c8d328..b5a9b87e4ec9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> @@ -397,6 +397,7 @@ void intel_gt_driver_release(struct intel_gt *gt)
>   void intel_gt_driver_late_release(struct intel_gt *gt)
>   {
>   	intel_uc_driver_late_release(&gt->uc);
> +	intel_gt_fini_requests(gt);
>   	intel_gt_fini_reset(gt);
>   	intel_gt_fini_timelines(gt);
>   }
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> index ccbddddbbd52..a79e6efb31a2 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> @@ -130,3 +130,9 @@ void intel_gt_unpark_requests(struct intel_gt *gt)
>   	schedule_delayed_work(&gt->requests.retire_work,
>   			      round_jiffies_up_relative(HZ));
>   }
> +
> +void intel_gt_fini_requests(struct intel_gt *gt)
> +{
> +	/* Wait until the work is marked as finished before unloading! */
> +	cancel_delayed_work_sync(&gt->requests.retire_work);
> +}
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.h b/drivers/gpu/drm/i915/gt/intel_gt_requests.h
> index bd31cbce47e0..fde546424c63 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.h
> @@ -20,5 +20,6 @@ int intel_gt_wait_for_idle(struct intel_gt *gt, long timeout);
>   void intel_gt_init_requests(struct intel_gt *gt);
>   void intel_gt_park_requests(struct intel_gt *gt);
>   void intel_gt_unpark_requests(struct intel_gt *gt);
> +void intel_gt_fini_requests(struct intel_gt *gt);
>   
>   #endif /* INTEL_GT_REQUESTS_H */
> 

Sounds plausible. Verified fix or speculative?

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Intel-gfx] [PATCH 1/3] drm/i915/gt: Flush retire.work timer object on unload
@ 2019-11-15 16:09   ` Tvrtko Ursulin
  0 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2019-11-15 16:09 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 15/11/2019 15:08, Chris Wilson wrote:
> We need to wait until the timer object is marked as deactivated before
> unloading, so follow up our gentle cancel_delayed_work() with the
> synchronous variant to ensure it is flushed off a remote cpu before we
> mark the memory as freed.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111994
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_gt.c          | 1 +
>   drivers/gpu/drm/i915/gt/intel_gt_requests.c | 6 ++++++
>   drivers/gpu/drm/i915/gt/intel_gt_requests.h | 1 +
>   3 files changed, 8 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
> index c39b21c8d328..b5a9b87e4ec9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> @@ -397,6 +397,7 @@ void intel_gt_driver_release(struct intel_gt *gt)
>   void intel_gt_driver_late_release(struct intel_gt *gt)
>   {
>   	intel_uc_driver_late_release(&gt->uc);
> +	intel_gt_fini_requests(gt);
>   	intel_gt_fini_reset(gt);
>   	intel_gt_fini_timelines(gt);
>   }
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> index ccbddddbbd52..a79e6efb31a2 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> @@ -130,3 +130,9 @@ void intel_gt_unpark_requests(struct intel_gt *gt)
>   	schedule_delayed_work(&gt->requests.retire_work,
>   			      round_jiffies_up_relative(HZ));
>   }
> +
> +void intel_gt_fini_requests(struct intel_gt *gt)
> +{
> +	/* Wait until the work is marked as finished before unloading! */
> +	cancel_delayed_work_sync(&gt->requests.retire_work);
> +}
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.h b/drivers/gpu/drm/i915/gt/intel_gt_requests.h
> index bd31cbce47e0..fde546424c63 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.h
> @@ -20,5 +20,6 @@ int intel_gt_wait_for_idle(struct intel_gt *gt, long timeout);
>   void intel_gt_init_requests(struct intel_gt *gt);
>   void intel_gt_park_requests(struct intel_gt *gt);
>   void intel_gt_unpark_requests(struct intel_gt *gt);
> +void intel_gt_fini_requests(struct intel_gt *gt);
>   
>   #endif /* INTEL_GT_REQUESTS_H */
> 

Sounds plausible. Verified fix or speculative?

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] drm/i915/selftests: Disable heartbeat around context barrier tests
@ 2019-11-15 16:15     ` Tvrtko Ursulin
  0 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2019-11-15 16:15 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 15/11/2019 15:08, Chris Wilson wrote:
> As the heartbeat has the effect of flushing context barriers, this
> interferes with the context barrier tests that are trying to observe
> them directly. Disable the heartbeat so that the barriers are as
> predictable as the test demands.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/selftest_context.c | 44 +++++++++++++++++++---
>   1 file changed, 38 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
> index 14ba6ceb9177..3586af636304 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
> @@ -5,6 +5,7 @@
>    */
>   
>   #include "i915_selftest.h"
> +#include "intel_engine_heartbeat.h"
>   #include "intel_engine_pm.h"
>   #include "intel_gt.h"
>   
> @@ -200,6 +201,7 @@ static int live_context_size(void *arg)
>   static int __live_active_context(struct intel_engine_cs *engine,
>   				 struct i915_gem_context *fixme)
>   {
> +	unsigned long saved_heartbeat;
>   	struct intel_context *ce;
>   	int pass;
>   	int err;
> @@ -227,36 +229,50 @@ static int __live_active_context(struct intel_engine_cs *engine,
>   	if (IS_ERR(ce))
>   		return PTR_ERR(ce);
>   
> +	saved_heartbeat = engine->props.heartbeat_interval_ms;
> +	engine->props.heartbeat_interval_ms = 0;
> +
>   	for (pass = 0; pass <= 2; pass++) {
>   		struct i915_request *rq;
>   
> +		intel_engine_pm_get(engine);
> +
>   		rq = intel_context_create_request(ce);
>   		if (IS_ERR(rq)) {
>   			err = PTR_ERR(rq);
> -			goto err;
> +			goto out_engine;
>   		}
>   
>   		err = request_sync(rq);
>   		if (err)
> -			goto err;
> +			goto out_engine;
>   
>   		/* Context will be kept active until after an idle-barrier. */
>   		if (i915_active_is_idle(&ce->active)) {
>   			pr_err("context is not active; expected idle-barrier (%s pass %d)\n",
>   			       engine->name, pass);
>   			err = -EINVAL;
> -			goto err;
> +			goto out_engine;
>   		}
>   
>   		if (!intel_engine_pm_is_awake(engine)) {
>   			pr_err("%s is asleep before idle-barrier\n",
>   			       engine->name);
>   			err = -EINVAL;
> -			goto err;
> +			goto out_engine;
>   		}
> +
> +out_engine:
> +		intel_engine_pm_put(engine);
> +		if (err)
> +			goto err;
>   	}
>   
>   	/* Now make sure our idle-barriers are flushed */
> +	err = intel_engine_flush_barriers(engine);
> +	if (err)
> +		goto err;
> +
>   	err = context_sync(engine->kernel_context);
>   	if (err)
>   		goto err;
> @@ -270,8 +286,9 @@ static int __live_active_context(struct intel_engine_cs *engine,
>   		struct drm_printer p = drm_debug_printer(__func__);
>   
>   		intel_engine_dump(engine, &p,
> -				  "%s is still awake after idle-barriers\n",
> -				  engine->name);
> +				  "%s is still awake:%d after idle-barriers\n",
> +				  engine->name,
> +				  atomic_read(&engine->wakeref.count));
>   		GEM_TRACE_DUMP();
>   
>   		err = -EINVAL;
> @@ -279,6 +296,7 @@ static int __live_active_context(struct intel_engine_cs *engine,
>   	}
>   
>   err:
> +	engine->props.heartbeat_interval_ms = saved_heartbeat;
>   	intel_context_put(ce);
>   	return err;
>   }
> @@ -349,6 +367,7 @@ static int __live_remote_context(struct intel_engine_cs *engine,
>   				 struct i915_gem_context *fixme)
>   {
>   	struct intel_context *local, *remote;
> +	unsigned long saved_heartbeat;
>   	int pass;
>   	int err;
>   
> @@ -360,6 +379,12 @@ static int __live_remote_context(struct intel_engine_cs *engine,
>   	 * clobber the idle-barrier.
>   	 */
>   
> +	if (intel_engine_pm_is_awake(engine)) {
> +		pr_err("%s is awake before starting %s!\n",
> +		       engine->name, __func__);
> +		return -EINVAL;
> +	}
> +
>   	remote = intel_context_create(fixme, engine);
>   	if (IS_ERR(remote))
>   		return PTR_ERR(remote);
> @@ -370,6 +395,10 @@ static int __live_remote_context(struct intel_engine_cs *engine,
>   		goto err_remote;
>   	}
>   
> +	saved_heartbeat = engine->props.heartbeat_interval_ms;
> +	engine->props.heartbeat_interval_ms = 0;
> +	intel_engine_pm_get(engine);
> +
>   	for (pass = 0; pass <= 2; pass++) {
>   		err = __remote_sync(local, remote);
>   		if (err)
> @@ -387,6 +416,9 @@ static int __live_remote_context(struct intel_engine_cs *engine,
>   		}
>   	}
>   
> +	intel_engine_pm_put(engine);
> +	engine->props.heartbeat_interval_ms = saved_heartbeat;
> +
>   	intel_context_put(local);
>   err_remote:
>   	intel_context_put(remote);
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Intel-gfx] [PATCH 2/3] drm/i915/selftests: Disable heartbeat around context barrier tests
@ 2019-11-15 16:15     ` Tvrtko Ursulin
  0 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2019-11-15 16:15 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 15/11/2019 15:08, Chris Wilson wrote:
> As the heartbeat has the effect of flushing context barriers, this
> interferes with the context barrier tests that are trying to observe
> them directly. Disable the heartbeat so that the barriers are as
> predictable as the test demands.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/selftest_context.c | 44 +++++++++++++++++++---
>   1 file changed, 38 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
> index 14ba6ceb9177..3586af636304 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
> @@ -5,6 +5,7 @@
>    */
>   
>   #include "i915_selftest.h"
> +#include "intel_engine_heartbeat.h"
>   #include "intel_engine_pm.h"
>   #include "intel_gt.h"
>   
> @@ -200,6 +201,7 @@ static int live_context_size(void *arg)
>   static int __live_active_context(struct intel_engine_cs *engine,
>   				 struct i915_gem_context *fixme)
>   {
> +	unsigned long saved_heartbeat;
>   	struct intel_context *ce;
>   	int pass;
>   	int err;
> @@ -227,36 +229,50 @@ static int __live_active_context(struct intel_engine_cs *engine,
>   	if (IS_ERR(ce))
>   		return PTR_ERR(ce);
>   
> +	saved_heartbeat = engine->props.heartbeat_interval_ms;
> +	engine->props.heartbeat_interval_ms = 0;
> +
>   	for (pass = 0; pass <= 2; pass++) {
>   		struct i915_request *rq;
>   
> +		intel_engine_pm_get(engine);
> +
>   		rq = intel_context_create_request(ce);
>   		if (IS_ERR(rq)) {
>   			err = PTR_ERR(rq);
> -			goto err;
> +			goto out_engine;
>   		}
>   
>   		err = request_sync(rq);
>   		if (err)
> -			goto err;
> +			goto out_engine;
>   
>   		/* Context will be kept active until after an idle-barrier. */
>   		if (i915_active_is_idle(&ce->active)) {
>   			pr_err("context is not active; expected idle-barrier (%s pass %d)\n",
>   			       engine->name, pass);
>   			err = -EINVAL;
> -			goto err;
> +			goto out_engine;
>   		}
>   
>   		if (!intel_engine_pm_is_awake(engine)) {
>   			pr_err("%s is asleep before idle-barrier\n",
>   			       engine->name);
>   			err = -EINVAL;
> -			goto err;
> +			goto out_engine;
>   		}
> +
> +out_engine:
> +		intel_engine_pm_put(engine);
> +		if (err)
> +			goto err;
>   	}
>   
>   	/* Now make sure our idle-barriers are flushed */
> +	err = intel_engine_flush_barriers(engine);
> +	if (err)
> +		goto err;
> +
>   	err = context_sync(engine->kernel_context);
>   	if (err)
>   		goto err;
> @@ -270,8 +286,9 @@ static int __live_active_context(struct intel_engine_cs *engine,
>   		struct drm_printer p = drm_debug_printer(__func__);
>   
>   		intel_engine_dump(engine, &p,
> -				  "%s is still awake after idle-barriers\n",
> -				  engine->name);
> +				  "%s is still awake:%d after idle-barriers\n",
> +				  engine->name,
> +				  atomic_read(&engine->wakeref.count));
>   		GEM_TRACE_DUMP();
>   
>   		err = -EINVAL;
> @@ -279,6 +296,7 @@ static int __live_active_context(struct intel_engine_cs *engine,
>   	}
>   
>   err:
> +	engine->props.heartbeat_interval_ms = saved_heartbeat;
>   	intel_context_put(ce);
>   	return err;
>   }
> @@ -349,6 +367,7 @@ static int __live_remote_context(struct intel_engine_cs *engine,
>   				 struct i915_gem_context *fixme)
>   {
>   	struct intel_context *local, *remote;
> +	unsigned long saved_heartbeat;
>   	int pass;
>   	int err;
>   
> @@ -360,6 +379,12 @@ static int __live_remote_context(struct intel_engine_cs *engine,
>   	 * clobber the idle-barrier.
>   	 */
>   
> +	if (intel_engine_pm_is_awake(engine)) {
> +		pr_err("%s is awake before starting %s!\n",
> +		       engine->name, __func__);
> +		return -EINVAL;
> +	}
> +
>   	remote = intel_context_create(fixme, engine);
>   	if (IS_ERR(remote))
>   		return PTR_ERR(remote);
> @@ -370,6 +395,10 @@ static int __live_remote_context(struct intel_engine_cs *engine,
>   		goto err_remote;
>   	}
>   
> +	saved_heartbeat = engine->props.heartbeat_interval_ms;
> +	engine->props.heartbeat_interval_ms = 0;
> +	intel_engine_pm_get(engine);
> +
>   	for (pass = 0; pass <= 2; pass++) {
>   		err = __remote_sync(local, remote);
>   		if (err)
> @@ -387,6 +416,9 @@ static int __live_remote_context(struct intel_engine_cs *engine,
>   		}
>   	}
>   
> +	intel_engine_pm_put(engine);
> +	engine->props.heartbeat_interval_ms = saved_heartbeat;
> +
>   	intel_context_put(local);
>   err_remote:
>   	intel_context_put(remote);
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] drm/i915/gt: Flush retire.work timer object on unload
@ 2019-11-15 16:18     ` Chris Wilson
  0 siblings, 0 replies; 16+ messages in thread
From: Chris Wilson @ 2019-11-15 16:18 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-11-15 16:09:00)
> 
> On 15/11/2019 15:08, Chris Wilson wrote:
> > We need to wait until the timer object is marked as deactivated before
> > unloading, so follow up our gentle cancel_delayed_work() with the
> > synchronous variant to ensure it is flushed off a remote cpu before we
> > mark the memory as freed.
> > 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111994
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_gt.c          | 1 +
> >   drivers/gpu/drm/i915/gt/intel_gt_requests.c | 6 ++++++
> >   drivers/gpu/drm/i915/gt/intel_gt_requests.h | 1 +
> >   3 files changed, 8 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
> > index c39b21c8d328..b5a9b87e4ec9 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> > @@ -397,6 +397,7 @@ void intel_gt_driver_release(struct intel_gt *gt)
> >   void intel_gt_driver_late_release(struct intel_gt *gt)
> >   {
> >       intel_uc_driver_late_release(&gt->uc);
> > +     intel_gt_fini_requests(gt);
> >       intel_gt_fini_reset(gt);
> >       intel_gt_fini_timelines(gt);
> >   }
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> > index ccbddddbbd52..a79e6efb31a2 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> > @@ -130,3 +130,9 @@ void intel_gt_unpark_requests(struct intel_gt *gt)
> >       schedule_delayed_work(&gt->requests.retire_work,
> >                             round_jiffies_up_relative(HZ));
> >   }
> > +
> > +void intel_gt_fini_requests(struct intel_gt *gt)
> > +{
> > +     /* Wait until the work is marked as finished before unloading! */
> > +     cancel_delayed_work_sync(&gt->requests.retire_work);
> > +}
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.h b/drivers/gpu/drm/i915/gt/intel_gt_requests.h
> > index bd31cbce47e0..fde546424c63 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.h
> > @@ -20,5 +20,6 @@ int intel_gt_wait_for_idle(struct intel_gt *gt, long timeout);
> >   void intel_gt_init_requests(struct intel_gt *gt);
> >   void intel_gt_park_requests(struct intel_gt *gt);
> >   void intel_gt_unpark_requests(struct intel_gt *gt);
> > +void intel_gt_fini_requests(struct intel_gt *gt);
> >   
> >   #endif /* INTEL_GT_REQUESTS_H */
> > 
> 
> Sounds plausible. Verified fix or speculative?

Only verified in the sense that it came and went away again.
It's timing dependent and all the debugobject says is delayed_work of
which we free a few hundred at module unload. But I suspect this is the
only delayed work that doesn't have a sync on shutdown at present.

It sounded plausible, albeit unlikely, to me that we could just about do
the kfree(i915) while the worker was asleep on another cpu before its
debugobject could be marked as deactivated.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Intel-gfx] [PATCH 1/3] drm/i915/gt: Flush retire.work timer object on unload
@ 2019-11-15 16:18     ` Chris Wilson
  0 siblings, 0 replies; 16+ messages in thread
From: Chris Wilson @ 2019-11-15 16:18 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-11-15 16:09:00)
> 
> On 15/11/2019 15:08, Chris Wilson wrote:
> > We need to wait until the timer object is marked as deactivated before
> > unloading, so follow up our gentle cancel_delayed_work() with the
> > synchronous variant to ensure it is flushed off a remote cpu before we
> > mark the memory as freed.
> > 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111994
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_gt.c          | 1 +
> >   drivers/gpu/drm/i915/gt/intel_gt_requests.c | 6 ++++++
> >   drivers/gpu/drm/i915/gt/intel_gt_requests.h | 1 +
> >   3 files changed, 8 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
> > index c39b21c8d328..b5a9b87e4ec9 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> > @@ -397,6 +397,7 @@ void intel_gt_driver_release(struct intel_gt *gt)
> >   void intel_gt_driver_late_release(struct intel_gt *gt)
> >   {
> >       intel_uc_driver_late_release(&gt->uc);
> > +     intel_gt_fini_requests(gt);
> >       intel_gt_fini_reset(gt);
> >       intel_gt_fini_timelines(gt);
> >   }
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> > index ccbddddbbd52..a79e6efb31a2 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> > @@ -130,3 +130,9 @@ void intel_gt_unpark_requests(struct intel_gt *gt)
> >       schedule_delayed_work(&gt->requests.retire_work,
> >                             round_jiffies_up_relative(HZ));
> >   }
> > +
> > +void intel_gt_fini_requests(struct intel_gt *gt)
> > +{
> > +     /* Wait until the work is marked as finished before unloading! */
> > +     cancel_delayed_work_sync(&gt->requests.retire_work);
> > +}
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.h b/drivers/gpu/drm/i915/gt/intel_gt_requests.h
> > index bd31cbce47e0..fde546424c63 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.h
> > @@ -20,5 +20,6 @@ int intel_gt_wait_for_idle(struct intel_gt *gt, long timeout);
> >   void intel_gt_init_requests(struct intel_gt *gt);
> >   void intel_gt_park_requests(struct intel_gt *gt);
> >   void intel_gt_unpark_requests(struct intel_gt *gt);
> > +void intel_gt_fini_requests(struct intel_gt *gt);
> >   
> >   #endif /* INTEL_GT_REQUESTS_H */
> > 
> 
> Sounds plausible. Verified fix or speculative?

Only verified in the sense that it came and went away again.
It's timing dependent and all the debugobject says is delayed_work of
which we free a few hundred at module unload. But I suspect this is the
only delayed work that doesn't have a sync on shutdown at present.

It sounded plausible, albeit unlikely, to me that we could just about do
the kfree(i915) while the worker was asleep on another cpu before its
debugobject could be marked as deactivated.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/3] drm/i915/gt: Flush retire.work timer object on unload
@ 2019-11-15 17:49   ` Patchwork
  0 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2019-11-15 17:49 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/3] drm/i915/gt: Flush retire.work timer object on unload
URL   : https://patchwork.freedesktop.org/series/69536/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
94cbdef4257c drm/i915/selftests: Disable heartbeat around context barrier tests
d547d16311c9 drm/i915/gt: Track engine round-trip times
-:14: WARNING:TYPO_SPELLING: 'preemptable' may be misspelled - perhaps 'preemptible'?
#14: 
the engine is idle at this point, and the pulse is non-preemptable, so

-:23: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")'
#23: 
References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")

total: 1 errors, 1 warnings, 0 checks, 120 lines checked

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/3] drm/i915/gt: Flush retire.work timer object on unload
@ 2019-11-15 17:49   ` Patchwork
  0 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2019-11-15 17:49 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/3] drm/i915/gt: Flush retire.work timer object on unload
URL   : https://patchwork.freedesktop.org/series/69536/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
94cbdef4257c drm/i915/selftests: Disable heartbeat around context barrier tests
d547d16311c9 drm/i915/gt: Track engine round-trip times
-:14: WARNING:TYPO_SPELLING: 'preemptable' may be misspelled - perhaps 'preemptible'?
#14: 
the engine is idle at this point, and the pulse is non-preemptable, so

-:23: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")'
#23: 
References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")

total: 1 errors, 1 warnings, 0 checks, 120 lines checked

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* ✗ Fi.CI.BAT: failure for series starting with [1/3] drm/i915/gt: Flush retire.work timer object on unload
@ 2019-11-15 18:11   ` Patchwork
  0 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2019-11-15 18:11 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/3] drm/i915/gt: Flush retire.work timer object on unload
URL   : https://patchwork.freedesktop.org/series/69536/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_7353 -> Patchwork_15286
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_15286 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_15286, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/index.html

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_15286:

### IGT changes ###

#### Possible regressions ####

  * igt@i915_selftest@live_blt:
    - fi-bsw-n3050:       [PASS][1] -> [INCOMPLETE][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7353/fi-bsw-n3050/igt@i915_selftest@live_blt.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-bsw-n3050/igt@i915_selftest@live_blt.html

  * igt@i915_selftest@live_gt_contexts:
    - fi-pnv-d510:        [PASS][3] -> [DMESG-FAIL][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7353/fi-pnv-d510/igt@i915_selftest@live_gt_contexts.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-pnv-d510/igt@i915_selftest@live_gt_contexts.html
    - fi-byt-n2820:       [PASS][5] -> [DMESG-FAIL][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7353/fi-byt-n2820/igt@i915_selftest@live_gt_contexts.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-byt-n2820/igt@i915_selftest@live_gt_contexts.html
    - fi-cfl-guc:         NOTRUN -> [DMESG-FAIL][7]
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-cfl-guc/igt@i915_selftest@live_gt_contexts.html
    - fi-bdw-5557u:       [PASS][8] -> [DMESG-FAIL][9]
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7353/fi-bdw-5557u/igt@i915_selftest@live_gt_contexts.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-bdw-5557u/igt@i915_selftest@live_gt_contexts.html
    - fi-kbl-guc:         [PASS][10] -> [DMESG-FAIL][11]
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7353/fi-kbl-guc/igt@i915_selftest@live_gt_contexts.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-kbl-guc/igt@i915_selftest@live_gt_contexts.html

  * igt@runner@aborted:
    - fi-pnv-d510:        NOTRUN -> [FAIL][12]
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-pnv-d510/igt@runner@aborted.html

  
Known issues
------------

  Here are the changes found in Patchwork_15286 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@i915_module_load@reload-with-fault-injection:
    - fi-icl-y:           [PASS][13] -> [INCOMPLETE][14] ([fdo#107713])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7353/fi-icl-y/igt@i915_module_load@reload-with-fault-injection.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-icl-y/igt@i915_module_load@reload-with-fault-injection.html

  * igt@kms_busy@basic-flip-pipe-b:
    - fi-skl-6770hq:      [PASS][15] -> [DMESG-WARN][16] ([fdo#105541])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7353/fi-skl-6770hq/igt@kms_busy@basic-flip-pipe-b.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-skl-6770hq/igt@kms_busy@basic-flip-pipe-b.html

  * igt@kms_chamelium@hdmi-hpd-fast:
    - fi-kbl-7500u:       [PASS][17] -> [FAIL][18] ([fdo#111045] / [fdo#111096])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7353/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html

  * igt@kms_setmode@basic-clone-single-crtc:
    - fi-skl-6770hq:      [PASS][19] -> [WARN][20] ([fdo#112252])
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7353/fi-skl-6770hq/igt@kms_setmode@basic-clone-single-crtc.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-skl-6770hq/igt@kms_setmode@basic-clone-single-crtc.html

  
#### Possible fixes ####

  * igt@i915_pm_rpm@module-reload:
    - fi-skl-6770hq:      [FAIL][21] ([fdo#108511]) -> [PASS][22]
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7353/fi-skl-6770hq/igt@i915_pm_rpm@module-reload.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-skl-6770hq/igt@i915_pm_rpm@module-reload.html

  * igt@i915_selftest@live_hangcheck:
    - fi-hsw-4770r:       [DMESG-FAIL][23] ([fdo#111991]) -> [PASS][24]
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7353/fi-hsw-4770r/igt@i915_selftest@live_hangcheck.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-hsw-4770r/igt@i915_selftest@live_hangcheck.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#105541]: https://bugs.freedesktop.org/show_bug.cgi?id=105541
  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#108511]: https://bugs.freedesktop.org/show_bug.cgi?id=108511
  [fdo#111045]: https://bugs.freedesktop.org/show_bug.cgi?id=111045
  [fdo#111096]: https://bugs.freedesktop.org/show_bug.cgi?id=111096
  [fdo#111991]: https://bugs.freedesktop.org/show_bug.cgi?id=111991
  [fdo#112252]: https://bugs.freedesktop.org/show_bug.cgi?id=112252
  [fdo#112260]: https://bugs.freedesktop.org/show_bug.cgi?id=112260
  [fdo#112298]: https://bugs.freedesktop.org/show_bug.cgi?id=112298


Participating hosts (51 -> 45)
------------------------------

  Additional (1): fi-cfl-guc 
  Missing    (7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_7353 -> Patchwork_15286

  CI-20190529: 20190529
  CI_DRM_7353: 18d4d81004d8407cd4dbbebdafb2ccd77eb52872 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5288: ff4551e36cd8e573ceb1e450d17a12e3298dc04c @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_15286: d547d16311c94ff70c417109431d69c8eeed3516 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

d547d16311c9 drm/i915/gt: Track engine round-trip times
94cbdef4257c drm/i915/selftests: Disable heartbeat around context barrier tests

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [1/3] drm/i915/gt: Flush retire.work timer object on unload
@ 2019-11-15 18:11   ` Patchwork
  0 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2019-11-15 18:11 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/3] drm/i915/gt: Flush retire.work timer object on unload
URL   : https://patchwork.freedesktop.org/series/69536/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_7353 -> Patchwork_15286
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_15286 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_15286, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/index.html

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_15286:

### IGT changes ###

#### Possible regressions ####

  * igt@i915_selftest@live_blt:
    - fi-bsw-n3050:       [PASS][1] -> [INCOMPLETE][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7353/fi-bsw-n3050/igt@i915_selftest@live_blt.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-bsw-n3050/igt@i915_selftest@live_blt.html

  * igt@i915_selftest@live_gt_contexts:
    - fi-pnv-d510:        [PASS][3] -> [DMESG-FAIL][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7353/fi-pnv-d510/igt@i915_selftest@live_gt_contexts.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-pnv-d510/igt@i915_selftest@live_gt_contexts.html
    - fi-byt-n2820:       [PASS][5] -> [DMESG-FAIL][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7353/fi-byt-n2820/igt@i915_selftest@live_gt_contexts.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-byt-n2820/igt@i915_selftest@live_gt_contexts.html
    - fi-cfl-guc:         NOTRUN -> [DMESG-FAIL][7]
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-cfl-guc/igt@i915_selftest@live_gt_contexts.html
    - fi-bdw-5557u:       [PASS][8] -> [DMESG-FAIL][9]
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7353/fi-bdw-5557u/igt@i915_selftest@live_gt_contexts.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-bdw-5557u/igt@i915_selftest@live_gt_contexts.html
    - fi-kbl-guc:         [PASS][10] -> [DMESG-FAIL][11]
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7353/fi-kbl-guc/igt@i915_selftest@live_gt_contexts.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-kbl-guc/igt@i915_selftest@live_gt_contexts.html

  * igt@runner@aborted:
    - fi-pnv-d510:        NOTRUN -> [FAIL][12]
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-pnv-d510/igt@runner@aborted.html

  
Known issues
------------

  Here are the changes found in Patchwork_15286 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@i915_module_load@reload-with-fault-injection:
    - fi-icl-y:           [PASS][13] -> [INCOMPLETE][14] ([fdo#107713])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7353/fi-icl-y/igt@i915_module_load@reload-with-fault-injection.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-icl-y/igt@i915_module_load@reload-with-fault-injection.html

  * igt@kms_busy@basic-flip-pipe-b:
    - fi-skl-6770hq:      [PASS][15] -> [DMESG-WARN][16] ([fdo#105541])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7353/fi-skl-6770hq/igt@kms_busy@basic-flip-pipe-b.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-skl-6770hq/igt@kms_busy@basic-flip-pipe-b.html

  * igt@kms_chamelium@hdmi-hpd-fast:
    - fi-kbl-7500u:       [PASS][17] -> [FAIL][18] ([fdo#111045] / [fdo#111096])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7353/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html

  * igt@kms_setmode@basic-clone-single-crtc:
    - fi-skl-6770hq:      [PASS][19] -> [WARN][20] ([fdo#112252])
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7353/fi-skl-6770hq/igt@kms_setmode@basic-clone-single-crtc.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-skl-6770hq/igt@kms_setmode@basic-clone-single-crtc.html

  
#### Possible fixes ####

  * igt@i915_pm_rpm@module-reload:
    - fi-skl-6770hq:      [FAIL][21] ([fdo#108511]) -> [PASS][22]
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7353/fi-skl-6770hq/igt@i915_pm_rpm@module-reload.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-skl-6770hq/igt@i915_pm_rpm@module-reload.html

  * igt@i915_selftest@live_hangcheck:
    - fi-hsw-4770r:       [DMESG-FAIL][23] ([fdo#111991]) -> [PASS][24]
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7353/fi-hsw-4770r/igt@i915_selftest@live_hangcheck.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/fi-hsw-4770r/igt@i915_selftest@live_hangcheck.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#105541]: https://bugs.freedesktop.org/show_bug.cgi?id=105541
  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#108511]: https://bugs.freedesktop.org/show_bug.cgi?id=108511
  [fdo#111045]: https://bugs.freedesktop.org/show_bug.cgi?id=111045
  [fdo#111096]: https://bugs.freedesktop.org/show_bug.cgi?id=111096
  [fdo#111991]: https://bugs.freedesktop.org/show_bug.cgi?id=111991
  [fdo#112252]: https://bugs.freedesktop.org/show_bug.cgi?id=112252
  [fdo#112260]: https://bugs.freedesktop.org/show_bug.cgi?id=112260
  [fdo#112298]: https://bugs.freedesktop.org/show_bug.cgi?id=112298


Participating hosts (51 -> 45)
------------------------------

  Additional (1): fi-cfl-guc 
  Missing    (7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_7353 -> Patchwork_15286

  CI-20190529: 20190529
  CI_DRM_7353: 18d4d81004d8407cd4dbbebdafb2ccd77eb52872 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5288: ff4551e36cd8e573ceb1e450d17a12e3298dc04c @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_15286: d547d16311c94ff70c417109431d69c8eeed3516 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

d547d16311c9 drm/i915/gt: Track engine round-trip times
94cbdef4257c drm/i915/selftests: Disable heartbeat around context barrier tests

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15286/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-11-15 18:11 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-15 15:08 [PATCH 1/3] drm/i915/gt: Flush retire.work timer object on unload Chris Wilson
2019-11-15 15:08 ` [Intel-gfx] " Chris Wilson
2019-11-15 15:08 ` [PATCH 2/3] drm/i915/selftests: Disable heartbeat around context barrier tests Chris Wilson
2019-11-15 15:08   ` [Intel-gfx] " Chris Wilson
2019-11-15 16:15   ` Tvrtko Ursulin
2019-11-15 16:15     ` [Intel-gfx] " Tvrtko Ursulin
2019-11-15 15:08 ` [PATCH 3/3] drm/i915/gt: Track engine round-trip times Chris Wilson
2019-11-15 15:08   ` [Intel-gfx] " Chris Wilson
2019-11-15 16:09 ` [PATCH 1/3] drm/i915/gt: Flush retire.work timer object on unload Tvrtko Ursulin
2019-11-15 16:09   ` [Intel-gfx] " Tvrtko Ursulin
2019-11-15 16:18   ` Chris Wilson
2019-11-15 16:18     ` [Intel-gfx] " Chris Wilson
2019-11-15 17:49 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/3] " Patchwork
2019-11-15 17:49   ` [Intel-gfx] " Patchwork
2019-11-15 18:11 ` ✗ Fi.CI.BAT: failure " Patchwork
2019-11-15 18:11   ` [Intel-gfx] " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.