Fast soft-rc6

All of lore.kernel.org
 help / color / mirror / Atom feed

* Fast soft-rc6
@ 2019-11-18 23:02 ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

In my very simple testing of scrolling through firefox, this brings up
back into line with HW rc6 energy usage, a substantial improvement over
current -tip.
-Chris


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* [Intel-gfx] Fast soft-rc6
@ 2019-11-18 23:02 ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

In my very simple testing of scrolling through firefox, this brings up
back into line with HW rc6 energy usage, a substantial improvement over
current -tip.
-Chris


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* [PATCH 01/19] drm/i915/selftests: Force bonded submission to overlap
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

Bonded request submission is designed to allow requests to execute in
parallel as laid out by the user. If the master request is already
finished before its bonded pair is submitted, the pair were not destined
to run in parallel and we lose the information about the master engine
to dictate selection of the secondary. If the second request was
required to be run on a particular engine in a virtual set, that should
have been specified, rather than left to the whims of a random
unconnected requests!

In the selftest, I made the mistake of not ensuring the master would
overlap with its bonded pairs, meaning that it could indeed complete
before we submitted the bonds. Those bonds were then free to select any
available engine in their virtual set, and not the one expected by the
test.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/selftest_lrc.c | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index 16ebe4d2308e..f3b0610d1f10 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -3036,15 +3036,21 @@ static int bond_virtual_engine(struct intel_gt *gt,
 	struct i915_gem_context *ctx;
 	struct i915_request *rq[16];
 	enum intel_engine_id id;
+	struct igt_spinner spin;
 	unsigned long n;
 	int err;
 
 	GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
 
-	ctx = kernel_context(gt->i915);
-	if (!ctx)
+	if (igt_spinner_init(&spin, gt))
 		return -ENOMEM;
 
+	ctx = kernel_context(gt->i915);
+	if (!ctx) {
+		err = -ENOMEM;
+		goto err_spin;
+	}
+
 	err = 0;
 	rq[0] = ERR_PTR(-ENOMEM);
 	for_each_engine(master, gt, id) {
@@ -3055,7 +3061,7 @@ static int bond_virtual_engine(struct intel_gt *gt,
 
 		memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
 
-		rq[0] = igt_request_alloc(ctx, master);
+		rq[0] = spinner_create_request(&spin, ctx, master, MI_NOOP);
 		if (IS_ERR(rq[0])) {
 			err = PTR_ERR(rq[0]);
 			goto out;
@@ -3068,10 +3074,17 @@ static int bond_virtual_engine(struct intel_gt *gt,
 							       &fence,
 							       GFP_KERNEL);
 		}
+
 		i915_request_add(rq[0]);
 		if (err < 0)
 			goto out;
 
+		if (!(flags & BOND_SCHEDULE) &&
+		    !igt_wait_for_spinner(&spin, rq[0])) {
+			err = -EIO;
+			goto out;
+		}
+
 		for (n = 0; n < nsibling; n++) {
 			struct intel_context *ve;
 
@@ -3119,6 +3132,8 @@ static int bond_virtual_engine(struct intel_gt *gt,
 			}
 		}
 		onstack_fence_fini(&fence);
+		intel_engine_flush_submission(master);
+		igt_spinner_end(&spin);
 
 		if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
 			pr_err("Master request did not execute (on %s)!\n",
@@ -3156,6 +3171,8 @@ static int bond_virtual_engine(struct intel_gt *gt,
 		err = -EIO;
 
 	kernel_context_close(ctx);
+err_spin:
+	igt_spinner_fini(&spin);
 	return err;
 }
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 01/19] drm/i915/selftests: Force bonded submission to overlap
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

Bonded request submission is designed to allow requests to execute in
parallel as laid out by the user. If the master request is already
finished before its bonded pair is submitted, the pair were not destined
to run in parallel and we lose the information about the master engine
to dictate selection of the secondary. If the second request was
required to be run on a particular engine in a virtual set, that should
have been specified, rather than left to the whims of a random
unconnected requests!

In the selftest, I made the mistake of not ensuring the master would
overlap with its bonded pairs, meaning that it could indeed complete
before we submitted the bonds. Those bonds were then free to select any
available engine in their virtual set, and not the one expected by the
test.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/selftest_lrc.c | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index 16ebe4d2308e..f3b0610d1f10 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -3036,15 +3036,21 @@ static int bond_virtual_engine(struct intel_gt *gt,
 	struct i915_gem_context *ctx;
 	struct i915_request *rq[16];
 	enum intel_engine_id id;
+	struct igt_spinner spin;
 	unsigned long n;
 	int err;
 
 	GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
 
-	ctx = kernel_context(gt->i915);
-	if (!ctx)
+	if (igt_spinner_init(&spin, gt))
 		return -ENOMEM;
 
+	ctx = kernel_context(gt->i915);
+	if (!ctx) {
+		err = -ENOMEM;
+		goto err_spin;
+	}
+
 	err = 0;
 	rq[0] = ERR_PTR(-ENOMEM);
 	for_each_engine(master, gt, id) {
@@ -3055,7 +3061,7 @@ static int bond_virtual_engine(struct intel_gt *gt,
 
 		memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
 
-		rq[0] = igt_request_alloc(ctx, master);
+		rq[0] = spinner_create_request(&spin, ctx, master, MI_NOOP);
 		if (IS_ERR(rq[0])) {
 			err = PTR_ERR(rq[0]);
 			goto out;
@@ -3068,10 +3074,17 @@ static int bond_virtual_engine(struct intel_gt *gt,
 							       &fence,
 							       GFP_KERNEL);
 		}
+
 		i915_request_add(rq[0]);
 		if (err < 0)
 			goto out;
 
+		if (!(flags & BOND_SCHEDULE) &&
+		    !igt_wait_for_spinner(&spin, rq[0])) {
+			err = -EIO;
+			goto out;
+		}
+
 		for (n = 0; n < nsibling; n++) {
 			struct intel_context *ve;
 
@@ -3119,6 +3132,8 @@ static int bond_virtual_engine(struct intel_gt *gt,
 			}
 		}
 		onstack_fence_fini(&fence);
+		intel_engine_flush_submission(master);
+		igt_spinner_end(&spin);
 
 		if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
 			pr_err("Master request did not execute (on %s)!\n",
@@ -3156,6 +3171,8 @@ static int bond_virtual_engine(struct intel_gt *gt,
 		err = -EIO;
 
 	kernel_context_close(ctx);
+err_spin:
+	igt_spinner_fini(&spin);
 	return err;
 }
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 02/19] drm/i915/gem: Manually dump the debug trace on GEM_BUG_ON
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

Since igt now defaults to not enabling ftrace-on-oops, we need to
manually invoke GEM_TRACE_DUMP() to see the debug log prior to a
GEM_BUG_ON panicking.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.h b/drivers/gpu/drm/i915/i915_gem.h
index c7985767296a..1753c84d6c0d 100644
--- a/drivers/gpu/drm/i915/i915_gem.h
+++ b/drivers/gpu/drm/i915/i915_gem.h
@@ -30,6 +30,8 @@
 
 #include <drm/drm_drv.h>
 
+#include "i915_utils.h"
+
 struct drm_i915_private;
 
 #ifdef CONFIG_DRM_I915_DEBUG_GEM
@@ -39,6 +41,7 @@ struct drm_i915_private;
 #define GEM_BUG_ON(condition) do { if (unlikely((condition))) {	\
 		GEM_TRACE_ERR("%s:%d GEM_BUG_ON(%s)\n", \
 			      __func__, __LINE__, __stringify(condition)); \
+		GEM_TRACE_DUMP(); \
 		BUG(); \
 		} \
 	} while(0)
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 02/19] drm/i915/gem: Manually dump the debug trace on GEM_BUG_ON
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

Since igt now defaults to not enabling ftrace-on-oops, we need to
manually invoke GEM_TRACE_DUMP() to see the debug log prior to a
GEM_BUG_ON panicking.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.h b/drivers/gpu/drm/i915/i915_gem.h
index c7985767296a..1753c84d6c0d 100644
--- a/drivers/gpu/drm/i915/i915_gem.h
+++ b/drivers/gpu/drm/i915/i915_gem.h
@@ -30,6 +30,8 @@
 
 #include <drm/drm_drv.h>
 
+#include "i915_utils.h"
+
 struct drm_i915_private;
 
 #ifdef CONFIG_DRM_I915_DEBUG_GEM
@@ -39,6 +41,7 @@ struct drm_i915_private;
 #define GEM_BUG_ON(condition) do { if (unlikely((condition))) {	\
 		GEM_TRACE_ERR("%s:%d GEM_BUG_ON(%s)\n", \
 			      __func__, __LINE__, __stringify(condition)); \
+		GEM_TRACE_DUMP(); \
 		BUG(); \
 		} \
 	} while(0)
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 03/19] drm/i915/gt: Close race between engine_park and intel_gt_retire_requests
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx; +Cc: Matthew Auld

The general concept was that intel_timeline.active_count was locked by
the intel_timeline.mutex. The exception was for power management, where
the engine->kernel_context->timeline could be manipulated under the
global wakeref.mutex.

This was quite solid, as we always manipulated the timeline only while
we held an engine wakeref.

And then we started retiring requests outside of struct_mutex, only
using the timelines.active_list and the timeline->mutex. There we
started manipulating intel_timeline.active_count outside of an engine
wakeref, and so introduced a race between __engine_park() and
intel_gt_retire_requests(), a race that could result in the
engine->kernel_context not being added to the active timelines and so
losing requests, which caused us to keep the system permanently powered
up [and unloadable].

The race would be easy to close if we could take the engine wakeref for
the timeline before we retire -- except timelines are not bound to any
engine and so we would need to keep all active engines awake. The
alternative is to guard intel_timeline_enter/intel_timeline_exit for use
outside of the timeline->mutex.

Fixes: e5dadff4b093 ("drm/i915: Protect request retirement with timeline->mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_requests.c   |  8 ++---
 drivers/gpu/drm/i915/gt/intel_timeline.c      | 34 +++++++++++++++----
 .../gpu/drm/i915/gt/intel_timeline_types.h    |  2 +-
 3 files changed, 32 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
index a79e6efb31a2..7559d6373f49 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
@@ -49,8 +49,8 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
 			continue;
 
 		intel_timeline_get(tl);
-		GEM_BUG_ON(!tl->active_count);
-		tl->active_count++; /* pin the list element */
+		GEM_BUG_ON(!atomic_read(&tl->active_count));
+		atomic_inc(&tl->active_count); /* pin the list element */
 		spin_unlock_irqrestore(&timelines->lock, flags);
 
 		if (timeout > 0) {
@@ -71,14 +71,14 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
 
 		/* Resume iteration after dropping lock */
 		list_safe_reset_next(tl, tn, link);
-		if (!--tl->active_count)
+		if (atomic_dec_and_test(&tl->active_count))
 			list_del(&tl->link);
 
 		mutex_unlock(&tl->mutex);
 
 		/* Defer the final release to after the spinlock */
 		if (refcount_dec_and_test(&tl->kref.refcount)) {
-			GEM_BUG_ON(tl->active_count);
+			GEM_BUG_ON(atomic_read(&tl->active_count));
 			list_add(&tl->link, &free);
 		}
 	}
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
index 16a9e88d93de..4f914f0d5eab 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -334,15 +334,33 @@ void intel_timeline_enter(struct intel_timeline *tl)
 	struct intel_gt_timelines *timelines = &tl->gt->timelines;
 	unsigned long flags;
 
+	/*
+	 * Pretend we are serialised by the timeline->mutex.
+	 *
+	 * While generally true, there are a few exceptions to the rule
+	 * for the engine->kernel_context being used to manage power
+	 * transitions. As the engine_park may be called from under any
+	 * timeline, it uses the power mutex as a global serialisation
+	 * lock to prevent any other request entering its timeline.
+	 *
+	 * The rule is generally tl->mutex, otherwise engine->wakeref.mutex.
+	 *
+	 * However, intel_gt_retire_request() does not know which engine
+	 * it is retiring along and so cannot partake in the engine-pm
+	 * barrier, and there we use the tl->active_count as a means to
+	 * pin the timeline in the active_list while the locks are dropped.
+	 * Ergo, as that is outside of the engine-pm barrier, we need to
+	 * use atomic to manipulate tl->active_count.
+	 */
 	lockdep_assert_held(&tl->mutex);
-
 	GEM_BUG_ON(!atomic_read(&tl->pin_count));
-	if (tl->active_count++)
+
+	if (atomic_add_unless(&tl->active_count, 1, 0))
 		return;
-	GEM_BUG_ON(!tl->active_count); /* overflow? */
 
 	spin_lock_irqsave(&timelines->lock, flags);
-	list_add(&tl->link, &timelines->active_list);
+	if (!atomic_fetch_inc(&tl->active_count))
+		list_add(&tl->link, &timelines->active_list);
 	spin_unlock_irqrestore(&timelines->lock, flags);
 }
 
@@ -351,14 +369,16 @@ void intel_timeline_exit(struct intel_timeline *tl)
 	struct intel_gt_timelines *timelines = &tl->gt->timelines;
 	unsigned long flags;
 
+	/* See intel_timeline_enter() */
 	lockdep_assert_held(&tl->mutex);
 
-	GEM_BUG_ON(!tl->active_count);
-	if (--tl->active_count)
+	GEM_BUG_ON(!atomic_read(&tl->active_count));
+	if (atomic_add_unless(&tl->active_count, -1, 1))
 		return;
 
 	spin_lock_irqsave(&timelines->lock, flags);
-	list_del(&tl->link);
+	if (atomic_dec_and_test(&tl->active_count))
+		list_del(&tl->link);
 	spin_unlock_irqrestore(&timelines->lock, flags);
 
 	/*
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline_types.h b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
index 98d9ee166379..5244615ed1cb 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
@@ -42,7 +42,7 @@ struct intel_timeline {
 	 * from the intel_context caller plus internal atomicity.
 	 */
 	atomic_t pin_count;
-	unsigned int active_count;
+	atomic_t active_count;
 
 	const u32 *hwsp_seqno;
 	struct i915_vma *hwsp_ggtt;
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 03/19] drm/i915/gt: Close race between engine_park and intel_gt_retire_requests
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx; +Cc: Matthew Auld

The general concept was that intel_timeline.active_count was locked by
the intel_timeline.mutex. The exception was for power management, where
the engine->kernel_context->timeline could be manipulated under the
global wakeref.mutex.

This was quite solid, as we always manipulated the timeline only while
we held an engine wakeref.

And then we started retiring requests outside of struct_mutex, only
using the timelines.active_list and the timeline->mutex. There we
started manipulating intel_timeline.active_count outside of an engine
wakeref, and so introduced a race between __engine_park() and
intel_gt_retire_requests(), a race that could result in the
engine->kernel_context not being added to the active timelines and so
losing requests, which caused us to keep the system permanently powered
up [and unloadable].

The race would be easy to close if we could take the engine wakeref for
the timeline before we retire -- except timelines are not bound to any
engine and so we would need to keep all active engines awake. The
alternative is to guard intel_timeline_enter/intel_timeline_exit for use
outside of the timeline->mutex.

Fixes: e5dadff4b093 ("drm/i915: Protect request retirement with timeline->mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_requests.c   |  8 ++---
 drivers/gpu/drm/i915/gt/intel_timeline.c      | 34 +++++++++++++++----
 .../gpu/drm/i915/gt/intel_timeline_types.h    |  2 +-
 3 files changed, 32 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
index a79e6efb31a2..7559d6373f49 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
@@ -49,8 +49,8 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
 			continue;
 
 		intel_timeline_get(tl);
-		GEM_BUG_ON(!tl->active_count);
-		tl->active_count++; /* pin the list element */
+		GEM_BUG_ON(!atomic_read(&tl->active_count));
+		atomic_inc(&tl->active_count); /* pin the list element */
 		spin_unlock_irqrestore(&timelines->lock, flags);
 
 		if (timeout > 0) {
@@ -71,14 +71,14 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
 
 		/* Resume iteration after dropping lock */
 		list_safe_reset_next(tl, tn, link);
-		if (!--tl->active_count)
+		if (atomic_dec_and_test(&tl->active_count))
 			list_del(&tl->link);
 
 		mutex_unlock(&tl->mutex);
 
 		/* Defer the final release to after the spinlock */
 		if (refcount_dec_and_test(&tl->kref.refcount)) {
-			GEM_BUG_ON(tl->active_count);
+			GEM_BUG_ON(atomic_read(&tl->active_count));
 			list_add(&tl->link, &free);
 		}
 	}
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
index 16a9e88d93de..4f914f0d5eab 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -334,15 +334,33 @@ void intel_timeline_enter(struct intel_timeline *tl)
 	struct intel_gt_timelines *timelines = &tl->gt->timelines;
 	unsigned long flags;
 
+	/*
+	 * Pretend we are serialised by the timeline->mutex.
+	 *
+	 * While generally true, there are a few exceptions to the rule
+	 * for the engine->kernel_context being used to manage power
+	 * transitions. As the engine_park may be called from under any
+	 * timeline, it uses the power mutex as a global serialisation
+	 * lock to prevent any other request entering its timeline.
+	 *
+	 * The rule is generally tl->mutex, otherwise engine->wakeref.mutex.
+	 *
+	 * However, intel_gt_retire_request() does not know which engine
+	 * it is retiring along and so cannot partake in the engine-pm
+	 * barrier, and there we use the tl->active_count as a means to
+	 * pin the timeline in the active_list while the locks are dropped.
+	 * Ergo, as that is outside of the engine-pm barrier, we need to
+	 * use atomic to manipulate tl->active_count.
+	 */
 	lockdep_assert_held(&tl->mutex);
-
 	GEM_BUG_ON(!atomic_read(&tl->pin_count));
-	if (tl->active_count++)
+
+	if (atomic_add_unless(&tl->active_count, 1, 0))
 		return;
-	GEM_BUG_ON(!tl->active_count); /* overflow? */
 
 	spin_lock_irqsave(&timelines->lock, flags);
-	list_add(&tl->link, &timelines->active_list);
+	if (!atomic_fetch_inc(&tl->active_count))
+		list_add(&tl->link, &timelines->active_list);
 	spin_unlock_irqrestore(&timelines->lock, flags);
 }
 
@@ -351,14 +369,16 @@ void intel_timeline_exit(struct intel_timeline *tl)
 	struct intel_gt_timelines *timelines = &tl->gt->timelines;
 	unsigned long flags;
 
+	/* See intel_timeline_enter() */
 	lockdep_assert_held(&tl->mutex);
 
-	GEM_BUG_ON(!tl->active_count);
-	if (--tl->active_count)
+	GEM_BUG_ON(!atomic_read(&tl->active_count));
+	if (atomic_add_unless(&tl->active_count, -1, 1))
 		return;
 
 	spin_lock_irqsave(&timelines->lock, flags);
-	list_del(&tl->link);
+	if (atomic_dec_and_test(&tl->active_count))
+		list_del(&tl->link);
 	spin_unlock_irqrestore(&timelines->lock, flags);
 
 	/*
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline_types.h b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
index 98d9ee166379..5244615ed1cb 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
@@ -42,7 +42,7 @@ struct intel_timeline {
 	 * from the intel_context caller plus internal atomicity.
 	 */
 	atomic_t pin_count;
-	unsigned int active_count;
+	atomic_t active_count;
 
 	const u32 *hwsp_seqno;
 	struct i915_vma *hwsp_ggtt;
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 04/19] drm/i915/gt: Unlock engine-pm after queuing the kernel context switch
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

In commit a79ca656b648 ("drm/i915: Push the wakeref->count deferral to
the backend"), I erroneously concluded that we last modify the engine
inside __i915_request_commit() meaning that we could enable concurrent
submission for userspace as we enqueued this request. However, this
falls into a trap with other users of the engine->kernel_context waking
up and submitting their request before the idle-switch is queued, with
the result that the kernel_context is executed out-of-sequence most
likely upsetting the GPU and certainly ourselves when we try to retire
the out-of-sequence requests.

As such we need to hold onto the effective engine->kernel_context mutex
lock (via the engine pm mutex proxy) until we have finish queuing the
request to the engine.

v2: Serialise against concurrent intel_gt_retire_requests()

Fixes: a79ca656b648 ("drm/i915: Push the wakeref->count deferral to the backend")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_pm.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 3c0f490ff2c7..722d3eec226e 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -75,6 +75,7 @@ static inline void __timeline_mark_unlock(struct intel_context *ce,
 
 static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 {
+	struct intel_context *ce = engine->kernel_context;
 	struct i915_request *rq;
 	unsigned long flags;
 	bool result = true;
@@ -99,15 +100,13 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 	 * retiring the last request, thus all rings should be empty and
 	 * all timelines idle.
 	 */
-	flags = __timeline_mark_lock(engine->kernel_context);
+	flags = __timeline_mark_lock(ce);
 
-	rq = __i915_request_create(engine->kernel_context, GFP_NOWAIT);
+	rq = __i915_request_create(ce, GFP_NOWAIT);
 	if (IS_ERR(rq))
 		/* Context switch failed, hope for the best! Maybe reset? */
 		goto out_unlock;
 
-	intel_timeline_enter(i915_request_timeline(rq));
-
 	/* Check again on the next retirement. */
 	engine->wakeref_serial = engine->serial + 1;
 	i915_request_add_active_barriers(rq);
@@ -116,13 +115,17 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 	rq->sched.attr.priority = I915_PRIORITY_BARRIER;
 	__i915_request_commit(rq);
 
+	__i915_request_queue(rq, NULL);
+
 	/* Release our exclusive hold on the engine */
 	__intel_wakeref_defer_park(&engine->wakeref);
-	__i915_request_queue(rq, NULL);
+
+	/* And finally expose our selfselves to intel_gt_retire_requests() */
+	intel_timeline_enter(ce->timeline);
 
 	result = false;
 out_unlock:
-	__timeline_mark_unlock(engine->kernel_context, flags);
+	__timeline_mark_unlock(ce, flags);
 	return result;
 }
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 04/19] drm/i915/gt: Unlock engine-pm after queuing the kernel context switch
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

In commit a79ca656b648 ("drm/i915: Push the wakeref->count deferral to
the backend"), I erroneously concluded that we last modify the engine
inside __i915_request_commit() meaning that we could enable concurrent
submission for userspace as we enqueued this request. However, this
falls into a trap with other users of the engine->kernel_context waking
up and submitting their request before the idle-switch is queued, with
the result that the kernel_context is executed out-of-sequence most
likely upsetting the GPU and certainly ourselves when we try to retire
the out-of-sequence requests.

As such we need to hold onto the effective engine->kernel_context mutex
lock (via the engine pm mutex proxy) until we have finish queuing the
request to the engine.

v2: Serialise against concurrent intel_gt_retire_requests()

Fixes: a79ca656b648 ("drm/i915: Push the wakeref->count deferral to the backend")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_pm.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 3c0f490ff2c7..722d3eec226e 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -75,6 +75,7 @@ static inline void __timeline_mark_unlock(struct intel_context *ce,
 
 static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 {
+	struct intel_context *ce = engine->kernel_context;
 	struct i915_request *rq;
 	unsigned long flags;
 	bool result = true;
@@ -99,15 +100,13 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 	 * retiring the last request, thus all rings should be empty and
 	 * all timelines idle.
 	 */
-	flags = __timeline_mark_lock(engine->kernel_context);
+	flags = __timeline_mark_lock(ce);
 
-	rq = __i915_request_create(engine->kernel_context, GFP_NOWAIT);
+	rq = __i915_request_create(ce, GFP_NOWAIT);
 	if (IS_ERR(rq))
 		/* Context switch failed, hope for the best! Maybe reset? */
 		goto out_unlock;
 
-	intel_timeline_enter(i915_request_timeline(rq));
-
 	/* Check again on the next retirement. */
 	engine->wakeref_serial = engine->serial + 1;
 	i915_request_add_active_barriers(rq);
@@ -116,13 +115,17 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 	rq->sched.attr.priority = I915_PRIORITY_BARRIER;
 	__i915_request_commit(rq);
 
+	__i915_request_queue(rq, NULL);
+
 	/* Release our exclusive hold on the engine */
 	__intel_wakeref_defer_park(&engine->wakeref);
-	__i915_request_queue(rq, NULL);
+
+	/* And finally expose our selfselves to intel_gt_retire_requests() */
+	intel_timeline_enter(ce->timeline);
 
 	result = false;
 out_unlock:
-	__timeline_mark_unlock(engine->kernel_context, flags);
+	__timeline_mark_unlock(ce, flags);
 	return result;
 }
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 05/19] drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

In order to avoid some nasty mutex inversions, commit 09c5ab384f6f
("drm/i915: Keep rings pinned while the context is active") allowed the
intel_ring unpinning to be run concurrently with the next context
pinning it. Thus each step in intel_ring_unpin() needed to be atomic and
ordered in a nice onion with intel_ring_pin() so that the lifetimes
overlapped and were always safe.

Sadly, a few steps in intel_ring_unpin() were overlooked, such as
closing the read/write pointers of the ring and discarding the
intel_ring.vaddr, as these steps were not serialised with
intel_ring_pin() and so could leave the ring in disarray.

Fixes: 09c5ab384f6f ("drm/i915: Keep rings pinned while the context is active")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_ring.c | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
index ece20504d240..374b28f13ca0 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring.c
@@ -57,9 +57,10 @@ int intel_ring_pin(struct intel_ring *ring)
 
 	i915_vma_make_unshrinkable(vma);
 
-	GEM_BUG_ON(ring->vaddr);
-	ring->vaddr = addr;
+	/* Discard any unused bytes beyond that submitted to hw. */
+	intel_ring_reset(ring, ring->emit);
 
+	ring->vaddr = addr;
 	return 0;
 
 err_ring:
@@ -85,20 +86,14 @@ void intel_ring_unpin(struct intel_ring *ring)
 	if (!atomic_dec_and_test(&ring->pin_count))
 		return;
 
-	/* Discard any unused bytes beyond that submitted to hw. */
-	intel_ring_reset(ring, ring->emit);
-
 	i915_vma_unset_ggtt_write(vma);
 	if (i915_vma_is_map_and_fenceable(vma))
 		i915_vma_unpin_iomap(vma);
 	else
 		i915_gem_object_unpin_map(vma->obj);
 
-	GEM_BUG_ON(!ring->vaddr);
-	ring->vaddr = NULL;
-
-	i915_vma_unpin(vma);
 	i915_vma_make_purgeable(vma);
+	i915_vma_unpin(vma);
 }
 
 static struct i915_vma *create_ring_vma(struct i915_ggtt *ggtt, int size)
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 05/19] drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

In order to avoid some nasty mutex inversions, commit 09c5ab384f6f
("drm/i915: Keep rings pinned while the context is active") allowed the
intel_ring unpinning to be run concurrently with the next context
pinning it. Thus each step in intel_ring_unpin() needed to be atomic and
ordered in a nice onion with intel_ring_pin() so that the lifetimes
overlapped and were always safe.

Sadly, a few steps in intel_ring_unpin() were overlooked, such as
closing the read/write pointers of the ring and discarding the
intel_ring.vaddr, as these steps were not serialised with
intel_ring_pin() and so could leave the ring in disarray.

Fixes: 09c5ab384f6f ("drm/i915: Keep rings pinned while the context is active")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_ring.c | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
index ece20504d240..374b28f13ca0 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring.c
@@ -57,9 +57,10 @@ int intel_ring_pin(struct intel_ring *ring)
 
 	i915_vma_make_unshrinkable(vma);
 
-	GEM_BUG_ON(ring->vaddr);
-	ring->vaddr = addr;
+	/* Discard any unused bytes beyond that submitted to hw. */
+	intel_ring_reset(ring, ring->emit);
 
+	ring->vaddr = addr;
 	return 0;
 
 err_ring:
@@ -85,20 +86,14 @@ void intel_ring_unpin(struct intel_ring *ring)
 	if (!atomic_dec_and_test(&ring->pin_count))
 		return;
 
-	/* Discard any unused bytes beyond that submitted to hw. */
-	intel_ring_reset(ring, ring->emit);
-
 	i915_vma_unset_ggtt_write(vma);
 	if (i915_vma_is_map_and_fenceable(vma))
 		i915_vma_unpin_iomap(vma);
 	else
 		i915_gem_object_unpin_map(vma->obj);
 
-	GEM_BUG_ON(!ring->vaddr);
-	ring->vaddr = NULL;
-
-	i915_vma_unpin(vma);
 	i915_vma_make_purgeable(vma);
+	i915_vma_unpin(vma);
 }
 
 static struct i915_vma *create_ring_vma(struct i915_ggtt *ggtt, int size)
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 06/19] drm/i915/gt: Schedule request retirement when submission idles
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

The major drawback of commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX
corruption WA") is that it disables RC6 while Skylake (and friends) is
active, and we do not consider the GPU idle until all outstanding
requests have been retired and the engine switched over to the kernel
context. If userspace is idle, this task falls onto our background idle
worker, which only runs roughly once a second, meaning that userspace has
to have been idle for a couple of seconds before we enable RC6 again.
Naturally, this causes us to consume considerably more energy than
before as powersaving is effectively disabled while a display server
(here's looking at you Xorg) is running.

As execlists will get a completion event as the last context is
completed and the GPU goes idle, we can use our submission tasklet to
notice when the GPU is idle and kick the retire worker. Thus during
light workloads, we will do much more work to idle the GPU faster...
Hopefully with commensurate power saving!

References: https://bugs.freedesktop.org/show_bug.cgi?id=112315
References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_requests.h |  9 ++++++++-
 drivers/gpu/drm/i915/gt/intel_lrc.c         | 13 +++++++++++++
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.h b/drivers/gpu/drm/i915/gt/intel_gt_requests.h
index fde546424c63..94b8758a45d9 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.h
@@ -7,7 +7,9 @@
 #ifndef INTEL_GT_REQUESTS_H
 #define INTEL_GT_REQUESTS_H
 
-struct intel_gt;
+#include <linux/workqueue.h>
+
+#include "intel_gt_types.h"
 
 long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout);
 static inline void intel_gt_retire_requests(struct intel_gt *gt)
@@ -15,6 +17,11 @@ static inline void intel_gt_retire_requests(struct intel_gt *gt)
 	intel_gt_retire_requests_timeout(gt, 0);
 }
 
+static inline void intel_gt_schedule_retire_requests(struct intel_gt *gt)
+{
+	mod_delayed_work(system_wq, &gt->requests.retire_work, 0);
+}
+
 int intel_gt_wait_for_idle(struct intel_gt *gt, long timeout);
 
 void intel_gt_init_requests(struct intel_gt *gt);
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 33ce258d484f..f7c8fec436a9 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -142,6 +142,7 @@
 #include "intel_engine_pm.h"
 #include "intel_gt.h"
 #include "intel_gt_pm.h"
+#include "intel_gt_requests.h"
 #include "intel_lrc_reg.h"
 #include "intel_mocs.h"
 #include "intel_reset.h"
@@ -2278,6 +2279,18 @@ static void execlists_submission_tasklet(unsigned long data)
 		if (timeout && preempt_timeout(engine))
 			preempt_reset(engine);
 	}
+
+	/*
+	 * If the GPU is currently idle, retire the outstanding completed
+	 * requests. This will allow us to enter soft-rc6 as soon as possible,
+	 * albeit at the cost of running the retire worker much more frequently
+	 * (over the entire GT not just this engine) and emitting more idle
+	 * barriers (i.e. kernel context switches unpinning all that went
+	 * before) which may add some extra latency.
+	 */
+	if (intel_engine_pm_is_awake(engine) &&
+	    !execlists_active(&engine->execlists))
+		intel_gt_schedule_retire_requests(engine->gt);
 }
 
 static void __execlists_kick(struct intel_engine_execlists *execlists)
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 06/19] drm/i915/gt: Schedule request retirement when submission idles
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

The major drawback of commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX
corruption WA") is that it disables RC6 while Skylake (and friends) is
active, and we do not consider the GPU idle until all outstanding
requests have been retired and the engine switched over to the kernel
context. If userspace is idle, this task falls onto our background idle
worker, which only runs roughly once a second, meaning that userspace has
to have been idle for a couple of seconds before we enable RC6 again.
Naturally, this causes us to consume considerably more energy than
before as powersaving is effectively disabled while a display server
(here's looking at you Xorg) is running.

As execlists will get a completion event as the last context is
completed and the GPU goes idle, we can use our submission tasklet to
notice when the GPU is idle and kick the retire worker. Thus during
light workloads, we will do much more work to idle the GPU faster...
Hopefully with commensurate power saving!

References: https://bugs.freedesktop.org/show_bug.cgi?id=112315
References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt_requests.h |  9 ++++++++-
 drivers/gpu/drm/i915/gt/intel_lrc.c         | 13 +++++++++++++
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.h b/drivers/gpu/drm/i915/gt/intel_gt_requests.h
index fde546424c63..94b8758a45d9 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.h
@@ -7,7 +7,9 @@
 #ifndef INTEL_GT_REQUESTS_H
 #define INTEL_GT_REQUESTS_H
 
-struct intel_gt;
+#include <linux/workqueue.h>
+
+#include "intel_gt_types.h"
 
 long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout);
 static inline void intel_gt_retire_requests(struct intel_gt *gt)
@@ -15,6 +17,11 @@ static inline void intel_gt_retire_requests(struct intel_gt *gt)
 	intel_gt_retire_requests_timeout(gt, 0);
 }
 
+static inline void intel_gt_schedule_retire_requests(struct intel_gt *gt)
+{
+	mod_delayed_work(system_wq, &gt->requests.retire_work, 0);
+}
+
 int intel_gt_wait_for_idle(struct intel_gt *gt, long timeout);
 
 void intel_gt_init_requests(struct intel_gt *gt);
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 33ce258d484f..f7c8fec436a9 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -142,6 +142,7 @@
 #include "intel_engine_pm.h"
 #include "intel_gt.h"
 #include "intel_gt_pm.h"
+#include "intel_gt_requests.h"
 #include "intel_lrc_reg.h"
 #include "intel_mocs.h"
 #include "intel_reset.h"
@@ -2278,6 +2279,18 @@ static void execlists_submission_tasklet(unsigned long data)
 		if (timeout && preempt_timeout(engine))
 			preempt_reset(engine);
 	}
+
+	/*
+	 * If the GPU is currently idle, retire the outstanding completed
+	 * requests. This will allow us to enter soft-rc6 as soon as possible,
+	 * albeit at the cost of running the retire worker much more frequently
+	 * (over the entire GT not just this engine) and emitting more idle
+	 * barriers (i.e. kernel context switches unpinning all that went
+	 * before) which may add some extra latency.
+	 */
+	if (intel_engine_pm_is_awake(engine) &&
+	    !execlists_active(&engine->execlists))
+		intel_gt_schedule_retire_requests(engine->gt);
 }
 
 static void __execlists_kick(struct intel_engine_execlists *execlists)
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 07/19] drm/i915: Mark up the calling context for intel_wakeref_put()
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

Previously, we assumed we could use mutex_trylock() within an atomic
context, falling back to a working if contended. However, such trickery
is illegal inside interrupt context, and so we need to always use a
worker under such circumstances. As we normally are in process context,
we can typically use a plain mutex, and only defer to a work when we
know we are caller from an interrupt path.

Fixes: 51fbd8de87dc ("drm/i915/pmu: Atomically acquire the gt_pm wakeref")
References: a0855d24fc22d ("locking/mutex: Complain upon mutex API misuse in IRQ contexts")
References: https://bugs.freedesktop.org/show_bug.cgi?id=111626
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  3 +-
 drivers/gpu/drm/i915/gt/intel_engine_pm.h    | 10 +++++
 drivers/gpu/drm/i915/gt/intel_gt_pm.c        |  1 -
 drivers/gpu/drm/i915/gt/intel_gt_pm.h        |  5 +++
 drivers/gpu/drm/i915/gt/intel_lrc.c          |  2 +-
 drivers/gpu/drm/i915/gt/intel_reset.c        |  2 +-
 drivers/gpu/drm/i915/gt/selftest_engine_pm.c |  7 ++--
 drivers/gpu/drm/i915/i915_active.c           |  5 ++-
 drivers/gpu/drm/i915/i915_pmu.c              |  6 +--
 drivers/gpu/drm/i915/intel_wakeref.c         |  8 ++--
 drivers/gpu/drm/i915/intel_wakeref.h         | 44 ++++++++++++++++----
 11 files changed, 68 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 722d3eec226e..4878d16176d5 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -180,7 +180,8 @@ static int __engine_park(struct intel_wakeref *wf)
 
 	engine->execlists.no_priolist = false;
 
-	intel_gt_pm_put(engine->gt);
+	/* While we call i915_vma_parked, we have to break the lock cycle */
+	intel_gt_pm_put_async(engine->gt);
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.h b/drivers/gpu/drm/i915/gt/intel_engine_pm.h
index 739c50fefcef..467475fce9c6 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.h
@@ -31,6 +31,16 @@ static inline void intel_engine_pm_put(struct intel_engine_cs *engine)
 	intel_wakeref_put(&engine->wakeref);
 }
 
+static inline void intel_engine_pm_put_async(struct intel_engine_cs *engine)
+{
+	intel_wakeref_put_async(&engine->wakeref);
+}
+
+static inline void intel_engine_pm_unlock_wait(struct intel_engine_cs *engine)
+{
+	intel_wakeref_unlock_wait(&engine->wakeref);
+}
+
 void intel_engine_init__pm(struct intel_engine_cs *engine);
 
 #endif /* INTEL_ENGINE_PM_H */
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
index e61f752a3cd5..7a9044ac4b75 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
@@ -105,7 +105,6 @@ static int __gt_park(struct intel_wakeref *wf)
 static const struct intel_wakeref_ops wf_ops = {
 	.get = __gt_unpark,
 	.put = __gt_park,
-	.flags = INTEL_WAKEREF_PUT_ASYNC,
 };
 
 void intel_gt_pm_init_early(struct intel_gt *gt)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.h b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
index b3e17399be9b..990efc27a4e4 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
@@ -32,6 +32,11 @@ static inline void intel_gt_pm_put(struct intel_gt *gt)
 	intel_wakeref_put(&gt->wakeref);
 }
 
+static inline void intel_gt_pm_put_async(struct intel_gt *gt)
+{
+	intel_wakeref_put_async(&gt->wakeref);
+}
+
 static inline int intel_gt_pm_wait_for_idle(struct intel_gt *gt)
 {
 	return intel_wakeref_wait_for_idle(&gt->wakeref);
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index f7c8fec436a9..fec46afb9264 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1173,7 +1173,7 @@ __execlists_schedule_out(struct i915_request *rq,
 
 	intel_engine_context_out(engine);
 	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
-	intel_gt_pm_put(engine->gt);
+	intel_gt_pm_put_async(engine->gt);
 
 	/*
 	 * If this is part of a virtual engine, its next request may
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index b7007cd78c6f..0388f9375366 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -1125,7 +1125,7 @@ int intel_engine_reset(struct intel_engine_cs *engine, const char *msg)
 out:
 	intel_engine_cancel_stop_cs(engine);
 	reset_finish_engine(engine);
-	intel_engine_pm_put(engine);
+	intel_engine_pm_put_async(engine);
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
index 20b9c83f43ad..851a6c4f65c6 100644
--- a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
@@ -51,11 +51,12 @@ static int live_engine_pm(void *arg)
 				pr_err("intel_engine_pm_get_if_awake(%s) failed under %s\n",
 				       engine->name, p->name);
 			else
-				intel_engine_pm_put(engine);
-			intel_engine_pm_put(engine);
+				intel_engine_pm_put_async(engine);
+			intel_engine_pm_put_async(engine);
 			p->critical_section_end();
 
-			/* engine wakeref is sync (instant) */
+			intel_engine_pm_unlock_wait(engine);
+
 			if (intel_engine_pm_is_awake(engine)) {
 				pr_err("%s is still awake after flushing pm\n",
 				       engine->name);
diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
index 5448f37c8102..dca15ace88f6 100644
--- a/drivers/gpu/drm/i915/i915_active.c
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -672,12 +672,13 @@ void i915_active_acquire_barrier(struct i915_active *ref)
 	 * populated by i915_request_add_active_barriers() to point to the
 	 * request that will eventually release them.
 	 */
-	spin_lock_irqsave_nested(&ref->tree_lock, flags, SINGLE_DEPTH_NESTING);
 	llist_for_each_safe(pos, next, take_preallocated_barriers(ref)) {
 		struct active_node *node = barrier_from_ll(pos);
 		struct intel_engine_cs *engine = barrier_to_engine(node);
 		struct rb_node **p, *parent;
 
+		spin_lock_irqsave_nested(&ref->tree_lock, flags,
+					 SINGLE_DEPTH_NESTING);
 		parent = NULL;
 		p = &ref->tree.rb_node;
 		while (*p) {
@@ -693,12 +694,12 @@ void i915_active_acquire_barrier(struct i915_active *ref)
 		}
 		rb_link_node(&node->node, parent, p);
 		rb_insert_color(&node->node, &ref->tree);
+		spin_unlock_irqrestore(&ref->tree_lock, flags);
 
 		GEM_BUG_ON(!intel_engine_pm_is_awake(engine));
 		llist_add(barrier_to_ll(node), &engine->barrier_tasks);
 		intel_engine_pm_put(engine);
 	}
-	spin_unlock_irqrestore(&ref->tree_lock, flags);
 }
 
 void i915_request_add_active_barriers(struct i915_request *rq)
diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index 9b02be0ad4e6..95e824a78d4d 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -190,7 +190,7 @@ static u64 get_rc6(struct intel_gt *gt)
 	val = 0;
 	if (intel_gt_pm_get_if_awake(gt)) {
 		val = __get_rc6(gt);
-		intel_gt_pm_put(gt);
+		intel_gt_pm_put_async(gt);
 	}
 
 	spin_lock_irqsave(&pmu->lock, flags);
@@ -360,7 +360,7 @@ engines_sample(struct intel_gt *gt, unsigned int period_ns)
 skip:
 		if (unlikely(mmio_lock))
 			spin_unlock_irqrestore(mmio_lock, flags);
-		intel_engine_pm_put(engine);
+		intel_engine_pm_put_async(engine);
 	}
 }
 
@@ -398,7 +398,7 @@ frequency_sample(struct intel_gt *gt, unsigned int period_ns)
 			if (stat)
 				val = intel_get_cagf(rps, stat);
 
-			intel_gt_pm_put(gt);
+			intel_gt_pm_put_async(gt);
 		}
 
 		add_sample_mult(&pmu->sample[__I915_SAMPLE_FREQ_ACT],
diff --git a/drivers/gpu/drm/i915/intel_wakeref.c b/drivers/gpu/drm/i915/intel_wakeref.c
index 868cc78048d0..9b29176cc1ca 100644
--- a/drivers/gpu/drm/i915/intel_wakeref.c
+++ b/drivers/gpu/drm/i915/intel_wakeref.c
@@ -54,7 +54,8 @@ int __intel_wakeref_get_first(struct intel_wakeref *wf)
 
 static void ____intel_wakeref_put_last(struct intel_wakeref *wf)
 {
-	if (!atomic_dec_and_test(&wf->count))
+	INTEL_WAKEREF_BUG_ON(atomic_read(&wf->count) <= 0);
+	if (unlikely(!atomic_dec_and_test(&wf->count)))
 		goto unlock;
 
 	/* ops->put() must reschedule its own release on error/deferral */
@@ -67,13 +68,12 @@ static void ____intel_wakeref_put_last(struct intel_wakeref *wf)
 	mutex_unlock(&wf->mutex);
 }
 
-void __intel_wakeref_put_last(struct intel_wakeref *wf)
+void __intel_wakeref_put_last(struct intel_wakeref *wf, unsigned long flags)
 {
 	INTEL_WAKEREF_BUG_ON(work_pending(&wf->work));
 
 	/* Assume we are not in process context and so cannot sleep. */
-	if (wf->ops->flags & INTEL_WAKEREF_PUT_ASYNC ||
-	    !mutex_trylock(&wf->mutex)) {
+	if (flags & INTEL_WAKEREF_PUT_ASYNC || !mutex_trylock(&wf->mutex)) {
 		schedule_work(&wf->work);
 		return;
 	}
diff --git a/drivers/gpu/drm/i915/intel_wakeref.h b/drivers/gpu/drm/i915/intel_wakeref.h
index 5f0c972a80fb..688b9b536a69 100644
--- a/drivers/gpu/drm/i915/intel_wakeref.h
+++ b/drivers/gpu/drm/i915/intel_wakeref.h
@@ -9,6 +9,7 @@
 
 #include <linux/atomic.h>
 #include <linux/bits.h>
+#include <linux/lockdep.h>
 #include <linux/mutex.h>
 #include <linux/refcount.h>
 #include <linux/stackdepot.h>
@@ -29,9 +30,6 @@ typedef depot_stack_handle_t intel_wakeref_t;
 struct intel_wakeref_ops {
 	int (*get)(struct intel_wakeref *wf);
 	int (*put)(struct intel_wakeref *wf);
-
-	unsigned long flags;
-#define INTEL_WAKEREF_PUT_ASYNC BIT(0)
 };
 
 struct intel_wakeref {
@@ -57,7 +55,7 @@ void __intel_wakeref_init(struct intel_wakeref *wf,
 } while (0)
 
 int __intel_wakeref_get_first(struct intel_wakeref *wf);
-void __intel_wakeref_put_last(struct intel_wakeref *wf);
+void __intel_wakeref_put_last(struct intel_wakeref *wf, unsigned long flags);
 
 /**
  * intel_wakeref_get: Acquire the wakeref
@@ -100,10 +98,9 @@ intel_wakeref_get_if_active(struct intel_wakeref *wf)
 }
 
 /**
- * intel_wakeref_put: Release the wakeref
- * @i915: the drm_i915_private device
+ * intel_wakeref_put_flags: Release the wakeref
  * @wf: the wakeref
- * @fn: callback for releasing the wakeref, called only on final release.
+ * @flags: control flags
  *
  * Release our hold on the wakeref. When there are no more users,
  * the runtime pm wakeref will be released after the @fn callback is called
@@ -116,11 +113,25 @@ intel_wakeref_get_if_active(struct intel_wakeref *wf)
  * code otherwise.
  */
 static inline void
-intel_wakeref_put(struct intel_wakeref *wf)
+__intel_wakeref_put(struct intel_wakeref *wf, unsigned long flags)
+#define INTEL_WAKEREF_PUT_ASYNC BIT(0)
 {
 	INTEL_WAKEREF_BUG_ON(atomic_read(&wf->count) <= 0);
 	if (unlikely(!atomic_add_unless(&wf->count, -1, 1)))
-		__intel_wakeref_put_last(wf);
+		__intel_wakeref_put_last(wf, flags);
+}
+
+static inline void
+intel_wakeref_put(struct intel_wakeref *wf)
+{
+	might_sleep();
+	__intel_wakeref_put(wf, 0);
+}
+
+static inline void
+intel_wakeref_put_async(struct intel_wakeref *wf)
+{
+	__intel_wakeref_put(wf, INTEL_WAKEREF_PUT_ASYNC);
 }
 
 /**
@@ -151,6 +162,20 @@ intel_wakeref_unlock(struct intel_wakeref *wf)
 	mutex_unlock(&wf->mutex);
 }
 
+/**
+ * intel_wakeref_unlock_wait: Wait until the active callback is complete
+ * @wf: the wakeref
+ *
+ * Waits for the active callback (under the @wf->mtuex) is complete.
+ */
+static inline void
+intel_wakeref_unlock_wait(struct intel_wakeref *wf)
+{
+	mutex_lock(&wf->mutex);
+	mutex_unlock(&wf->mutex);
+	flush_work(&wf->work);
+}
+
 /**
  * intel_wakeref_is_active: Query whether the wakeref is currently held
  * @wf: the wakeref
@@ -170,6 +195,7 @@ intel_wakeref_is_active(const struct intel_wakeref *wf)
 static inline void
 __intel_wakeref_defer_park(struct intel_wakeref *wf)
 {
+	lockdep_assert_held(&wf->mutex);
 	INTEL_WAKEREF_BUG_ON(atomic_read(&wf->count));
 	atomic_set_release(&wf->count, 1);
 }
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 07/19] drm/i915: Mark up the calling context for intel_wakeref_put()
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

Previously, we assumed we could use mutex_trylock() within an atomic
context, falling back to a working if contended. However, such trickery
is illegal inside interrupt context, and so we need to always use a
worker under such circumstances. As we normally are in process context,
we can typically use a plain mutex, and only defer to a work when we
know we are caller from an interrupt path.

Fixes: 51fbd8de87dc ("drm/i915/pmu: Atomically acquire the gt_pm wakeref")
References: a0855d24fc22d ("locking/mutex: Complain upon mutex API misuse in IRQ contexts")
References: https://bugs.freedesktop.org/show_bug.cgi?id=111626
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  3 +-
 drivers/gpu/drm/i915/gt/intel_engine_pm.h    | 10 +++++
 drivers/gpu/drm/i915/gt/intel_gt_pm.c        |  1 -
 drivers/gpu/drm/i915/gt/intel_gt_pm.h        |  5 +++
 drivers/gpu/drm/i915/gt/intel_lrc.c          |  2 +-
 drivers/gpu/drm/i915/gt/intel_reset.c        |  2 +-
 drivers/gpu/drm/i915/gt/selftest_engine_pm.c |  7 ++--
 drivers/gpu/drm/i915/i915_active.c           |  5 ++-
 drivers/gpu/drm/i915/i915_pmu.c              |  6 +--
 drivers/gpu/drm/i915/intel_wakeref.c         |  8 ++--
 drivers/gpu/drm/i915/intel_wakeref.h         | 44 ++++++++++++++++----
 11 files changed, 68 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 722d3eec226e..4878d16176d5 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -180,7 +180,8 @@ static int __engine_park(struct intel_wakeref *wf)
 
 	engine->execlists.no_priolist = false;
 
-	intel_gt_pm_put(engine->gt);
+	/* While we call i915_vma_parked, we have to break the lock cycle */
+	intel_gt_pm_put_async(engine->gt);
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.h b/drivers/gpu/drm/i915/gt/intel_engine_pm.h
index 739c50fefcef..467475fce9c6 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.h
@@ -31,6 +31,16 @@ static inline void intel_engine_pm_put(struct intel_engine_cs *engine)
 	intel_wakeref_put(&engine->wakeref);
 }
 
+static inline void intel_engine_pm_put_async(struct intel_engine_cs *engine)
+{
+	intel_wakeref_put_async(&engine->wakeref);
+}
+
+static inline void intel_engine_pm_unlock_wait(struct intel_engine_cs *engine)
+{
+	intel_wakeref_unlock_wait(&engine->wakeref);
+}
+
 void intel_engine_init__pm(struct intel_engine_cs *engine);
 
 #endif /* INTEL_ENGINE_PM_H */
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
index e61f752a3cd5..7a9044ac4b75 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
@@ -105,7 +105,6 @@ static int __gt_park(struct intel_wakeref *wf)
 static const struct intel_wakeref_ops wf_ops = {
 	.get = __gt_unpark,
 	.put = __gt_park,
-	.flags = INTEL_WAKEREF_PUT_ASYNC,
 };
 
 void intel_gt_pm_init_early(struct intel_gt *gt)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.h b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
index b3e17399be9b..990efc27a4e4 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
@@ -32,6 +32,11 @@ static inline void intel_gt_pm_put(struct intel_gt *gt)
 	intel_wakeref_put(&gt->wakeref);
 }
 
+static inline void intel_gt_pm_put_async(struct intel_gt *gt)
+{
+	intel_wakeref_put_async(&gt->wakeref);
+}
+
 static inline int intel_gt_pm_wait_for_idle(struct intel_gt *gt)
 {
 	return intel_wakeref_wait_for_idle(&gt->wakeref);
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index f7c8fec436a9..fec46afb9264 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1173,7 +1173,7 @@ __execlists_schedule_out(struct i915_request *rq,
 
 	intel_engine_context_out(engine);
 	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
-	intel_gt_pm_put(engine->gt);
+	intel_gt_pm_put_async(engine->gt);
 
 	/*
 	 * If this is part of a virtual engine, its next request may
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index b7007cd78c6f..0388f9375366 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -1125,7 +1125,7 @@ int intel_engine_reset(struct intel_engine_cs *engine, const char *msg)
 out:
 	intel_engine_cancel_stop_cs(engine);
 	reset_finish_engine(engine);
-	intel_engine_pm_put(engine);
+	intel_engine_pm_put_async(engine);
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
index 20b9c83f43ad..851a6c4f65c6 100644
--- a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
@@ -51,11 +51,12 @@ static int live_engine_pm(void *arg)
 				pr_err("intel_engine_pm_get_if_awake(%s) failed under %s\n",
 				       engine->name, p->name);
 			else
-				intel_engine_pm_put(engine);
-			intel_engine_pm_put(engine);
+				intel_engine_pm_put_async(engine);
+			intel_engine_pm_put_async(engine);
 			p->critical_section_end();
 
-			/* engine wakeref is sync (instant) */
+			intel_engine_pm_unlock_wait(engine);
+
 			if (intel_engine_pm_is_awake(engine)) {
 				pr_err("%s is still awake after flushing pm\n",
 				       engine->name);
diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
index 5448f37c8102..dca15ace88f6 100644
--- a/drivers/gpu/drm/i915/i915_active.c
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -672,12 +672,13 @@ void i915_active_acquire_barrier(struct i915_active *ref)
 	 * populated by i915_request_add_active_barriers() to point to the
 	 * request that will eventually release them.
 	 */
-	spin_lock_irqsave_nested(&ref->tree_lock, flags, SINGLE_DEPTH_NESTING);
 	llist_for_each_safe(pos, next, take_preallocated_barriers(ref)) {
 		struct active_node *node = barrier_from_ll(pos);
 		struct intel_engine_cs *engine = barrier_to_engine(node);
 		struct rb_node **p, *parent;
 
+		spin_lock_irqsave_nested(&ref->tree_lock, flags,
+					 SINGLE_DEPTH_NESTING);
 		parent = NULL;
 		p = &ref->tree.rb_node;
 		while (*p) {
@@ -693,12 +694,12 @@ void i915_active_acquire_barrier(struct i915_active *ref)
 		}
 		rb_link_node(&node->node, parent, p);
 		rb_insert_color(&node->node, &ref->tree);
+		spin_unlock_irqrestore(&ref->tree_lock, flags);
 
 		GEM_BUG_ON(!intel_engine_pm_is_awake(engine));
 		llist_add(barrier_to_ll(node), &engine->barrier_tasks);
 		intel_engine_pm_put(engine);
 	}
-	spin_unlock_irqrestore(&ref->tree_lock, flags);
 }
 
 void i915_request_add_active_barriers(struct i915_request *rq)
diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index 9b02be0ad4e6..95e824a78d4d 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -190,7 +190,7 @@ static u64 get_rc6(struct intel_gt *gt)
 	val = 0;
 	if (intel_gt_pm_get_if_awake(gt)) {
 		val = __get_rc6(gt);
-		intel_gt_pm_put(gt);
+		intel_gt_pm_put_async(gt);
 	}
 
 	spin_lock_irqsave(&pmu->lock, flags);
@@ -360,7 +360,7 @@ engines_sample(struct intel_gt *gt, unsigned int period_ns)
 skip:
 		if (unlikely(mmio_lock))
 			spin_unlock_irqrestore(mmio_lock, flags);
-		intel_engine_pm_put(engine);
+		intel_engine_pm_put_async(engine);
 	}
 }
 
@@ -398,7 +398,7 @@ frequency_sample(struct intel_gt *gt, unsigned int period_ns)
 			if (stat)
 				val = intel_get_cagf(rps, stat);
 
-			intel_gt_pm_put(gt);
+			intel_gt_pm_put_async(gt);
 		}
 
 		add_sample_mult(&pmu->sample[__I915_SAMPLE_FREQ_ACT],
diff --git a/drivers/gpu/drm/i915/intel_wakeref.c b/drivers/gpu/drm/i915/intel_wakeref.c
index 868cc78048d0..9b29176cc1ca 100644
--- a/drivers/gpu/drm/i915/intel_wakeref.c
+++ b/drivers/gpu/drm/i915/intel_wakeref.c
@@ -54,7 +54,8 @@ int __intel_wakeref_get_first(struct intel_wakeref *wf)
 
 static void ____intel_wakeref_put_last(struct intel_wakeref *wf)
 {
-	if (!atomic_dec_and_test(&wf->count))
+	INTEL_WAKEREF_BUG_ON(atomic_read(&wf->count) <= 0);
+	if (unlikely(!atomic_dec_and_test(&wf->count)))
 		goto unlock;
 
 	/* ops->put() must reschedule its own release on error/deferral */
@@ -67,13 +68,12 @@ static void ____intel_wakeref_put_last(struct intel_wakeref *wf)
 	mutex_unlock(&wf->mutex);
 }
 
-void __intel_wakeref_put_last(struct intel_wakeref *wf)
+void __intel_wakeref_put_last(struct intel_wakeref *wf, unsigned long flags)
 {
 	INTEL_WAKEREF_BUG_ON(work_pending(&wf->work));
 
 	/* Assume we are not in process context and so cannot sleep. */
-	if (wf->ops->flags & INTEL_WAKEREF_PUT_ASYNC ||
-	    !mutex_trylock(&wf->mutex)) {
+	if (flags & INTEL_WAKEREF_PUT_ASYNC || !mutex_trylock(&wf->mutex)) {
 		schedule_work(&wf->work);
 		return;
 	}
diff --git a/drivers/gpu/drm/i915/intel_wakeref.h b/drivers/gpu/drm/i915/intel_wakeref.h
index 5f0c972a80fb..688b9b536a69 100644
--- a/drivers/gpu/drm/i915/intel_wakeref.h
+++ b/drivers/gpu/drm/i915/intel_wakeref.h
@@ -9,6 +9,7 @@
 
 #include <linux/atomic.h>
 #include <linux/bits.h>
+#include <linux/lockdep.h>
 #include <linux/mutex.h>
 #include <linux/refcount.h>
 #include <linux/stackdepot.h>
@@ -29,9 +30,6 @@ typedef depot_stack_handle_t intel_wakeref_t;
 struct intel_wakeref_ops {
 	int (*get)(struct intel_wakeref *wf);
 	int (*put)(struct intel_wakeref *wf);
-
-	unsigned long flags;
-#define INTEL_WAKEREF_PUT_ASYNC BIT(0)
 };
 
 struct intel_wakeref {
@@ -57,7 +55,7 @@ void __intel_wakeref_init(struct intel_wakeref *wf,
 } while (0)
 
 int __intel_wakeref_get_first(struct intel_wakeref *wf);
-void __intel_wakeref_put_last(struct intel_wakeref *wf);
+void __intel_wakeref_put_last(struct intel_wakeref *wf, unsigned long flags);
 
 /**
  * intel_wakeref_get: Acquire the wakeref
@@ -100,10 +98,9 @@ intel_wakeref_get_if_active(struct intel_wakeref *wf)
 }
 
 /**
- * intel_wakeref_put: Release the wakeref
- * @i915: the drm_i915_private device
+ * intel_wakeref_put_flags: Release the wakeref
  * @wf: the wakeref
- * @fn: callback for releasing the wakeref, called only on final release.
+ * @flags: control flags
  *
  * Release our hold on the wakeref. When there are no more users,
  * the runtime pm wakeref will be released after the @fn callback is called
@@ -116,11 +113,25 @@ intel_wakeref_get_if_active(struct intel_wakeref *wf)
  * code otherwise.
  */
 static inline void
-intel_wakeref_put(struct intel_wakeref *wf)
+__intel_wakeref_put(struct intel_wakeref *wf, unsigned long flags)
+#define INTEL_WAKEREF_PUT_ASYNC BIT(0)
 {
 	INTEL_WAKEREF_BUG_ON(atomic_read(&wf->count) <= 0);
 	if (unlikely(!atomic_add_unless(&wf->count, -1, 1)))
-		__intel_wakeref_put_last(wf);
+		__intel_wakeref_put_last(wf, flags);
+}
+
+static inline void
+intel_wakeref_put(struct intel_wakeref *wf)
+{
+	might_sleep();
+	__intel_wakeref_put(wf, 0);
+}
+
+static inline void
+intel_wakeref_put_async(struct intel_wakeref *wf)
+{
+	__intel_wakeref_put(wf, INTEL_WAKEREF_PUT_ASYNC);
 }
 
 /**
@@ -151,6 +162,20 @@ intel_wakeref_unlock(struct intel_wakeref *wf)
 	mutex_unlock(&wf->mutex);
 }
 
+/**
+ * intel_wakeref_unlock_wait: Wait until the active callback is complete
+ * @wf: the wakeref
+ *
+ * Waits for the active callback (under the @wf->mtuex) is complete.
+ */
+static inline void
+intel_wakeref_unlock_wait(struct intel_wakeref *wf)
+{
+	mutex_lock(&wf->mutex);
+	mutex_unlock(&wf->mutex);
+	flush_work(&wf->work);
+}
+
 /**
  * intel_wakeref_is_active: Query whether the wakeref is currently held
  * @wf: the wakeref
@@ -170,6 +195,7 @@ intel_wakeref_is_active(const struct intel_wakeref *wf)
 static inline void
 __intel_wakeref_defer_park(struct intel_wakeref *wf)
 {
+	lockdep_assert_held(&wf->mutex);
 	INTEL_WAKEREF_BUG_ON(atomic_read(&wf->count));
 	atomic_set_release(&wf->count, 1);
 }
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 08/19] drm/i915/gem: Merge GGTT vma flush into a single loop
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

We only need the one loop to find the dirty vma flush them, and their
chipset.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index db103d3c8760..63bd3ff84f5e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -279,18 +279,12 @@ i915_gem_object_flush_write_domain(struct drm_i915_gem_object *obj,
 
 	switch (obj->write_domain) {
 	case I915_GEM_DOMAIN_GTT:
-		for_each_ggtt_vma(vma, obj)
-			intel_gt_flush_ggtt_writes(vma->vm->gt);
-
-		intel_frontbuffer_flush(obj->frontbuffer, ORIGIN_CPU);
-
 		for_each_ggtt_vma(vma, obj) {
-			if (vma->iomap)
-				continue;
-
-			i915_vma_unset_ggtt_write(vma);
+			if (i915_vma_unset_ggtt_write(vma))
+				intel_gt_flush_ggtt_writes(vma->vm->gt);
 		}
 
+		intel_frontbuffer_flush(obj->frontbuffer, ORIGIN_CPU);
 		break;
 
 	case I915_GEM_DOMAIN_WC:
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 08/19] drm/i915/gem: Merge GGTT vma flush into a single loop
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

We only need the one loop to find the dirty vma flush them, and their
chipset.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index db103d3c8760..63bd3ff84f5e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -279,18 +279,12 @@ i915_gem_object_flush_write_domain(struct drm_i915_gem_object *obj,
 
 	switch (obj->write_domain) {
 	case I915_GEM_DOMAIN_GTT:
-		for_each_ggtt_vma(vma, obj)
-			intel_gt_flush_ggtt_writes(vma->vm->gt);
-
-		intel_frontbuffer_flush(obj->frontbuffer, ORIGIN_CPU);
-
 		for_each_ggtt_vma(vma, obj) {
-			if (vma->iomap)
-				continue;
-
-			i915_vma_unset_ggtt_write(vma);
+			if (i915_vma_unset_ggtt_write(vma))
+				intel_gt_flush_ggtt_writes(vma->vm->gt);
 		}
 
+		intel_frontbuffer_flush(obj->frontbuffer, ORIGIN_CPU);
 		break;
 
 	case I915_GEM_DOMAIN_WC:
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 09/19] drm/i915/gt: Only wait for register chipset flush if active
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

Only serialise with the chipset using an mmio if the chipset is
currently active. We expect that any writes into the chipset range will
simply be forgotten until it wakes up.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_gt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index b5a9b87e4ec9..c4fd8d65b8a3 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -304,7 +304,7 @@ void intel_gt_flush_ggtt_writes(struct intel_gt *gt)
 
 	intel_gt_chipset_flush(gt);
 
-	with_intel_runtime_pm(uncore->rpm, wakeref) {
+	with_intel_runtime_pm_if_in_use(uncore->rpm, wakeref) {
 		unsigned long flags;
 
 		spin_lock_irqsave(&uncore->lock, flags);
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 09/19] drm/i915/gt: Only wait for register chipset flush if active
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

Only serialise with the chipset using an mmio if the chipset is
currently active. We expect that any writes into the chipset range will
simply be forgotten until it wakes up.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_gt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index b5a9b87e4ec9..c4fd8d65b8a3 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -304,7 +304,7 @@ void intel_gt_flush_ggtt_writes(struct intel_gt *gt)
 
 	intel_gt_chipset_flush(gt);
 
-	with_intel_runtime_pm(uncore->rpm, wakeref) {
+	with_intel_runtime_pm_if_in_use(uncore->rpm, wakeref) {
 		unsigned long flags;
 
 		spin_lock_irqsave(&uncore->lock, flags);
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 10/19] drm/i915: Protect the obj->vma.list during iteration
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

Take the obj->vma.lock to prevent modifications to the list as we
iterate, to avoid the dreaded the NULL pointer.

<1>[  347.820823] BUG: kernel NULL pointer dereference, address: 0000000000000150
<1>[  347.820856] #PF: supervisor read access in kernel mode
<1>[  347.820874] #PF: error_code(0x0000) - not-present page
<6>[  347.820892] PGD 0 P4D 0
<4>[  347.820908] Oops: 0000 [#1] PREEMPT SMP NOPTI
<4>[  347.820926] CPU: 3 PID: 1303 Comm: gem_persistent_ Tainted: G     U            5.4.0-rc7-CI-CI_DRM_7352+ #1
<4>[  347.820956] Hardware name:  /NUC6CAYB, BIOS AYAPLCEL.86A.0049.2018.0508.1356 05/08/2018
<4>[  347.821132] RIP: 0010:i915_gem_object_flush_write_domain+0xd9/0x1d0 [i915]
<4>[  347.821157] Code: 0f 84 e9 00 00 00 48 8b 80 e0 fd ff ff f6 c4 40 75 11 e9 ed 00 00 00 48 8b 80 e0 fd ff ff f6 c4 40 74 26 48 8b 83 b0 00 00 00 <48> 8b b8 50 01 00 00 e8 fb 20 fb ff 48 8b 83 30 03 00 00 49 39 c4
<4>[  347.821210] RSP: 0018:ffffc90000a1f8f8 EFLAGS: 00010202
<4>[  347.821229] RAX: 0000000000000000 RBX: ffffc900008479a0 RCX: 0000000000000018
<4>[  347.821252] RDX: 0000000000000000 RSI: 000000000000000d RDI: ffff888275a090b0
<4>[  347.821274] RBP: ffff8882673c8040 R08: ffff88825991b8d0 R09: 0000000000000000
<4>[  347.821297] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8882673c8280
<4>[  347.821319] R13: ffff8882673c8368 R14: 0000000000000000 R15: ffff888266a54000
<4>[  347.821343] FS:  00007f75865f4240(0000) GS:ffff888277b80000(0000) knlGS:0000000000000000
<4>[  347.821368] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  347.821389] CR2: 0000000000000150 CR3: 000000025aee0000 CR4: 00000000003406e0
<4>[  347.821411] Call Trace:
<4>[  347.821555]  i915_gem_object_prepare_read+0xea/0x2a0 [i915]
<4>[  347.821706]  intel_engine_cmd_parser+0x5ce/0xe90 [i915]
<4>[  347.821834]  ? __i915_sw_fence_complete+0x1a0/0x250 [i915]
<4>[  347.821990]  i915_gem_do_execbuffer+0xb4c/0x2550 [i915]

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 63bd3ff84f5e..458945e1823e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -279,10 +279,12 @@ i915_gem_object_flush_write_domain(struct drm_i915_gem_object *obj,
 
 	switch (obj->write_domain) {
 	case I915_GEM_DOMAIN_GTT:
+		spin_lock(&obj->vma.lock);
 		for_each_ggtt_vma(vma, obj) {
 			if (i915_vma_unset_ggtt_write(vma))
 				intel_gt_flush_ggtt_writes(vma->vm->gt);
 		}
+		spin_unlock(&obj->vma.lock);
 
 		intel_frontbuffer_flush(obj->frontbuffer, ORIGIN_CPU);
 		break;
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 10/19] drm/i915: Protect the obj->vma.list during iteration
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

Take the obj->vma.lock to prevent modifications to the list as we
iterate, to avoid the dreaded the NULL pointer.

<1>[  347.820823] BUG: kernel NULL pointer dereference, address: 0000000000000150
<1>[  347.820856] #PF: supervisor read access in kernel mode
<1>[  347.820874] #PF: error_code(0x0000) - not-present page
<6>[  347.820892] PGD 0 P4D 0
<4>[  347.820908] Oops: 0000 [#1] PREEMPT SMP NOPTI
<4>[  347.820926] CPU: 3 PID: 1303 Comm: gem_persistent_ Tainted: G     U            5.4.0-rc7-CI-CI_DRM_7352+ #1
<4>[  347.820956] Hardware name:  /NUC6CAYB, BIOS AYAPLCEL.86A.0049.2018.0508.1356 05/08/2018
<4>[  347.821132] RIP: 0010:i915_gem_object_flush_write_domain+0xd9/0x1d0 [i915]
<4>[  347.821157] Code: 0f 84 e9 00 00 00 48 8b 80 e0 fd ff ff f6 c4 40 75 11 e9 ed 00 00 00 48 8b 80 e0 fd ff ff f6 c4 40 74 26 48 8b 83 b0 00 00 00 <48> 8b b8 50 01 00 00 e8 fb 20 fb ff 48 8b 83 30 03 00 00 49 39 c4
<4>[  347.821210] RSP: 0018:ffffc90000a1f8f8 EFLAGS: 00010202
<4>[  347.821229] RAX: 0000000000000000 RBX: ffffc900008479a0 RCX: 0000000000000018
<4>[  347.821252] RDX: 0000000000000000 RSI: 000000000000000d RDI: ffff888275a090b0
<4>[  347.821274] RBP: ffff8882673c8040 R08: ffff88825991b8d0 R09: 0000000000000000
<4>[  347.821297] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8882673c8280
<4>[  347.821319] R13: ffff8882673c8368 R14: 0000000000000000 R15: ffff888266a54000
<4>[  347.821343] FS:  00007f75865f4240(0000) GS:ffff888277b80000(0000) knlGS:0000000000000000
<4>[  347.821368] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  347.821389] CR2: 0000000000000150 CR3: 000000025aee0000 CR4: 00000000003406e0
<4>[  347.821411] Call Trace:
<4>[  347.821555]  i915_gem_object_prepare_read+0xea/0x2a0 [i915]
<4>[  347.821706]  intel_engine_cmd_parser+0x5ce/0xe90 [i915]
<4>[  347.821834]  ? __i915_sw_fence_complete+0x1a0/0x250 [i915]
<4>[  347.821990]  i915_gem_do_execbuffer+0xb4c/0x2550 [i915]

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 63bd3ff84f5e..458945e1823e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -279,10 +279,12 @@ i915_gem_object_flush_write_domain(struct drm_i915_gem_object *obj,
 
 	switch (obj->write_domain) {
 	case I915_GEM_DOMAIN_GTT:
+		spin_lock(&obj->vma.lock);
 		for_each_ggtt_vma(vma, obj) {
 			if (i915_vma_unset_ggtt_write(vma))
 				intel_gt_flush_ggtt_writes(vma->vm->gt);
 		}
+		spin_unlock(&obj->vma.lock);
 
 		intel_frontbuffer_flush(obj->frontbuffer, ORIGIN_CPU);
 		break;
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 11/19] drm/i915: Wait until the intel_wakeref idle callback is complete
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

When waiting for idle, serialise with any ongoing callback so that it
will have completed before completing the wait.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_wakeref.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_wakeref.c b/drivers/gpu/drm/i915/intel_wakeref.c
index 9b29176cc1ca..91feb53b2942 100644
--- a/drivers/gpu/drm/i915/intel_wakeref.c
+++ b/drivers/gpu/drm/i915/intel_wakeref.c
@@ -109,8 +109,15 @@ void __intel_wakeref_init(struct intel_wakeref *wf,
 
 int intel_wakeref_wait_for_idle(struct intel_wakeref *wf)
 {
-	return wait_var_event_killable(&wf->wakeref,
-				       !intel_wakeref_is_active(wf));
+	int err;
+
+	err = wait_var_event_killable(&wf->wakeref,
+				      !intel_wakeref_is_active(wf));
+	if (err)
+		return err;
+
+	intel_wakeref_unlock_wait(wf);
+	return 0;
 }
 
 static void wakeref_auto_timeout(struct timer_list *t)
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 11/19] drm/i915: Wait until the intel_wakeref idle callback is complete
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

When waiting for idle, serialise with any ongoing callback so that it
will have completed before completing the wait.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_wakeref.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_wakeref.c b/drivers/gpu/drm/i915/intel_wakeref.c
index 9b29176cc1ca..91feb53b2942 100644
--- a/drivers/gpu/drm/i915/intel_wakeref.c
+++ b/drivers/gpu/drm/i915/intel_wakeref.c
@@ -109,8 +109,15 @@ void __intel_wakeref_init(struct intel_wakeref *wf,
 
 int intel_wakeref_wait_for_idle(struct intel_wakeref *wf)
 {
-	return wait_var_event_killable(&wf->wakeref,
-				       !intel_wakeref_is_active(wf));
+	int err;
+
+	err = wait_var_event_killable(&wf->wakeref,
+				      !intel_wakeref_is_active(wf));
+	if (err)
+		return err;
+
+	intel_wakeref_unlock_wait(wf);
+	return 0;
 }
 
 static void wakeref_auto_timeout(struct timer_list *t)
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 12/19] drm/i915/gt: Declare timeline.lock to be irq-free
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

Now that we never allow the intel_wakeref callbacks to be invoked from
interrupt context, we do not need the irqsafe spinlock for the timeline.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_gt_requests.c |  9 ++++-----
 drivers/gpu/drm/i915/gt/intel_reset.c       |  9 ++++-----
 drivers/gpu/drm/i915/gt/intel_timeline.c    | 10 ++++------
 3 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
index 7559d6373f49..74356db43325 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
@@ -33,7 +33,6 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
 {
 	struct intel_gt_timelines *timelines = &gt->timelines;
 	struct intel_timeline *tl, *tn;
-	unsigned long flags;
 	bool interruptible;
 	LIST_HEAD(free);
 
@@ -43,7 +42,7 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
 
 	flush_submission(gt); /* kick the ksoftirqd tasklets */
 
-	spin_lock_irqsave(&timelines->lock, flags);
+	spin_lock(&timelines->lock);
 	list_for_each_entry_safe(tl, tn, &timelines->active_list, link) {
 		if (!mutex_trylock(&tl->mutex))
 			continue;
@@ -51,7 +50,7 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
 		intel_timeline_get(tl);
 		GEM_BUG_ON(!atomic_read(&tl->active_count));
 		atomic_inc(&tl->active_count); /* pin the list element */
-		spin_unlock_irqrestore(&timelines->lock, flags);
+		spin_unlock(&timelines->lock);
 
 		if (timeout > 0) {
 			struct dma_fence *fence;
@@ -67,7 +66,7 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
 
 		retire_requests(tl);
 
-		spin_lock_irqsave(&timelines->lock, flags);
+		spin_lock(&timelines->lock);
 
 		/* Resume iteration after dropping lock */
 		list_safe_reset_next(tl, tn, link);
@@ -82,7 +81,7 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
 			list_add(&tl->link, &free);
 		}
 	}
-	spin_unlock_irqrestore(&timelines->lock, flags);
+	spin_unlock(&timelines->lock);
 
 	list_for_each_entry_safe(tl, tn, &free, link)
 		__intel_timeline_free(&tl->kref);
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index 0388f9375366..36189238e13c 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -831,7 +831,6 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
 {
 	struct intel_gt_timelines *timelines = &gt->timelines;
 	struct intel_timeline *tl;
-	unsigned long flags;
 	bool ok;
 
 	if (!test_bit(I915_WEDGED, &gt->reset.flags))
@@ -853,7 +852,7 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
 	 *
 	 * No more can be submitted until we reset the wedged bit.
 	 */
-	spin_lock_irqsave(&timelines->lock, flags);
+	spin_lock(&timelines->lock);
 	list_for_each_entry(tl, &timelines->active_list, link) {
 		struct dma_fence *fence;
 
@@ -861,7 +860,7 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
 		if (!fence)
 			continue;
 
-		spin_unlock_irqrestore(&timelines->lock, flags);
+		spin_unlock(&timelines->lock);
 
 		/*
 		 * All internal dependencies (i915_requests) will have
@@ -874,10 +873,10 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
 		dma_fence_put(fence);
 
 		/* Restart iteration after droping lock */
-		spin_lock_irqsave(&timelines->lock, flags);
+		spin_lock(&timelines->lock);
 		tl = list_entry(&timelines->active_list, typeof(*tl), link);
 	}
-	spin_unlock_irqrestore(&timelines->lock, flags);
+	spin_unlock(&timelines->lock);
 
 	/* We must reset pending GPU events before restoring our submission */
 	ok = !HAS_EXECLISTS(gt->i915); /* XXX better agnosticism desired */
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
index 4f914f0d5eab..bd973d950064 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -332,7 +332,6 @@ int intel_timeline_pin(struct intel_timeline *tl)
 void intel_timeline_enter(struct intel_timeline *tl)
 {
 	struct intel_gt_timelines *timelines = &tl->gt->timelines;
-	unsigned long flags;
 
 	/*
 	 * Pretend we are serialised by the timeline->mutex.
@@ -358,16 +357,15 @@ void intel_timeline_enter(struct intel_timeline *tl)
 	if (atomic_add_unless(&tl->active_count, 1, 0))
 		return;
 
-	spin_lock_irqsave(&timelines->lock, flags);
+	spin_lock(&timelines->lock);
 	if (!atomic_fetch_inc(&tl->active_count))
 		list_add(&tl->link, &timelines->active_list);
-	spin_unlock_irqrestore(&timelines->lock, flags);
+	spin_unlock(&timelines->lock);
 }
 
 void intel_timeline_exit(struct intel_timeline *tl)
 {
 	struct intel_gt_timelines *timelines = &tl->gt->timelines;
-	unsigned long flags;
 
 	/* See intel_timeline_enter() */
 	lockdep_assert_held(&tl->mutex);
@@ -376,10 +374,10 @@ void intel_timeline_exit(struct intel_timeline *tl)
 	if (atomic_add_unless(&tl->active_count, -1, 1))
 		return;
 
-	spin_lock_irqsave(&timelines->lock, flags);
+	spin_lock(&timelines->lock);
 	if (atomic_dec_and_test(&tl->active_count))
 		list_del(&tl->link);
-	spin_unlock_irqrestore(&timelines->lock, flags);
+	spin_unlock(&timelines->lock);
 
 	/*
 	 * Since this timeline is idle, all bariers upon which we were waiting
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 12/19] drm/i915/gt: Declare timeline.lock to be irq-free
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

Now that we never allow the intel_wakeref callbacks to be invoked from
interrupt context, we do not need the irqsafe spinlock for the timeline.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_gt_requests.c |  9 ++++-----
 drivers/gpu/drm/i915/gt/intel_reset.c       |  9 ++++-----
 drivers/gpu/drm/i915/gt/intel_timeline.c    | 10 ++++------
 3 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
index 7559d6373f49..74356db43325 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
@@ -33,7 +33,6 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
 {
 	struct intel_gt_timelines *timelines = &gt->timelines;
 	struct intel_timeline *tl, *tn;
-	unsigned long flags;
 	bool interruptible;
 	LIST_HEAD(free);
 
@@ -43,7 +42,7 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
 
 	flush_submission(gt); /* kick the ksoftirqd tasklets */
 
-	spin_lock_irqsave(&timelines->lock, flags);
+	spin_lock(&timelines->lock);
 	list_for_each_entry_safe(tl, tn, &timelines->active_list, link) {
 		if (!mutex_trylock(&tl->mutex))
 			continue;
@@ -51,7 +50,7 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
 		intel_timeline_get(tl);
 		GEM_BUG_ON(!atomic_read(&tl->active_count));
 		atomic_inc(&tl->active_count); /* pin the list element */
-		spin_unlock_irqrestore(&timelines->lock, flags);
+		spin_unlock(&timelines->lock);
 
 		if (timeout > 0) {
 			struct dma_fence *fence;
@@ -67,7 +66,7 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
 
 		retire_requests(tl);
 
-		spin_lock_irqsave(&timelines->lock, flags);
+		spin_lock(&timelines->lock);
 
 		/* Resume iteration after dropping lock */
 		list_safe_reset_next(tl, tn, link);
@@ -82,7 +81,7 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
 			list_add(&tl->link, &free);
 		}
 	}
-	spin_unlock_irqrestore(&timelines->lock, flags);
+	spin_unlock(&timelines->lock);
 
 	list_for_each_entry_safe(tl, tn, &free, link)
 		__intel_timeline_free(&tl->kref);
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index 0388f9375366..36189238e13c 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -831,7 +831,6 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
 {
 	struct intel_gt_timelines *timelines = &gt->timelines;
 	struct intel_timeline *tl;
-	unsigned long flags;
 	bool ok;
 
 	if (!test_bit(I915_WEDGED, &gt->reset.flags))
@@ -853,7 +852,7 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
 	 *
 	 * No more can be submitted until we reset the wedged bit.
 	 */
-	spin_lock_irqsave(&timelines->lock, flags);
+	spin_lock(&timelines->lock);
 	list_for_each_entry(tl, &timelines->active_list, link) {
 		struct dma_fence *fence;
 
@@ -861,7 +860,7 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
 		if (!fence)
 			continue;
 
-		spin_unlock_irqrestore(&timelines->lock, flags);
+		spin_unlock(&timelines->lock);
 
 		/*
 		 * All internal dependencies (i915_requests) will have
@@ -874,10 +873,10 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
 		dma_fence_put(fence);
 
 		/* Restart iteration after droping lock */
-		spin_lock_irqsave(&timelines->lock, flags);
+		spin_lock(&timelines->lock);
 		tl = list_entry(&timelines->active_list, typeof(*tl), link);
 	}
-	spin_unlock_irqrestore(&timelines->lock, flags);
+	spin_unlock(&timelines->lock);
 
 	/* We must reset pending GPU events before restoring our submission */
 	ok = !HAS_EXECLISTS(gt->i915); /* XXX better agnosticism desired */
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
index 4f914f0d5eab..bd973d950064 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -332,7 +332,6 @@ int intel_timeline_pin(struct intel_timeline *tl)
 void intel_timeline_enter(struct intel_timeline *tl)
 {
 	struct intel_gt_timelines *timelines = &tl->gt->timelines;
-	unsigned long flags;
 
 	/*
 	 * Pretend we are serialised by the timeline->mutex.
@@ -358,16 +357,15 @@ void intel_timeline_enter(struct intel_timeline *tl)
 	if (atomic_add_unless(&tl->active_count, 1, 0))
 		return;
 
-	spin_lock_irqsave(&timelines->lock, flags);
+	spin_lock(&timelines->lock);
 	if (!atomic_fetch_inc(&tl->active_count))
 		list_add(&tl->link, &timelines->active_list);
-	spin_unlock_irqrestore(&timelines->lock, flags);
+	spin_unlock(&timelines->lock);
 }
 
 void intel_timeline_exit(struct intel_timeline *tl)
 {
 	struct intel_gt_timelines *timelines = &tl->gt->timelines;
-	unsigned long flags;
 
 	/* See intel_timeline_enter() */
 	lockdep_assert_held(&tl->mutex);
@@ -376,10 +374,10 @@ void intel_timeline_exit(struct intel_timeline *tl)
 	if (atomic_add_unless(&tl->active_count, -1, 1))
 		return;
 
-	spin_lock_irqsave(&timelines->lock, flags);
+	spin_lock(&timelines->lock);
 	if (atomic_dec_and_test(&tl->active_count))
 		list_del(&tl->link);
-	spin_unlock_irqrestore(&timelines->lock, flags);
+	spin_unlock(&timelines->lock);
 
 	/*
 	 * Since this timeline is idle, all bariers upon which we were waiting
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 13/19] drm/i915/gt: Move new timelines to the end of active_list
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

When adding a new active timeline, place it at the end of the list. This
allows for intel_gt_retire_requests() to pick up the newcomer more
quickly and hopefully complete the retirement sooner.

References: 7936a22dd466 ("drm/i915/gt: Wait for new requests in intel_gt_retire_requests()")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_timeline.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
index bd973d950064..b190a5d9ab02 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -359,7 +359,7 @@ void intel_timeline_enter(struct intel_timeline *tl)
 
 	spin_lock(&timelines->lock);
 	if (!atomic_fetch_inc(&tl->active_count))
-		list_add(&tl->link, &timelines->active_list);
+		list_add_tail(&tl->link, &timelines->active_list);
 	spin_unlock(&timelines->lock);
 }
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 13/19] drm/i915/gt: Move new timelines to the end of active_list
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

When adding a new active timeline, place it at the end of the list. This
allows for intel_gt_retire_requests() to pick up the newcomer more
quickly and hopefully complete the retirement sooner.

References: 7936a22dd466 ("drm/i915/gt: Wait for new requests in intel_gt_retire_requests()")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_timeline.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
index bd973d950064..b190a5d9ab02 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -359,7 +359,7 @@ void intel_timeline_enter(struct intel_timeline *tl)
 
 	spin_lock(&timelines->lock);
 	if (!atomic_fetch_inc(&tl->active_count))
-		list_add(&tl->link, &timelines->active_list);
+		list_add_tail(&tl->link, &timelines->active_list);
 	spin_unlock(&timelines->lock);
 }
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 14/19] drm/i915/gt: Schedule next retirement worker first
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

As we may park the gt during request retirement, we may then cancel the
retirement worker only to then program the delayed worker once more.

If we schedule the next delayed retirement worker first, if we then park
the gt, the work remain cancelled [until we unpark].

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_gt_requests.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
index 74356db43325..4dc3cbeb1b36 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
@@ -109,9 +109,9 @@ static void retire_work_handler(struct work_struct *work)
 	struct intel_gt *gt =
 		container_of(work, typeof(*gt), requests.retire_work.work);
 
-	intel_gt_retire_requests(gt);
 	schedule_delayed_work(&gt->requests.retire_work,
 			      round_jiffies_up_relative(HZ));
+	intel_gt_retire_requests(gt);
 }
 
 void intel_gt_init_requests(struct intel_gt *gt)
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 14/19] drm/i915/gt: Schedule next retirement worker first
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

As we may park the gt during request retirement, we may then cancel the
retirement worker only to then program the delayed worker once more.

If we schedule the next delayed retirement worker first, if we then park
the gt, the work remain cancelled [until we unpark].

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_gt_requests.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
index 74356db43325..4dc3cbeb1b36 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
@@ -109,9 +109,9 @@ static void retire_work_handler(struct work_struct *work)
 	struct intel_gt *gt =
 		container_of(work, typeof(*gt), requests.retire_work.work);
 
-	intel_gt_retire_requests(gt);
 	schedule_delayed_work(&gt->requests.retire_work,
 			      round_jiffies_up_relative(HZ));
+	intel_gt_retire_requests(gt);
 }
 
 void intel_gt_init_requests(struct intel_gt *gt)
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 15/19] drm/i915/gt: Flush the requests after wedging on suspend
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

Retire all requests if we resort to wedged the driver on suspend. They
will now be idle, so we might as we free them before shutting down.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_gt_pm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
index 7a9044ac4b75..f6b5169d623f 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
@@ -256,6 +256,7 @@ static void wait_for_suspend(struct intel_gt *gt)
 		 * the gpu quiet.
 		 */
 		intel_gt_set_wedged(gt);
+		intel_gt_retire_requests(gt);
 	}
 
 	intel_gt_pm_wait_for_idle(gt);
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 15/19] drm/i915/gt: Flush the requests after wedging on suspend
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

Retire all requests if we resort to wedged the driver on suspend. They
will now be idle, so we might as we free them before shutting down.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_gt_pm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
index 7a9044ac4b75..f6b5169d623f 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
@@ -256,6 +256,7 @@ static void wait_for_suspend(struct intel_gt *gt)
 		 * the gpu quiet.
 		 */
 		intel_gt_set_wedged(gt);
+		intel_gt_retire_requests(gt);
 	}
 
 	intel_gt_pm_wait_for_idle(gt);
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 16/19] drm/i915/selftests: Flush the active callbacks
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

Before checking the current i915_active state for the asynchronous work
we submitted, flush any ongoing callback. This ensures that our sampling
is robust and does not sporadically fail due to bad timing as the work
is running on another cpu.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/selftest_context.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
index 3586af636304..d1203b4c1409 100644
--- a/drivers/gpu/drm/i915/gt/selftest_context.c
+++ b/drivers/gpu/drm/i915/gt/selftest_context.c
@@ -48,20 +48,25 @@ static int context_sync(struct intel_context *ce)
 
 	mutex_lock(&tl->mutex);
 	do {
-		struct dma_fence *fence;
+		struct i915_request *rq;
 		long timeout;
 
-		fence = i915_active_fence_get(&tl->last_request);
-		if (!fence)
+		if (list_empty(&tl->requests))
 			break;
 
-		timeout = dma_fence_wait_timeout(fence, false, HZ / 10);
+		rq = list_last_entry(&tl->requests, typeof(*rq), link);
+		i915_request_get(rq);
+
+		timeout = i915_request_wait(rq, 0, HZ / 10);
 		if (timeout < 0)
 			err = timeout;
 		else
-			i915_request_retire_upto(to_request(fence));
+			i915_request_retire_upto(rq);
 
-		dma_fence_put(fence);
+		spin_lock_irq(&rq->lock);
+		spin_unlock_irq(&rq->lock);
+
+		i915_request_put(rq);
 	} while (!err);
 	mutex_unlock(&tl->mutex);
 
@@ -282,6 +287,8 @@ static int __live_active_context(struct intel_engine_cs *engine,
 		err = -EINVAL;
 	}
 
+	intel_engine_pm_unlock_wait(engine);
+
 	if (intel_engine_pm_is_awake(engine)) {
 		struct drm_printer p = drm_debug_printer(__func__);
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 16/19] drm/i915/selftests: Flush the active callbacks
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

Before checking the current i915_active state for the asynchronous work
we submitted, flush any ongoing callback. This ensures that our sampling
is robust and does not sporadically fail due to bad timing as the work
is running on another cpu.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/selftest_context.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
index 3586af636304..d1203b4c1409 100644
--- a/drivers/gpu/drm/i915/gt/selftest_context.c
+++ b/drivers/gpu/drm/i915/gt/selftest_context.c
@@ -48,20 +48,25 @@ static int context_sync(struct intel_context *ce)
 
 	mutex_lock(&tl->mutex);
 	do {
-		struct dma_fence *fence;
+		struct i915_request *rq;
 		long timeout;
 
-		fence = i915_active_fence_get(&tl->last_request);
-		if (!fence)
+		if (list_empty(&tl->requests))
 			break;
 
-		timeout = dma_fence_wait_timeout(fence, false, HZ / 10);
+		rq = list_last_entry(&tl->requests, typeof(*rq), link);
+		i915_request_get(rq);
+
+		timeout = i915_request_wait(rq, 0, HZ / 10);
 		if (timeout < 0)
 			err = timeout;
 		else
-			i915_request_retire_upto(to_request(fence));
+			i915_request_retire_upto(rq);
 
-		dma_fence_put(fence);
+		spin_lock_irq(&rq->lock);
+		spin_unlock_irq(&rq->lock);
+
+		i915_request_put(rq);
 	} while (!err);
 	mutex_unlock(&tl->mutex);
 
@@ -282,6 +287,8 @@ static int __live_active_context(struct intel_engine_cs *engine,
 		err = -EINVAL;
 	}
 
+	intel_engine_pm_unlock_wait(engine);
+
 	if (intel_engine_pm_is_awake(engine)) {
 		struct drm_printer p = drm_debug_printer(__func__);
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 17/19] drm/i915/selftests: Be explicit in ERR_PTR handling
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx; +Cc: Dan Carpenter

When setting up a full GGTT, we expect the next insert to fail with
-ENOSPC. Simplify the use of ERR_PTR to not confuse either the reader or
smatch.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
References: f40a7b7558ef ("drm/i915: Initial selftests for exercising eviction")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/selftests/i915_gem_evict.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
index 5f133d177212..06ef88510209 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
@@ -198,8 +198,8 @@ static int igt_overcommit(void *arg)
 	quirk_add(obj, &objects);
 
 	vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0, 0);
-	if (!IS_ERR(vma) || PTR_ERR(vma) != -ENOSPC) {
-		pr_err("Failed to evict+insert, i915_gem_object_ggtt_pin returned err=%d\n", (int)PTR_ERR(vma));
+	if (vma != ERR_PTR(-ENOSPC)) {
+		pr_err("Failed to evict+insert, i915_gem_object_ggtt_pin returned err=%d\n", (int)PTR_ERR_OR_ZERO(vma));
 		err = -EINVAL;
 		goto cleanup;
 	}
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 17/19] drm/i915/selftests: Be explicit in ERR_PTR handling
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx; +Cc: Dan Carpenter

When setting up a full GGTT, we expect the next insert to fail with
-ENOSPC. Simplify the use of ERR_PTR to not confuse either the reader or
smatch.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
References: f40a7b7558ef ("drm/i915: Initial selftests for exercising eviction")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/selftests/i915_gem_evict.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
index 5f133d177212..06ef88510209 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
@@ -198,8 +198,8 @@ static int igt_overcommit(void *arg)
 	quirk_add(obj, &objects);
 
 	vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0, 0);
-	if (!IS_ERR(vma) || PTR_ERR(vma) != -ENOSPC) {
-		pr_err("Failed to evict+insert, i915_gem_object_ggtt_pin returned err=%d\n", (int)PTR_ERR(vma));
+	if (vma != ERR_PTR(-ENOSPC)) {
+		pr_err("Failed to evict+insert, i915_gem_object_ggtt_pin returned err=%d\n", (int)PTR_ERR_OR_ZERO(vma));
 		err = -EINVAL;
 		goto cleanup;
 	}
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 18/19] drm/i915/selftests: Exercise rc6 handling
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

Reading from CTX_INFO upsets rc6, requiring us to detect and prevent
possible rc6 context corruption. Poke at the bear!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_rc6.c           |   4 +
 drivers/gpu/drm/i915/gt/selftest_gt_pm.c      |  13 ++
 drivers/gpu/drm/i915/gt/selftest_rc6.c        | 146 ++++++++++++++++++
 drivers/gpu/drm/i915/gt/selftest_rc6.h        |  12 ++
 .../drm/i915/selftests/i915_live_selftests.h  |   1 +
 5 files changed, 176 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gt/selftest_rc6.c
 create mode 100644 drivers/gpu/drm/i915/gt/selftest_rc6.h

diff --git a/drivers/gpu/drm/i915/gt/intel_rc6.c b/drivers/gpu/drm/i915/gt/intel_rc6.c
index 977a617a196d..7a0bc6dde009 100644
--- a/drivers/gpu/drm/i915/gt/intel_rc6.c
+++ b/drivers/gpu/drm/i915/gt/intel_rc6.c
@@ -783,3 +783,7 @@ u64 intel_rc6_residency_us(struct intel_rc6 *rc6, i915_reg_t reg)
 {
 	return DIV_ROUND_UP_ULL(intel_rc6_residency_ns(rc6, reg), 1000);
 }
+
+#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
+#include "selftest_rc6.c"
+#endif
diff --git a/drivers/gpu/drm/i915/gt/selftest_gt_pm.c b/drivers/gpu/drm/i915/gt/selftest_gt_pm.c
index d1752f15702a..1d5bf93d1258 100644
--- a/drivers/gpu/drm/i915/gt/selftest_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/selftest_gt_pm.c
@@ -6,6 +6,7 @@
  */
 
 #include "selftest_llc.h"
+#include "selftest_rc6.h"
 
 static int live_gt_resume(void *arg)
 {
@@ -58,3 +59,15 @@ int intel_gt_pm_live_selftests(struct drm_i915_private *i915)
 
 	return intel_gt_live_subtests(tests, &i915->gt);
 }
+
+int intel_gt_pm_late_selftests(struct drm_i915_private *i915)
+{
+	static const struct i915_subtest tests[] = {
+		SUBTEST(live_rc6_ctx),
+	};
+
+	if (intel_gt_is_wedged(&i915->gt))
+		return 0;
+
+	return intel_gt_live_subtests(tests, &i915->gt);
+}
diff --git a/drivers/gpu/drm/i915/gt/selftest_rc6.c b/drivers/gpu/drm/i915/gt/selftest_rc6.c
new file mode 100644
index 000000000000..c8b729f7e93e
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/selftest_rc6.c
@@ -0,0 +1,146 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "intel_context.h"
+#include "intel_engine_pm.h"
+#include "intel_gt_requests.h"
+#include "intel_ring.h"
+#include "selftest_rc6.h"
+
+#include "selftests/i915_random.h"
+
+static const u32 *__live_rc6_ctx(struct intel_context *ce)
+{
+	struct i915_request *rq;
+	u32 const *result;
+	u32 cmd;
+	u32 *cs;
+
+	rq = intel_context_create_request(ce);
+	if (IS_ERR(rq))
+		return ERR_CAST(rq);
+
+	cs = intel_ring_begin(rq, 4);
+	if (IS_ERR(cs)) {
+		i915_request_add(rq);
+		return cs;
+	}
+
+	cmd = MI_STORE_REGISTER_MEM | MI_USE_GGTT;
+	if (INTEL_GEN(rq->i915) >= 8)
+		cmd++;
+
+	*cs++ = cmd;
+	*cs++ = i915_mmio_reg_offset(GEN8_RC6_CTX_INFO);
+	*cs++ = ce->timeline->hwsp_offset + 8;
+	*cs++ = 0;
+	intel_ring_advance(rq, cs);
+
+	result = rq->hwsp_seqno + 2;
+	i915_request_add(rq);
+
+	return result;
+}
+
+static struct intel_engine_cs **
+randomised_engines(struct intel_gt *gt,
+		   struct rnd_state *prng,
+		   unsigned int *count)
+{
+	struct intel_engine_cs *engine, **engines;
+	enum intel_engine_id id;
+	int n;
+
+	n = 0;
+	for_each_engine(engine, gt, id)
+		n++;
+	if (!n)
+		return NULL;
+
+	engines = kmalloc_array(n, sizeof(*engines), GFP_KERNEL);
+	if (!engines)
+		return NULL;
+
+	n = 0;
+	for_each_engine(engine, gt, id)
+		engines[n++] = engine;
+
+	i915_prandom_shuffle(engines, sizeof(*engines), n, prng);
+
+	*count = n;
+	return engines;
+}
+
+int live_rc6_ctx(void *arg)
+{
+	struct intel_gt *gt = arg;
+	struct intel_engine_cs **engines;
+	unsigned int n, count;
+	I915_RND_STATE(prng);
+	int err = 0;
+
+	/* A read of CTX_INFO upsets rc6. Poke the bear! */
+	if (INTEL_GEN(gt->i915) < 8)
+		return 0;
+
+	engines = randomised_engines(gt, &prng, &count);
+	if (!engines)
+		return 0;
+
+	for (n = 0; n < count; n++) {
+		struct intel_engine_cs *engine = engines[n];
+		int pass;
+
+		for (pass = 0; pass < 2; pass++) {
+			struct intel_context *ce;
+			unsigned int resets =
+				i915_reset_engine_count(&gt->i915->gpu_error,
+							engine);
+			const u32 *res;
+
+			/* Use a sacrifical context */
+			ce = intel_context_create(engine->kernel_context->gem_context,
+						  engine);
+			if (IS_ERR(ce)) {
+				err = PTR_ERR(ce);
+				goto out;
+			}
+
+			intel_engine_pm_get(engine);
+			res = __live_rc6_ctx(ce);
+			intel_engine_pm_put(engine);
+			intel_context_put(ce);
+			if (IS_ERR(res)) {
+				err = PTR_ERR(res);
+				goto out;
+			}
+
+			if (intel_gt_wait_for_idle(gt, HZ / 5) == -ETIME) {
+				intel_gt_set_wedged(gt);
+				err = -ETIME;
+				goto out;
+			}
+
+			intel_gt_pm_wait_for_idle(gt);
+			pr_debug("%s: CTX_INFO=%0x\n",
+				 engine->name, READ_ONCE(*res));
+
+			if (resets !=
+			    i915_reset_engine_count(&gt->i915->gpu_error,
+						    engine)) {
+				pr_err("%s: GPU reset required\n",
+				       engine->name);
+				add_taint_for_CI(TAINT_WARN);
+				err = -EIO;
+				goto out;
+			}
+		}
+	}
+
+out:
+	kfree(engines);
+	return err;
+}
diff --git a/drivers/gpu/drm/i915/gt/selftest_rc6.h b/drivers/gpu/drm/i915/gt/selftest_rc6.h
new file mode 100644
index 000000000000..230c6b4c7dc0
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/selftest_rc6.h
@@ -0,0 +1,12 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef SELFTEST_RC6_H
+#define SELFTEST_RC6_H
+
+int live_rc6_ctx(void *arg);
+
+#endif /* SELFTEST_RC6_H */
diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
index 11b40bc58e6d..beff59ee9f6f 100644
--- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
@@ -39,3 +39,4 @@ selftest(hangcheck, intel_hangcheck_live_selftests)
 selftest(execlists, intel_execlists_live_selftests)
 selftest(guc, intel_guc_live_selftest)
 selftest(perf, i915_perf_live_selftests)
+selftest(late_gt_pm, intel_gt_pm_late_selftests)
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 18/19] drm/i915/selftests: Exercise rc6 handling
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

Reading from CTX_INFO upsets rc6, requiring us to detect and prevent
possible rc6 context corruption. Poke at the bear!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_rc6.c           |   4 +
 drivers/gpu/drm/i915/gt/selftest_gt_pm.c      |  13 ++
 drivers/gpu/drm/i915/gt/selftest_rc6.c        | 146 ++++++++++++++++++
 drivers/gpu/drm/i915/gt/selftest_rc6.h        |  12 ++
 .../drm/i915/selftests/i915_live_selftests.h  |   1 +
 5 files changed, 176 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gt/selftest_rc6.c
 create mode 100644 drivers/gpu/drm/i915/gt/selftest_rc6.h

diff --git a/drivers/gpu/drm/i915/gt/intel_rc6.c b/drivers/gpu/drm/i915/gt/intel_rc6.c
index 977a617a196d..7a0bc6dde009 100644
--- a/drivers/gpu/drm/i915/gt/intel_rc6.c
+++ b/drivers/gpu/drm/i915/gt/intel_rc6.c
@@ -783,3 +783,7 @@ u64 intel_rc6_residency_us(struct intel_rc6 *rc6, i915_reg_t reg)
 {
 	return DIV_ROUND_UP_ULL(intel_rc6_residency_ns(rc6, reg), 1000);
 }
+
+#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
+#include "selftest_rc6.c"
+#endif
diff --git a/drivers/gpu/drm/i915/gt/selftest_gt_pm.c b/drivers/gpu/drm/i915/gt/selftest_gt_pm.c
index d1752f15702a..1d5bf93d1258 100644
--- a/drivers/gpu/drm/i915/gt/selftest_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/selftest_gt_pm.c
@@ -6,6 +6,7 @@
  */
 
 #include "selftest_llc.h"
+#include "selftest_rc6.h"
 
 static int live_gt_resume(void *arg)
 {
@@ -58,3 +59,15 @@ int intel_gt_pm_live_selftests(struct drm_i915_private *i915)
 
 	return intel_gt_live_subtests(tests, &i915->gt);
 }
+
+int intel_gt_pm_late_selftests(struct drm_i915_private *i915)
+{
+	static const struct i915_subtest tests[] = {
+		SUBTEST(live_rc6_ctx),
+	};
+
+	if (intel_gt_is_wedged(&i915->gt))
+		return 0;
+
+	return intel_gt_live_subtests(tests, &i915->gt);
+}
diff --git a/drivers/gpu/drm/i915/gt/selftest_rc6.c b/drivers/gpu/drm/i915/gt/selftest_rc6.c
new file mode 100644
index 000000000000..c8b729f7e93e
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/selftest_rc6.c
@@ -0,0 +1,146 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "intel_context.h"
+#include "intel_engine_pm.h"
+#include "intel_gt_requests.h"
+#include "intel_ring.h"
+#include "selftest_rc6.h"
+
+#include "selftests/i915_random.h"
+
+static const u32 *__live_rc6_ctx(struct intel_context *ce)
+{
+	struct i915_request *rq;
+	u32 const *result;
+	u32 cmd;
+	u32 *cs;
+
+	rq = intel_context_create_request(ce);
+	if (IS_ERR(rq))
+		return ERR_CAST(rq);
+
+	cs = intel_ring_begin(rq, 4);
+	if (IS_ERR(cs)) {
+		i915_request_add(rq);
+		return cs;
+	}
+
+	cmd = MI_STORE_REGISTER_MEM | MI_USE_GGTT;
+	if (INTEL_GEN(rq->i915) >= 8)
+		cmd++;
+
+	*cs++ = cmd;
+	*cs++ = i915_mmio_reg_offset(GEN8_RC6_CTX_INFO);
+	*cs++ = ce->timeline->hwsp_offset + 8;
+	*cs++ = 0;
+	intel_ring_advance(rq, cs);
+
+	result = rq->hwsp_seqno + 2;
+	i915_request_add(rq);
+
+	return result;
+}
+
+static struct intel_engine_cs **
+randomised_engines(struct intel_gt *gt,
+		   struct rnd_state *prng,
+		   unsigned int *count)
+{
+	struct intel_engine_cs *engine, **engines;
+	enum intel_engine_id id;
+	int n;
+
+	n = 0;
+	for_each_engine(engine, gt, id)
+		n++;
+	if (!n)
+		return NULL;
+
+	engines = kmalloc_array(n, sizeof(*engines), GFP_KERNEL);
+	if (!engines)
+		return NULL;
+
+	n = 0;
+	for_each_engine(engine, gt, id)
+		engines[n++] = engine;
+
+	i915_prandom_shuffle(engines, sizeof(*engines), n, prng);
+
+	*count = n;
+	return engines;
+}
+
+int live_rc6_ctx(void *arg)
+{
+	struct intel_gt *gt = arg;
+	struct intel_engine_cs **engines;
+	unsigned int n, count;
+	I915_RND_STATE(prng);
+	int err = 0;
+
+	/* A read of CTX_INFO upsets rc6. Poke the bear! */
+	if (INTEL_GEN(gt->i915) < 8)
+		return 0;
+
+	engines = randomised_engines(gt, &prng, &count);
+	if (!engines)
+		return 0;
+
+	for (n = 0; n < count; n++) {
+		struct intel_engine_cs *engine = engines[n];
+		int pass;
+
+		for (pass = 0; pass < 2; pass++) {
+			struct intel_context *ce;
+			unsigned int resets =
+				i915_reset_engine_count(&gt->i915->gpu_error,
+							engine);
+			const u32 *res;
+
+			/* Use a sacrifical context */
+			ce = intel_context_create(engine->kernel_context->gem_context,
+						  engine);
+			if (IS_ERR(ce)) {
+				err = PTR_ERR(ce);
+				goto out;
+			}
+
+			intel_engine_pm_get(engine);
+			res = __live_rc6_ctx(ce);
+			intel_engine_pm_put(engine);
+			intel_context_put(ce);
+			if (IS_ERR(res)) {
+				err = PTR_ERR(res);
+				goto out;
+			}
+
+			if (intel_gt_wait_for_idle(gt, HZ / 5) == -ETIME) {
+				intel_gt_set_wedged(gt);
+				err = -ETIME;
+				goto out;
+			}
+
+			intel_gt_pm_wait_for_idle(gt);
+			pr_debug("%s: CTX_INFO=%0x\n",
+				 engine->name, READ_ONCE(*res));
+
+			if (resets !=
+			    i915_reset_engine_count(&gt->i915->gpu_error,
+						    engine)) {
+				pr_err("%s: GPU reset required\n",
+				       engine->name);
+				add_taint_for_CI(TAINT_WARN);
+				err = -EIO;
+				goto out;
+			}
+		}
+	}
+
+out:
+	kfree(engines);
+	return err;
+}
diff --git a/drivers/gpu/drm/i915/gt/selftest_rc6.h b/drivers/gpu/drm/i915/gt/selftest_rc6.h
new file mode 100644
index 000000000000..230c6b4c7dc0
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/selftest_rc6.h
@@ -0,0 +1,12 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef SELFTEST_RC6_H
+#define SELFTEST_RC6_H
+
+int live_rc6_ctx(void *arg);
+
+#endif /* SELFTEST_RC6_H */
diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
index 11b40bc58e6d..beff59ee9f6f 100644
--- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
@@ -39,3 +39,4 @@ selftest(hangcheck, intel_hangcheck_live_selftests)
 selftest(execlists, intel_execlists_live_selftests)
 selftest(guc, intel_guc_live_selftest)
 selftest(perf, i915_perf_live_selftests)
+selftest(late_gt_pm, intel_gt_pm_late_selftests)
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH 19/19] drm/i915/gt: Track engine round-trip times
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

Knowing the round trip time of an engine is useful for tracking the
health of the system as well as providing a metric for the baseline
responsiveness of the engine. We can use the latter metric for
automatically tuning our waits in selftests and when idling so we don't
confuse a slower system with a dead one.

Upon idling the engine, we send one last pulse to switch the context
away from precious user state to the volatile kernel context. We know
the engine is idle at this point, and the pulse is non-preemptable, so
this provides us with a good measurement of the round trip time. A
secondary effect is that by installing an interrupt onto the pulse, we
can flush the engine immediately upon completion, curtailing the
background flush and entering powersaving immediately.

v2: Manage pm wakerefs to avoid too early module unload

References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  2 ++
 drivers/gpu/drm/i915/gt/intel_engine_pm.c    | 37 +++++++++++++++++++-
 drivers/gpu/drm/i915/gt/intel_engine_types.h | 11 ++++++
 drivers/gpu/drm/i915/gt/intel_gt_pm.h        |  5 +++
 drivers/gpu/drm/i915/intel_wakeref.h         | 13 +++++++
 5 files changed, 67 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index b9613d044393..2d11db13dc89 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -334,6 +334,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
 	/* Nothing to do here, execute in order of dependencies */
 	engine->schedule = NULL;
 
+	ewma_delay_init(&engine->delay);
 	seqlock_init(&engine->stats.lock);
 
 	ATOMIC_INIT_NOTIFIER_HEAD(&engine->context_status_notifier);
@@ -1477,6 +1478,7 @@ void intel_engine_dump(struct intel_engine_cs *engine,
 		drm_printf(m, "*** WEDGED ***\n");
 
 	drm_printf(m, "\tAwake? %d\n", atomic_read(&engine->wakeref.count));
+	drm_printf(m, "\tDelay: %luus\n", ewma_delay_read(&engine->delay));
 
 	rcu_read_lock();
 	rq = READ_ONCE(engine->heartbeat.systole);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 4878d16176d5..27d75990b4ac 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -12,6 +12,7 @@
 #include "intel_engine_pool.h"
 #include "intel_gt.h"
 #include "intel_gt_pm.h"
+#include "intel_gt_requests.h"
 #include "intel_rc6.h"
 #include "intel_ring.h"
 
@@ -73,6 +74,24 @@ static inline void __timeline_mark_unlock(struct intel_context *ce,
 
 #endif /* !IS_ENABLED(CONFIG_LOCKDEP) */
 
+struct duration_cb {
+	struct dma_fence_cb cb;
+	ktime_t emitted;
+};
+
+static void duration_cb(struct dma_fence *fence, struct dma_fence_cb *cb)
+{
+	struct duration_cb *dcb = container_of(cb, typeof(*dcb), cb);
+	struct intel_engine_cs *engine = to_request(fence)->engine;
+
+	ewma_delay_add(&engine->delay,
+		       ktime_us_delta(ktime_get(), dcb->emitted));
+
+	/* Kick retire for quicker powersaving (soft-rc6). */
+	intel_gt_schedule_retire_requests(engine->gt);
+	intel_gt_pm_put_async(engine->gt);
+}
+
 static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 {
 	struct intel_context *ce = engine->kernel_context;
@@ -113,7 +132,23 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 
 	/* Install ourselves as a preemption barrier */
 	rq->sched.attr.priority = I915_PRIORITY_BARRIER;
-	__i915_request_commit(rq);
+	if (likely(!__i915_request_commit(rq))) { /* engine should be idle! */
+		struct duration_cb *dcb;
+
+		BUILD_BUG_ON(sizeof(*dcb) > sizeof(rq->submitq));
+		dcb = (struct duration_cb *)&rq->submitq;
+
+		/*
+		 * Use an interrupt for precise measurement of duration,
+		 * otherwise we rely on someone else retiring all the requests
+		 * which may delay the signaling (i.e. we will likely wait
+		 * until the background request retirement running every
+		 * second or two).
+		 */
+		__intel_gt_pm_get(rq->engine->gt);
+		dma_fence_add_callback(&rq->fence, &dcb->cb, duration_cb);
+		dcb->emitted = ktime_get();
+	}
 
 	__i915_request_queue(rq, NULL);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 758f0e8ec672..1de121583482 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -7,6 +7,7 @@
 #ifndef __INTEL_ENGINE_TYPES__
 #define __INTEL_ENGINE_TYPES__
 
+#include <linux/average.h>
 #include <linux/hashtable.h>
 #include <linux/irq_work.h>
 #include <linux/kref.h>
@@ -119,6 +120,9 @@ enum intel_engine_id {
 #define INVALID_ENGINE ((enum intel_engine_id)-1)
 };
 
+/* A simple estimator for the round-trip responsive time of an engine */
+DECLARE_EWMA(delay, 6, 4)
+
 struct st_preempt_hang {
 	struct completion completion;
 	unsigned int count;
@@ -316,6 +320,13 @@ struct intel_engine_cs {
 		struct intel_timeline *timeline;
 	} legacy;
 
+	/*
+	 * We track the average duration of the idle pulse on parking the
+	 * engine to keep an estimate of the how the fast the engine is
+	 * under ideal conditions.
+	 */
+	struct ewma_delay delay;
+
 	/* Rather than have every client wait upon all user interrupts,
 	 * with the herd waking after every interrupt and each doing the
 	 * heavyweight seqno dance, we delegate the task (of being the
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.h b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
index 990efc27a4e4..4a9e48c12bd4 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
@@ -22,6 +22,11 @@ static inline void intel_gt_pm_get(struct intel_gt *gt)
 	intel_wakeref_get(&gt->wakeref);
 }
 
+static inline void __intel_gt_pm_get(struct intel_gt *gt)
+{
+	__intel_wakeref_get(&gt->wakeref);
+}
+
 static inline bool intel_gt_pm_get_if_awake(struct intel_gt *gt)
 {
 	return intel_wakeref_get_if_active(&gt->wakeref);
diff --git a/drivers/gpu/drm/i915/intel_wakeref.h b/drivers/gpu/drm/i915/intel_wakeref.h
index 688b9b536a69..e19b0f8d1794 100644
--- a/drivers/gpu/drm/i915/intel_wakeref.h
+++ b/drivers/gpu/drm/i915/intel_wakeref.h
@@ -82,6 +82,19 @@ intel_wakeref_get(struct intel_wakeref *wf)
 	return 0;
 }
 
+/**
+ * __intel_wakeref_get: Acquire the wakeref, again
+ * @wf: the wakeref
+ *
+ * Acquire a hold on an already active wakeref.
+ */
+static inline void
+__intel_wakeref_get(struct intel_wakeref *wf)
+{
+	INTEL_WAKEREF_BUG_ON(atomic_read(&wf->count) <= 0);
+	atomic_inc(&wf->count);
+}
+
 /**
  * intel_wakeref_get_if_in_use: Acquire the wakeref
  * @wf: the wakeref
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH 19/19] drm/i915/gt: Track engine round-trip times
@ 2019-11-18 23:02   ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-18 23:02 UTC (permalink / raw)
  To: intel-gfx

Knowing the round trip time of an engine is useful for tracking the
health of the system as well as providing a metric for the baseline
responsiveness of the engine. We can use the latter metric for
automatically tuning our waits in selftests and when idling so we don't
confuse a slower system with a dead one.

Upon idling the engine, we send one last pulse to switch the context
away from precious user state to the volatile kernel context. We know
the engine is idle at this point, and the pulse is non-preemptable, so
this provides us with a good measurement of the round trip time. A
secondary effect is that by installing an interrupt onto the pulse, we
can flush the engine immediately upon completion, curtailing the
background flush and entering powersaving immediately.

v2: Manage pm wakerefs to avoid too early module unload

References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  2 ++
 drivers/gpu/drm/i915/gt/intel_engine_pm.c    | 37 +++++++++++++++++++-
 drivers/gpu/drm/i915/gt/intel_engine_types.h | 11 ++++++
 drivers/gpu/drm/i915/gt/intel_gt_pm.h        |  5 +++
 drivers/gpu/drm/i915/intel_wakeref.h         | 13 +++++++
 5 files changed, 67 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index b9613d044393..2d11db13dc89 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -334,6 +334,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
 	/* Nothing to do here, execute in order of dependencies */
 	engine->schedule = NULL;
 
+	ewma_delay_init(&engine->delay);
 	seqlock_init(&engine->stats.lock);
 
 	ATOMIC_INIT_NOTIFIER_HEAD(&engine->context_status_notifier);
@@ -1477,6 +1478,7 @@ void intel_engine_dump(struct intel_engine_cs *engine,
 		drm_printf(m, "*** WEDGED ***\n");
 
 	drm_printf(m, "\tAwake? %d\n", atomic_read(&engine->wakeref.count));
+	drm_printf(m, "\tDelay: %luus\n", ewma_delay_read(&engine->delay));
 
 	rcu_read_lock();
 	rq = READ_ONCE(engine->heartbeat.systole);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 4878d16176d5..27d75990b4ac 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -12,6 +12,7 @@
 #include "intel_engine_pool.h"
 #include "intel_gt.h"
 #include "intel_gt_pm.h"
+#include "intel_gt_requests.h"
 #include "intel_rc6.h"
 #include "intel_ring.h"
 
@@ -73,6 +74,24 @@ static inline void __timeline_mark_unlock(struct intel_context *ce,
 
 #endif /* !IS_ENABLED(CONFIG_LOCKDEP) */
 
+struct duration_cb {
+	struct dma_fence_cb cb;
+	ktime_t emitted;
+};
+
+static void duration_cb(struct dma_fence *fence, struct dma_fence_cb *cb)
+{
+	struct duration_cb *dcb = container_of(cb, typeof(*dcb), cb);
+	struct intel_engine_cs *engine = to_request(fence)->engine;
+
+	ewma_delay_add(&engine->delay,
+		       ktime_us_delta(ktime_get(), dcb->emitted));
+
+	/* Kick retire for quicker powersaving (soft-rc6). */
+	intel_gt_schedule_retire_requests(engine->gt);
+	intel_gt_pm_put_async(engine->gt);
+}
+
 static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 {
 	struct intel_context *ce = engine->kernel_context;
@@ -113,7 +132,23 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 
 	/* Install ourselves as a preemption barrier */
 	rq->sched.attr.priority = I915_PRIORITY_BARRIER;
-	__i915_request_commit(rq);
+	if (likely(!__i915_request_commit(rq))) { /* engine should be idle! */
+		struct duration_cb *dcb;
+
+		BUILD_BUG_ON(sizeof(*dcb) > sizeof(rq->submitq));
+		dcb = (struct duration_cb *)&rq->submitq;
+
+		/*
+		 * Use an interrupt for precise measurement of duration,
+		 * otherwise we rely on someone else retiring all the requests
+		 * which may delay the signaling (i.e. we will likely wait
+		 * until the background request retirement running every
+		 * second or two).
+		 */
+		__intel_gt_pm_get(rq->engine->gt);
+		dma_fence_add_callback(&rq->fence, &dcb->cb, duration_cb);
+		dcb->emitted = ktime_get();
+	}
 
 	__i915_request_queue(rq, NULL);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 758f0e8ec672..1de121583482 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -7,6 +7,7 @@
 #ifndef __INTEL_ENGINE_TYPES__
 #define __INTEL_ENGINE_TYPES__
 
+#include <linux/average.h>
 #include <linux/hashtable.h>
 #include <linux/irq_work.h>
 #include <linux/kref.h>
@@ -119,6 +120,9 @@ enum intel_engine_id {
 #define INVALID_ENGINE ((enum intel_engine_id)-1)
 };
 
+/* A simple estimator for the round-trip responsive time of an engine */
+DECLARE_EWMA(delay, 6, 4)
+
 struct st_preempt_hang {
 	struct completion completion;
 	unsigned int count;
@@ -316,6 +320,13 @@ struct intel_engine_cs {
 		struct intel_timeline *timeline;
 	} legacy;
 
+	/*
+	 * We track the average duration of the idle pulse on parking the
+	 * engine to keep an estimate of the how the fast the engine is
+	 * under ideal conditions.
+	 */
+	struct ewma_delay delay;
+
 	/* Rather than have every client wait upon all user interrupts,
 	 * with the herd waking after every interrupt and each doing the
 	 * heavyweight seqno dance, we delegate the task (of being the
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.h b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
index 990efc27a4e4..4a9e48c12bd4 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
@@ -22,6 +22,11 @@ static inline void intel_gt_pm_get(struct intel_gt *gt)
 	intel_wakeref_get(&gt->wakeref);
 }
 
+static inline void __intel_gt_pm_get(struct intel_gt *gt)
+{
+	__intel_wakeref_get(&gt->wakeref);
+}
+
 static inline bool intel_gt_pm_get_if_awake(struct intel_gt *gt)
 {
 	return intel_wakeref_get_if_active(&gt->wakeref);
diff --git a/drivers/gpu/drm/i915/intel_wakeref.h b/drivers/gpu/drm/i915/intel_wakeref.h
index 688b9b536a69..e19b0f8d1794 100644
--- a/drivers/gpu/drm/i915/intel_wakeref.h
+++ b/drivers/gpu/drm/i915/intel_wakeref.h
@@ -82,6 +82,19 @@ intel_wakeref_get(struct intel_wakeref *wf)
 	return 0;
 }
 
+/**
+ * __intel_wakeref_get: Acquire the wakeref, again
+ * @wf: the wakeref
+ *
+ * Acquire a hold on an already active wakeref.
+ */
+static inline void
+__intel_wakeref_get(struct intel_wakeref *wf)
+{
+	INTEL_WAKEREF_BUG_ON(atomic_read(&wf->count) <= 0);
+	atomic_inc(&wf->count);
+}
+
 /**
  * intel_wakeref_get_if_in_use: Acquire the wakeref
  * @wf: the wakeref
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/19] drm/i915/selftests: Force bonded submission to overlap
@ 2019-11-18 23:21   ` Patchwork
  0 siblings, 0 replies; 90+ messages in thread
From: Patchwork @ 2019-11-18 23:21 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/19] drm/i915/selftests: Force bonded submission to overlap
URL   : https://patchwork.freedesktop.org/series/69647/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
aeceb9b58767 drm/i915/selftests: Force bonded submission to overlap
ec2e43c08c5d drm/i915/gem: Manually dump the debug trace on GEM_BUG_ON
1f9cca09dfa7 drm/i915/gt: Close race between engine_park and intel_gt_retire_requests
0c568bdcf398 drm/i915/gt: Unlock engine-pm after queuing the kernel context switch
7c7145bc1d72 drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint
4919c27e24e1 drm/i915/gt: Schedule request retirement when submission idles
-:25: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")'
#25: 
References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")

total: 1 errors, 0 warnings, 0 checks, 46 lines checked
2771144ad598 drm/i915: Mark up the calling context for intel_wakeref_put()
-:14: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#14: 
References: a0855d24fc22d ("locking/mutex: Complain upon mutex API misuse in IRQ contexts")

total: 0 errors, 1 warnings, 0 checks, 239 lines checked
95bcf958b4d7 drm/i915/gem: Merge GGTT vma flush into a single loop
cc4cf48b6b93 drm/i915/gt: Only wait for register chipset flush if active
fd291c5b4a09 drm/i915: Protect the obj->vma.list during iteration
-:9: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#9: 
<1>[  347.820823] BUG: kernel NULL pointer dereference, address: 0000000000000150

total: 0 errors, 1 warnings, 0 checks, 12 lines checked
45c38cff15ad drm/i915: Wait until the intel_wakeref idle callback is complete
285c5c0d27d1 drm/i915/gt: Declare timeline.lock to be irq-free
20b312931cfe drm/i915/gt: Move new timelines to the end of active_list
-:10: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#10: 
References: 7936a22dd466 ("drm/i915/gt: Wait for new requests in intel_gt_retire_requests()")

-:10: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 7936a22dd466 ("drm/i915/gt: Wait for new requests in intel_gt_retire_requests()")'
#10: 
References: 7936a22dd466 ("drm/i915/gt: Wait for new requests in intel_gt_retire_requests()")

total: 1 errors, 1 warnings, 0 checks, 8 lines checked
8e96143a50b4 drm/i915/gt: Schedule next retirement worker first
5e243d88f528 drm/i915/gt: Flush the requests after wedging on suspend
0c9185cf37bb drm/i915/selftests: Flush the active callbacks
d9e27f4dc70b drm/i915/selftests: Be explicit in ERR_PTR handling
-:11: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#11: 
References: f40a7b7558ef ("drm/i915: Initial selftests for exercising eviction")

-:11: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit f40a7b7558ef ("drm/i915: Initial selftests for exercising eviction")'
#11: 
References: f40a7b7558ef ("drm/i915: Initial selftests for exercising eviction")

-:25: WARNING:LONG_LINE: line over 100 characters
#25: FILE: drivers/gpu/drm/i915/selftests/i915_gem_evict.c:202:
+		pr_err("Failed to evict+insert, i915_gem_object_ggtt_pin returned err=%d\n", (int)PTR_ERR_OR_ZERO(vma));

total: 1 errors, 2 warnings, 0 checks, 10 lines checked
bc72aa6adc15 drm/i915/selftests: Exercise rc6 handling
-:54: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#54: 
new file mode 100644

-:59: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#59: FILE: drivers/gpu/drm/i915/gt/selftest_rc6.c:1:
+/*

-:60: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#60: FILE: drivers/gpu/drm/i915/gt/selftest_rc6.c:2:
+ * SPDX-License-Identifier: MIT

-:140: WARNING:LINE_SPACING: Missing a blank line after declarations
#140: FILE: drivers/gpu/drm/i915/gt/selftest_rc6.c:82:
+	unsigned int n, count;
+	I915_RND_STATE(prng);

-:211: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#211: FILE: drivers/gpu/drm/i915/gt/selftest_rc6.h:1:
+/*

-:212: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#212: FILE: drivers/gpu/drm/i915/gt/selftest_rc6.h:2:
+ * SPDX-License-Identifier: MIT

total: 0 errors, 6 warnings, 0 checks, 191 lines checked
052a17b70d34 drm/i915/gt: Track engine round-trip times
-:14: WARNING:TYPO_SPELLING: 'preemptable' may be misspelled - perhaps 'preemptible'?
#14: 
the engine is idle at this point, and the pulse is non-preemptable, so

-:22: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")'
#22: 
References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")

total: 1 errors, 1 warnings, 0 checks, 128 lines checked

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/19] drm/i915/selftests: Force bonded submission to overlap
@ 2019-11-18 23:21   ` Patchwork
  0 siblings, 0 replies; 90+ messages in thread
From: Patchwork @ 2019-11-18 23:21 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/19] drm/i915/selftests: Force bonded submission to overlap
URL   : https://patchwork.freedesktop.org/series/69647/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
aeceb9b58767 drm/i915/selftests: Force bonded submission to overlap
ec2e43c08c5d drm/i915/gem: Manually dump the debug trace on GEM_BUG_ON
1f9cca09dfa7 drm/i915/gt: Close race between engine_park and intel_gt_retire_requests
0c568bdcf398 drm/i915/gt: Unlock engine-pm after queuing the kernel context switch
7c7145bc1d72 drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint
4919c27e24e1 drm/i915/gt: Schedule request retirement when submission idles
-:25: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")'
#25: 
References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")

total: 1 errors, 0 warnings, 0 checks, 46 lines checked
2771144ad598 drm/i915: Mark up the calling context for intel_wakeref_put()
-:14: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#14: 
References: a0855d24fc22d ("locking/mutex: Complain upon mutex API misuse in IRQ contexts")

total: 0 errors, 1 warnings, 0 checks, 239 lines checked
95bcf958b4d7 drm/i915/gem: Merge GGTT vma flush into a single loop
cc4cf48b6b93 drm/i915/gt: Only wait for register chipset flush if active
fd291c5b4a09 drm/i915: Protect the obj->vma.list during iteration
-:9: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#9: 
<1>[  347.820823] BUG: kernel NULL pointer dereference, address: 0000000000000150

total: 0 errors, 1 warnings, 0 checks, 12 lines checked
45c38cff15ad drm/i915: Wait until the intel_wakeref idle callback is complete
285c5c0d27d1 drm/i915/gt: Declare timeline.lock to be irq-free
20b312931cfe drm/i915/gt: Move new timelines to the end of active_list
-:10: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#10: 
References: 7936a22dd466 ("drm/i915/gt: Wait for new requests in intel_gt_retire_requests()")

-:10: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 7936a22dd466 ("drm/i915/gt: Wait for new requests in intel_gt_retire_requests()")'
#10: 
References: 7936a22dd466 ("drm/i915/gt: Wait for new requests in intel_gt_retire_requests()")

total: 1 errors, 1 warnings, 0 checks, 8 lines checked
8e96143a50b4 drm/i915/gt: Schedule next retirement worker first
5e243d88f528 drm/i915/gt: Flush the requests after wedging on suspend
0c9185cf37bb drm/i915/selftests: Flush the active callbacks
d9e27f4dc70b drm/i915/selftests: Be explicit in ERR_PTR handling
-:11: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#11: 
References: f40a7b7558ef ("drm/i915: Initial selftests for exercising eviction")

-:11: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit f40a7b7558ef ("drm/i915: Initial selftests for exercising eviction")'
#11: 
References: f40a7b7558ef ("drm/i915: Initial selftests for exercising eviction")

-:25: WARNING:LONG_LINE: line over 100 characters
#25: FILE: drivers/gpu/drm/i915/selftests/i915_gem_evict.c:202:
+		pr_err("Failed to evict+insert, i915_gem_object_ggtt_pin returned err=%d\n", (int)PTR_ERR_OR_ZERO(vma));

total: 1 errors, 2 warnings, 0 checks, 10 lines checked
bc72aa6adc15 drm/i915/selftests: Exercise rc6 handling
-:54: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#54: 
new file mode 100644

-:59: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#59: FILE: drivers/gpu/drm/i915/gt/selftest_rc6.c:1:
+/*

-:60: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#60: FILE: drivers/gpu/drm/i915/gt/selftest_rc6.c:2:
+ * SPDX-License-Identifier: MIT

-:140: WARNING:LINE_SPACING: Missing a blank line after declarations
#140: FILE: drivers/gpu/drm/i915/gt/selftest_rc6.c:82:
+	unsigned int n, count;
+	I915_RND_STATE(prng);

-:211: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#211: FILE: drivers/gpu/drm/i915/gt/selftest_rc6.h:1:
+/*

-:212: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#212: FILE: drivers/gpu/drm/i915/gt/selftest_rc6.h:2:
+ * SPDX-License-Identifier: MIT

total: 0 errors, 6 warnings, 0 checks, 191 lines checked
052a17b70d34 drm/i915/gt: Track engine round-trip times
-:14: WARNING:TYPO_SPELLING: 'preemptable' may be misspelled - perhaps 'preemptible'?
#14: 
the engine is idle at this point, and the pulse is non-preemptable, so

-:22: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")'
#22: 
References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")

total: 1 errors, 1 warnings, 0 checks, 128 lines checked

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [01/19] drm/i915/selftests: Force bonded submission to overlap
@ 2019-11-19  0:04   ` Patchwork
  0 siblings, 0 replies; 90+ messages in thread
From: Patchwork @ 2019-11-19  0:04 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/19] drm/i915/selftests: Force bonded submission to overlap
URL   : https://patchwork.freedesktop.org/series/69647/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_7367 -> Patchwork_15322
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/index.html

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_15322:

### IGT changes ###

#### Suppressed ####

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@i915_selftest@live_gem_contexts:
    - {fi-kbl-7560u}:     [PASS][1] -> [TIMEOUT][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/fi-kbl-7560u/igt@i915_selftest@live_gem_contexts.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/fi-kbl-7560u/igt@i915_selftest@live_gem_contexts.html

  * igt@runner@aborted:
    - {fi-kbl-7560u}:     NOTRUN -> [FAIL][3]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/fi-kbl-7560u/igt@runner@aborted.html

  
New tests
---------

  New tests have been introduced between CI_DRM_7367 and Patchwork_15322:

### New IGT tests (1) ###

  * igt@i915_selftest@live_late_gt_pm:
    - Statuses : 35 pass(s)
    - Exec time: [0.39, 1.43] s

  

Known issues
------------

  Here are the changes found in Patchwork_15322 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_suspend@basic-s4-devices:
    - fi-icl-dsi:         [PASS][4] -> [FAIL][5] ([fdo#109779] / [fdo#111550])
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/fi-icl-dsi/igt@gem_exec_suspend@basic-s4-devices.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/fi-icl-dsi/igt@gem_exec_suspend@basic-s4-devices.html

  * igt@i915_selftest@live_execlists:
    - fi-whl-u:           [PASS][6] -> [INCOMPLETE][7] ([fdo#112066])
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/fi-whl-u/igt@i915_selftest@live_execlists.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/fi-whl-u/igt@i915_selftest@live_execlists.html
    - fi-icl-dsi:         [PASS][8] -> [INCOMPLETE][9] ([fdo#107713])
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/fi-icl-dsi/igt@i915_selftest@live_execlists.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/fi-icl-dsi/igt@i915_selftest@live_execlists.html

  * igt@i915_selftest@live_sanitycheck:
    - fi-skl-lmem:        [PASS][10] -> [DMESG-WARN][11] ([fdo#112261])
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/fi-skl-lmem/igt@i915_selftest@live_sanitycheck.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/fi-skl-lmem/igt@i915_selftest@live_sanitycheck.html

  * igt@kms_chamelium@hdmi-hpd-fast:
    - fi-kbl-7500u:       [PASS][12] -> [FAIL][13] ([fdo#111407])
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html

  
#### Possible fixes ####

  * igt@i915_selftest@live_blt:
    - fi-bsw-n3050:       [DMESG-FAIL][14] ([fdo#112176]) -> [PASS][15]
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/fi-bsw-n3050/igt@i915_selftest@live_blt.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/fi-bsw-n3050/igt@i915_selftest@live_blt.html
    - fi-hsw-peppy:       [DMESG-FAIL][16] ([fdo#112225]) -> [PASS][17]
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/fi-hsw-peppy/igt@i915_selftest@live_blt.html
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/fi-hsw-peppy/igt@i915_selftest@live_blt.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#109779]: https://bugs.freedesktop.org/show_bug.cgi?id=109779
  [fdo#111407]: https://bugs.freedesktop.org/show_bug.cgi?id=111407
  [fdo#111550]: https://bugs.freedesktop.org/show_bug.cgi?id=111550
  [fdo#112066]: https://bugs.freedesktop.org/show_bug.cgi?id=112066
  [fdo#112176]: https://bugs.freedesktop.org/show_bug.cgi?id=112176
  [fdo#112225]: https://bugs.freedesktop.org/show_bug.cgi?id=112225
  [fdo#112261]: https://bugs.freedesktop.org/show_bug.cgi?id=112261


Participating hosts (49 -> 43)
------------------------------

  Missing    (6): fi-hsw-4200u fi-bsw-cyan fi-ctg-p8600 fi-blb-e6850 fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_7367 -> Patchwork_15322

  CI-20190529: 20190529
  CI_DRM_7367: db6c0570abdebad7d56bf88eb15ad3b51dad4699 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5293: 4bb46f08f7cb6485642c4351cecdad93072d27a0 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_15322: 052a17b70d341927ec36f3dd2fe3b5ae7706fb6d @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

052a17b70d34 drm/i915/gt: Track engine round-trip times
bc72aa6adc15 drm/i915/selftests: Exercise rc6 handling
d9e27f4dc70b drm/i915/selftests: Be explicit in ERR_PTR handling
0c9185cf37bb drm/i915/selftests: Flush the active callbacks
5e243d88f528 drm/i915/gt: Flush the requests after wedging on suspend
8e96143a50b4 drm/i915/gt: Schedule next retirement worker first
20b312931cfe drm/i915/gt: Move new timelines to the end of active_list
285c5c0d27d1 drm/i915/gt: Declare timeline.lock to be irq-free
45c38cff15ad drm/i915: Wait until the intel_wakeref idle callback is complete
fd291c5b4a09 drm/i915: Protect the obj->vma.list during iteration
cc4cf48b6b93 drm/i915/gt: Only wait for register chipset flush if active
95bcf958b4d7 drm/i915/gem: Merge GGTT vma flush into a single loop
2771144ad598 drm/i915: Mark up the calling context for intel_wakeref_put()
4919c27e24e1 drm/i915/gt: Schedule request retirement when submission idles
7c7145bc1d72 drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint
0c568bdcf398 drm/i915/gt: Unlock engine-pm after queuing the kernel context switch
1f9cca09dfa7 drm/i915/gt: Close race between engine_park and intel_gt_retire_requests
ec2e43c08c5d drm/i915/gem: Manually dump the debug trace on GEM_BUG_ON
aeceb9b58767 drm/i915/selftests: Force bonded submission to overlap

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [01/19] drm/i915/selftests: Force bonded submission to overlap
@ 2019-11-19  0:04   ` Patchwork
  0 siblings, 0 replies; 90+ messages in thread
From: Patchwork @ 2019-11-19  0:04 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/19] drm/i915/selftests: Force bonded submission to overlap
URL   : https://patchwork.freedesktop.org/series/69647/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_7367 -> Patchwork_15322
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/index.html

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_15322:

### IGT changes ###

#### Suppressed ####

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@i915_selftest@live_gem_contexts:
    - {fi-kbl-7560u}:     [PASS][1] -> [TIMEOUT][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/fi-kbl-7560u/igt@i915_selftest@live_gem_contexts.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/fi-kbl-7560u/igt@i915_selftest@live_gem_contexts.html

  * igt@runner@aborted:
    - {fi-kbl-7560u}:     NOTRUN -> [FAIL][3]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/fi-kbl-7560u/igt@runner@aborted.html

  
New tests
---------

  New tests have been introduced between CI_DRM_7367 and Patchwork_15322:

### New IGT tests (1) ###

  * igt@i915_selftest@live_late_gt_pm:
    - Statuses : 35 pass(s)
    - Exec time: [0.39, 1.43] s

  

Known issues
------------

  Here are the changes found in Patchwork_15322 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_suspend@basic-s4-devices:
    - fi-icl-dsi:         [PASS][4] -> [FAIL][5] ([fdo#109779] / [fdo#111550])
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/fi-icl-dsi/igt@gem_exec_suspend@basic-s4-devices.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/fi-icl-dsi/igt@gem_exec_suspend@basic-s4-devices.html

  * igt@i915_selftest@live_execlists:
    - fi-whl-u:           [PASS][6] -> [INCOMPLETE][7] ([fdo#112066])
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/fi-whl-u/igt@i915_selftest@live_execlists.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/fi-whl-u/igt@i915_selftest@live_execlists.html
    - fi-icl-dsi:         [PASS][8] -> [INCOMPLETE][9] ([fdo#107713])
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/fi-icl-dsi/igt@i915_selftest@live_execlists.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/fi-icl-dsi/igt@i915_selftest@live_execlists.html

  * igt@i915_selftest@live_sanitycheck:
    - fi-skl-lmem:        [PASS][10] -> [DMESG-WARN][11] ([fdo#112261])
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/fi-skl-lmem/igt@i915_selftest@live_sanitycheck.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/fi-skl-lmem/igt@i915_selftest@live_sanitycheck.html

  * igt@kms_chamelium@hdmi-hpd-fast:
    - fi-kbl-7500u:       [PASS][12] -> [FAIL][13] ([fdo#111407])
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html

  
#### Possible fixes ####

  * igt@i915_selftest@live_blt:
    - fi-bsw-n3050:       [DMESG-FAIL][14] ([fdo#112176]) -> [PASS][15]
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/fi-bsw-n3050/igt@i915_selftest@live_blt.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/fi-bsw-n3050/igt@i915_selftest@live_blt.html
    - fi-hsw-peppy:       [DMESG-FAIL][16] ([fdo#112225]) -> [PASS][17]
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/fi-hsw-peppy/igt@i915_selftest@live_blt.html
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/fi-hsw-peppy/igt@i915_selftest@live_blt.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#109779]: https://bugs.freedesktop.org/show_bug.cgi?id=109779
  [fdo#111407]: https://bugs.freedesktop.org/show_bug.cgi?id=111407
  [fdo#111550]: https://bugs.freedesktop.org/show_bug.cgi?id=111550
  [fdo#112066]: https://bugs.freedesktop.org/show_bug.cgi?id=112066
  [fdo#112176]: https://bugs.freedesktop.org/show_bug.cgi?id=112176
  [fdo#112225]: https://bugs.freedesktop.org/show_bug.cgi?id=112225
  [fdo#112261]: https://bugs.freedesktop.org/show_bug.cgi?id=112261


Participating hosts (49 -> 43)
------------------------------

  Missing    (6): fi-hsw-4200u fi-bsw-cyan fi-ctg-p8600 fi-blb-e6850 fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_7367 -> Patchwork_15322

  CI-20190529: 20190529
  CI_DRM_7367: db6c0570abdebad7d56bf88eb15ad3b51dad4699 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5293: 4bb46f08f7cb6485642c4351cecdad93072d27a0 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_15322: 052a17b70d341927ec36f3dd2fe3b5ae7706fb6d @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

052a17b70d34 drm/i915/gt: Track engine round-trip times
bc72aa6adc15 drm/i915/selftests: Exercise rc6 handling
d9e27f4dc70b drm/i915/selftests: Be explicit in ERR_PTR handling
0c9185cf37bb drm/i915/selftests: Flush the active callbacks
5e243d88f528 drm/i915/gt: Flush the requests after wedging on suspend
8e96143a50b4 drm/i915/gt: Schedule next retirement worker first
20b312931cfe drm/i915/gt: Move new timelines to the end of active_list
285c5c0d27d1 drm/i915/gt: Declare timeline.lock to be irq-free
45c38cff15ad drm/i915: Wait until the intel_wakeref idle callback is complete
fd291c5b4a09 drm/i915: Protect the obj->vma.list during iteration
cc4cf48b6b93 drm/i915/gt: Only wait for register chipset flush if active
95bcf958b4d7 drm/i915/gem: Merge GGTT vma flush into a single loop
2771144ad598 drm/i915: Mark up the calling context for intel_wakeref_put()
4919c27e24e1 drm/i915/gt: Schedule request retirement when submission idles
7c7145bc1d72 drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint
0c568bdcf398 drm/i915/gt: Unlock engine-pm after queuing the kernel context switch
1f9cca09dfa7 drm/i915/gt: Close race between engine_park and intel_gt_retire_requests
ec2e43c08c5d drm/i915/gem: Manually dump the debug trace on GEM_BUG_ON
aeceb9b58767 drm/i915/selftests: Force bonded submission to overlap

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* ✗ Fi.CI.IGT: failure for series starting with [01/19] drm/i915/selftests: Force bonded submission to overlap
@ 2019-11-19  9:08   ` Patchwork
  0 siblings, 0 replies; 90+ messages in thread
From: Patchwork @ 2019-11-19  9:08 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/19] drm/i915/selftests: Force bonded submission to overlap
URL   : https://patchwork.freedesktop.org/series/69647/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_7367_full -> Patchwork_15322_full
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_15322_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_15322_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_15322_full:

### IGT changes ###

#### Possible regressions ####

  * igt@kms_draw_crc@draw-method-xrgb2101010-render-ytiled:
    - shard-skl:          [PASS][1] -> [INCOMPLETE][2] +1 similar issue
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-skl6/igt@kms_draw_crc@draw-method-xrgb2101010-render-ytiled.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-skl10/igt@kms_draw_crc@draw-method-xrgb2101010-render-ytiled.html

  * igt@perf@oa-exponents:
    - shard-kbl:          [PASS][3] -> [TIMEOUT][4] +1 similar issue
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-kbl7/igt@perf@oa-exponents.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-kbl1/igt@perf@oa-exponents.html
    - shard-iclb:         [PASS][5] -> [TIMEOUT][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-iclb3/igt@perf@oa-exponents.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-iclb6/igt@perf@oa-exponents.html

  * igt@runner@aborted:
    - shard-kbl:          NOTRUN -> ([FAIL][7], [FAIL][8])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-kbl1/igt@runner@aborted.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-kbl6/igt@runner@aborted.html

  
New tests
---------

  New tests have been introduced between CI_DRM_7367_full and Patchwork_15322_full:

### New IGT tests (1) ###

  * igt@i915_selftest@live_late_gt_pm:
    - Statuses : 4 pass(s)
    - Exec time: [0.31, 2.47] s

  

Known issues
------------

  Here are the changes found in Patchwork_15322_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_ctx_shared@exec-single-timeline-bsd:
    - shard-iclb:         [PASS][9] -> [SKIP][10] ([fdo#110841])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-iclb5/igt@gem_ctx_shared@exec-single-timeline-bsd.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-iclb1/igt@gem_ctx_shared@exec-single-timeline-bsd.html

  * igt@gem_eio@reset-stress:
    - shard-snb:          [PASS][11] -> [FAIL][12] ([fdo#109661])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-snb4/igt@gem_eio@reset-stress.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-snb5/igt@gem_eio@reset-stress.html

  * igt@gem_exec_create@madvise:
    - shard-tglb:         [PASS][13] -> [INCOMPLETE][14] ([fdo#111747]) +1 similar issue
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-tglb8/igt@gem_exec_create@madvise.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-tglb6/igt@gem_exec_create@madvise.html

  * igt@gem_exec_reloc@basic-write-read-active:
    - shard-skl:          [PASS][15] -> [DMESG-WARN][16] ([fdo#106107])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-skl6/igt@gem_exec_reloc@basic-write-read-active.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-skl10/igt@gem_exec_reloc@basic-write-read-active.html

  * igt@gem_exec_schedule@fifo-bsd:
    - shard-iclb:         [PASS][17] -> [SKIP][18] ([fdo#112146]) +2 similar issues
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-iclb5/igt@gem_exec_schedule@fifo-bsd.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-iclb1/igt@gem_exec_schedule@fifo-bsd.html

  * igt@gem_exec_suspend@basic-s0:
    - shard-tglb:         [PASS][19] -> [INCOMPLETE][20] ([fdo#111832])
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-tglb2/igt@gem_exec_suspend@basic-s0.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-tglb5/igt@gem_exec_suspend@basic-s0.html

  * igt@gem_persistent_relocs@forked-interruptible-thrash-inactive:
    - shard-snb:          [PASS][21] -> [TIMEOUT][22] ([fdo#112068 ])
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-snb4/igt@gem_persistent_relocs@forked-interruptible-thrash-inactive.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-snb7/igt@gem_persistent_relocs@forked-interruptible-thrash-inactive.html

  * igt@gem_softpin@noreloc-s3:
    - shard-apl:          [PASS][23] -> [DMESG-WARN][24] ([fdo#108566])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-apl3/igt@gem_softpin@noreloc-s3.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-apl1/igt@gem_softpin@noreloc-s3.html
    - shard-skl:          [PASS][25] -> [INCOMPLETE][26] ([fdo#104108] / [fdo#107773])
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-skl7/igt@gem_softpin@noreloc-s3.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-skl2/igt@gem_softpin@noreloc-s3.html

  * igt@gem_userptr_blits@dmabuf-unsync:
    - shard-hsw:          [PASS][27] -> [DMESG-WARN][28] ([fdo#111870])
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-hsw5/igt@gem_userptr_blits@dmabuf-unsync.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-hsw8/igt@gem_userptr_blits@dmabuf-unsync.html

  * igt@i915_pm_dc@dc5-dpms:
    - shard-skl:          [PASS][29] -> [INCOMPLETE][30] ([fdo#108972]) +1 similar issue
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-skl3/igt@i915_pm_dc@dc5-dpms.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-skl9/igt@i915_pm_dc@dc5-dpms.html

  * igt@i915_pm_rc6_residency@rc6-accuracy:
    - shard-kbl:          [PASS][31] -> [SKIP][32] ([fdo#109271])
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-kbl6/igt@i915_pm_rc6_residency@rc6-accuracy.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-kbl1/igt@i915_pm_rc6_residency@rc6-accuracy.html

  * igt@i915_pm_rpm@system-suspend-execbuf:
    - shard-tglb:         [PASS][33] -> [INCOMPLETE][34] ([fdo#111832] / [fdo#111850])
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-tglb6/igt@i915_pm_rpm@system-suspend-execbuf.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-tglb3/igt@i915_pm_rpm@system-suspend-execbuf.html

  * igt@i915_selftest@live_hangcheck:
    - shard-hsw:          [PASS][35] -> [DMESG-FAIL][36] ([fdo#111991])
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-hsw6/igt@i915_selftest@live_hangcheck.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-hsw6/igt@i915_selftest@live_hangcheck.html

  * igt@kms_cursor_crc@pipe-b-cursor-128x128-offscreen:
    - shard-skl:          [PASS][37] -> [FAIL][38] ([fdo#103232])
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-skl3/igt@kms_cursor_crc@pipe-b-cursor-128x128-offscreen.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-skl9/igt@kms_cursor_crc@pipe-b-cursor-128x128-offscreen.html

  * igt@kms_flip@2x-flip-vs-suspend:
    - shard-hsw:          [PASS][39] -> [INCOMPLETE][40] ([fdo#103540])
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-hsw1/igt@kms_flip@2x-flip-vs-suspend.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-hsw8/igt@kms_flip@2x-flip-vs-suspend.html

  * igt@kms_frontbuffer_tracking@fbc-1p-indfb-fliptrack:
    - shard-tglb:         [PASS][41] -> [FAIL][42] ([fdo#103167]) +2 similar issues
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-tglb7/igt@kms_frontbuffer_tracking@fbc-1p-indfb-fliptrack.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-tglb4/igt@kms_frontbuffer_tracking@fbc-1p-indfb-fliptrack.html

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-mmap-cpu:
    - shard-iclb:         [PASS][43] -> [INCOMPLETE][44] ([fdo#107713])
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-iclb1/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-mmap-cpu.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-iclb4/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-mmap-cpu.html

  * igt@kms_frontbuffer_tracking@fbc-rgb565-draw-pwrite:
    - shard-iclb:         [PASS][45] -> [FAIL][46] ([fdo#103167]) +3 similar issues
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-iclb6/igt@kms_frontbuffer_tracking@fbc-rgb565-draw-pwrite.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-iclb2/igt@kms_frontbuffer_tracking@fbc-rgb565-draw-pwrite.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-onoff:
    - shard-tglb:         [PASS][47] -> [INCOMPLETE][48] ([fdo#111884])
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-tglb2/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-onoff.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-tglb1/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-onoff.html

  * igt@kms_plane@pixel-format-pipe-b-planes-source-clamping:
    - shard-kbl:          [PASS][49] -> [INCOMPLETE][50] ([fdo#103665])
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-kbl1/igt@kms_plane@pixel-format-pipe-b-planes-source-clamping.html
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-kbl2/igt@kms_plane@pixel-format-pipe-b-planes-source-clamping.html

  * igt@kms_plane@plane-panning-bottom-right-suspend-pipe-d-planes:
    - shard-tglb:         [PASS][51] -> [INCOMPLETE][52] ([fdo#111850])
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-tglb9/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-d-planes.html
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-tglb7/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-d-planes.html

  * igt@kms_plane_alpha_blend@pipe-c-coverage-7efc:
    - shard-skl:          [PASS][53] -> [FAIL][54] ([fdo#108145] / [fdo#110403])
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-skl2/igt@kms_plane_alpha_blend@pipe-c-coverage-7efc.html
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-skl6/igt@kms_plane_alpha_blend@pipe-c-coverage-7efc.html

  * igt@kms_plane_lowres@pipe-a-tiling-y:
    - shard-iclb:         [PASS][55] -> [FAIL][56] ([fdo#103166])
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-iclb2/igt@kms_plane_lowres@pipe-a-tiling-y.html
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-iclb4/igt@kms_plane_lowres@pipe-a-tiling-y.html

  * igt@kms_psr@psr2_basic:
    - shard-iclb:         [PASS][57] -> [SKIP][58] ([fdo#109441])
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-iclb2/igt@kms_psr@psr2_basic.html
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-iclb4/igt@kms_psr@psr2_basic.html

  * igt@kms_setmode@basic:
    - shard-apl:          [PASS][59] -> [FAIL][60] ([fdo#99912])
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-apl2/igt@kms_setmode@basic.html
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-apl8/igt@kms_setmode@basic.html
    - shard-glk:          [PASS][61] -> [FAIL][62] ([fdo#99912])
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-glk2/igt@kms_setmode@basic.html
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-glk8/igt@kms_setmode@basic.html
    - shard-kbl:          [PASS][63] -> [FAIL][64] ([fdo#99912])
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-kbl7/igt@kms_setmode@basic.html
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-kbl6/igt@kms_setmode@basic.html

  * igt@perf@gen8-unprivileged-single-ctx-counters:
    - shard-skl:          [PASS][65] -> [INCOMPLETE][66] ([fdo#111747])
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-skl1/igt@perf@gen8-unprivileged-single-ctx-counters.html
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-skl5/igt@perf@gen8-unprivileged-single-ctx-counters.html

  * igt@perf_pmu@busy-no-semaphores-vcs1:
    - shard-iclb:         [PASS][67] -> [SKIP][68] ([fdo#112080]) +2 similar issues
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-iclb4/igt@perf_pmu@busy-no-semaphores-vcs1.html
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-iclb7/igt@perf_pmu@busy-no-semaphores-vcs1.html

  * igt@prime_busy@hang-bsd2:
    - shard-iclb:         [PASS][69] -> [SKIP][70] ([fdo#109276]) +8 similar issues
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-iclb1/igt@prime_busy@hang-bsd2.html
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-iclb8/igt@prime_busy@hang-bsd2.html

  
#### Possible fixes ####

  * igt@gem_ctx_isolation@vcs1-clean:
    - shard-iclb:         [SKIP][71] ([fdo#109276] / [fdo#112080]) -> [PASS][72] +1 similar issue
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-iclb5/igt@gem_ctx_isolation@vcs1-clean.html
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-iclb1/igt@gem_ctx_isolation@vcs1-clean.html

  * igt@gem_exec_create@forked:
    - shard-tglb:         [INCOMPLETE][73] ([fdo#108838] / [fdo#111747]) -> [PASS][74]
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-tglb6/igt@gem_exec_create@forked.html
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-tglb5/igt@gem_exec_create@forked.html

  * igt@gem_exec_reuse@single:
    - shard-tglb:         [INCOMPLETE][75] ([fdo#111747]) -> [PASS][76]
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-tglb6/igt@gem_exec_reuse@single.html
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-tglb2/igt@gem_exec_reuse@single.html

  * igt@gem_exec_schedule@fifo-bsd1:
    - shard-iclb:         [SKIP][77] ([fdo#109276]) -> [PASS][78] +8 similar issues
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-iclb3/igt@gem_exec_schedule@fifo-bsd1.html
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-iclb2/igt@gem_exec_schedule@fifo-bsd1.html

  * igt@gem_exec_schedule@in-order-bsd:
    - shard-iclb:         [SKIP][79] ([fdo#112146]) -> [PASS][80] +2 similar issues
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-iclb1/igt@gem_exec_schedule@in-order-bsd.html
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-iclb8/igt@gem_exec_schedule@in-order-bsd.html

  * igt@gem_exec_schedule@preempt-queue-bsd2:
    - shard-tglb:         [INCOMPLETE][81] ([fdo#111606] / [fdo#111677]) -> [PASS][82]
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-tglb6/igt@gem_exec_schedule@preempt-queue-bsd2.html
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-tglb2/igt@gem_exec_schedule@preempt-queue-bsd2.html

  * igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrash-inactive:
    - shard-apl:          [DMESG-FAIL][83] ([fdo#112309]) -> [PASS][84]
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-apl2/igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrash-inactive.html
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-apl8/igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrash-inactive.html

  * igt@gem_persistent_relocs@forked-interruptible-thrash-inactive:
    - shard-kbl:          [TIMEOUT][85] ([fdo#112068 ]) -> [PASS][86]
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-kbl1/igt@gem_persistent_relocs@forked-interruptible-thrash-inactive.html
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-kbl4/igt@gem_persistent_relocs@forked-interruptible-thrash-inactive.html

  * igt@gem_userptr_blits@dmabuf-sync:
    - shard-snb:          [DMESG-WARN][87] ([fdo#111870]) -> [PASS][88] +1 similar issue
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-snb1/igt@gem_userptr_blits@dmabuf-sync.html
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-snb5/igt@gem_userptr_blits@dmabuf-sync.html

  * igt@gem_userptr_blits@map-fixed-invalidate-busy-gup:
    - shard-hsw:          [DMESG-WARN][89] ([fdo#111870]) -> [PASS][90] +2 similar issues
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-hsw8/igt@gem_userptr_blits@map-fixed-invalidate-busy-gup.html
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-hsw6/igt@gem_userptr_blits@map-fixed-invalidate-busy-gup.html

  * igt@i915_pm_rpm@system-suspend:
    - shard-tglb:         [INCOMPLETE][91] ([fdo#111747] / [fdo#111850]) -> [PASS][92]
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-tglb3/igt@i915_pm_rpm@system-suspend.html
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-tglb3/igt@i915_pm_rpm@system-suspend.html

  * igt@i915_selftest@live_execlists:
    - shard-glk:          [TIMEOUT][93] -> [PASS][94]
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-glk2/igt@i915_selftest@live_execlists.html
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-glk8/igt@i915_selftest@live_execlists.html

  * igt@i915_suspend@sysfs-reader:
    - shard-tglb:         [INCOMPLETE][95] ([fdo#111832] / [fdo#111850]) -> [PASS][96] +4 similar issues
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-tglb3/igt@i915_suspend@sysfs-reader.html
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-tglb6/igt@i915_suspend@sysfs-reader.html

  * igt@kms_cursor_crc@pipe-a-cursor-suspend:
    - shard-kbl:          [DMESG-WARN][97] ([fdo#108566]) -> [PASS][98]
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-kbl2/igt@kms_cursor_crc@pipe-a-cursor-suspend.html
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-kbl2/igt@kms_cursor_crc@pipe-a-cursor-suspend.html

  * igt@kms_cursor_crc@pipe-b-cursor-128x42-random:
    - shard-skl:          [FAIL][99] ([fdo#103232]) -> [PASS][100]
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-skl6/igt@kms_cursor_crc@pipe-b-cursor-128x42-random.html
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-skl1/igt@kms_cursor_crc@pipe-b-cursor-128x42-random.html

  * igt@kms_cursor_crc@pipe-d-cursor-suspend:
    - shard-tglb:         [INCOMPLETE][101] ([fdo#111850]) -> [PASS][102]
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-tglb4/igt@kms_cursor_crc@pipe-d-cursor-suspend.html
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-tglb9/igt@kms_cursor_crc@pipe-d-cursor-suspend.html

  * igt@kms_flip@2x-flip-vs-suspend-interruptible:
    - shard-hsw:          [INCOMPLETE][103] ([fdo#103540]) -> [PASS][104]
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-hsw5/igt@kms_flip@2x-flip-vs-suspend-interruptible.html
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-hsw1/igt@kms_flip@2x-flip-vs-suspend-interruptible.html

  * igt@kms_flip@flip-vs-suspend-interruptible:
    - shard-apl:          [DMESG-WARN][105] ([fdo#108566]) -> [PASS][106]
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-apl1/igt@kms_flip@flip-vs-suspend-interruptible.html
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-apl2/igt@kms_flip@flip-vs-suspend-interrupt

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* [Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [01/19] drm/i915/selftests: Force bonded submission to overlap
@ 2019-11-19  9:08   ` Patchwork
  0 siblings, 0 replies; 90+ messages in thread
From: Patchwork @ 2019-11-19  9:08 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/19] drm/i915/selftests: Force bonded submission to overlap
URL   : https://patchwork.freedesktop.org/series/69647/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_7367_full -> Patchwork_15322_full
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_15322_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_15322_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_15322_full:

### IGT changes ###

#### Possible regressions ####

  * igt@kms_draw_crc@draw-method-xrgb2101010-render-ytiled:
    - shard-skl:          [PASS][1] -> [INCOMPLETE][2] +1 similar issue
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-skl6/igt@kms_draw_crc@draw-method-xrgb2101010-render-ytiled.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-skl10/igt@kms_draw_crc@draw-method-xrgb2101010-render-ytiled.html

  * igt@perf@oa-exponents:
    - shard-kbl:          [PASS][3] -> [TIMEOUT][4] +1 similar issue
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-kbl7/igt@perf@oa-exponents.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-kbl1/igt@perf@oa-exponents.html
    - shard-iclb:         [PASS][5] -> [TIMEOUT][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-iclb3/igt@perf@oa-exponents.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-iclb6/igt@perf@oa-exponents.html

  * igt@runner@aborted:
    - shard-kbl:          NOTRUN -> ([FAIL][7], [FAIL][8])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-kbl1/igt@runner@aborted.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-kbl6/igt@runner@aborted.html

  
New tests
---------

  New tests have been introduced between CI_DRM_7367_full and Patchwork_15322_full:

### New IGT tests (1) ###

  * igt@i915_selftest@live_late_gt_pm:
    - Statuses : 4 pass(s)
    - Exec time: [0.31, 2.47] s

  

Known issues
------------

  Here are the changes found in Patchwork_15322_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_ctx_shared@exec-single-timeline-bsd:
    - shard-iclb:         [PASS][9] -> [SKIP][10] ([fdo#110841])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-iclb5/igt@gem_ctx_shared@exec-single-timeline-bsd.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-iclb1/igt@gem_ctx_shared@exec-single-timeline-bsd.html

  * igt@gem_eio@reset-stress:
    - shard-snb:          [PASS][11] -> [FAIL][12] ([fdo#109661])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-snb4/igt@gem_eio@reset-stress.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-snb5/igt@gem_eio@reset-stress.html

  * igt@gem_exec_create@madvise:
    - shard-tglb:         [PASS][13] -> [INCOMPLETE][14] ([fdo#111747]) +1 similar issue
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-tglb8/igt@gem_exec_create@madvise.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-tglb6/igt@gem_exec_create@madvise.html

  * igt@gem_exec_reloc@basic-write-read-active:
    - shard-skl:          [PASS][15] -> [DMESG-WARN][16] ([fdo#106107])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-skl6/igt@gem_exec_reloc@basic-write-read-active.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-skl10/igt@gem_exec_reloc@basic-write-read-active.html

  * igt@gem_exec_schedule@fifo-bsd:
    - shard-iclb:         [PASS][17] -> [SKIP][18] ([fdo#112146]) +2 similar issues
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-iclb5/igt@gem_exec_schedule@fifo-bsd.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-iclb1/igt@gem_exec_schedule@fifo-bsd.html

  * igt@gem_exec_suspend@basic-s0:
    - shard-tglb:         [PASS][19] -> [INCOMPLETE][20] ([fdo#111832])
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-tglb2/igt@gem_exec_suspend@basic-s0.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-tglb5/igt@gem_exec_suspend@basic-s0.html

  * igt@gem_persistent_relocs@forked-interruptible-thrash-inactive:
    - shard-snb:          [PASS][21] -> [TIMEOUT][22] ([fdo#112068 ])
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-snb4/igt@gem_persistent_relocs@forked-interruptible-thrash-inactive.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-snb7/igt@gem_persistent_relocs@forked-interruptible-thrash-inactive.html

  * igt@gem_softpin@noreloc-s3:
    - shard-apl:          [PASS][23] -> [DMESG-WARN][24] ([fdo#108566])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-apl3/igt@gem_softpin@noreloc-s3.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-apl1/igt@gem_softpin@noreloc-s3.html
    - shard-skl:          [PASS][25] -> [INCOMPLETE][26] ([fdo#104108] / [fdo#107773])
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-skl7/igt@gem_softpin@noreloc-s3.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-skl2/igt@gem_softpin@noreloc-s3.html

  * igt@gem_userptr_blits@dmabuf-unsync:
    - shard-hsw:          [PASS][27] -> [DMESG-WARN][28] ([fdo#111870])
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-hsw5/igt@gem_userptr_blits@dmabuf-unsync.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-hsw8/igt@gem_userptr_blits@dmabuf-unsync.html

  * igt@i915_pm_dc@dc5-dpms:
    - shard-skl:          [PASS][29] -> [INCOMPLETE][30] ([fdo#108972]) +1 similar issue
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-skl3/igt@i915_pm_dc@dc5-dpms.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-skl9/igt@i915_pm_dc@dc5-dpms.html

  * igt@i915_pm_rc6_residency@rc6-accuracy:
    - shard-kbl:          [PASS][31] -> [SKIP][32] ([fdo#109271])
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-kbl6/igt@i915_pm_rc6_residency@rc6-accuracy.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-kbl1/igt@i915_pm_rc6_residency@rc6-accuracy.html

  * igt@i915_pm_rpm@system-suspend-execbuf:
    - shard-tglb:         [PASS][33] -> [INCOMPLETE][34] ([fdo#111832] / [fdo#111850])
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-tglb6/igt@i915_pm_rpm@system-suspend-execbuf.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-tglb3/igt@i915_pm_rpm@system-suspend-execbuf.html

  * igt@i915_selftest@live_hangcheck:
    - shard-hsw:          [PASS][35] -> [DMESG-FAIL][36] ([fdo#111991])
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-hsw6/igt@i915_selftest@live_hangcheck.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-hsw6/igt@i915_selftest@live_hangcheck.html

  * igt@kms_cursor_crc@pipe-b-cursor-128x128-offscreen:
    - shard-skl:          [PASS][37] -> [FAIL][38] ([fdo#103232])
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-skl3/igt@kms_cursor_crc@pipe-b-cursor-128x128-offscreen.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-skl9/igt@kms_cursor_crc@pipe-b-cursor-128x128-offscreen.html

  * igt@kms_flip@2x-flip-vs-suspend:
    - shard-hsw:          [PASS][39] -> [INCOMPLETE][40] ([fdo#103540])
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-hsw1/igt@kms_flip@2x-flip-vs-suspend.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-hsw8/igt@kms_flip@2x-flip-vs-suspend.html

  * igt@kms_frontbuffer_tracking@fbc-1p-indfb-fliptrack:
    - shard-tglb:         [PASS][41] -> [FAIL][42] ([fdo#103167]) +2 similar issues
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-tglb7/igt@kms_frontbuffer_tracking@fbc-1p-indfb-fliptrack.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-tglb4/igt@kms_frontbuffer_tracking@fbc-1p-indfb-fliptrack.html

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-mmap-cpu:
    - shard-iclb:         [PASS][43] -> [INCOMPLETE][44] ([fdo#107713])
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-iclb1/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-mmap-cpu.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-iclb4/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-mmap-cpu.html

  * igt@kms_frontbuffer_tracking@fbc-rgb565-draw-pwrite:
    - shard-iclb:         [PASS][45] -> [FAIL][46] ([fdo#103167]) +3 similar issues
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-iclb6/igt@kms_frontbuffer_tracking@fbc-rgb565-draw-pwrite.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-iclb2/igt@kms_frontbuffer_tracking@fbc-rgb565-draw-pwrite.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-onoff:
    - shard-tglb:         [PASS][47] -> [INCOMPLETE][48] ([fdo#111884])
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-tglb2/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-onoff.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-tglb1/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-onoff.html

  * igt@kms_plane@pixel-format-pipe-b-planes-source-clamping:
    - shard-kbl:          [PASS][49] -> [INCOMPLETE][50] ([fdo#103665])
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-kbl1/igt@kms_plane@pixel-format-pipe-b-planes-source-clamping.html
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-kbl2/igt@kms_plane@pixel-format-pipe-b-planes-source-clamping.html

  * igt@kms_plane@plane-panning-bottom-right-suspend-pipe-d-planes:
    - shard-tglb:         [PASS][51] -> [INCOMPLETE][52] ([fdo#111850])
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-tglb9/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-d-planes.html
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-tglb7/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-d-planes.html

  * igt@kms_plane_alpha_blend@pipe-c-coverage-7efc:
    - shard-skl:          [PASS][53] -> [FAIL][54] ([fdo#108145] / [fdo#110403])
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-skl2/igt@kms_plane_alpha_blend@pipe-c-coverage-7efc.html
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-skl6/igt@kms_plane_alpha_blend@pipe-c-coverage-7efc.html

  * igt@kms_plane_lowres@pipe-a-tiling-y:
    - shard-iclb:         [PASS][55] -> [FAIL][56] ([fdo#103166])
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-iclb2/igt@kms_plane_lowres@pipe-a-tiling-y.html
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-iclb4/igt@kms_plane_lowres@pipe-a-tiling-y.html

  * igt@kms_psr@psr2_basic:
    - shard-iclb:         [PASS][57] -> [SKIP][58] ([fdo#109441])
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-iclb2/igt@kms_psr@psr2_basic.html
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-iclb4/igt@kms_psr@psr2_basic.html

  * igt@kms_setmode@basic:
    - shard-apl:          [PASS][59] -> [FAIL][60] ([fdo#99912])
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-apl2/igt@kms_setmode@basic.html
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-apl8/igt@kms_setmode@basic.html
    - shard-glk:          [PASS][61] -> [FAIL][62] ([fdo#99912])
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-glk2/igt@kms_setmode@basic.html
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-glk8/igt@kms_setmode@basic.html
    - shard-kbl:          [PASS][63] -> [FAIL][64] ([fdo#99912])
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-kbl7/igt@kms_setmode@basic.html
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-kbl6/igt@kms_setmode@basic.html

  * igt@perf@gen8-unprivileged-single-ctx-counters:
    - shard-skl:          [PASS][65] -> [INCOMPLETE][66] ([fdo#111747])
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-skl1/igt@perf@gen8-unprivileged-single-ctx-counters.html
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-skl5/igt@perf@gen8-unprivileged-single-ctx-counters.html

  * igt@perf_pmu@busy-no-semaphores-vcs1:
    - shard-iclb:         [PASS][67] -> [SKIP][68] ([fdo#112080]) +2 similar issues
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-iclb4/igt@perf_pmu@busy-no-semaphores-vcs1.html
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-iclb7/igt@perf_pmu@busy-no-semaphores-vcs1.html

  * igt@prime_busy@hang-bsd2:
    - shard-iclb:         [PASS][69] -> [SKIP][70] ([fdo#109276]) +8 similar issues
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-iclb1/igt@prime_busy@hang-bsd2.html
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-iclb8/igt@prime_busy@hang-bsd2.html

  
#### Possible fixes ####

  * igt@gem_ctx_isolation@vcs1-clean:
    - shard-iclb:         [SKIP][71] ([fdo#109276] / [fdo#112080]) -> [PASS][72] +1 similar issue
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-iclb5/igt@gem_ctx_isolation@vcs1-clean.html
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-iclb1/igt@gem_ctx_isolation@vcs1-clean.html

  * igt@gem_exec_create@forked:
    - shard-tglb:         [INCOMPLETE][73] ([fdo#108838] / [fdo#111747]) -> [PASS][74]
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-tglb6/igt@gem_exec_create@forked.html
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-tglb5/igt@gem_exec_create@forked.html

  * igt@gem_exec_reuse@single:
    - shard-tglb:         [INCOMPLETE][75] ([fdo#111747]) -> [PASS][76]
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-tglb6/igt@gem_exec_reuse@single.html
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-tglb2/igt@gem_exec_reuse@single.html

  * igt@gem_exec_schedule@fifo-bsd1:
    - shard-iclb:         [SKIP][77] ([fdo#109276]) -> [PASS][78] +8 similar issues
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-iclb3/igt@gem_exec_schedule@fifo-bsd1.html
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-iclb2/igt@gem_exec_schedule@fifo-bsd1.html

  * igt@gem_exec_schedule@in-order-bsd:
    - shard-iclb:         [SKIP][79] ([fdo#112146]) -> [PASS][80] +2 similar issues
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-iclb1/igt@gem_exec_schedule@in-order-bsd.html
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-iclb8/igt@gem_exec_schedule@in-order-bsd.html

  * igt@gem_exec_schedule@preempt-queue-bsd2:
    - shard-tglb:         [INCOMPLETE][81] ([fdo#111606] / [fdo#111677]) -> [PASS][82]
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-tglb6/igt@gem_exec_schedule@preempt-queue-bsd2.html
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-tglb2/igt@gem_exec_schedule@preempt-queue-bsd2.html

  * igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrash-inactive:
    - shard-apl:          [DMESG-FAIL][83] ([fdo#112309]) -> [PASS][84]
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-apl2/igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrash-inactive.html
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-apl8/igt@gem_persistent_relocs@forked-interruptible-faulting-reloc-thrash-inactive.html

  * igt@gem_persistent_relocs@forked-interruptible-thrash-inactive:
    - shard-kbl:          [TIMEOUT][85] ([fdo#112068 ]) -> [PASS][86]
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-kbl1/igt@gem_persistent_relocs@forked-interruptible-thrash-inactive.html
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-kbl4/igt@gem_persistent_relocs@forked-interruptible-thrash-inactive.html

  * igt@gem_userptr_blits@dmabuf-sync:
    - shard-snb:          [DMESG-WARN][87] ([fdo#111870]) -> [PASS][88] +1 similar issue
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-snb1/igt@gem_userptr_blits@dmabuf-sync.html
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-snb5/igt@gem_userptr_blits@dmabuf-sync.html

  * igt@gem_userptr_blits@map-fixed-invalidate-busy-gup:
    - shard-hsw:          [DMESG-WARN][89] ([fdo#111870]) -> [PASS][90] +2 similar issues
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-hsw8/igt@gem_userptr_blits@map-fixed-invalidate-busy-gup.html
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-hsw6/igt@gem_userptr_blits@map-fixed-invalidate-busy-gup.html

  * igt@i915_pm_rpm@system-suspend:
    - shard-tglb:         [INCOMPLETE][91] ([fdo#111747] / [fdo#111850]) -> [PASS][92]
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-tglb3/igt@i915_pm_rpm@system-suspend.html
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-tglb3/igt@i915_pm_rpm@system-suspend.html

  * igt@i915_selftest@live_execlists:
    - shard-glk:          [TIMEOUT][93] -> [PASS][94]
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-glk2/igt@i915_selftest@live_execlists.html
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-glk8/igt@i915_selftest@live_execlists.html

  * igt@i915_suspend@sysfs-reader:
    - shard-tglb:         [INCOMPLETE][95] ([fdo#111832] / [fdo#111850]) -> [PASS][96] +4 similar issues
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-tglb3/igt@i915_suspend@sysfs-reader.html
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-tglb6/igt@i915_suspend@sysfs-reader.html

  * igt@kms_cursor_crc@pipe-a-cursor-suspend:
    - shard-kbl:          [DMESG-WARN][97] ([fdo#108566]) -> [PASS][98]
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-kbl2/igt@kms_cursor_crc@pipe-a-cursor-suspend.html
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-kbl2/igt@kms_cursor_crc@pipe-a-cursor-suspend.html

  * igt@kms_cursor_crc@pipe-b-cursor-128x42-random:
    - shard-skl:          [FAIL][99] ([fdo#103232]) -> [PASS][100]
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-skl6/igt@kms_cursor_crc@pipe-b-cursor-128x42-random.html
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-skl1/igt@kms_cursor_crc@pipe-b-cursor-128x42-random.html

  * igt@kms_cursor_crc@pipe-d-cursor-suspend:
    - shard-tglb:         [INCOMPLETE][101] ([fdo#111850]) -> [PASS][102]
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-tglb4/igt@kms_cursor_crc@pipe-d-cursor-suspend.html
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-tglb9/igt@kms_cursor_crc@pipe-d-cursor-suspend.html

  * igt@kms_flip@2x-flip-vs-suspend-interruptible:
    - shard-hsw:          [INCOMPLETE][103] ([fdo#103540]) -> [PASS][104]
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-hsw5/igt@kms_flip@2x-flip-vs-suspend-interruptible.html
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-hsw1/igt@kms_flip@2x-flip-vs-suspend-interruptible.html

  * igt@kms_flip@flip-vs-suspend-interruptible:
    - shard-apl:          [DMESG-WARN][105] ([fdo#108566]) -> [PASS][106]
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7367/shard-apl1/igt@kms_flip@flip-vs-suspend-interruptible.html
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/shard-apl2/igt@kms_flip@flip-vs-suspend-interrupt

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15322/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 03/19] drm/i915/gt: Close race between engine_park and intel_gt_retire_requests
@ 2019-11-19 14:15     ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2019-11-19 14:15 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Matthew Auld


On 18/11/2019 23:02, Chris Wilson wrote:
> The general concept was that intel_timeline.active_count was locked by
> the intel_timeline.mutex. The exception was for power management, where
> the engine->kernel_context->timeline could be manipulated under the
> global wakeref.mutex.
> 
> This was quite solid, as we always manipulated the timeline only while
> we held an engine wakeref.
> 
> And then we started retiring requests outside of struct_mutex, only
> using the timelines.active_list and the timeline->mutex. There we
> started manipulating intel_timeline.active_count outside of an engine
> wakeref, and so introduced a race between __engine_park() and
> intel_gt_retire_requests(), a race that could result in the
> engine->kernel_context not being added to the active timelines and so
> losing requests, which caused us to keep the system permanently powered
> up [and unloadable].
> 
> The race would be easy to close if we could take the engine wakeref for
> the timeline before we retire -- except timelines are not bound to any
> engine and so we would need to keep all active engines awake. The
> alternative is to guard intel_timeline_enter/intel_timeline_exit for use
> outside of the timeline->mutex.
> 
> Fixes: e5dadff4b093 ("drm/i915: Protect request retirement with timeline->mutex")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Matthew Auld <matthew.auld@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_gt_requests.c   |  8 ++---
>   drivers/gpu/drm/i915/gt/intel_timeline.c      | 34 +++++++++++++++----
>   .../gpu/drm/i915/gt/intel_timeline_types.h    |  2 +-
>   3 files changed, 32 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> index a79e6efb31a2..7559d6373f49 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> @@ -49,8 +49,8 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
>   			continue;
>   
>   		intel_timeline_get(tl);
> -		GEM_BUG_ON(!tl->active_count);
> -		tl->active_count++; /* pin the list element */
> +		GEM_BUG_ON(!atomic_read(&tl->active_count));
> +		atomic_inc(&tl->active_count); /* pin the list element */
>   		spin_unlock_irqrestore(&timelines->lock, flags);
>   
>   		if (timeout > 0) {
> @@ -71,14 +71,14 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
>   
>   		/* Resume iteration after dropping lock */
>   		list_safe_reset_next(tl, tn, link);
> -		if (!--tl->active_count)
> +		if (atomic_dec_and_test(&tl->active_count))
>   			list_del(&tl->link);
>   
>   		mutex_unlock(&tl->mutex);
>   
>   		/* Defer the final release to after the spinlock */
>   		if (refcount_dec_and_test(&tl->kref.refcount)) {
> -			GEM_BUG_ON(tl->active_count);
> +			GEM_BUG_ON(atomic_read(&tl->active_count));
>   			list_add(&tl->link, &free);
>   		}
>   	}
> diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
> index 16a9e88d93de..4f914f0d5eab 100644
> --- a/drivers/gpu/drm/i915/gt/intel_timeline.c
> +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
> @@ -334,15 +334,33 @@ void intel_timeline_enter(struct intel_timeline *tl)
>   	struct intel_gt_timelines *timelines = &tl->gt->timelines;
>   	unsigned long flags;
>   
> +	/*
> +	 * Pretend we are serialised by the timeline->mutex.
> +	 *
> +	 * While generally true, there are a few exceptions to the rule
> +	 * for the engine->kernel_context being used to manage power
> +	 * transitions. As the engine_park may be called from under any
> +	 * timeline, it uses the power mutex as a global serialisation
> +	 * lock to prevent any other request entering its timeline.
> +	 *
> +	 * The rule is generally tl->mutex, otherwise engine->wakeref.mutex.
> +	 *
> +	 * However, intel_gt_retire_request() does not know which engine
> +	 * it is retiring along and so cannot partake in the engine-pm
> +	 * barrier, and there we use the tl->active_count as a means to
> +	 * pin the timeline in the active_list while the locks are dropped.
> +	 * Ergo, as that is outside of the engine-pm barrier, we need to
> +	 * use atomic to manipulate tl->active_count.
> +	 */
>   	lockdep_assert_held(&tl->mutex);
> -
>   	GEM_BUG_ON(!atomic_read(&tl->pin_count));
> -	if (tl->active_count++)
> +
> +	if (atomic_add_unless(&tl->active_count, 1, 0))
>   		return;
> -	GEM_BUG_ON(!tl->active_count); /* overflow? */
>   
>   	spin_lock_irqsave(&timelines->lock, flags);
> -	list_add(&tl->link, &timelines->active_list);
> +	if (!atomic_fetch_inc(&tl->active_count))
> +		list_add(&tl->link, &timelines->active_list);

So retirement raced with this and has elevated the active_count? But 
retirement does not add the timeline to the list, so we exit here 
without it on the active_list.

>   	spin_unlock_irqrestore(&timelines->lock, flags);
>   }
>   
> @@ -351,14 +369,16 @@ void intel_timeline_exit(struct intel_timeline *tl)
>   	struct intel_gt_timelines *timelines = &tl->gt->timelines;
>   	unsigned long flags;
>   
> +	/* See intel_timeline_enter() */
>   	lockdep_assert_held(&tl->mutex);
>   
> -	GEM_BUG_ON(!tl->active_count);
> -	if (--tl->active_count)
> +	GEM_BUG_ON(!atomic_read(&tl->active_count));
> +	if (atomic_add_unless(&tl->active_count, -1, 1))
>   		return;
>   
>   	spin_lock_irqsave(&timelines->lock, flags);
> -	list_del(&tl->link);
> +	if (atomic_dec_and_test(&tl->active_count))
> +		list_del(&tl->link);

This one I can understand because retirement would have unlinked it.

>   	spin_unlock_irqrestore(&timelines->lock, flags);
>   
>   	/*
> diff --git a/drivers/gpu/drm/i915/gt/intel_timeline_types.h b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
> index 98d9ee166379..5244615ed1cb 100644
> --- a/drivers/gpu/drm/i915/gt/intel_timeline_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
> @@ -42,7 +42,7 @@ struct intel_timeline {
>   	 * from the intel_context caller plus internal atomicity.
>   	 */
>   	atomic_t pin_count;
> -	unsigned int active_count;
> +	atomic_t active_count;
>   
>   	const u32 *hwsp_seqno;
>   	struct i915_vma *hwsp_ggtt;
> 

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 03/19] drm/i915/gt: Close race between engine_park and intel_gt_retire_requests
@ 2019-11-19 14:15     ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2019-11-19 14:15 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Matthew Auld


On 18/11/2019 23:02, Chris Wilson wrote:
> The general concept was that intel_timeline.active_count was locked by
> the intel_timeline.mutex. The exception was for power management, where
> the engine->kernel_context->timeline could be manipulated under the
> global wakeref.mutex.
> 
> This was quite solid, as we always manipulated the timeline only while
> we held an engine wakeref.
> 
> And then we started retiring requests outside of struct_mutex, only
> using the timelines.active_list and the timeline->mutex. There we
> started manipulating intel_timeline.active_count outside of an engine
> wakeref, and so introduced a race between __engine_park() and
> intel_gt_retire_requests(), a race that could result in the
> engine->kernel_context not being added to the active timelines and so
> losing requests, which caused us to keep the system permanently powered
> up [and unloadable].
> 
> The race would be easy to close if we could take the engine wakeref for
> the timeline before we retire -- except timelines are not bound to any
> engine and so we would need to keep all active engines awake. The
> alternative is to guard intel_timeline_enter/intel_timeline_exit for use
> outside of the timeline->mutex.
> 
> Fixes: e5dadff4b093 ("drm/i915: Protect request retirement with timeline->mutex")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Matthew Auld <matthew.auld@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_gt_requests.c   |  8 ++---
>   drivers/gpu/drm/i915/gt/intel_timeline.c      | 34 +++++++++++++++----
>   .../gpu/drm/i915/gt/intel_timeline_types.h    |  2 +-
>   3 files changed, 32 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> index a79e6efb31a2..7559d6373f49 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> @@ -49,8 +49,8 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
>   			continue;
>   
>   		intel_timeline_get(tl);
> -		GEM_BUG_ON(!tl->active_count);
> -		tl->active_count++; /* pin the list element */
> +		GEM_BUG_ON(!atomic_read(&tl->active_count));
> +		atomic_inc(&tl->active_count); /* pin the list element */
>   		spin_unlock_irqrestore(&timelines->lock, flags);
>   
>   		if (timeout > 0) {
> @@ -71,14 +71,14 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
>   
>   		/* Resume iteration after dropping lock */
>   		list_safe_reset_next(tl, tn, link);
> -		if (!--tl->active_count)
> +		if (atomic_dec_and_test(&tl->active_count))
>   			list_del(&tl->link);
>   
>   		mutex_unlock(&tl->mutex);
>   
>   		/* Defer the final release to after the spinlock */
>   		if (refcount_dec_and_test(&tl->kref.refcount)) {
> -			GEM_BUG_ON(tl->active_count);
> +			GEM_BUG_ON(atomic_read(&tl->active_count));
>   			list_add(&tl->link, &free);
>   		}
>   	}
> diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
> index 16a9e88d93de..4f914f0d5eab 100644
> --- a/drivers/gpu/drm/i915/gt/intel_timeline.c
> +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
> @@ -334,15 +334,33 @@ void intel_timeline_enter(struct intel_timeline *tl)
>   	struct intel_gt_timelines *timelines = &tl->gt->timelines;
>   	unsigned long flags;
>   
> +	/*
> +	 * Pretend we are serialised by the timeline->mutex.
> +	 *
> +	 * While generally true, there are a few exceptions to the rule
> +	 * for the engine->kernel_context being used to manage power
> +	 * transitions. As the engine_park may be called from under any
> +	 * timeline, it uses the power mutex as a global serialisation
> +	 * lock to prevent any other request entering its timeline.
> +	 *
> +	 * The rule is generally tl->mutex, otherwise engine->wakeref.mutex.
> +	 *
> +	 * However, intel_gt_retire_request() does not know which engine
> +	 * it is retiring along and so cannot partake in the engine-pm
> +	 * barrier, and there we use the tl->active_count as a means to
> +	 * pin the timeline in the active_list while the locks are dropped.
> +	 * Ergo, as that is outside of the engine-pm barrier, we need to
> +	 * use atomic to manipulate tl->active_count.
> +	 */
>   	lockdep_assert_held(&tl->mutex);
> -
>   	GEM_BUG_ON(!atomic_read(&tl->pin_count));
> -	if (tl->active_count++)
> +
> +	if (atomic_add_unless(&tl->active_count, 1, 0))
>   		return;
> -	GEM_BUG_ON(!tl->active_count); /* overflow? */
>   
>   	spin_lock_irqsave(&timelines->lock, flags);
> -	list_add(&tl->link, &timelines->active_list);
> +	if (!atomic_fetch_inc(&tl->active_count))
> +		list_add(&tl->link, &timelines->active_list);

So retirement raced with this and has elevated the active_count? But 
retirement does not add the timeline to the list, so we exit here 
without it on the active_list.

>   	spin_unlock_irqrestore(&timelines->lock, flags);
>   }
>   
> @@ -351,14 +369,16 @@ void intel_timeline_exit(struct intel_timeline *tl)
>   	struct intel_gt_timelines *timelines = &tl->gt->timelines;
>   	unsigned long flags;
>   
> +	/* See intel_timeline_enter() */
>   	lockdep_assert_held(&tl->mutex);
>   
> -	GEM_BUG_ON(!tl->active_count);
> -	if (--tl->active_count)
> +	GEM_BUG_ON(!atomic_read(&tl->active_count));
> +	if (atomic_add_unless(&tl->active_count, -1, 1))
>   		return;
>   
>   	spin_lock_irqsave(&timelines->lock, flags);
> -	list_del(&tl->link);
> +	if (atomic_dec_and_test(&tl->active_count))
> +		list_del(&tl->link);

This one I can understand because retirement would have unlinked it.

>   	spin_unlock_irqrestore(&timelines->lock, flags);
>   
>   	/*
> diff --git a/drivers/gpu/drm/i915/gt/intel_timeline_types.h b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
> index 98d9ee166379..5244615ed1cb 100644
> --- a/drivers/gpu/drm/i915/gt/intel_timeline_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
> @@ -42,7 +42,7 @@ struct intel_timeline {
>   	 * from the intel_context caller plus internal atomicity.
>   	 */
>   	atomic_t pin_count;
> -	unsigned int active_count;
> +	atomic_t active_count;
>   
>   	const u32 *hwsp_seqno;
>   	struct i915_vma *hwsp_ggtt;
> 

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 04/19] drm/i915/gt: Unlock engine-pm after queuing the kernel context switch
@ 2019-11-19 14:35     ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2019-11-19 14:35 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/11/2019 23:02, Chris Wilson wrote:
> In commit a79ca656b648 ("drm/i915: Push the wakeref->count deferral to
> the backend"), I erroneously concluded that we last modify the engine
> inside __i915_request_commit() meaning that we could enable concurrent
> submission for userspace as we enqueued this request. However, this
> falls into a trap with other users of the engine->kernel_context waking
> up and submitting their request before the idle-switch is queued, with
> the result that the kernel_context is executed out-of-sequence most
> likely upsetting the GPU and certainly ourselves when we try to retire
> the out-of-sequence requests.
> 
> As such we need to hold onto the effective engine->kernel_context mutex
> lock (via the engine pm mutex proxy) until we have finish queuing the
> request to the engine.
> 
> v2: Serialise against concurrent intel_gt_retire_requests()
> 
> Fixes: a79ca656b648 ("drm/i915: Push the wakeref->count deferral to the backend")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_engine_pm.c | 15 +++++++++------
>   1 file changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> index 3c0f490ff2c7..722d3eec226e 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> @@ -75,6 +75,7 @@ static inline void __timeline_mark_unlock(struct intel_context *ce,
>   
>   static bool switch_to_kernel_context(struct intel_engine_cs *engine)
>   {
> +	struct intel_context *ce = engine->kernel_context;
>   	struct i915_request *rq;
>   	unsigned long flags;
>   	bool result = true;
> @@ -99,15 +100,13 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
>   	 * retiring the last request, thus all rings should be empty and
>   	 * all timelines idle.
>   	 */
> -	flags = __timeline_mark_lock(engine->kernel_context);
> +	flags = __timeline_mark_lock(ce);
>   
> -	rq = __i915_request_create(engine->kernel_context, GFP_NOWAIT);
> +	rq = __i915_request_create(ce, GFP_NOWAIT);
>   	if (IS_ERR(rq))
>   		/* Context switch failed, hope for the best! Maybe reset? */
>   		goto out_unlock;
>   
> -	intel_timeline_enter(i915_request_timeline(rq));
> -
>   	/* Check again on the next retirement. */
>   	engine->wakeref_serial = engine->serial + 1;
>   	i915_request_add_active_barriers(rq);
> @@ -116,13 +115,17 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
>   	rq->sched.attr.priority = I915_PRIORITY_BARRIER;
>   	__i915_request_commit(rq);
>   
> +	__i915_request_queue(rq, NULL);
> +
>   	/* Release our exclusive hold on the engine */
>   	__intel_wakeref_defer_park(&engine->wakeref);
> -	__i915_request_queue(rq, NULL);
> +
> +	/* And finally expose our selfselves to intel_gt_retire_requests() 

ourselves

*/
> +	intel_timeline_enter(ce->timeline);

I haven't really managed to follow this.

What are these other clients which can queue requests and what in this 
block prevents them doing so?

The change seems to be moving the queuing to before 
__intel_wakeref_defer_park and intel_timeline_enter to last. Wakeref 
defer extends the engine lifetime until the submitted rq is retired. But 
how is that consider "unlocking"?

Regards,

Tvrtko

>   
>   	result = false;
>   out_unlock:
> -	__timeline_mark_unlock(engine->kernel_context, flags);
> +	__timeline_mark_unlock(ce, flags);
>   	return result;
>   }
>   
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 04/19] drm/i915/gt: Unlock engine-pm after queuing the kernel context switch
@ 2019-11-19 14:35     ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2019-11-19 14:35 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/11/2019 23:02, Chris Wilson wrote:
> In commit a79ca656b648 ("drm/i915: Push the wakeref->count deferral to
> the backend"), I erroneously concluded that we last modify the engine
> inside __i915_request_commit() meaning that we could enable concurrent
> submission for userspace as we enqueued this request. However, this
> falls into a trap with other users of the engine->kernel_context waking
> up and submitting their request before the idle-switch is queued, with
> the result that the kernel_context is executed out-of-sequence most
> likely upsetting the GPU and certainly ourselves when we try to retire
> the out-of-sequence requests.
> 
> As such we need to hold onto the effective engine->kernel_context mutex
> lock (via the engine pm mutex proxy) until we have finish queuing the
> request to the engine.
> 
> v2: Serialise against concurrent intel_gt_retire_requests()
> 
> Fixes: a79ca656b648 ("drm/i915: Push the wakeref->count deferral to the backend")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_engine_pm.c | 15 +++++++++------
>   1 file changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> index 3c0f490ff2c7..722d3eec226e 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> @@ -75,6 +75,7 @@ static inline void __timeline_mark_unlock(struct intel_context *ce,
>   
>   static bool switch_to_kernel_context(struct intel_engine_cs *engine)
>   {
> +	struct intel_context *ce = engine->kernel_context;
>   	struct i915_request *rq;
>   	unsigned long flags;
>   	bool result = true;
> @@ -99,15 +100,13 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
>   	 * retiring the last request, thus all rings should be empty and
>   	 * all timelines idle.
>   	 */
> -	flags = __timeline_mark_lock(engine->kernel_context);
> +	flags = __timeline_mark_lock(ce);
>   
> -	rq = __i915_request_create(engine->kernel_context, GFP_NOWAIT);
> +	rq = __i915_request_create(ce, GFP_NOWAIT);
>   	if (IS_ERR(rq))
>   		/* Context switch failed, hope for the best! Maybe reset? */
>   		goto out_unlock;
>   
> -	intel_timeline_enter(i915_request_timeline(rq));
> -
>   	/* Check again on the next retirement. */
>   	engine->wakeref_serial = engine->serial + 1;
>   	i915_request_add_active_barriers(rq);
> @@ -116,13 +115,17 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
>   	rq->sched.attr.priority = I915_PRIORITY_BARRIER;
>   	__i915_request_commit(rq);
>   
> +	__i915_request_queue(rq, NULL);
> +
>   	/* Release our exclusive hold on the engine */
>   	__intel_wakeref_defer_park(&engine->wakeref);
> -	__i915_request_queue(rq, NULL);
> +
> +	/* And finally expose our selfselves to intel_gt_retire_requests() 

ourselves

*/
> +	intel_timeline_enter(ce->timeline);

I haven't really managed to follow this.

What are these other clients which can queue requests and what in this 
block prevents them doing so?

The change seems to be moving the queuing to before 
__intel_wakeref_defer_park and intel_timeline_enter to last. Wakeref 
defer extends the engine lifetime until the submitted rq is retired. But 
how is that consider "unlocking"?

Regards,

Tvrtko

>   
>   	result = false;
>   out_unlock:
> -	__timeline_mark_unlock(engine->kernel_context, flags);
> +	__timeline_mark_unlock(ce, flags);
>   	return result;
>   }
>   
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 03/19] drm/i915/gt: Close race between engine_park and intel_gt_retire_requests
@ 2019-11-19 14:41       ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-19 14:41 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: Matthew Auld

Quoting Tvrtko Ursulin (2019-11-19 14:15:49)
> 
> On 18/11/2019 23:02, Chris Wilson wrote:
> > The general concept was that intel_timeline.active_count was locked by
> > the intel_timeline.mutex. The exception was for power management, where
> > the engine->kernel_context->timeline could be manipulated under the
> > global wakeref.mutex.
> > 
> > This was quite solid, as we always manipulated the timeline only while
> > we held an engine wakeref.
> > 
> > And then we started retiring requests outside of struct_mutex, only
> > using the timelines.active_list and the timeline->mutex. There we
> > started manipulating intel_timeline.active_count outside of an engine
> > wakeref, and so introduced a race between __engine_park() and
> > intel_gt_retire_requests(), a race that could result in the
> > engine->kernel_context not being added to the active timelines and so
> > losing requests, which caused us to keep the system permanently powered
> > up [and unloadable].
> > 
> > The race would be easy to close if we could take the engine wakeref for
> > the timeline before we retire -- except timelines are not bound to any
> > engine and so we would need to keep all active engines awake. The
> > alternative is to guard intel_timeline_enter/intel_timeline_exit for use
> > outside of the timeline->mutex.
> > 
> > Fixes: e5dadff4b093 ("drm/i915: Protect request retirement with timeline->mutex")
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Matthew Auld <matthew.auld@intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_gt_requests.c   |  8 ++---
> >   drivers/gpu/drm/i915/gt/intel_timeline.c      | 34 +++++++++++++++----
> >   .../gpu/drm/i915/gt/intel_timeline_types.h    |  2 +-
> >   3 files changed, 32 insertions(+), 12 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> > index a79e6efb31a2..7559d6373f49 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> > @@ -49,8 +49,8 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
> >                       continue;
> >   
> >               intel_timeline_get(tl);
> > -             GEM_BUG_ON(!tl->active_count);
> > -             tl->active_count++; /* pin the list element */
> > +             GEM_BUG_ON(!atomic_read(&tl->active_count));
> > +             atomic_inc(&tl->active_count); /* pin the list element */
> >               spin_unlock_irqrestore(&timelines->lock, flags);
> >   
> >               if (timeout > 0) {
> > @@ -71,14 +71,14 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
> >   
> >               /* Resume iteration after dropping lock */
> >               list_safe_reset_next(tl, tn, link);
> > -             if (!--tl->active_count)
> > +             if (atomic_dec_and_test(&tl->active_count))
> >                       list_del(&tl->link);
> >   
> >               mutex_unlock(&tl->mutex);
> >   
> >               /* Defer the final release to after the spinlock */
> >               if (refcount_dec_and_test(&tl->kref.refcount)) {
> > -                     GEM_BUG_ON(tl->active_count);
> > +                     GEM_BUG_ON(atomic_read(&tl->active_count));
> >                       list_add(&tl->link, &free);
> >               }
> >       }
> > diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
> > index 16a9e88d93de..4f914f0d5eab 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_timeline.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
> > @@ -334,15 +334,33 @@ void intel_timeline_enter(struct intel_timeline *tl)
> >       struct intel_gt_timelines *timelines = &tl->gt->timelines;
> >       unsigned long flags;
> >   
> > +     /*
> > +      * Pretend we are serialised by the timeline->mutex.
> > +      *
> > +      * While generally true, there are a few exceptions to the rule
> > +      * for the engine->kernel_context being used to manage power
> > +      * transitions. As the engine_park may be called from under any
> > +      * timeline, it uses the power mutex as a global serialisation
> > +      * lock to prevent any other request entering its timeline.
> > +      *
> > +      * The rule is generally tl->mutex, otherwise engine->wakeref.mutex.
> > +      *
> > +      * However, intel_gt_retire_request() does not know which engine
> > +      * it is retiring along and so cannot partake in the engine-pm
> > +      * barrier, and there we use the tl->active_count as a means to
> > +      * pin the timeline in the active_list while the locks are dropped.
> > +      * Ergo, as that is outside of the engine-pm barrier, we need to
> > +      * use atomic to manipulate tl->active_count.
> > +      */
> >       lockdep_assert_held(&tl->mutex);
> > -
> >       GEM_BUG_ON(!atomic_read(&tl->pin_count));
> > -     if (tl->active_count++)
> > +
> > +     if (atomic_add_unless(&tl->active_count, 1, 0))
> >               return;
> > -     GEM_BUG_ON(!tl->active_count); /* overflow? */
> >   
> >       spin_lock_irqsave(&timelines->lock, flags);
> > -     list_add(&tl->link, &timelines->active_list);
> > +     if (!atomic_fetch_inc(&tl->active_count))
> > +             list_add(&tl->link, &timelines->active_list);
> 
> So retirement raced with this and has elevated the active_count? But 
> retirement does not add the timeline to the list, so we exit here 
> without it on the active_list.

Retirement only sees an element on the active_list. What we observed in
practice was the inc/dec on tl->active_count racing, causing
indeterminate results, with the result that we removed the element from
the active_list while it had a raised tl->active_count (due to the
inflight posting from the other CPU).

Thus we kept requests inflight and the engine awake with no way to clear
them. This most obvious triggered GEM_BUG_ON(gt->awake) during suspend,
and is also responsible for the timeouts on gem_quiescent_gpu() or
igt_drop_caches_set(DROP_IDLE).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 03/19] drm/i915/gt: Close race between engine_park and intel_gt_retire_requests
@ 2019-11-19 14:41       ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-19 14:41 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: Matthew Auld

Quoting Tvrtko Ursulin (2019-11-19 14:15:49)
> 
> On 18/11/2019 23:02, Chris Wilson wrote:
> > The general concept was that intel_timeline.active_count was locked by
> > the intel_timeline.mutex. The exception was for power management, where
> > the engine->kernel_context->timeline could be manipulated under the
> > global wakeref.mutex.
> > 
> > This was quite solid, as we always manipulated the timeline only while
> > we held an engine wakeref.
> > 
> > And then we started retiring requests outside of struct_mutex, only
> > using the timelines.active_list and the timeline->mutex. There we
> > started manipulating intel_timeline.active_count outside of an engine
> > wakeref, and so introduced a race between __engine_park() and
> > intel_gt_retire_requests(), a race that could result in the
> > engine->kernel_context not being added to the active timelines and so
> > losing requests, which caused us to keep the system permanently powered
> > up [and unloadable].
> > 
> > The race would be easy to close if we could take the engine wakeref for
> > the timeline before we retire -- except timelines are not bound to any
> > engine and so we would need to keep all active engines awake. The
> > alternative is to guard intel_timeline_enter/intel_timeline_exit for use
> > outside of the timeline->mutex.
> > 
> > Fixes: e5dadff4b093 ("drm/i915: Protect request retirement with timeline->mutex")
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Matthew Auld <matthew.auld@intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_gt_requests.c   |  8 ++---
> >   drivers/gpu/drm/i915/gt/intel_timeline.c      | 34 +++++++++++++++----
> >   .../gpu/drm/i915/gt/intel_timeline_types.h    |  2 +-
> >   3 files changed, 32 insertions(+), 12 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> > index a79e6efb31a2..7559d6373f49 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> > @@ -49,8 +49,8 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
> >                       continue;
> >   
> >               intel_timeline_get(tl);
> > -             GEM_BUG_ON(!tl->active_count);
> > -             tl->active_count++; /* pin the list element */
> > +             GEM_BUG_ON(!atomic_read(&tl->active_count));
> > +             atomic_inc(&tl->active_count); /* pin the list element */
> >               spin_unlock_irqrestore(&timelines->lock, flags);
> >   
> >               if (timeout > 0) {
> > @@ -71,14 +71,14 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
> >   
> >               /* Resume iteration after dropping lock */
> >               list_safe_reset_next(tl, tn, link);
> > -             if (!--tl->active_count)
> > +             if (atomic_dec_and_test(&tl->active_count))
> >                       list_del(&tl->link);
> >   
> >               mutex_unlock(&tl->mutex);
> >   
> >               /* Defer the final release to after the spinlock */
> >               if (refcount_dec_and_test(&tl->kref.refcount)) {
> > -                     GEM_BUG_ON(tl->active_count);
> > +                     GEM_BUG_ON(atomic_read(&tl->active_count));
> >                       list_add(&tl->link, &free);
> >               }
> >       }
> > diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
> > index 16a9e88d93de..4f914f0d5eab 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_timeline.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
> > @@ -334,15 +334,33 @@ void intel_timeline_enter(struct intel_timeline *tl)
> >       struct intel_gt_timelines *timelines = &tl->gt->timelines;
> >       unsigned long flags;
> >   
> > +     /*
> > +      * Pretend we are serialised by the timeline->mutex.
> > +      *
> > +      * While generally true, there are a few exceptions to the rule
> > +      * for the engine->kernel_context being used to manage power
> > +      * transitions. As the engine_park may be called from under any
> > +      * timeline, it uses the power mutex as a global serialisation
> > +      * lock to prevent any other request entering its timeline.
> > +      *
> > +      * The rule is generally tl->mutex, otherwise engine->wakeref.mutex.
> > +      *
> > +      * However, intel_gt_retire_request() does not know which engine
> > +      * it is retiring along and so cannot partake in the engine-pm
> > +      * barrier, and there we use the tl->active_count as a means to
> > +      * pin the timeline in the active_list while the locks are dropped.
> > +      * Ergo, as that is outside of the engine-pm barrier, we need to
> > +      * use atomic to manipulate tl->active_count.
> > +      */
> >       lockdep_assert_held(&tl->mutex);
> > -
> >       GEM_BUG_ON(!atomic_read(&tl->pin_count));
> > -     if (tl->active_count++)
> > +
> > +     if (atomic_add_unless(&tl->active_count, 1, 0))
> >               return;
> > -     GEM_BUG_ON(!tl->active_count); /* overflow? */
> >   
> >       spin_lock_irqsave(&timelines->lock, flags);
> > -     list_add(&tl->link, &timelines->active_list);
> > +     if (!atomic_fetch_inc(&tl->active_count))
> > +             list_add(&tl->link, &timelines->active_list);
> 
> So retirement raced with this and has elevated the active_count? But 
> retirement does not add the timeline to the list, so we exit here 
> without it on the active_list.

Retirement only sees an element on the active_list. What we observed in
practice was the inc/dec on tl->active_count racing, causing
indeterminate results, with the result that we removed the element from
the active_list while it had a raised tl->active_count (due to the
inflight posting from the other CPU).

Thus we kept requests inflight and the engine awake with no way to clear
them. This most obvious triggered GEM_BUG_ON(gt->awake) during suspend,
and is also responsible for the timeouts on gem_quiescent_gpu() or
igt_drop_caches_set(DROP_IDLE).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 04/19] drm/i915/gt: Unlock engine-pm after queuing the kernel context switch
@ 2019-11-19 14:50       ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-19 14:50 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-11-19 14:35:04)
> 
> On 18/11/2019 23:02, Chris Wilson wrote:
> > In commit a79ca656b648 ("drm/i915: Push the wakeref->count deferral to
> > the backend"), I erroneously concluded that we last modify the engine
> > inside __i915_request_commit() meaning that we could enable concurrent
> > submission for userspace as we enqueued this request. However, this
> > falls into a trap with other users of the engine->kernel_context waking
> > up and submitting their request before the idle-switch is queued, with
> > the result that the kernel_context is executed out-of-sequence most
> > likely upsetting the GPU and certainly ourselves when we try to retire
> > the out-of-sequence requests.
> > 
> > As such we need to hold onto the effective engine->kernel_context mutex
> > lock (via the engine pm mutex proxy) until we have finish queuing the
> > request to the engine.
> > 
> > v2: Serialise against concurrent intel_gt_retire_requests()
> > 
> > Fixes: a79ca656b648 ("drm/i915: Push the wakeref->count deferral to the backend")
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_engine_pm.c | 15 +++++++++------
> >   1 file changed, 9 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> > index 3c0f490ff2c7..722d3eec226e 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> > @@ -75,6 +75,7 @@ static inline void __timeline_mark_unlock(struct intel_context *ce,
> >   
> >   static bool switch_to_kernel_context(struct intel_engine_cs *engine)
> >   {
> > +     struct intel_context *ce = engine->kernel_context;
> >       struct i915_request *rq;
> >       unsigned long flags;
> >       bool result = true;
> > @@ -99,15 +100,13 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
> >        * retiring the last request, thus all rings should be empty and
> >        * all timelines idle.
> >        */
> > -     flags = __timeline_mark_lock(engine->kernel_context);
> > +     flags = __timeline_mark_lock(ce);
> >   
> > -     rq = __i915_request_create(engine->kernel_context, GFP_NOWAIT);
> > +     rq = __i915_request_create(ce, GFP_NOWAIT);
> >       if (IS_ERR(rq))
> >               /* Context switch failed, hope for the best! Maybe reset? */
> >               goto out_unlock;
> >   
> > -     intel_timeline_enter(i915_request_timeline(rq));
> > -
> >       /* Check again on the next retirement. */
> >       engine->wakeref_serial = engine->serial + 1;
> >       i915_request_add_active_barriers(rq);
> > @@ -116,13 +115,17 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
> >       rq->sched.attr.priority = I915_PRIORITY_BARRIER;
> >       __i915_request_commit(rq);
> >   
> > +     __i915_request_queue(rq, NULL);
> > +
> >       /* Release our exclusive hold on the engine */
> >       __intel_wakeref_defer_park(&engine->wakeref);
> > -     __i915_request_queue(rq, NULL);
> > +
> > +     /* And finally expose our selfselves to intel_gt_retire_requests() 
> 
> ourselves
> 
> */
> > +     intel_timeline_enter(ce->timeline);
> 
> I haven't really managed to follow this.
> 
> What are these other clients which can queue requests and what in this 
> block prevents them doing so?

There are 2 other parties and the GPU who have a stake in this.

A new gpu user will be waiting on the engine-pm to start their
engine_unpark. New waiters are predicated on engine->wakeref.count and
so intel_wakeref_defer_park() acts like a mutex_unlock of the
engine->wakeref.

The other party is intel_gt_retire_requests(), which is walking the list
of active timelines looking for completions.  Meanwhile as soon as we
call __i915_request_queue(), the GPU may complete our request. Ergo,
if we put ourselves on the timelines.active_list (intel_timeline_enter)
before we raise the wakeref.count, we may see the request completion and
retire it causing an undeflow of the engine->wakeref.

> The change seems to be moving the queuing to before 
> __intel_wakeref_defer_park and intel_timeline_enter to last. Wakeref 
> defer extends the engine lifetime until the submitted rq is retired. But 
> how is that consider "unlocking"?

static inline int
intel_wakeref_get(struct intel_wakeref *wf)
{
        if (unlikely(!atomic_inc_not_zero(&wf->count)))
                return __intel_wakeref_get_first(wf);

        return 0;
}

As we build the switch_to_kernel_context(), wf->count is 0, and so all
new users will enter the slow path and take the mutex_lock(&wf->mutex).

As soon as we call intel_wakeref_defer_park(), we call
atomic_set_release(&wf->count, 1) and so all new users will take the
fast path and skip the mutex_lock. Hence why I connote it with being a
"mutex_unlock"
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 04/19] drm/i915/gt: Unlock engine-pm after queuing the kernel context switch
@ 2019-11-19 14:50       ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-19 14:50 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-11-19 14:35:04)
> 
> On 18/11/2019 23:02, Chris Wilson wrote:
> > In commit a79ca656b648 ("drm/i915: Push the wakeref->count deferral to
> > the backend"), I erroneously concluded that we last modify the engine
> > inside __i915_request_commit() meaning that we could enable concurrent
> > submission for userspace as we enqueued this request. However, this
> > falls into a trap with other users of the engine->kernel_context waking
> > up and submitting their request before the idle-switch is queued, with
> > the result that the kernel_context is executed out-of-sequence most
> > likely upsetting the GPU and certainly ourselves when we try to retire
> > the out-of-sequence requests.
> > 
> > As such we need to hold onto the effective engine->kernel_context mutex
> > lock (via the engine pm mutex proxy) until we have finish queuing the
> > request to the engine.
> > 
> > v2: Serialise against concurrent intel_gt_retire_requests()
> > 
> > Fixes: a79ca656b648 ("drm/i915: Push the wakeref->count deferral to the backend")
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_engine_pm.c | 15 +++++++++------
> >   1 file changed, 9 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> > index 3c0f490ff2c7..722d3eec226e 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> > @@ -75,6 +75,7 @@ static inline void __timeline_mark_unlock(struct intel_context *ce,
> >   
> >   static bool switch_to_kernel_context(struct intel_engine_cs *engine)
> >   {
> > +     struct intel_context *ce = engine->kernel_context;
> >       struct i915_request *rq;
> >       unsigned long flags;
> >       bool result = true;
> > @@ -99,15 +100,13 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
> >        * retiring the last request, thus all rings should be empty and
> >        * all timelines idle.
> >        */
> > -     flags = __timeline_mark_lock(engine->kernel_context);
> > +     flags = __timeline_mark_lock(ce);
> >   
> > -     rq = __i915_request_create(engine->kernel_context, GFP_NOWAIT);
> > +     rq = __i915_request_create(ce, GFP_NOWAIT);
> >       if (IS_ERR(rq))
> >               /* Context switch failed, hope for the best! Maybe reset? */
> >               goto out_unlock;
> >   
> > -     intel_timeline_enter(i915_request_timeline(rq));
> > -
> >       /* Check again on the next retirement. */
> >       engine->wakeref_serial = engine->serial + 1;
> >       i915_request_add_active_barriers(rq);
> > @@ -116,13 +115,17 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
> >       rq->sched.attr.priority = I915_PRIORITY_BARRIER;
> >       __i915_request_commit(rq);
> >   
> > +     __i915_request_queue(rq, NULL);
> > +
> >       /* Release our exclusive hold on the engine */
> >       __intel_wakeref_defer_park(&engine->wakeref);
> > -     __i915_request_queue(rq, NULL);
> > +
> > +     /* And finally expose our selfselves to intel_gt_retire_requests() 
> 
> ourselves
> 
> */
> > +     intel_timeline_enter(ce->timeline);
> 
> I haven't really managed to follow this.
> 
> What are these other clients which can queue requests and what in this 
> block prevents them doing so?

There are 2 other parties and the GPU who have a stake in this.

A new gpu user will be waiting on the engine-pm to start their
engine_unpark. New waiters are predicated on engine->wakeref.count and
so intel_wakeref_defer_park() acts like a mutex_unlock of the
engine->wakeref.

The other party is intel_gt_retire_requests(), which is walking the list
of active timelines looking for completions.  Meanwhile as soon as we
call __i915_request_queue(), the GPU may complete our request. Ergo,
if we put ourselves on the timelines.active_list (intel_timeline_enter)
before we raise the wakeref.count, we may see the request completion and
retire it causing an undeflow of the engine->wakeref.

> The change seems to be moving the queuing to before 
> __intel_wakeref_defer_park and intel_timeline_enter to last. Wakeref 
> defer extends the engine lifetime until the submitted rq is retired. But 
> how is that consider "unlocking"?

static inline int
intel_wakeref_get(struct intel_wakeref *wf)
{
        if (unlikely(!atomic_inc_not_zero(&wf->count)))
                return __intel_wakeref_get_first(wf);

        return 0;
}

As we build the switch_to_kernel_context(), wf->count is 0, and so all
new users will enter the slow path and take the mutex_lock(&wf->mutex).

As soon as we call intel_wakeref_defer_park(), we call
atomic_set_release(&wf->count, 1) and so all new users will take the
fast path and skip the mutex_lock. Hence why I connote it with being a
"mutex_unlock"
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 05/19] drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint
@ 2019-11-19 14:54     ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2019-11-19 14:54 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/11/2019 23:02, Chris Wilson wrote:
> In order to avoid some nasty mutex inversions, commit 09c5ab384f6f
> ("drm/i915: Keep rings pinned while the context is active") allowed the
> intel_ring unpinning to be run concurrently with the next context
> pinning it. Thus each step in intel_ring_unpin() needed to be atomic and
> ordered in a nice onion with intel_ring_pin() so that the lifetimes
> overlapped and were always safe.
> 
> Sadly, a few steps in intel_ring_unpin() were overlooked, such as
> closing the read/write pointers of the ring and discarding the
> intel_ring.vaddr, as these steps were not serialised with
> intel_ring_pin() and so could leave the ring in disarray.
> 
> Fixes: 09c5ab384f6f ("drm/i915: Keep rings pinned while the context is active")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_ring.c | 13 ++++---------
>   1 file changed, 4 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
> index ece20504d240..374b28f13ca0 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
> @@ -57,9 +57,10 @@ int intel_ring_pin(struct intel_ring *ring)
>   
>   	i915_vma_make_unshrinkable(vma);
>   
> -	GEM_BUG_ON(ring->vaddr);
> -	ring->vaddr = addr;
> +	/* Discard any unused bytes beyond that submitted to hw. */
> +	intel_ring_reset(ring, ring->emit);
>   
> +	ring->vaddr = addr;
>   	return 0;
>   
>   err_ring:
> @@ -85,20 +86,14 @@ void intel_ring_unpin(struct intel_ring *ring)
>   	if (!atomic_dec_and_test(&ring->pin_count))
>   		return;
>   
> -	/* Discard any unused bytes beyond that submitted to hw. */
> -	intel_ring_reset(ring, ring->emit);
> -
>   	i915_vma_unset_ggtt_write(vma);
>   	if (i915_vma_is_map_and_fenceable(vma))
>   		i915_vma_unpin_iomap(vma);
>   	else
>   		i915_gem_object_unpin_map(vma->obj);
>   
> -	GEM_BUG_ON(!ring->vaddr);
> -	ring->vaddr = NULL;
> -
> -	i915_vma_unpin(vma);
>   	i915_vma_make_purgeable(vma);
> +	i915_vma_unpin(vma);
>   }
>   
>   static struct i915_vma *create_ring_vma(struct i915_ggtt *ggtt, int size)
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 05/19] drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint
@ 2019-11-19 14:54     ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2019-11-19 14:54 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/11/2019 23:02, Chris Wilson wrote:
> In order to avoid some nasty mutex inversions, commit 09c5ab384f6f
> ("drm/i915: Keep rings pinned while the context is active") allowed the
> intel_ring unpinning to be run concurrently with the next context
> pinning it. Thus each step in intel_ring_unpin() needed to be atomic and
> ordered in a nice onion with intel_ring_pin() so that the lifetimes
> overlapped and were always safe.
> 
> Sadly, a few steps in intel_ring_unpin() were overlooked, such as
> closing the read/write pointers of the ring and discarding the
> intel_ring.vaddr, as these steps were not serialised with
> intel_ring_pin() and so could leave the ring in disarray.
> 
> Fixes: 09c5ab384f6f ("drm/i915: Keep rings pinned while the context is active")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_ring.c | 13 ++++---------
>   1 file changed, 4 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
> index ece20504d240..374b28f13ca0 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
> @@ -57,9 +57,10 @@ int intel_ring_pin(struct intel_ring *ring)
>   
>   	i915_vma_make_unshrinkable(vma);
>   
> -	GEM_BUG_ON(ring->vaddr);
> -	ring->vaddr = addr;
> +	/* Discard any unused bytes beyond that submitted to hw. */
> +	intel_ring_reset(ring, ring->emit);
>   
> +	ring->vaddr = addr;
>   	return 0;
>   
>   err_ring:
> @@ -85,20 +86,14 @@ void intel_ring_unpin(struct intel_ring *ring)
>   	if (!atomic_dec_and_test(&ring->pin_count))
>   		return;
>   
> -	/* Discard any unused bytes beyond that submitted to hw. */
> -	intel_ring_reset(ring, ring->emit);
> -
>   	i915_vma_unset_ggtt_write(vma);
>   	if (i915_vma_is_map_and_fenceable(vma))
>   		i915_vma_unpin_iomap(vma);
>   	else
>   		i915_gem_object_unpin_map(vma->obj);
>   
> -	GEM_BUG_ON(!ring->vaddr);
> -	ring->vaddr = NULL;
> -
> -	i915_vma_unpin(vma);
>   	i915_vma_make_purgeable(vma);
> +	i915_vma_unpin(vma);
>   }
>   
>   static struct i915_vma *create_ring_vma(struct i915_ggtt *ggtt, int size)
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* [PATCH] drm/i915/gt: Unlock engine-pm after queuing the kernel context switch
@ 2019-11-19 15:03     ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-19 15:03 UTC (permalink / raw)
  To: intel-gfx

In commit a79ca656b648 ("drm/i915: Push the wakeref->count deferral to
the backend"), I erroneously concluded that we last modify the engine
inside __i915_request_commit() meaning that we could enable concurrent
submission for userspace as we enqueued this request. However, this
falls into a trap with other users of the engine->kernel_context waking
up and submitting their request before the idle-switch is queued, with
the result that the kernel_context is executed out-of-sequence most
likely upsetting the GPU and certainly ourselves when we try to retire
the out-of-sequence requests.

As such we need to hold onto the effective engine->kernel_context mutex
lock (via the engine pm mutex proxy) until we have finish queuing the
request to the engine.

v2: Serialise against concurrent intel_gt_retire_requests()
v3: Describe the hairy locking scheme with intel_gt_retire_requests()
for future reference.

Fixes: a79ca656b648 ("drm/i915: Push the wakeref->count deferral to the backend")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_pm.c | 31 ++++++++++++++++++-----
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 3c0f490ff2c7..a7240e2dd873 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -75,6 +75,7 @@ static inline void __timeline_mark_unlock(struct intel_context *ce,
 
 static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 {
+	struct intel_context *ce = engine->kernel_context;
 	struct i915_request *rq;
 	unsigned long flags;
 	bool result = true;
@@ -98,16 +99,30 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 	 * This should hold true as we can only park the engine after
 	 * retiring the last request, thus all rings should be empty and
 	 * all timelines idle.
+	 *
+	 * For unlocking, there are 2 other parties and the GPU who have a
+	 * stake here.
+	 *
+	 * A new gpu user will be waiting on the engine-pm to start their
+	 * engine_unpark. New waiters are predicated on engine->wakeref.count
+	 * and so intel_wakeref_defer_park() acts like a mutex_unlock of the
+	 * engine->wakeref.
+	 *
+	 * The other party is intel_gt_retire_requests(), which is walking the
+	 * list of active timelines looking for completions. Meanwhile as soon
+	 * as we call __i915_request_queue(), the GPU may complete our request.
+	 * Ergo, if we put ourselves on the timelines.active_list
+	 * (se intel_timeline_enter()) before we increment the
+	 * engine->wakeref.count, we may see the request completion and retire
+	 * it causing an undeflow of the engine->wakeref.
 	 */
-	flags = __timeline_mark_lock(engine->kernel_context);
+	flags = __timeline_mark_lock(ce);
 
-	rq = __i915_request_create(engine->kernel_context, GFP_NOWAIT);
+	rq = __i915_request_create(ce, GFP_NOWAIT);
 	if (IS_ERR(rq))
 		/* Context switch failed, hope for the best! Maybe reset? */
 		goto out_unlock;
 
-	intel_timeline_enter(i915_request_timeline(rq));
-
 	/* Check again on the next retirement. */
 	engine->wakeref_serial = engine->serial + 1;
 	i915_request_add_active_barriers(rq);
@@ -116,13 +131,17 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 	rq->sched.attr.priority = I915_PRIORITY_BARRIER;
 	__i915_request_commit(rq);
 
+	__i915_request_queue(rq, NULL);
+
 	/* Release our exclusive hold on the engine */
 	__intel_wakeref_defer_park(&engine->wakeref);
-	__i915_request_queue(rq, NULL);
+
+	/* And finally expose ourselves to intel_gt_retire_requests() */
+	intel_timeline_enter(ce->timeline);
 
 	result = false;
 out_unlock:
-	__timeline_mark_unlock(engine->kernel_context, flags);
+	__timeline_mark_unlock(ce, flags);
 	return result;
 }
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [Intel-gfx] [PATCH] drm/i915/gt: Unlock engine-pm after queuing the kernel context switch
@ 2019-11-19 15:03     ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-19 15:03 UTC (permalink / raw)
  To: intel-gfx

In commit a79ca656b648 ("drm/i915: Push the wakeref->count deferral to
the backend"), I erroneously concluded that we last modify the engine
inside __i915_request_commit() meaning that we could enable concurrent
submission for userspace as we enqueued this request. However, this
falls into a trap with other users of the engine->kernel_context waking
up and submitting their request before the idle-switch is queued, with
the result that the kernel_context is executed out-of-sequence most
likely upsetting the GPU and certainly ourselves when we try to retire
the out-of-sequence requests.

As such we need to hold onto the effective engine->kernel_context mutex
lock (via the engine pm mutex proxy) until we have finish queuing the
request to the engine.

v2: Serialise against concurrent intel_gt_retire_requests()
v3: Describe the hairy locking scheme with intel_gt_retire_requests()
for future reference.

Fixes: a79ca656b648 ("drm/i915: Push the wakeref->count deferral to the backend")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_pm.c | 31 ++++++++++++++++++-----
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 3c0f490ff2c7..a7240e2dd873 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -75,6 +75,7 @@ static inline void __timeline_mark_unlock(struct intel_context *ce,
 
 static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 {
+	struct intel_context *ce = engine->kernel_context;
 	struct i915_request *rq;
 	unsigned long flags;
 	bool result = true;
@@ -98,16 +99,30 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 	 * This should hold true as we can only park the engine after
 	 * retiring the last request, thus all rings should be empty and
 	 * all timelines idle.
+	 *
+	 * For unlocking, there are 2 other parties and the GPU who have a
+	 * stake here.
+	 *
+	 * A new gpu user will be waiting on the engine-pm to start their
+	 * engine_unpark. New waiters are predicated on engine->wakeref.count
+	 * and so intel_wakeref_defer_park() acts like a mutex_unlock of the
+	 * engine->wakeref.
+	 *
+	 * The other party is intel_gt_retire_requests(), which is walking the
+	 * list of active timelines looking for completions. Meanwhile as soon
+	 * as we call __i915_request_queue(), the GPU may complete our request.
+	 * Ergo, if we put ourselves on the timelines.active_list
+	 * (se intel_timeline_enter()) before we increment the
+	 * engine->wakeref.count, we may see the request completion and retire
+	 * it causing an undeflow of the engine->wakeref.
 	 */
-	flags = __timeline_mark_lock(engine->kernel_context);
+	flags = __timeline_mark_lock(ce);
 
-	rq = __i915_request_create(engine->kernel_context, GFP_NOWAIT);
+	rq = __i915_request_create(ce, GFP_NOWAIT);
 	if (IS_ERR(rq))
 		/* Context switch failed, hope for the best! Maybe reset? */
 		goto out_unlock;
 
-	intel_timeline_enter(i915_request_timeline(rq));
-
 	/* Check again on the next retirement. */
 	engine->wakeref_serial = engine->serial + 1;
 	i915_request_add_active_barriers(rq);
@@ -116,13 +131,17 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 	rq->sched.attr.priority = I915_PRIORITY_BARRIER;
 	__i915_request_commit(rq);
 
+	__i915_request_queue(rq, NULL);
+
 	/* Release our exclusive hold on the engine */
 	__intel_wakeref_defer_park(&engine->wakeref);
-	__i915_request_queue(rq, NULL);
+
+	/* And finally expose ourselves to intel_gt_retire_requests() */
+	intel_timeline_enter(ce->timeline);
 
 	result = false;
 out_unlock:
-	__timeline_mark_unlock(engine->kernel_context, flags);
+	__timeline_mark_unlock(ce, flags);
 	return result;
 }
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* Re: [PATCH 06/19] drm/i915/gt: Schedule request retirement when submission idles
@ 2019-11-19 15:04     ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2019-11-19 15:04 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/11/2019 23:02, Chris Wilson wrote:
> The major drawback of commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX
> corruption WA") is that it disables RC6 while Skylake (and friends) is
> active, and we do not consider the GPU idle until all outstanding
> requests have been retired and the engine switched over to the kernel
> context. If userspace is idle, this task falls onto our background idle
> worker, which only runs roughly once a second, meaning that userspace has
> to have been idle for a couple of seconds before we enable RC6 again.
> Naturally, this causes us to consume considerably more energy than
> before as powersaving is effectively disabled while a display server
> (here's looking at you Xorg) is running.
> 
> As execlists will get a completion event as the last context is
> completed and the GPU goes idle, we can use our submission tasklet to
> notice when the GPU is idle and kick the retire worker. Thus during
> light workloads, we will do much more work to idle the GPU faster...
> Hopefully with commensurate power saving!
> 
> References: https://bugs.freedesktop.org/show_bug.cgi?id=112315
> References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_gt_requests.h |  9 ++++++++-
>   drivers/gpu/drm/i915/gt/intel_lrc.c         | 13 +++++++++++++
>   2 files changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.h b/drivers/gpu/drm/i915/gt/intel_gt_requests.h
> index fde546424c63..94b8758a45d9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.h
> @@ -7,7 +7,9 @@
>   #ifndef INTEL_GT_REQUESTS_H
>   #define INTEL_GT_REQUESTS_H
>   
> -struct intel_gt;
> +#include <linux/workqueue.h>
> +
> +#include "intel_gt_types.h"
>   
>   long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout);
>   static inline void intel_gt_retire_requests(struct intel_gt *gt)
> @@ -15,6 +17,11 @@ static inline void intel_gt_retire_requests(struct intel_gt *gt)
>   	intel_gt_retire_requests_timeout(gt, 0);
>   }
>   
> +static inline void intel_gt_schedule_retire_requests(struct intel_gt *gt)
> +{
> +	mod_delayed_work(system_wq, &gt->requests.retire_work, 0);
> +}
> +
>   int intel_gt_wait_for_idle(struct intel_gt *gt, long timeout);
>   
>   void intel_gt_init_requests(struct intel_gt *gt);
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 33ce258d484f..f7c8fec436a9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -142,6 +142,7 @@
>   #include "intel_engine_pm.h"
>   #include "intel_gt.h"
>   #include "intel_gt_pm.h"
> +#include "intel_gt_requests.h"
>   #include "intel_lrc_reg.h"
>   #include "intel_mocs.h"
>   #include "intel_reset.h"
> @@ -2278,6 +2279,18 @@ static void execlists_submission_tasklet(unsigned long data)
>   		if (timeout && preempt_timeout(engine))
>   			preempt_reset(engine);
>   	}
> +
> +	/*
> +	 * If the GPU is currently idle, retire the outstanding completed
> +	 * requests. This will allow us to enter soft-rc6 as soon as possible,
> +	 * albeit at the cost of running the retire worker much more frequently
> +	 * (over the entire GT not just this engine) and emitting more idle
> +	 * barriers (i.e. kernel context switches unpinning all that went
> +	 * before) which may add some extra latency.
> +	 */
> +	if (intel_engine_pm_is_awake(engine) &&
> +	    !execlists_active(&engine->execlists))
> +		intel_gt_schedule_retire_requests(engine->gt);

I am still not a fan of doing this for all platforms.

Its not just the cost of retirement but there is 
intel_engine_flush_submission on all engines in there as well which we 
cannot avoid triggering from this path.

Would it be worth experimenting with additional per-engine retire 
workers? Most of the code could be shared, just a little bit of 
specialization to filter on engine.

Regards,

Tvrtko

>   }
>   
>   static void __execlists_kick(struct intel_engine_execlists *execlists)
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 06/19] drm/i915/gt: Schedule request retirement when submission idles
@ 2019-11-19 15:04     ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2019-11-19 15:04 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/11/2019 23:02, Chris Wilson wrote:
> The major drawback of commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX
> corruption WA") is that it disables RC6 while Skylake (and friends) is
> active, and we do not consider the GPU idle until all outstanding
> requests have been retired and the engine switched over to the kernel
> context. If userspace is idle, this task falls onto our background idle
> worker, which only runs roughly once a second, meaning that userspace has
> to have been idle for a couple of seconds before we enable RC6 again.
> Naturally, this causes us to consume considerably more energy than
> before as powersaving is effectively disabled while a display server
> (here's looking at you Xorg) is running.
> 
> As execlists will get a completion event as the last context is
> completed and the GPU goes idle, we can use our submission tasklet to
> notice when the GPU is idle and kick the retire worker. Thus during
> light workloads, we will do much more work to idle the GPU faster...
> Hopefully with commensurate power saving!
> 
> References: https://bugs.freedesktop.org/show_bug.cgi?id=112315
> References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_gt_requests.h |  9 ++++++++-
>   drivers/gpu/drm/i915/gt/intel_lrc.c         | 13 +++++++++++++
>   2 files changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.h b/drivers/gpu/drm/i915/gt/intel_gt_requests.h
> index fde546424c63..94b8758a45d9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.h
> @@ -7,7 +7,9 @@
>   #ifndef INTEL_GT_REQUESTS_H
>   #define INTEL_GT_REQUESTS_H
>   
> -struct intel_gt;
> +#include <linux/workqueue.h>
> +
> +#include "intel_gt_types.h"
>   
>   long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout);
>   static inline void intel_gt_retire_requests(struct intel_gt *gt)
> @@ -15,6 +17,11 @@ static inline void intel_gt_retire_requests(struct intel_gt *gt)
>   	intel_gt_retire_requests_timeout(gt, 0);
>   }
>   
> +static inline void intel_gt_schedule_retire_requests(struct intel_gt *gt)
> +{
> +	mod_delayed_work(system_wq, &gt->requests.retire_work, 0);
> +}
> +
>   int intel_gt_wait_for_idle(struct intel_gt *gt, long timeout);
>   
>   void intel_gt_init_requests(struct intel_gt *gt);
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 33ce258d484f..f7c8fec436a9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -142,6 +142,7 @@
>   #include "intel_engine_pm.h"
>   #include "intel_gt.h"
>   #include "intel_gt_pm.h"
> +#include "intel_gt_requests.h"
>   #include "intel_lrc_reg.h"
>   #include "intel_mocs.h"
>   #include "intel_reset.h"
> @@ -2278,6 +2279,18 @@ static void execlists_submission_tasklet(unsigned long data)
>   		if (timeout && preempt_timeout(engine))
>   			preempt_reset(engine);
>   	}
> +
> +	/*
> +	 * If the GPU is currently idle, retire the outstanding completed
> +	 * requests. This will allow us to enter soft-rc6 as soon as possible,
> +	 * albeit at the cost of running the retire worker much more frequently
> +	 * (over the entire GT not just this engine) and emitting more idle
> +	 * barriers (i.e. kernel context switches unpinning all that went
> +	 * before) which may add some extra latency.
> +	 */
> +	if (intel_engine_pm_is_awake(engine) &&
> +	    !execlists_active(&engine->execlists))
> +		intel_gt_schedule_retire_requests(engine->gt);

I am still not a fan of doing this for all platforms.

Its not just the cost of retirement but there is 
intel_engine_flush_submission on all engines in there as well which we 
cannot avoid triggering from this path.

Would it be worth experimenting with additional per-engine retire 
workers? Most of the code could be shared, just a little bit of 
specialization to filter on engine.

Regards,

Tvrtko

>   }
>   
>   static void __execlists_kick(struct intel_engine_execlists *execlists)
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 07/19] drm/i915: Mark up the calling context for intel_wakeref_put()
@ 2019-11-19 15:57     ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2019-11-19 15:57 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/11/2019 23:02, Chris Wilson wrote:
> Previously, we assumed we could use mutex_trylock() within an atomic
> context, falling back to a working if contended. However, such trickery
> is illegal inside interrupt context, and so we need to always use a
> worker under such circumstances. As we normally are in process context,
> we can typically use a plain mutex, and only defer to a work when we
> know we are caller from an interrupt path.
> 
> Fixes: 51fbd8de87dc ("drm/i915/pmu: Atomically acquire the gt_pm wakeref")
> References: a0855d24fc22d ("locking/mutex: Complain upon mutex API misuse in IRQ contexts")
> References: https://bugs.freedesktop.org/show_bug.cgi?id=111626
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  3 +-
>   drivers/gpu/drm/i915/gt/intel_engine_pm.h    | 10 +++++
>   drivers/gpu/drm/i915/gt/intel_gt_pm.c        |  1 -
>   drivers/gpu/drm/i915/gt/intel_gt_pm.h        |  5 +++
>   drivers/gpu/drm/i915/gt/intel_lrc.c          |  2 +-
>   drivers/gpu/drm/i915/gt/intel_reset.c        |  2 +-
>   drivers/gpu/drm/i915/gt/selftest_engine_pm.c |  7 ++--
>   drivers/gpu/drm/i915/i915_active.c           |  5 ++-
>   drivers/gpu/drm/i915/i915_pmu.c              |  6 +--
>   drivers/gpu/drm/i915/intel_wakeref.c         |  8 ++--
>   drivers/gpu/drm/i915/intel_wakeref.h         | 44 ++++++++++++++++----
>   11 files changed, 68 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> index 722d3eec226e..4878d16176d5 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> @@ -180,7 +180,8 @@ static int __engine_park(struct intel_wakeref *wf)
>   
>   	engine->execlists.no_priolist = false;
>   
> -	intel_gt_pm_put(engine->gt);
> +	/* While we call i915_vma_parked, we have to break the lock cycle */
> +	intel_gt_pm_put_async(engine->gt);
>   	return 0;
>   }
>   
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.h b/drivers/gpu/drm/i915/gt/intel_engine_pm.h
> index 739c50fefcef..467475fce9c6 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.h
> @@ -31,6 +31,16 @@ static inline void intel_engine_pm_put(struct intel_engine_cs *engine)
>   	intel_wakeref_put(&engine->wakeref);
>   }
>   
> +static inline void intel_engine_pm_put_async(struct intel_engine_cs *engine)
> +{
> +	intel_wakeref_put_async(&engine->wakeref);
> +}
> +
> +static inline void intel_engine_pm_unlock_wait(struct intel_engine_cs *engine)
> +{
> +	intel_wakeref_unlock_wait(&engine->wakeref);
> +}
> +
>   void intel_engine_init__pm(struct intel_engine_cs *engine);
>   
>   #endif /* INTEL_ENGINE_PM_H */
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> index e61f752a3cd5..7a9044ac4b75 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> @@ -105,7 +105,6 @@ static int __gt_park(struct intel_wakeref *wf)
>   static const struct intel_wakeref_ops wf_ops = {
>   	.get = __gt_unpark,
>   	.put = __gt_park,
> -	.flags = INTEL_WAKEREF_PUT_ASYNC,
>   };
>   
>   void intel_gt_pm_init_early(struct intel_gt *gt)
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.h b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
> index b3e17399be9b..990efc27a4e4 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
> @@ -32,6 +32,11 @@ static inline void intel_gt_pm_put(struct intel_gt *gt)
>   	intel_wakeref_put(&gt->wakeref);
>   }
>   
> +static inline void intel_gt_pm_put_async(struct intel_gt *gt)
> +{
> +	intel_wakeref_put_async(&gt->wakeref);
> +}
> +
>   static inline int intel_gt_pm_wait_for_idle(struct intel_gt *gt)
>   {
>   	return intel_wakeref_wait_for_idle(&gt->wakeref);
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index f7c8fec436a9..fec46afb9264 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -1173,7 +1173,7 @@ __execlists_schedule_out(struct i915_request *rq,
>   
>   	intel_engine_context_out(engine);
>   	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
> -	intel_gt_pm_put(engine->gt);
> +	intel_gt_pm_put_async(engine->gt);

Wait a minute.. wasn't the wakeref hierarchy supposed to be engine -> gt 
so this place should be on the engine level?

>   
>   	/*
>   	 * If this is part of a virtual engine, its next request may
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> index b7007cd78c6f..0388f9375366 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -1125,7 +1125,7 @@ int intel_engine_reset(struct intel_engine_cs *engine, const char *msg)
>   out:
>   	intel_engine_cancel_stop_cs(engine);
>   	reset_finish_engine(engine);
> -	intel_engine_pm_put(engine);
> +	intel_engine_pm_put_async(engine);
>   	return ret;
>   }
>   
> diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
> index 20b9c83f43ad..851a6c4f65c6 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
> @@ -51,11 +51,12 @@ static int live_engine_pm(void *arg)
>   				pr_err("intel_engine_pm_get_if_awake(%s) failed under %s\n",
>   				       engine->name, p->name);
>   			else
> -				intel_engine_pm_put(engine);
> -			intel_engine_pm_put(engine);
> +				intel_engine_pm_put_async(engine);
> +			intel_engine_pm_put_async(engine);
>   			p->critical_section_end();
>   
> -			/* engine wakeref is sync (instant) */
> +			intel_engine_pm_unlock_wait(engine);

 From the context would it be clearer to name it 
intel_engine_pm_wait_put? sync_put? flush_put? To more clearly represent 
it is a pair of put_async.

> +
>   			if (intel_engine_pm_is_awake(engine)) {
>   				pr_err("%s is still awake after flushing pm\n",
>   				       engine->name);
> diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
> index 5448f37c8102..dca15ace88f6 100644
> --- a/drivers/gpu/drm/i915/i915_active.c
> +++ b/drivers/gpu/drm/i915/i915_active.c
> @@ -672,12 +672,13 @@ void i915_active_acquire_barrier(struct i915_active *ref)
>   	 * populated by i915_request_add_active_barriers() to point to the
>   	 * request that will eventually release them.
>   	 */
> -	spin_lock_irqsave_nested(&ref->tree_lock, flags, SINGLE_DEPTH_NESTING);
>   	llist_for_each_safe(pos, next, take_preallocated_barriers(ref)) {
>   		struct active_node *node = barrier_from_ll(pos);
>   		struct intel_engine_cs *engine = barrier_to_engine(node);
>   		struct rb_node **p, *parent;
>   
> +		spin_lock_irqsave_nested(&ref->tree_lock, flags,
> +					 SINGLE_DEPTH_NESTING);
>   		parent = NULL;
>   		p = &ref->tree.rb_node;
>   		while (*p) {
> @@ -693,12 +694,12 @@ void i915_active_acquire_barrier(struct i915_active *ref)
>   		}
>   		rb_link_node(&node->node, parent, p);
>   		rb_insert_color(&node->node, &ref->tree);
> +		spin_unlock_irqrestore(&ref->tree_lock, flags);
>   
>   		GEM_BUG_ON(!intel_engine_pm_is_awake(engine));
>   		llist_add(barrier_to_ll(node), &engine->barrier_tasks);
>   		intel_engine_pm_put(engine);
>   	}
> -	spin_unlock_irqrestore(&ref->tree_lock, flags);

Pros/cons of leaving the locks where they were and using put_async, 
versus this layout?

>   }
>   
>   void i915_request_add_active_barriers(struct i915_request *rq)
> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> index 9b02be0ad4e6..95e824a78d4d 100644
> --- a/drivers/gpu/drm/i915/i915_pmu.c
> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> @@ -190,7 +190,7 @@ static u64 get_rc6(struct intel_gt *gt)
>   	val = 0;
>   	if (intel_gt_pm_get_if_awake(gt)) {
>   		val = __get_rc6(gt);
> -		intel_gt_pm_put(gt);
> +		intel_gt_pm_put_async(gt);
>   	}
>   
>   	spin_lock_irqsave(&pmu->lock, flags);
> @@ -360,7 +360,7 @@ engines_sample(struct intel_gt *gt, unsigned int period_ns)
>   skip:
>   		if (unlikely(mmio_lock))
>   			spin_unlock_irqrestore(mmio_lock, flags);
> -		intel_engine_pm_put(engine);
> +		intel_engine_pm_put_async(engine);
>   	}
>   }
>   
> @@ -398,7 +398,7 @@ frequency_sample(struct intel_gt *gt, unsigned int period_ns)
>   			if (stat)
>   				val = intel_get_cagf(rps, stat);
>   
> -			intel_gt_pm_put(gt);
> +			intel_gt_pm_put_async(gt);
>   		}
>   
>   		add_sample_mult(&pmu->sample[__I915_SAMPLE_FREQ_ACT],
> diff --git a/drivers/gpu/drm/i915/intel_wakeref.c b/drivers/gpu/drm/i915/intel_wakeref.c
> index 868cc78048d0..9b29176cc1ca 100644
> --- a/drivers/gpu/drm/i915/intel_wakeref.c
> +++ b/drivers/gpu/drm/i915/intel_wakeref.c
> @@ -54,7 +54,8 @@ int __intel_wakeref_get_first(struct intel_wakeref *wf)
>   
>   static void ____intel_wakeref_put_last(struct intel_wakeref *wf)
>   {
> -	if (!atomic_dec_and_test(&wf->count))
> +	INTEL_WAKEREF_BUG_ON(atomic_read(&wf->count) <= 0);
> +	if (unlikely(!atomic_dec_and_test(&wf->count)))
>   		goto unlock;
>   
>   	/* ops->put() must reschedule its own release on error/deferral */
> @@ -67,13 +68,12 @@ static void ____intel_wakeref_put_last(struct intel_wakeref *wf)
>   	mutex_unlock(&wf->mutex);
>   }
>   
> -void __intel_wakeref_put_last(struct intel_wakeref *wf)
> +void __intel_wakeref_put_last(struct intel_wakeref *wf, unsigned long flags)
>   {
>   	INTEL_WAKEREF_BUG_ON(work_pending(&wf->work));
>   
>   	/* Assume we are not in process context and so cannot sleep. */
> -	if (wf->ops->flags & INTEL_WAKEREF_PUT_ASYNC ||
> -	    !mutex_trylock(&wf->mutex)) {
> +	if (flags & INTEL_WAKEREF_PUT_ASYNC || !mutex_trylock(&wf->mutex)) {
>   		schedule_work(&wf->work);
>   		return;
>   	}
> diff --git a/drivers/gpu/drm/i915/intel_wakeref.h b/drivers/gpu/drm/i915/intel_wakeref.h
> index 5f0c972a80fb..688b9b536a69 100644
> --- a/drivers/gpu/drm/i915/intel_wakeref.h
> +++ b/drivers/gpu/drm/i915/intel_wakeref.h
> @@ -9,6 +9,7 @@
>   
>   #include <linux/atomic.h>
>   #include <linux/bits.h>
> +#include <linux/lockdep.h>
>   #include <linux/mutex.h>
>   #include <linux/refcount.h>
>   #include <linux/stackdepot.h>
> @@ -29,9 +30,6 @@ typedef depot_stack_handle_t intel_wakeref_t;
>   struct intel_wakeref_ops {
>   	int (*get)(struct intel_wakeref *wf);
>   	int (*put)(struct intel_wakeref *wf);
> -
> -	unsigned long flags;
> -#define INTEL_WAKEREF_PUT_ASYNC BIT(0)
>   };
>   
>   struct intel_wakeref {
> @@ -57,7 +55,7 @@ void __intel_wakeref_init(struct intel_wakeref *wf,
>   } while (0)
>   
>   int __intel_wakeref_get_first(struct intel_wakeref *wf);
> -void __intel_wakeref_put_last(struct intel_wakeref *wf);
> +void __intel_wakeref_put_last(struct intel_wakeref *wf, unsigned long flags);
>   
>   /**
>    * intel_wakeref_get: Acquire the wakeref
> @@ -100,10 +98,9 @@ intel_wakeref_get_if_active(struct intel_wakeref *wf)
>   }
>   
>   /**
> - * intel_wakeref_put: Release the wakeref
> - * @i915: the drm_i915_private device
> + * intel_wakeref_put_flags: Release the wakeref
>    * @wf: the wakeref
> - * @fn: callback for releasing the wakeref, called only on final release.
> + * @flags: control flags
>    *
>    * Release our hold on the wakeref. When there are no more users,
>    * the runtime pm wakeref will be released after the @fn callback is called
> @@ -116,11 +113,25 @@ intel_wakeref_get_if_active(struct intel_wakeref *wf)
>    * code otherwise.
>    */
>   static inline void
> -intel_wakeref_put(struct intel_wakeref *wf)
> +__intel_wakeref_put(struct intel_wakeref *wf, unsigned long flags)
> +#define INTEL_WAKEREF_PUT_ASYNC BIT(0)
>   {
>   	INTEL_WAKEREF_BUG_ON(atomic_read(&wf->count) <= 0);
>   	if (unlikely(!atomic_add_unless(&wf->count, -1, 1)))
> -		__intel_wakeref_put_last(wf);
> +		__intel_wakeref_put_last(wf, flags);
> +}
> +
> +static inline void
> +intel_wakeref_put(struct intel_wakeref *wf)
> +{
> +	might_sleep();
> +	__intel_wakeref_put(wf, 0);
> +}
> +
> +static inline void
> +intel_wakeref_put_async(struct intel_wakeref *wf)
> +{
> +	__intel_wakeref_put(wf, INTEL_WAKEREF_PUT_ASYNC);
>   }
>   
>   /**
> @@ -151,6 +162,20 @@ intel_wakeref_unlock(struct intel_wakeref *wf)
>   	mutex_unlock(&wf->mutex);
>   }
>   
> +/**
> + * intel_wakeref_unlock_wait: Wait until the active callback is complete
> + * @wf: the wakeref
> + *
> + * Waits for the active callback (under the @wf->mtuex) is complete.
> + */
> +static inline void
> +intel_wakeref_unlock_wait(struct intel_wakeref *wf)
> +{
> +	mutex_lock(&wf->mutex);
> +	mutex_unlock(&wf->mutex);
> +	flush_work(&wf->work);
> +}
> +
>   /**
>    * intel_wakeref_is_active: Query whether the wakeref is currently held
>    * @wf: the wakeref
> @@ -170,6 +195,7 @@ intel_wakeref_is_active(const struct intel_wakeref *wf)
>   static inline void
>   __intel_wakeref_defer_park(struct intel_wakeref *wf)
>   {
> +	lockdep_assert_held(&wf->mutex);
>   	INTEL_WAKEREF_BUG_ON(atomic_read(&wf->count));
>   	atomic_set_release(&wf->count, 1);
>   }
> 

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 07/19] drm/i915: Mark up the calling context for intel_wakeref_put()
@ 2019-11-19 15:57     ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2019-11-19 15:57 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/11/2019 23:02, Chris Wilson wrote:
> Previously, we assumed we could use mutex_trylock() within an atomic
> context, falling back to a working if contended. However, such trickery
> is illegal inside interrupt context, and so we need to always use a
> worker under such circumstances. As we normally are in process context,
> we can typically use a plain mutex, and only defer to a work when we
> know we are caller from an interrupt path.
> 
> Fixes: 51fbd8de87dc ("drm/i915/pmu: Atomically acquire the gt_pm wakeref")
> References: a0855d24fc22d ("locking/mutex: Complain upon mutex API misuse in IRQ contexts")
> References: https://bugs.freedesktop.org/show_bug.cgi?id=111626
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  3 +-
>   drivers/gpu/drm/i915/gt/intel_engine_pm.h    | 10 +++++
>   drivers/gpu/drm/i915/gt/intel_gt_pm.c        |  1 -
>   drivers/gpu/drm/i915/gt/intel_gt_pm.h        |  5 +++
>   drivers/gpu/drm/i915/gt/intel_lrc.c          |  2 +-
>   drivers/gpu/drm/i915/gt/intel_reset.c        |  2 +-
>   drivers/gpu/drm/i915/gt/selftest_engine_pm.c |  7 ++--
>   drivers/gpu/drm/i915/i915_active.c           |  5 ++-
>   drivers/gpu/drm/i915/i915_pmu.c              |  6 +--
>   drivers/gpu/drm/i915/intel_wakeref.c         |  8 ++--
>   drivers/gpu/drm/i915/intel_wakeref.h         | 44 ++++++++++++++++----
>   11 files changed, 68 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> index 722d3eec226e..4878d16176d5 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> @@ -180,7 +180,8 @@ static int __engine_park(struct intel_wakeref *wf)
>   
>   	engine->execlists.no_priolist = false;
>   
> -	intel_gt_pm_put(engine->gt);
> +	/* While we call i915_vma_parked, we have to break the lock cycle */
> +	intel_gt_pm_put_async(engine->gt);
>   	return 0;
>   }
>   
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.h b/drivers/gpu/drm/i915/gt/intel_engine_pm.h
> index 739c50fefcef..467475fce9c6 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.h
> @@ -31,6 +31,16 @@ static inline void intel_engine_pm_put(struct intel_engine_cs *engine)
>   	intel_wakeref_put(&engine->wakeref);
>   }
>   
> +static inline void intel_engine_pm_put_async(struct intel_engine_cs *engine)
> +{
> +	intel_wakeref_put_async(&engine->wakeref);
> +}
> +
> +static inline void intel_engine_pm_unlock_wait(struct intel_engine_cs *engine)
> +{
> +	intel_wakeref_unlock_wait(&engine->wakeref);
> +}
> +
>   void intel_engine_init__pm(struct intel_engine_cs *engine);
>   
>   #endif /* INTEL_ENGINE_PM_H */
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> index e61f752a3cd5..7a9044ac4b75 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> @@ -105,7 +105,6 @@ static int __gt_park(struct intel_wakeref *wf)
>   static const struct intel_wakeref_ops wf_ops = {
>   	.get = __gt_unpark,
>   	.put = __gt_park,
> -	.flags = INTEL_WAKEREF_PUT_ASYNC,
>   };
>   
>   void intel_gt_pm_init_early(struct intel_gt *gt)
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.h b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
> index b3e17399be9b..990efc27a4e4 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
> @@ -32,6 +32,11 @@ static inline void intel_gt_pm_put(struct intel_gt *gt)
>   	intel_wakeref_put(&gt->wakeref);
>   }
>   
> +static inline void intel_gt_pm_put_async(struct intel_gt *gt)
> +{
> +	intel_wakeref_put_async(&gt->wakeref);
> +}
> +
>   static inline int intel_gt_pm_wait_for_idle(struct intel_gt *gt)
>   {
>   	return intel_wakeref_wait_for_idle(&gt->wakeref);
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index f7c8fec436a9..fec46afb9264 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -1173,7 +1173,7 @@ __execlists_schedule_out(struct i915_request *rq,
>   
>   	intel_engine_context_out(engine);
>   	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
> -	intel_gt_pm_put(engine->gt);
> +	intel_gt_pm_put_async(engine->gt);

Wait a minute.. wasn't the wakeref hierarchy supposed to be engine -> gt 
so this place should be on the engine level?

>   
>   	/*
>   	 * If this is part of a virtual engine, its next request may
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> index b7007cd78c6f..0388f9375366 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -1125,7 +1125,7 @@ int intel_engine_reset(struct intel_engine_cs *engine, const char *msg)
>   out:
>   	intel_engine_cancel_stop_cs(engine);
>   	reset_finish_engine(engine);
> -	intel_engine_pm_put(engine);
> +	intel_engine_pm_put_async(engine);
>   	return ret;
>   }
>   
> diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
> index 20b9c83f43ad..851a6c4f65c6 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
> @@ -51,11 +51,12 @@ static int live_engine_pm(void *arg)
>   				pr_err("intel_engine_pm_get_if_awake(%s) failed under %s\n",
>   				       engine->name, p->name);
>   			else
> -				intel_engine_pm_put(engine);
> -			intel_engine_pm_put(engine);
> +				intel_engine_pm_put_async(engine);
> +			intel_engine_pm_put_async(engine);
>   			p->critical_section_end();
>   
> -			/* engine wakeref is sync (instant) */
> +			intel_engine_pm_unlock_wait(engine);

 From the context would it be clearer to name it 
intel_engine_pm_wait_put? sync_put? flush_put? To more clearly represent 
it is a pair of put_async.

> +
>   			if (intel_engine_pm_is_awake(engine)) {
>   				pr_err("%s is still awake after flushing pm\n",
>   				       engine->name);
> diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
> index 5448f37c8102..dca15ace88f6 100644
> --- a/drivers/gpu/drm/i915/i915_active.c
> +++ b/drivers/gpu/drm/i915/i915_active.c
> @@ -672,12 +672,13 @@ void i915_active_acquire_barrier(struct i915_active *ref)
>   	 * populated by i915_request_add_active_barriers() to point to the
>   	 * request that will eventually release them.
>   	 */
> -	spin_lock_irqsave_nested(&ref->tree_lock, flags, SINGLE_DEPTH_NESTING);
>   	llist_for_each_safe(pos, next, take_preallocated_barriers(ref)) {
>   		struct active_node *node = barrier_from_ll(pos);
>   		struct intel_engine_cs *engine = barrier_to_engine(node);
>   		struct rb_node **p, *parent;
>   
> +		spin_lock_irqsave_nested(&ref->tree_lock, flags,
> +					 SINGLE_DEPTH_NESTING);
>   		parent = NULL;
>   		p = &ref->tree.rb_node;
>   		while (*p) {
> @@ -693,12 +694,12 @@ void i915_active_acquire_barrier(struct i915_active *ref)
>   		}
>   		rb_link_node(&node->node, parent, p);
>   		rb_insert_color(&node->node, &ref->tree);
> +		spin_unlock_irqrestore(&ref->tree_lock, flags);
>   
>   		GEM_BUG_ON(!intel_engine_pm_is_awake(engine));
>   		llist_add(barrier_to_ll(node), &engine->barrier_tasks);
>   		intel_engine_pm_put(engine);
>   	}
> -	spin_unlock_irqrestore(&ref->tree_lock, flags);

Pros/cons of leaving the locks where they were and using put_async, 
versus this layout?

>   }
>   
>   void i915_request_add_active_barriers(struct i915_request *rq)
> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> index 9b02be0ad4e6..95e824a78d4d 100644
> --- a/drivers/gpu/drm/i915/i915_pmu.c
> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> @@ -190,7 +190,7 @@ static u64 get_rc6(struct intel_gt *gt)
>   	val = 0;
>   	if (intel_gt_pm_get_if_awake(gt)) {
>   		val = __get_rc6(gt);
> -		intel_gt_pm_put(gt);
> +		intel_gt_pm_put_async(gt);
>   	}
>   
>   	spin_lock_irqsave(&pmu->lock, flags);
> @@ -360,7 +360,7 @@ engines_sample(struct intel_gt *gt, unsigned int period_ns)
>   skip:
>   		if (unlikely(mmio_lock))
>   			spin_unlock_irqrestore(mmio_lock, flags);
> -		intel_engine_pm_put(engine);
> +		intel_engine_pm_put_async(engine);
>   	}
>   }
>   
> @@ -398,7 +398,7 @@ frequency_sample(struct intel_gt *gt, unsigned int period_ns)
>   			if (stat)
>   				val = intel_get_cagf(rps, stat);
>   
> -			intel_gt_pm_put(gt);
> +			intel_gt_pm_put_async(gt);
>   		}
>   
>   		add_sample_mult(&pmu->sample[__I915_SAMPLE_FREQ_ACT],
> diff --git a/drivers/gpu/drm/i915/intel_wakeref.c b/drivers/gpu/drm/i915/intel_wakeref.c
> index 868cc78048d0..9b29176cc1ca 100644
> --- a/drivers/gpu/drm/i915/intel_wakeref.c
> +++ b/drivers/gpu/drm/i915/intel_wakeref.c
> @@ -54,7 +54,8 @@ int __intel_wakeref_get_first(struct intel_wakeref *wf)
>   
>   static void ____intel_wakeref_put_last(struct intel_wakeref *wf)
>   {
> -	if (!atomic_dec_and_test(&wf->count))
> +	INTEL_WAKEREF_BUG_ON(atomic_read(&wf->count) <= 0);
> +	if (unlikely(!atomic_dec_and_test(&wf->count)))
>   		goto unlock;
>   
>   	/* ops->put() must reschedule its own release on error/deferral */
> @@ -67,13 +68,12 @@ static void ____intel_wakeref_put_last(struct intel_wakeref *wf)
>   	mutex_unlock(&wf->mutex);
>   }
>   
> -void __intel_wakeref_put_last(struct intel_wakeref *wf)
> +void __intel_wakeref_put_last(struct intel_wakeref *wf, unsigned long flags)
>   {
>   	INTEL_WAKEREF_BUG_ON(work_pending(&wf->work));
>   
>   	/* Assume we are not in process context and so cannot sleep. */
> -	if (wf->ops->flags & INTEL_WAKEREF_PUT_ASYNC ||
> -	    !mutex_trylock(&wf->mutex)) {
> +	if (flags & INTEL_WAKEREF_PUT_ASYNC || !mutex_trylock(&wf->mutex)) {
>   		schedule_work(&wf->work);
>   		return;
>   	}
> diff --git a/drivers/gpu/drm/i915/intel_wakeref.h b/drivers/gpu/drm/i915/intel_wakeref.h
> index 5f0c972a80fb..688b9b536a69 100644
> --- a/drivers/gpu/drm/i915/intel_wakeref.h
> +++ b/drivers/gpu/drm/i915/intel_wakeref.h
> @@ -9,6 +9,7 @@
>   
>   #include <linux/atomic.h>
>   #include <linux/bits.h>
> +#include <linux/lockdep.h>
>   #include <linux/mutex.h>
>   #include <linux/refcount.h>
>   #include <linux/stackdepot.h>
> @@ -29,9 +30,6 @@ typedef depot_stack_handle_t intel_wakeref_t;
>   struct intel_wakeref_ops {
>   	int (*get)(struct intel_wakeref *wf);
>   	int (*put)(struct intel_wakeref *wf);
> -
> -	unsigned long flags;
> -#define INTEL_WAKEREF_PUT_ASYNC BIT(0)
>   };
>   
>   struct intel_wakeref {
> @@ -57,7 +55,7 @@ void __intel_wakeref_init(struct intel_wakeref *wf,
>   } while (0)
>   
>   int __intel_wakeref_get_first(struct intel_wakeref *wf);
> -void __intel_wakeref_put_last(struct intel_wakeref *wf);
> +void __intel_wakeref_put_last(struct intel_wakeref *wf, unsigned long flags);
>   
>   /**
>    * intel_wakeref_get: Acquire the wakeref
> @@ -100,10 +98,9 @@ intel_wakeref_get_if_active(struct intel_wakeref *wf)
>   }
>   
>   /**
> - * intel_wakeref_put: Release the wakeref
> - * @i915: the drm_i915_private device
> + * intel_wakeref_put_flags: Release the wakeref
>    * @wf: the wakeref
> - * @fn: callback for releasing the wakeref, called only on final release.
> + * @flags: control flags
>    *
>    * Release our hold on the wakeref. When there are no more users,
>    * the runtime pm wakeref will be released after the @fn callback is called
> @@ -116,11 +113,25 @@ intel_wakeref_get_if_active(struct intel_wakeref *wf)
>    * code otherwise.
>    */
>   static inline void
> -intel_wakeref_put(struct intel_wakeref *wf)
> +__intel_wakeref_put(struct intel_wakeref *wf, unsigned long flags)
> +#define INTEL_WAKEREF_PUT_ASYNC BIT(0)
>   {
>   	INTEL_WAKEREF_BUG_ON(atomic_read(&wf->count) <= 0);
>   	if (unlikely(!atomic_add_unless(&wf->count, -1, 1)))
> -		__intel_wakeref_put_last(wf);
> +		__intel_wakeref_put_last(wf, flags);
> +}
> +
> +static inline void
> +intel_wakeref_put(struct intel_wakeref *wf)
> +{
> +	might_sleep();
> +	__intel_wakeref_put(wf, 0);
> +}
> +
> +static inline void
> +intel_wakeref_put_async(struct intel_wakeref *wf)
> +{
> +	__intel_wakeref_put(wf, INTEL_WAKEREF_PUT_ASYNC);
>   }
>   
>   /**
> @@ -151,6 +162,20 @@ intel_wakeref_unlock(struct intel_wakeref *wf)
>   	mutex_unlock(&wf->mutex);
>   }
>   
> +/**
> + * intel_wakeref_unlock_wait: Wait until the active callback is complete
> + * @wf: the wakeref
> + *
> + * Waits for the active callback (under the @wf->mtuex) is complete.
> + */
> +static inline void
> +intel_wakeref_unlock_wait(struct intel_wakeref *wf)
> +{
> +	mutex_lock(&wf->mutex);
> +	mutex_unlock(&wf->mutex);
> +	flush_work(&wf->work);
> +}
> +
>   /**
>    * intel_wakeref_is_active: Query whether the wakeref is currently held
>    * @wf: the wakeref
> @@ -170,6 +195,7 @@ intel_wakeref_is_active(const struct intel_wakeref *wf)
>   static inline void
>   __intel_wakeref_defer_park(struct intel_wakeref *wf)
>   {
> +	lockdep_assert_held(&wf->mutex);
>   	INTEL_WAKEREF_BUG_ON(atomic_read(&wf->count));
>   	atomic_set_release(&wf->count, 1);
>   }
> 

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 12/19] drm/i915/gt: Declare timeline.lock to be irq-free
@ 2019-11-19 15:58     ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2019-11-19 15:58 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/11/2019 23:02, Chris Wilson wrote:
> Now that we never allow the intel_wakeref callbacks to be invoked from
> interrupt context, we do not need the irqsafe spinlock for the timeline.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/intel_gt_requests.c |  9 ++++-----
>   drivers/gpu/drm/i915/gt/intel_reset.c       |  9 ++++-----
>   drivers/gpu/drm/i915/gt/intel_timeline.c    | 10 ++++------
>   3 files changed, 12 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> index 7559d6373f49..74356db43325 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> @@ -33,7 +33,6 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
>   {
>   	struct intel_gt_timelines *timelines = &gt->timelines;
>   	struct intel_timeline *tl, *tn;
> -	unsigned long flags;
>   	bool interruptible;
>   	LIST_HEAD(free);
>   
> @@ -43,7 +42,7 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
>   
>   	flush_submission(gt); /* kick the ksoftirqd tasklets */
>   
> -	spin_lock_irqsave(&timelines->lock, flags);
> +	spin_lock(&timelines->lock);
>   	list_for_each_entry_safe(tl, tn, &timelines->active_list, link) {
>   		if (!mutex_trylock(&tl->mutex))
>   			continue;
> @@ -51,7 +50,7 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
>   		intel_timeline_get(tl);
>   		GEM_BUG_ON(!atomic_read(&tl->active_count));
>   		atomic_inc(&tl->active_count); /* pin the list element */
> -		spin_unlock_irqrestore(&timelines->lock, flags);
> +		spin_unlock(&timelines->lock);
>   
>   		if (timeout > 0) {
>   			struct dma_fence *fence;
> @@ -67,7 +66,7 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
>   
>   		retire_requests(tl);
>   
> -		spin_lock_irqsave(&timelines->lock, flags);
> +		spin_lock(&timelines->lock);
>   
>   		/* Resume iteration after dropping lock */
>   		list_safe_reset_next(tl, tn, link);
> @@ -82,7 +81,7 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
>   			list_add(&tl->link, &free);
>   		}
>   	}
> -	spin_unlock_irqrestore(&timelines->lock, flags);
> +	spin_unlock(&timelines->lock);
>   
>   	list_for_each_entry_safe(tl, tn, &free, link)
>   		__intel_timeline_free(&tl->kref);
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> index 0388f9375366..36189238e13c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -831,7 +831,6 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
>   {
>   	struct intel_gt_timelines *timelines = &gt->timelines;
>   	struct intel_timeline *tl;
> -	unsigned long flags;
>   	bool ok;
>   
>   	if (!test_bit(I915_WEDGED, &gt->reset.flags))
> @@ -853,7 +852,7 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
>   	 *
>   	 * No more can be submitted until we reset the wedged bit.
>   	 */
> -	spin_lock_irqsave(&timelines->lock, flags);
> +	spin_lock(&timelines->lock);
>   	list_for_each_entry(tl, &timelines->active_list, link) {
>   		struct dma_fence *fence;
>   
> @@ -861,7 +860,7 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
>   		if (!fence)
>   			continue;
>   
> -		spin_unlock_irqrestore(&timelines->lock, flags);
> +		spin_unlock(&timelines->lock);
>   
>   		/*
>   		 * All internal dependencies (i915_requests) will have
> @@ -874,10 +873,10 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
>   		dma_fence_put(fence);
>   
>   		/* Restart iteration after droping lock */
> -		spin_lock_irqsave(&timelines->lock, flags);
> +		spin_lock(&timelines->lock);
>   		tl = list_entry(&timelines->active_list, typeof(*tl), link);
>   	}
> -	spin_unlock_irqrestore(&timelines->lock, flags);
> +	spin_unlock(&timelines->lock);
>   
>   	/* We must reset pending GPU events before restoring our submission */
>   	ok = !HAS_EXECLISTS(gt->i915); /* XXX better agnosticism desired */
> diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
> index 4f914f0d5eab..bd973d950064 100644
> --- a/drivers/gpu/drm/i915/gt/intel_timeline.c
> +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
> @@ -332,7 +332,6 @@ int intel_timeline_pin(struct intel_timeline *tl)
>   void intel_timeline_enter(struct intel_timeline *tl)
>   {
>   	struct intel_gt_timelines *timelines = &tl->gt->timelines;
> -	unsigned long flags;
>   
>   	/*
>   	 * Pretend we are serialised by the timeline->mutex.
> @@ -358,16 +357,15 @@ void intel_timeline_enter(struct intel_timeline *tl)
>   	if (atomic_add_unless(&tl->active_count, 1, 0))
>   		return;
>   
> -	spin_lock_irqsave(&timelines->lock, flags);
> +	spin_lock(&timelines->lock);
>   	if (!atomic_fetch_inc(&tl->active_count))
>   		list_add(&tl->link, &timelines->active_list);
> -	spin_unlock_irqrestore(&timelines->lock, flags);
> +	spin_unlock(&timelines->lock);
>   }
>   
>   void intel_timeline_exit(struct intel_timeline *tl)
>   {
>   	struct intel_gt_timelines *timelines = &tl->gt->timelines;
> -	unsigned long flags;
>   
>   	/* See intel_timeline_enter() */
>   	lockdep_assert_held(&tl->mutex);
> @@ -376,10 +374,10 @@ void intel_timeline_exit(struct intel_timeline *tl)
>   	if (atomic_add_unless(&tl->active_count, -1, 1))
>   		return;
>   
> -	spin_lock_irqsave(&timelines->lock, flags);
> +	spin_lock(&timelines->lock);
>   	if (atomic_dec_and_test(&tl->active_count))
>   		list_del(&tl->link);
> -	spin_unlock_irqrestore(&timelines->lock, flags);
> +	spin_unlock(&timelines->lock);
>   
>   	/*
>   	 * Since this timeline is idle, all bariers upon which we were waiting
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 12/19] drm/i915/gt: Declare timeline.lock to be irq-free
@ 2019-11-19 15:58     ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2019-11-19 15:58 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/11/2019 23:02, Chris Wilson wrote:
> Now that we never allow the intel_wakeref callbacks to be invoked from
> interrupt context, we do not need the irqsafe spinlock for the timeline.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/intel_gt_requests.c |  9 ++++-----
>   drivers/gpu/drm/i915/gt/intel_reset.c       |  9 ++++-----
>   drivers/gpu/drm/i915/gt/intel_timeline.c    | 10 ++++------
>   3 files changed, 12 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> index 7559d6373f49..74356db43325 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> @@ -33,7 +33,6 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
>   {
>   	struct intel_gt_timelines *timelines = &gt->timelines;
>   	struct intel_timeline *tl, *tn;
> -	unsigned long flags;
>   	bool interruptible;
>   	LIST_HEAD(free);
>   
> @@ -43,7 +42,7 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
>   
>   	flush_submission(gt); /* kick the ksoftirqd tasklets */
>   
> -	spin_lock_irqsave(&timelines->lock, flags);
> +	spin_lock(&timelines->lock);
>   	list_for_each_entry_safe(tl, tn, &timelines->active_list, link) {
>   		if (!mutex_trylock(&tl->mutex))
>   			continue;
> @@ -51,7 +50,7 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
>   		intel_timeline_get(tl);
>   		GEM_BUG_ON(!atomic_read(&tl->active_count));
>   		atomic_inc(&tl->active_count); /* pin the list element */
> -		spin_unlock_irqrestore(&timelines->lock, flags);
> +		spin_unlock(&timelines->lock);
>   
>   		if (timeout > 0) {
>   			struct dma_fence *fence;
> @@ -67,7 +66,7 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
>   
>   		retire_requests(tl);
>   
> -		spin_lock_irqsave(&timelines->lock, flags);
> +		spin_lock(&timelines->lock);
>   
>   		/* Resume iteration after dropping lock */
>   		list_safe_reset_next(tl, tn, link);
> @@ -82,7 +81,7 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
>   			list_add(&tl->link, &free);
>   		}
>   	}
> -	spin_unlock_irqrestore(&timelines->lock, flags);
> +	spin_unlock(&timelines->lock);
>   
>   	list_for_each_entry_safe(tl, tn, &free, link)
>   		__intel_timeline_free(&tl->kref);
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> index 0388f9375366..36189238e13c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -831,7 +831,6 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
>   {
>   	struct intel_gt_timelines *timelines = &gt->timelines;
>   	struct intel_timeline *tl;
> -	unsigned long flags;
>   	bool ok;
>   
>   	if (!test_bit(I915_WEDGED, &gt->reset.flags))
> @@ -853,7 +852,7 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
>   	 *
>   	 * No more can be submitted until we reset the wedged bit.
>   	 */
> -	spin_lock_irqsave(&timelines->lock, flags);
> +	spin_lock(&timelines->lock);
>   	list_for_each_entry(tl, &timelines->active_list, link) {
>   		struct dma_fence *fence;
>   
> @@ -861,7 +860,7 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
>   		if (!fence)
>   			continue;
>   
> -		spin_unlock_irqrestore(&timelines->lock, flags);
> +		spin_unlock(&timelines->lock);
>   
>   		/*
>   		 * All internal dependencies (i915_requests) will have
> @@ -874,10 +873,10 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
>   		dma_fence_put(fence);
>   
>   		/* Restart iteration after droping lock */
> -		spin_lock_irqsave(&timelines->lock, flags);
> +		spin_lock(&timelines->lock);
>   		tl = list_entry(&timelines->active_list, typeof(*tl), link);
>   	}
> -	spin_unlock_irqrestore(&timelines->lock, flags);
> +	spin_unlock(&timelines->lock);
>   
>   	/* We must reset pending GPU events before restoring our submission */
>   	ok = !HAS_EXECLISTS(gt->i915); /* XXX better agnosticism desired */
> diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
> index 4f914f0d5eab..bd973d950064 100644
> --- a/drivers/gpu/drm/i915/gt/intel_timeline.c
> +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
> @@ -332,7 +332,6 @@ int intel_timeline_pin(struct intel_timeline *tl)
>   void intel_timeline_enter(struct intel_timeline *tl)
>   {
>   	struct intel_gt_timelines *timelines = &tl->gt->timelines;
> -	unsigned long flags;
>   
>   	/*
>   	 * Pretend we are serialised by the timeline->mutex.
> @@ -358,16 +357,15 @@ void intel_timeline_enter(struct intel_timeline *tl)
>   	if (atomic_add_unless(&tl->active_count, 1, 0))
>   		return;
>   
> -	spin_lock_irqsave(&timelines->lock, flags);
> +	spin_lock(&timelines->lock);
>   	if (!atomic_fetch_inc(&tl->active_count))
>   		list_add(&tl->link, &timelines->active_list);
> -	spin_unlock_irqrestore(&timelines->lock, flags);
> +	spin_unlock(&timelines->lock);
>   }
>   
>   void intel_timeline_exit(struct intel_timeline *tl)
>   {
>   	struct intel_gt_timelines *timelines = &tl->gt->timelines;
> -	unsigned long flags;
>   
>   	/* See intel_timeline_enter() */
>   	lockdep_assert_held(&tl->mutex);
> @@ -376,10 +374,10 @@ void intel_timeline_exit(struct intel_timeline *tl)
>   	if (atomic_add_unless(&tl->active_count, -1, 1))
>   		return;
>   
> -	spin_lock_irqsave(&timelines->lock, flags);
> +	spin_lock(&timelines->lock);
>   	if (atomic_dec_and_test(&tl->active_count))
>   		list_del(&tl->link);
> -	spin_unlock_irqrestore(&timelines->lock, flags);
> +	spin_unlock(&timelines->lock);
>   
>   	/*
>   	 * Since this timeline is idle, all bariers upon which we were waiting
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 13/19] drm/i915/gt: Move new timelines to the end of active_list
@ 2019-11-19 16:02     ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2019-11-19 16:02 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/11/2019 23:02, Chris Wilson wrote:
> When adding a new active timeline, place it at the end of the list. This
> allows for intel_gt_retire_requests() to pick up the newcomer more
> quickly and hopefully complete the retirement sooner.
> 
> References: 7936a22dd466 ("drm/i915/gt: Wait for new requests in intel_gt_retire_requests()")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_timeline.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
> index bd973d950064..b190a5d9ab02 100644
> --- a/drivers/gpu/drm/i915/gt/intel_timeline.c
> +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
> @@ -359,7 +359,7 @@ void intel_timeline_enter(struct intel_timeline *tl)
>   
>   	spin_lock(&timelines->lock);
>   	if (!atomic_fetch_inc(&tl->active_count))
> -		list_add(&tl->link, &timelines->active_list);
> +		list_add_tail(&tl->link, &timelines->active_list);
>   	spin_unlock(&timelines->lock);
>   }
>   
> 

If I am not missing something this should be on the micro-optimisation 
level, well, mini-optimisation. Since for instance now it could wait on 
the most recent request and when that finishes do mostly signalled 
checks, or even less. With the change it would first sweep the already 
completed ones and then wait for the most recent one. Nevertheless, I 
don't see a problem with it so:

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 13/19] drm/i915/gt: Move new timelines to the end of active_list
@ 2019-11-19 16:02     ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2019-11-19 16:02 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/11/2019 23:02, Chris Wilson wrote:
> When adding a new active timeline, place it at the end of the list. This
> allows for intel_gt_retire_requests() to pick up the newcomer more
> quickly and hopefully complete the retirement sooner.
> 
> References: 7936a22dd466 ("drm/i915/gt: Wait for new requests in intel_gt_retire_requests()")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_timeline.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
> index bd973d950064..b190a5d9ab02 100644
> --- a/drivers/gpu/drm/i915/gt/intel_timeline.c
> +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
> @@ -359,7 +359,7 @@ void intel_timeline_enter(struct intel_timeline *tl)
>   
>   	spin_lock(&timelines->lock);
>   	if (!atomic_fetch_inc(&tl->active_count))
> -		list_add(&tl->link, &timelines->active_list);
> +		list_add_tail(&tl->link, &timelines->active_list);
>   	spin_unlock(&timelines->lock);
>   }
>   
> 

If I am not missing something this should be on the micro-optimisation 
level, well, mini-optimisation. Since for instance now it could wait on 
the most recent request and when that finishes do mostly signalled 
checks, or even less. With the change it would first sweep the already 
completed ones and then wait for the most recent one. Nevertheless, I 
don't see a problem with it so:

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 14/19] drm/i915/gt: Schedule next retirement worker first
@ 2019-11-19 16:07     ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2019-11-19 16:07 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/11/2019 23:02, Chris Wilson wrote:
> As we may park the gt during request retirement, we may then cancel the
> retirement worker only to then program the delayed worker once more.
> 
> If we schedule the next delayed retirement worker first, if we then park
> the gt, the work remain cancelled [until we unpark].
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/intel_gt_requests.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> index 74356db43325..4dc3cbeb1b36 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> @@ -109,9 +109,9 @@ static void retire_work_handler(struct work_struct *work)
>   	struct intel_gt *gt =
>   		container_of(work, typeof(*gt), requests.retire_work.work);
>   
> -	intel_gt_retire_requests(gt);
>   	schedule_delayed_work(&gt->requests.retire_work,
>   			      round_jiffies_up_relative(HZ));
> +	intel_gt_retire_requests(gt);
>   }
>   
>   void intel_gt_init_requests(struct intel_gt *gt)


Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 14/19] drm/i915/gt: Schedule next retirement worker first
@ 2019-11-19 16:07     ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2019-11-19 16:07 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/11/2019 23:02, Chris Wilson wrote:
> As we may park the gt during request retirement, we may then cancel the
> retirement worker only to then program the delayed worker once more.
> 
> If we schedule the next delayed retirement worker first, if we then park
> the gt, the work remain cancelled [until we unpark].
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/intel_gt_requests.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> index 74356db43325..4dc3cbeb1b36 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> @@ -109,9 +109,9 @@ static void retire_work_handler(struct work_struct *work)
>   	struct intel_gt *gt =
>   		container_of(work, typeof(*gt), requests.retire_work.work);
>   
> -	intel_gt_retire_requests(gt);
>   	schedule_delayed_work(&gt->requests.retire_work,
>   			      round_jiffies_up_relative(HZ));
> +	intel_gt_retire_requests(gt);
>   }
>   
>   void intel_gt_init_requests(struct intel_gt *gt)


Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 15/19] drm/i915/gt: Flush the requests after wedging on suspend
@ 2019-11-19 16:12     ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2019-11-19 16:12 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/11/2019 23:02, Chris Wilson wrote:
> Retire all requests if we resort to wedged the driver on suspend. They
> will now be idle, so we might as we free them before shutting down.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/intel_gt_pm.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> index 7a9044ac4b75..f6b5169d623f 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> @@ -256,6 +256,7 @@ static void wait_for_suspend(struct intel_gt *gt)
>   		 * the gpu quiet.
>   		 */
>   		intel_gt_set_wedged(gt);
> +		intel_gt_retire_requests(gt);
>   	}
>   
>   	intel_gt_pm_wait_for_idle(gt);
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Or given that parking is now sync it could work to make 
intel_gt_park_requests flush and then intel_gt_pm_wait_for_idle would 
handle it. But that would keep the GPU alive for too long, given that 
request retire can run independently. So perhaps this is better.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 15/19] drm/i915/gt: Flush the requests after wedging on suspend
@ 2019-11-19 16:12     ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2019-11-19 16:12 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/11/2019 23:02, Chris Wilson wrote:
> Retire all requests if we resort to wedged the driver on suspend. They
> will now be idle, so we might as we free them before shutting down.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/intel_gt_pm.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> index 7a9044ac4b75..f6b5169d623f 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> @@ -256,6 +256,7 @@ static void wait_for_suspend(struct intel_gt *gt)
>   		 * the gpu quiet.
>   		 */
>   		intel_gt_set_wedged(gt);
> +		intel_gt_retire_requests(gt);
>   	}
>   
>   	intel_gt_pm_wait_for_idle(gt);
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Or given that parking is now sync it could work to make 
intel_gt_park_requests flush and then intel_gt_pm_wait_for_idle would 
handle it. But that would keep the GPU alive for too long, given that 
request retire can run independently. So perhaps this is better.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 07/19] drm/i915: Mark up the calling context for intel_wakeref_put()
@ 2019-11-19 16:12       ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-19 16:12 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-11-19 15:57:24)
> 
> On 18/11/2019 23:02, Chris Wilson wrote:
> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > index f7c8fec436a9..fec46afb9264 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > @@ -1173,7 +1173,7 @@ __execlists_schedule_out(struct i915_request *rq,
> >   
> >       intel_engine_context_out(engine);
> >       execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
> > -     intel_gt_pm_put(engine->gt);
> > +     intel_gt_pm_put_async(engine->gt);
> 
> Wait a minute.. wasn't the wakeref hierarchy supposed to be engine -> gt 
> so this place should be on the engine level?

Ssh. I lied.

We skip the engine-pm here for the CS interrupts so that we are not kept
waiting to call engine_park().

The excuse is that this wakeref for the GT interrupt, not the engine :)

> > diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> > index b7007cd78c6f..0388f9375366 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> > @@ -1125,7 +1125,7 @@ int intel_engine_reset(struct intel_engine_cs *engine, const char *msg)
> >   out:
> >       intel_engine_cancel_stop_cs(engine);
> >       reset_finish_engine(engine);
> > -     intel_engine_pm_put(engine);
> > +     intel_engine_pm_put_async(engine);
> >       return ret;
> >   }
> >   
> > diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
> > index 20b9c83f43ad..851a6c4f65c6 100644
> > --- a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
> > +++ b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
> > @@ -51,11 +51,12 @@ static int live_engine_pm(void *arg)
> >                               pr_err("intel_engine_pm_get_if_awake(%s) failed under %s\n",
> >                                      engine->name, p->name);
> >                       else
> > -                             intel_engine_pm_put(engine);
> > -                     intel_engine_pm_put(engine);
> > +                             intel_engine_pm_put_async(engine);
> > +                     intel_engine_pm_put_async(engine);
> >                       p->critical_section_end();
> >   
> > -                     /* engine wakeref is sync (instant) */
> > +                     intel_engine_pm_unlock_wait(engine);
> 
>  From the context would it be clearer to name it 
> intel_engine_pm_wait_put? sync_put? flush_put? To more clearly represent 
> it is a pair of put_async.

Possibly, I am in mourning for spin_unlock_wait() and will keep on
protesting its demise!

intel_engine_pm_flush() is perhaps the clearest description in line with
say flush_work().

> > diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
> > index 5448f37c8102..dca15ace88f6 100644
> > --- a/drivers/gpu/drm/i915/i915_active.c
> > +++ b/drivers/gpu/drm/i915/i915_active.c
> > @@ -672,12 +672,13 @@ void i915_active_acquire_barrier(struct i915_active *ref)
> >        * populated by i915_request_add_active_barriers() to point to the
> >        * request that will eventually release them.
> >        */
> > -     spin_lock_irqsave_nested(&ref->tree_lock, flags, SINGLE_DEPTH_NESTING);
> >       llist_for_each_safe(pos, next, take_preallocated_barriers(ref)) {
> >               struct active_node *node = barrier_from_ll(pos);
> >               struct intel_engine_cs *engine = barrier_to_engine(node);
> >               struct rb_node **p, *parent;
> >   
> > +             spin_lock_irqsave_nested(&ref->tree_lock, flags,
> > +                                      SINGLE_DEPTH_NESTING);
> >               parent = NULL;
> >               p = &ref->tree.rb_node;
> >               while (*p) {
> > @@ -693,12 +694,12 @@ void i915_active_acquire_barrier(struct i915_active *ref)
> >               }
> >               rb_link_node(&node->node, parent, p);
> >               rb_insert_color(&node->node, &ref->tree);
> > +             spin_unlock_irqrestore(&ref->tree_lock, flags);
> >   
> >               GEM_BUG_ON(!intel_engine_pm_is_awake(engine));
> >               llist_add(barrier_to_ll(node), &engine->barrier_tasks);
> >               intel_engine_pm_put(engine);
> >       }
> > -     spin_unlock_irqrestore(&ref->tree_lock, flags);
> 
> Pros/cons of leaving the locks where they were and using put_async, 
> versus this layout?

Usually just a single engine in the list (only for virtual engines will
there be more) so we save the worker invocation at typically no cost.
Thus getting into the engine_park() earlier while the GPU is still warm.

That and I'm still smarting from RT demanding all spin_lock_irq to be
reviewed and tightened.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 07/19] drm/i915: Mark up the calling context for intel_wakeref_put()
@ 2019-11-19 16:12       ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-19 16:12 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-11-19 15:57:24)
> 
> On 18/11/2019 23:02, Chris Wilson wrote:
> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > index f7c8fec436a9..fec46afb9264 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > @@ -1173,7 +1173,7 @@ __execlists_schedule_out(struct i915_request *rq,
> >   
> >       intel_engine_context_out(engine);
> >       execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
> > -     intel_gt_pm_put(engine->gt);
> > +     intel_gt_pm_put_async(engine->gt);
> 
> Wait a minute.. wasn't the wakeref hierarchy supposed to be engine -> gt 
> so this place should be on the engine level?

Ssh. I lied.

We skip the engine-pm here for the CS interrupts so that we are not kept
waiting to call engine_park().

The excuse is that this wakeref for the GT interrupt, not the engine :)

> > diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> > index b7007cd78c6f..0388f9375366 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> > @@ -1125,7 +1125,7 @@ int intel_engine_reset(struct intel_engine_cs *engine, const char *msg)
> >   out:
> >       intel_engine_cancel_stop_cs(engine);
> >       reset_finish_engine(engine);
> > -     intel_engine_pm_put(engine);
> > +     intel_engine_pm_put_async(engine);
> >       return ret;
> >   }
> >   
> > diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
> > index 20b9c83f43ad..851a6c4f65c6 100644
> > --- a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
> > +++ b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
> > @@ -51,11 +51,12 @@ static int live_engine_pm(void *arg)
> >                               pr_err("intel_engine_pm_get_if_awake(%s) failed under %s\n",
> >                                      engine->name, p->name);
> >                       else
> > -                             intel_engine_pm_put(engine);
> > -                     intel_engine_pm_put(engine);
> > +                             intel_engine_pm_put_async(engine);
> > +                     intel_engine_pm_put_async(engine);
> >                       p->critical_section_end();
> >   
> > -                     /* engine wakeref is sync (instant) */
> > +                     intel_engine_pm_unlock_wait(engine);
> 
>  From the context would it be clearer to name it 
> intel_engine_pm_wait_put? sync_put? flush_put? To more clearly represent 
> it is a pair of put_async.

Possibly, I am in mourning for spin_unlock_wait() and will keep on
protesting its demise!

intel_engine_pm_flush() is perhaps the clearest description in line with
say flush_work().

> > diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
> > index 5448f37c8102..dca15ace88f6 100644
> > --- a/drivers/gpu/drm/i915/i915_active.c
> > +++ b/drivers/gpu/drm/i915/i915_active.c
> > @@ -672,12 +672,13 @@ void i915_active_acquire_barrier(struct i915_active *ref)
> >        * populated by i915_request_add_active_barriers() to point to the
> >        * request that will eventually release them.
> >        */
> > -     spin_lock_irqsave_nested(&ref->tree_lock, flags, SINGLE_DEPTH_NESTING);
> >       llist_for_each_safe(pos, next, take_preallocated_barriers(ref)) {
> >               struct active_node *node = barrier_from_ll(pos);
> >               struct intel_engine_cs *engine = barrier_to_engine(node);
> >               struct rb_node **p, *parent;
> >   
> > +             spin_lock_irqsave_nested(&ref->tree_lock, flags,
> > +                                      SINGLE_DEPTH_NESTING);
> >               parent = NULL;
> >               p = &ref->tree.rb_node;
> >               while (*p) {
> > @@ -693,12 +694,12 @@ void i915_active_acquire_barrier(struct i915_active *ref)
> >               }
> >               rb_link_node(&node->node, parent, p);
> >               rb_insert_color(&node->node, &ref->tree);
> > +             spin_unlock_irqrestore(&ref->tree_lock, flags);
> >   
> >               GEM_BUG_ON(!intel_engine_pm_is_awake(engine));
> >               llist_add(barrier_to_ll(node), &engine->barrier_tasks);
> >               intel_engine_pm_put(engine);
> >       }
> > -     spin_unlock_irqrestore(&ref->tree_lock, flags);
> 
> Pros/cons of leaving the locks where they were and using put_async, 
> versus this layout?

Usually just a single engine in the list (only for virtual engines will
there be more) so we save the worker invocation at typically no cost.
Thus getting into the engine_park() earlier while the GPU is still warm.

That and I'm still smarting from RT demanding all spin_lock_irq to be
reviewed and tightened.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 11/19] drm/i915: Wait until the intel_wakeref idle callback is complete
@ 2019-11-19 16:15     ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2019-11-19 16:15 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/11/2019 23:02, Chris Wilson wrote:
> When waiting for idle, serialise with any ongoing callback so that it
> will have completed before completing the wait.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/intel_wakeref.c | 11 +++++++++--
>   1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_wakeref.c b/drivers/gpu/drm/i915/intel_wakeref.c
> index 9b29176cc1ca..91feb53b2942 100644
> --- a/drivers/gpu/drm/i915/intel_wakeref.c
> +++ b/drivers/gpu/drm/i915/intel_wakeref.c
> @@ -109,8 +109,15 @@ void __intel_wakeref_init(struct intel_wakeref *wf,
>   
>   int intel_wakeref_wait_for_idle(struct intel_wakeref *wf)
>   {
> -	return wait_var_event_killable(&wf->wakeref,
> -				       !intel_wakeref_is_active(wf));
> +	int err;
> +
> +	err = wait_var_event_killable(&wf->wakeref,
> +				      !intel_wakeref_is_active(wf));
> +	if (err)
> +		return err;
> +
> +	intel_wakeref_unlock_wait(wf);
> +	return 0;
>   }
>   
>   static void wakeref_auto_timeout(struct timer_list *t)
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 11/19] drm/i915: Wait until the intel_wakeref idle callback is complete
@ 2019-11-19 16:15     ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2019-11-19 16:15 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/11/2019 23:02, Chris Wilson wrote:
> When waiting for idle, serialise with any ongoing callback so that it
> will have completed before completing the wait.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/intel_wakeref.c | 11 +++++++++--
>   1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_wakeref.c b/drivers/gpu/drm/i915/intel_wakeref.c
> index 9b29176cc1ca..91feb53b2942 100644
> --- a/drivers/gpu/drm/i915/intel_wakeref.c
> +++ b/drivers/gpu/drm/i915/intel_wakeref.c
> @@ -109,8 +109,15 @@ void __intel_wakeref_init(struct intel_wakeref *wf,
>   
>   int intel_wakeref_wait_for_idle(struct intel_wakeref *wf)
>   {
> -	return wait_var_event_killable(&wf->wakeref,
> -				       !intel_wakeref_is_active(wf));
> +	int err;
> +
> +	err = wait_var_event_killable(&wf->wakeref,
> +				      !intel_wakeref_is_active(wf));
> +	if (err)
> +		return err;
> +
> +	intel_wakeref_unlock_wait(wf);
> +	return 0;
>   }
>   
>   static void wakeref_auto_timeout(struct timer_list *t)
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 06/19] drm/i915/gt: Schedule request retirement when submission idles
@ 2019-11-19 16:20       ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-19 16:20 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-11-19 15:04:46)
> 
> On 18/11/2019 23:02, Chris Wilson wrote:
> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > index 33ce258d484f..f7c8fec436a9 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > @@ -142,6 +142,7 @@
> >   #include "intel_engine_pm.h"
> >   #include "intel_gt.h"
> >   #include "intel_gt_pm.h"
> > +#include "intel_gt_requests.h"
> >   #include "intel_lrc_reg.h"
> >   #include "intel_mocs.h"
> >   #include "intel_reset.h"
> > @@ -2278,6 +2279,18 @@ static void execlists_submission_tasklet(unsigned long data)
> >               if (timeout && preempt_timeout(engine))
> >                       preempt_reset(engine);
> >       }
> > +
> > +     /*
> > +      * If the GPU is currently idle, retire the outstanding completed
> > +      * requests. This will allow us to enter soft-rc6 as soon as possible,
> > +      * albeit at the cost of running the retire worker much more frequently
> > +      * (over the entire GT not just this engine) and emitting more idle
> > +      * barriers (i.e. kernel context switches unpinning all that went
> > +      * before) which may add some extra latency.
> > +      */
> > +     if (intel_engine_pm_is_awake(engine) &&
> > +         !execlists_active(&engine->execlists))
> > +             intel_gt_schedule_retire_requests(engine->gt);
> 
> I am still not a fan of doing this for all platforms.

I understand. I think it makes a fair amount of sense to do early
retires, and wish to pursue that if I can show there is no harm.
 
> Its not just the cost of retirement but there is 
> intel_engine_flush_submission on all engines in there as well which we 
> cannot avoid triggering from this path.
> 
> Would it be worth experimenting with additional per-engine retire 
> workers? Most of the code could be shared, just a little bit of 
> specialization to filter on engine.

I haven't sketched out anything more than peeking at the last request on
the timeline and doing a rq->engine == engine filter. Walking the global
timeline.active_list in that case is also a nuisance.

There's definitely scope here for us using some more information from
process_csb() about which context completed and limit work to that
timeline. Hmm, something along those lines maybe...
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 06/19] drm/i915/gt: Schedule request retirement when submission idles
@ 2019-11-19 16:20       ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-19 16:20 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-11-19 15:04:46)
> 
> On 18/11/2019 23:02, Chris Wilson wrote:
> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > index 33ce258d484f..f7c8fec436a9 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > @@ -142,6 +142,7 @@
> >   #include "intel_engine_pm.h"
> >   #include "intel_gt.h"
> >   #include "intel_gt_pm.h"
> > +#include "intel_gt_requests.h"
> >   #include "intel_lrc_reg.h"
> >   #include "intel_mocs.h"
> >   #include "intel_reset.h"
> > @@ -2278,6 +2279,18 @@ static void execlists_submission_tasklet(unsigned long data)
> >               if (timeout && preempt_timeout(engine))
> >                       preempt_reset(engine);
> >       }
> > +
> > +     /*
> > +      * If the GPU is currently idle, retire the outstanding completed
> > +      * requests. This will allow us to enter soft-rc6 as soon as possible,
> > +      * albeit at the cost of running the retire worker much more frequently
> > +      * (over the entire GT not just this engine) and emitting more idle
> > +      * barriers (i.e. kernel context switches unpinning all that went
> > +      * before) which may add some extra latency.
> > +      */
> > +     if (intel_engine_pm_is_awake(engine) &&
> > +         !execlists_active(&engine->execlists))
> > +             intel_gt_schedule_retire_requests(engine->gt);
> 
> I am still not a fan of doing this for all platforms.

I understand. I think it makes a fair amount of sense to do early
retires, and wish to pursue that if I can show there is no harm.
 
> Its not just the cost of retirement but there is 
> intel_engine_flush_submission on all engines in there as well which we 
> cannot avoid triggering from this path.
> 
> Would it be worth experimenting with additional per-engine retire 
> workers? Most of the code could be shared, just a little bit of 
> specialization to filter on engine.

I haven't sketched out anything more than peeking at the last request on
the timeline and doing a rq->engine == engine filter. Walking the global
timeline.active_list in that case is also a nuisance.

There's definitely scope here for us using some more information from
process_csb() about which context completed and limit work to that
timeline. Hmm, something along those lines maybe...
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 06/19] drm/i915/gt: Schedule request retirement when submission idles
@ 2019-11-19 16:33         ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2019-11-19 16:33 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 19/11/2019 16:20, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-11-19 15:04:46)
>>
>> On 18/11/2019 23:02, Chris Wilson wrote:
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>> index 33ce258d484f..f7c8fec436a9 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>> @@ -142,6 +142,7 @@
>>>    #include "intel_engine_pm.h"
>>>    #include "intel_gt.h"
>>>    #include "intel_gt_pm.h"
>>> +#include "intel_gt_requests.h"
>>>    #include "intel_lrc_reg.h"
>>>    #include "intel_mocs.h"
>>>    #include "intel_reset.h"
>>> @@ -2278,6 +2279,18 @@ static void execlists_submission_tasklet(unsigned long data)
>>>                if (timeout && preempt_timeout(engine))
>>>                        preempt_reset(engine);
>>>        }
>>> +
>>> +     /*
>>> +      * If the GPU is currently idle, retire the outstanding completed
>>> +      * requests. This will allow us to enter soft-rc6 as soon as possible,
>>> +      * albeit at the cost of running the retire worker much more frequently
>>> +      * (over the entire GT not just this engine) and emitting more idle
>>> +      * barriers (i.e. kernel context switches unpinning all that went
>>> +      * before) which may add some extra latency.
>>> +      */
>>> +     if (intel_engine_pm_is_awake(engine) &&
>>> +         !execlists_active(&engine->execlists))
>>> +             intel_gt_schedule_retire_requests(engine->gt);
>>
>> I am still not a fan of doing this for all platforms.
> 
> I understand. I think it makes a fair amount of sense to do early
> retires, and wish to pursue that if I can show there is no harm.

It's also a bit of a layering problem.

>> Its not just the cost of retirement but there is
>> intel_engine_flush_submission on all engines in there as well which we
>> cannot avoid triggering from this path.
>>
>> Would it be worth experimenting with additional per-engine retire
>> workers? Most of the code could be shared, just a little bit of
>> specialization to filter on engine.
> 
> I haven't sketched out anything more than peeking at the last request on
> the timeline and doing a rq->engine == engine filter. Walking the global
> timeline.active_list in that case is also a nuisance.

That together with:

	flush_submission(gt, engine ? engine->mask : ALL_ENGINES);

Might be enough? At least to satisfy my concern.

Apart layering is still bad.. And I'd still limit it to when RC6 WA is 
active unless it can be shown there is no perf/power impact across 
GPU/CPU to do this everywhere.

At which point it becomes easier to just limit it because we have to 
have it there.

I also wonder if the current flush_submission wasn't the reason for 
performance regression you were seeing with this? It makes this tasklet 
wait for all other engines, if they are busy. But not sure.. perhaps it 
is work which would be done anyway.

> There's definitely scope here for us using some more information from
> process_csb() about which context completed and limit work to that
> timeline. Hmm, something along those lines maybe...

But you want to retire all timelines which have work on this particular 
physical engine. Otherwise it doesn't get parked, no?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 06/19] drm/i915/gt: Schedule request retirement when submission idles
@ 2019-11-19 16:33         ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2019-11-19 16:33 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 19/11/2019 16:20, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-11-19 15:04:46)
>>
>> On 18/11/2019 23:02, Chris Wilson wrote:
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>> index 33ce258d484f..f7c8fec436a9 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>> @@ -142,6 +142,7 @@
>>>    #include "intel_engine_pm.h"
>>>    #include "intel_gt.h"
>>>    #include "intel_gt_pm.h"
>>> +#include "intel_gt_requests.h"
>>>    #include "intel_lrc_reg.h"
>>>    #include "intel_mocs.h"
>>>    #include "intel_reset.h"
>>> @@ -2278,6 +2279,18 @@ static void execlists_submission_tasklet(unsigned long data)
>>>                if (timeout && preempt_timeout(engine))
>>>                        preempt_reset(engine);
>>>        }
>>> +
>>> +     /*
>>> +      * If the GPU is currently idle, retire the outstanding completed
>>> +      * requests. This will allow us to enter soft-rc6 as soon as possible,
>>> +      * albeit at the cost of running the retire worker much more frequently
>>> +      * (over the entire GT not just this engine) and emitting more idle
>>> +      * barriers (i.e. kernel context switches unpinning all that went
>>> +      * before) which may add some extra latency.
>>> +      */
>>> +     if (intel_engine_pm_is_awake(engine) &&
>>> +         !execlists_active(&engine->execlists))
>>> +             intel_gt_schedule_retire_requests(engine->gt);
>>
>> I am still not a fan of doing this for all platforms.
> 
> I understand. I think it makes a fair amount of sense to do early
> retires, and wish to pursue that if I can show there is no harm.

It's also a bit of a layering problem.

>> Its not just the cost of retirement but there is
>> intel_engine_flush_submission on all engines in there as well which we
>> cannot avoid triggering from this path.
>>
>> Would it be worth experimenting with additional per-engine retire
>> workers? Most of the code could be shared, just a little bit of
>> specialization to filter on engine.
> 
> I haven't sketched out anything more than peeking at the last request on
> the timeline and doing a rq->engine == engine filter. Walking the global
> timeline.active_list in that case is also a nuisance.

That together with:

	flush_submission(gt, engine ? engine->mask : ALL_ENGINES);

Might be enough? At least to satisfy my concern.

Apart layering is still bad.. And I'd still limit it to when RC6 WA is 
active unless it can be shown there is no perf/power impact across 
GPU/CPU to do this everywhere.

At which point it becomes easier to just limit it because we have to 
have it there.

I also wonder if the current flush_submission wasn't the reason for 
performance regression you were seeing with this? It makes this tasklet 
wait for all other engines, if they are busy. But not sure.. perhaps it 
is work which would be done anyway.

> There's definitely scope here for us using some more information from
> process_csb() about which context completed and limit work to that
> timeline. Hmm, something along those lines maybe...

But you want to retire all timelines which have work on this particular 
physical engine. Otherwise it doesn't get parked, no?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 06/19] drm/i915/gt: Schedule request retirement when submission idles
@ 2019-11-19 16:42           ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-19 16:42 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-11-19 16:33:18)
> 
> On 19/11/2019 16:20, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-11-19 15:04:46)
> >>
> >> On 18/11/2019 23:02, Chris Wilson wrote:
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >>> index 33ce258d484f..f7c8fec436a9 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >>> @@ -142,6 +142,7 @@
> >>>    #include "intel_engine_pm.h"
> >>>    #include "intel_gt.h"
> >>>    #include "intel_gt_pm.h"
> >>> +#include "intel_gt_requests.h"
> >>>    #include "intel_lrc_reg.h"
> >>>    #include "intel_mocs.h"
> >>>    #include "intel_reset.h"
> >>> @@ -2278,6 +2279,18 @@ static void execlists_submission_tasklet(unsigned long data)
> >>>                if (timeout && preempt_timeout(engine))
> >>>                        preempt_reset(engine);
> >>>        }
> >>> +
> >>> +     /*
> >>> +      * If the GPU is currently idle, retire the outstanding completed
> >>> +      * requests. This will allow us to enter soft-rc6 as soon as possible,
> >>> +      * albeit at the cost of running the retire worker much more frequently
> >>> +      * (over the entire GT not just this engine) and emitting more idle
> >>> +      * barriers (i.e. kernel context switches unpinning all that went
> >>> +      * before) which may add some extra latency.
> >>> +      */
> >>> +     if (intel_engine_pm_is_awake(engine) &&
> >>> +         !execlists_active(&engine->execlists))
> >>> +             intel_gt_schedule_retire_requests(engine->gt);
> >>
> >> I am still not a fan of doing this for all platforms.
> > 
> > I understand. I think it makes a fair amount of sense to do early
> > retires, and wish to pursue that if I can show there is no harm.
> 
> It's also a bit of a layering problem.

Them's fighting words! :)
 
> >> Its not just the cost of retirement but there is
> >> intel_engine_flush_submission on all engines in there as well which we
> >> cannot avoid triggering from this path.
> >>
> >> Would it be worth experimenting with additional per-engine retire
> >> workers? Most of the code could be shared, just a little bit of
> >> specialization to filter on engine.
> > 
> > I haven't sketched out anything more than peeking at the last request on
> > the timeline and doing a rq->engine == engine filter. Walking the global
> > timeline.active_list in that case is also a nuisance.
> 
> That together with:
> 
>         flush_submission(gt, engine ? engine->mask : ALL_ENGINES);
> 
> Might be enough? At least to satisfy my concern.

Aye, flushing all other when we know we only care about being idle is
definitely a weak point of the current scheme.

> Apart layering is still bad.. And I'd still limit it to when RC6 WA is 
> active unless it can be shown there is no perf/power impact across 
> GPU/CPU to do this everywhere.

Bah, keep tuning until it's a win for everyone!
 
> At which point it becomes easier to just limit it because we have to 
> have it there.
> 
> I also wonder if the current flush_submission wasn't the reason for 
> performance regression you were seeing with this? It makes this tasklet 
> wait for all other engines, if they are busy. But not sure.. perhaps it 
> is work which would be done anyway.

I haven't finished yet; but the baseline took a big nose dive so it
might be enough to hide a lot of evil.

Too bad I don't have an Icelake with to cross check with an unaffected
platform.

> > There's definitely scope here for us using some more information from
> > process_csb() about which context completed and limit work to that
> > timeline. Hmm, something along those lines maybe...
> 
> But you want to retire all timelines which have work on this particular 
> physical engine. Otherwise it doesn't get parked, no?

There I was suggesting being even more proactive, and say keeping an
llist of completed timelines. Nothing concrete yet, plenty of existing
races found already that need fixing.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 06/19] drm/i915/gt: Schedule request retirement when submission idles
@ 2019-11-19 16:42           ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-19 16:42 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-11-19 16:33:18)
> 
> On 19/11/2019 16:20, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-11-19 15:04:46)
> >>
> >> On 18/11/2019 23:02, Chris Wilson wrote:
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >>> index 33ce258d484f..f7c8fec436a9 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >>> @@ -142,6 +142,7 @@
> >>>    #include "intel_engine_pm.h"
> >>>    #include "intel_gt.h"
> >>>    #include "intel_gt_pm.h"
> >>> +#include "intel_gt_requests.h"
> >>>    #include "intel_lrc_reg.h"
> >>>    #include "intel_mocs.h"
> >>>    #include "intel_reset.h"
> >>> @@ -2278,6 +2279,18 @@ static void execlists_submission_tasklet(unsigned long data)
> >>>                if (timeout && preempt_timeout(engine))
> >>>                        preempt_reset(engine);
> >>>        }
> >>> +
> >>> +     /*
> >>> +      * If the GPU is currently idle, retire the outstanding completed
> >>> +      * requests. This will allow us to enter soft-rc6 as soon as possible,
> >>> +      * albeit at the cost of running the retire worker much more frequently
> >>> +      * (over the entire GT not just this engine) and emitting more idle
> >>> +      * barriers (i.e. kernel context switches unpinning all that went
> >>> +      * before) which may add some extra latency.
> >>> +      */
> >>> +     if (intel_engine_pm_is_awake(engine) &&
> >>> +         !execlists_active(&engine->execlists))
> >>> +             intel_gt_schedule_retire_requests(engine->gt);
> >>
> >> I am still not a fan of doing this for all platforms.
> > 
> > I understand. I think it makes a fair amount of sense to do early
> > retires, and wish to pursue that if I can show there is no harm.
> 
> It's also a bit of a layering problem.

Them's fighting words! :)
 
> >> Its not just the cost of retirement but there is
> >> intel_engine_flush_submission on all engines in there as well which we
> >> cannot avoid triggering from this path.
> >>
> >> Would it be worth experimenting with additional per-engine retire
> >> workers? Most of the code could be shared, just a little bit of
> >> specialization to filter on engine.
> > 
> > I haven't sketched out anything more than peeking at the last request on
> > the timeline and doing a rq->engine == engine filter. Walking the global
> > timeline.active_list in that case is also a nuisance.
> 
> That together with:
> 
>         flush_submission(gt, engine ? engine->mask : ALL_ENGINES);
> 
> Might be enough? At least to satisfy my concern.

Aye, flushing all other when we know we only care about being idle is
definitely a weak point of the current scheme.

> Apart layering is still bad.. And I'd still limit it to when RC6 WA is 
> active unless it can be shown there is no perf/power impact across 
> GPU/CPU to do this everywhere.

Bah, keep tuning until it's a win for everyone!
 
> At which point it becomes easier to just limit it because we have to 
> have it there.
> 
> I also wonder if the current flush_submission wasn't the reason for 
> performance regression you were seeing with this? It makes this tasklet 
> wait for all other engines, if they are busy. But not sure.. perhaps it 
> is work which would be done anyway.

I haven't finished yet; but the baseline took a big nose dive so it
might be enough to hide a lot of evil.

Too bad I don't have an Icelake with to cross check with an unaffected
platform.

> > There's definitely scope here for us using some more information from
> > process_csb() about which context completed and limit work to that
> > timeline. Hmm, something along those lines maybe...
> 
> But you want to retire all timelines which have work on this particular 
> physical engine. Otherwise it doesn't get parked, no?

There I was suggesting being even more proactive, and say keeping an
llist of completed timelines. Nothing concrete yet, plenty of existing
races found already that need fixing.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 15/19] drm/i915/gt: Flush the requests after wedging on suspend
@ 2019-11-19 17:22       ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-19 17:22 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-11-19 16:12:18)
> 
> On 18/11/2019 23:02, Chris Wilson wrote:
> > Retire all requests if we resort to wedged the driver on suspend. They
> > will now be idle, so we might as we free them before shutting down.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_gt_pm.c | 1 +
> >   1 file changed, 1 insertion(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> > index 7a9044ac4b75..f6b5169d623f 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> > @@ -256,6 +256,7 @@ static void wait_for_suspend(struct intel_gt *gt)
> >                * the gpu quiet.
> >                */
> >               intel_gt_set_wedged(gt);
> > +             intel_gt_retire_requests(gt);
> >       }
> >   
> >       intel_gt_pm_wait_for_idle(gt);
> > 
> 
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Or given that parking is now sync it could work to make 
> intel_gt_park_requests flush and then intel_gt_pm_wait_for_idle would 
> handle it. But that would keep the GPU alive for too long, given that 
> request retire can run independently. So perhaps this is better.

It's the unlikely path, so favours the simpler hammer.

It's what we used to do, dropped and then forgotten as the mutexes were
moved around. Hopefully, it still makes sense tomorrow.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 15/19] drm/i915/gt: Flush the requests after wedging on suspend
@ 2019-11-19 17:22       ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-19 17:22 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-11-19 16:12:18)
> 
> On 18/11/2019 23:02, Chris Wilson wrote:
> > Retire all requests if we resort to wedged the driver on suspend. They
> > will now be idle, so we might as we free them before shutting down.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_gt_pm.c | 1 +
> >   1 file changed, 1 insertion(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> > index 7a9044ac4b75..f6b5169d623f 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> > @@ -256,6 +256,7 @@ static void wait_for_suspend(struct intel_gt *gt)
> >                * the gpu quiet.
> >                */
> >               intel_gt_set_wedged(gt);
> > +             intel_gt_retire_requests(gt);
> >       }
> >   
> >       intel_gt_pm_wait_for_idle(gt);
> > 
> 
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Or given that parking is now sync it could work to make 
> intel_gt_park_requests flush and then intel_gt_pm_wait_for_idle would 
> handle it. But that would keep the GPU alive for too long, given that 
> request retire can run independently. So perhaps this is better.

It's the unlikely path, so favours the simpler hammer.

It's what we used to do, dropped and then forgotten as the mutexes were
moved around. Hopefully, it still makes sense tomorrow.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 06/19] drm/i915/gt: Schedule request retirement when submission idles
@ 2019-11-19 18:58             ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-19 18:58 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Chris Wilson (2019-11-19 16:42:28)
> Quoting Tvrtko Ursulin (2019-11-19 16:33:18)
> > I also wonder if the current flush_submission wasn't the reason for 
> > performance regression you were seeing with this? It makes this tasklet 
> > wait for all other engines, if they are busy. But not sure.. perhaps it 
> > is work which would be done anyway.
> 
> I haven't finished yet; but the baseline took a big nose dive so it
> might be enough to hide a lot of evil.

Early results yet, but the extra work is exacerbating the regressions in
gem_wsim, enough that this cannot land as is and we do need to be
smarter.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 06/19] drm/i915/gt: Schedule request retirement when submission idles
@ 2019-11-19 18:58             ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-19 18:58 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Chris Wilson (2019-11-19 16:42:28)
> Quoting Tvrtko Ursulin (2019-11-19 16:33:18)
> > I also wonder if the current flush_submission wasn't the reason for 
> > performance regression you were seeing with this? It makes this tasklet 
> > wait for all other engines, if they are busy. But not sure.. perhaps it 
> > is work which would be done anyway.
> 
> I haven't finished yet; but the baseline took a big nose dive so it
> might be enough to hide a lot of evil.

Early results yet, but the extra work is exacerbating the regressions in
gem_wsim, enough that this cannot land as is and we do need to be
smarter.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* ✗ Fi.CI.BUILD: failure for series starting with [01/19] drm/i915/selftests: Force bonded submission to overlap (rev2)
@ 2019-11-19 19:04   ` Patchwork
  0 siblings, 0 replies; 90+ messages in thread
From: Patchwork @ 2019-11-19 19:04 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/19] drm/i915/selftests: Force bonded submission to overlap (rev2)
URL   : https://patchwork.freedesktop.org/series/69647/
State : failure

== Summary ==

Applying: drm/i915/selftests: Force bonded submission to overlap
Applying: drm/i915/gem: Manually dump the debug trace on GEM_BUG_ON
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/i915/i915_gem.h
Falling back to patching base and 3-way merge...
No changes -- Patch already applied.
Applying: drm/i915/gt: Close race between engine_park and intel_gt_retire_requests
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/i915/gt/intel_gt_requests.c
M	drivers/gpu/drm/i915/gt/intel_timeline.c
Falling back to patching base and 3-way merge...
Auto-merging drivers/gpu/drm/i915/gt/intel_timeline.c
CONFLICT (content): Merge conflict in drivers/gpu/drm/i915/gt/intel_timeline.c
Auto-merging drivers/gpu/drm/i915/gt/intel_gt_requests.c
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch' to see the failed patch
Patch failed at 0003 drm/i915/gt: Close race between engine_park and intel_gt_retire_requests
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [01/19] drm/i915/selftests: Force bonded submission to overlap (rev2)
@ 2019-11-19 19:04   ` Patchwork
  0 siblings, 0 replies; 90+ messages in thread
From: Patchwork @ 2019-11-19 19:04 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/19] drm/i915/selftests: Force bonded submission to overlap (rev2)
URL   : https://patchwork.freedesktop.org/series/69647/
State : failure

== Summary ==

Applying: drm/i915/selftests: Force bonded submission to overlap
Applying: drm/i915/gem: Manually dump the debug trace on GEM_BUG_ON
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/i915/i915_gem.h
Falling back to patching base and 3-way merge...
No changes -- Patch already applied.
Applying: drm/i915/gt: Close race between engine_park and intel_gt_retire_requests
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/i915/gt/intel_gt_requests.c
M	drivers/gpu/drm/i915/gt/intel_timeline.c
Falling back to patching base and 3-way merge...
Auto-merging drivers/gpu/drm/i915/gt/intel_timeline.c
CONFLICT (content): Merge conflict in drivers/gpu/drm/i915/gt/intel_timeline.c
Auto-merging drivers/gpu/drm/i915/gt/intel_gt_requests.c
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch' to see the failed patch
Patch failed at 0003 drm/i915/gt: Close race between engine_park and intel_gt_retire_requests
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 03/19] drm/i915/gt: Close race between engine_park and intel_gt_retire_requests
@ 2019-11-20 11:39         ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2019-11-20 11:39 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Matthew Auld


On 19/11/2019 14:41, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-11-19 14:15:49)
>>
>> On 18/11/2019 23:02, Chris Wilson wrote:
>>> The general concept was that intel_timeline.active_count was locked by
>>> the intel_timeline.mutex. The exception was for power management, where
>>> the engine->kernel_context->timeline could be manipulated under the
>>> global wakeref.mutex.
>>>
>>> This was quite solid, as we always manipulated the timeline only while
>>> we held an engine wakeref.
>>>
>>> And then we started retiring requests outside of struct_mutex, only
>>> using the timelines.active_list and the timeline->mutex. There we
>>> started manipulating intel_timeline.active_count outside of an engine
>>> wakeref, and so introduced a race between __engine_park() and
>>> intel_gt_retire_requests(), a race that could result in the
>>> engine->kernel_context not being added to the active timelines and so
>>> losing requests, which caused us to keep the system permanently powered
>>> up [and unloadable].
>>>
>>> The race would be easy to close if we could take the engine wakeref for
>>> the timeline before we retire -- except timelines are not bound to any
>>> engine and so we would need to keep all active engines awake. The
>>> alternative is to guard intel_timeline_enter/intel_timeline_exit for use
>>> outside of the timeline->mutex.
>>>
>>> Fixes: e5dadff4b093 ("drm/i915: Protect request retirement with timeline->mutex")
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Matthew Auld <matthew.auld@intel.com>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/gt/intel_gt_requests.c   |  8 ++---
>>>    drivers/gpu/drm/i915/gt/intel_timeline.c      | 34 +++++++++++++++----
>>>    .../gpu/drm/i915/gt/intel_timeline_types.h    |  2 +-
>>>    3 files changed, 32 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
>>> index a79e6efb31a2..7559d6373f49 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
>>> @@ -49,8 +49,8 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
>>>                        continue;
>>>    
>>>                intel_timeline_get(tl);
>>> -             GEM_BUG_ON(!tl->active_count);
>>> -             tl->active_count++; /* pin the list element */
>>> +             GEM_BUG_ON(!atomic_read(&tl->active_count));
>>> +             atomic_inc(&tl->active_count); /* pin the list element */
>>>                spin_unlock_irqrestore(&timelines->lock, flags);
>>>    
>>>                if (timeout > 0) {
>>> @@ -71,14 +71,14 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
>>>    
>>>                /* Resume iteration after dropping lock */
>>>                list_safe_reset_next(tl, tn, link);
>>> -             if (!--tl->active_count)
>>> +             if (atomic_dec_and_test(&tl->active_count))
>>>                        list_del(&tl->link);
>>>    
>>>                mutex_unlock(&tl->mutex);
>>>    
>>>                /* Defer the final release to after the spinlock */
>>>                if (refcount_dec_and_test(&tl->kref.refcount)) {
>>> -                     GEM_BUG_ON(tl->active_count);
>>> +                     GEM_BUG_ON(atomic_read(&tl->active_count));
>>>                        list_add(&tl->link, &free);
>>>                }
>>>        }
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
>>> index 16a9e88d93de..4f914f0d5eab 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_timeline.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
>>> @@ -334,15 +334,33 @@ void intel_timeline_enter(struct intel_timeline *tl)
>>>        struct intel_gt_timelines *timelines = &tl->gt->timelines;
>>>        unsigned long flags;
>>>    
>>> +     /*
>>> +      * Pretend we are serialised by the timeline->mutex.
>>> +      *
>>> +      * While generally true, there are a few exceptions to the rule
>>> +      * for the engine->kernel_context being used to manage power
>>> +      * transitions. As the engine_park may be called from under any
>>> +      * timeline, it uses the power mutex as a global serialisation
>>> +      * lock to prevent any other request entering its timeline.
>>> +      *
>>> +      * The rule is generally tl->mutex, otherwise engine->wakeref.mutex.
>>> +      *
>>> +      * However, intel_gt_retire_request() does not know which engine
>>> +      * it is retiring along and so cannot partake in the engine-pm
>>> +      * barrier, and there we use the tl->active_count as a means to
>>> +      * pin the timeline in the active_list while the locks are dropped.
>>> +      * Ergo, as that is outside of the engine-pm barrier, we need to
>>> +      * use atomic to manipulate tl->active_count.
>>> +      */
>>>        lockdep_assert_held(&tl->mutex);
>>> -
>>>        GEM_BUG_ON(!atomic_read(&tl->pin_count));
>>> -     if (tl->active_count++)
>>> +
>>> +     if (atomic_add_unless(&tl->active_count, 1, 0))
>>>                return;
>>> -     GEM_BUG_ON(!tl->active_count); /* overflow? */
>>>    
>>>        spin_lock_irqsave(&timelines->lock, flags);
>>> -     list_add(&tl->link, &timelines->active_list);
>>> +     if (!atomic_fetch_inc(&tl->active_count))
>>> +             list_add(&tl->link, &timelines->active_list);
>>
>> So retirement raced with this and has elevated the active_count? But
>> retirement does not add the timeline to the list, so we exit here
>> without it on the active_list.
> 
> Retirement only sees an element on the active_list. What we observed in
> practice was the inc/dec on tl->active_count racing, causing
> indeterminate results, with the result that we removed the element from
> the active_list while it had a raised tl->active_count (due to the
> inflight posting from the other CPU).
> 
> Thus we kept requests inflight and the engine awake with no way to clear
> them. This most obvious triggered GEM_BUG_ON(gt->awake) during suspend,
> and is also responsible for the timeouts on gem_quiescent_gpu() or
> igt_drop_caches_set(DROP_IDLE).

I understand (I think) the race where retirement races with parking. But 
this side of things (intel_timeline_enter) does not seem to be involved 
in that. intel_timeline_enter is the only place which puts the timeline 
onto the active_list. How can two of them race?

And also, what remains to be purpose of timelines->lock?

Regards,

Tvrtko








_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 03/19] drm/i915/gt: Close race between engine_park and intel_gt_retire_requests
@ 2019-11-20 11:39         ` Tvrtko Ursulin
  0 siblings, 0 replies; 90+ messages in thread
From: Tvrtko Ursulin @ 2019-11-20 11:39 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Matthew Auld


On 19/11/2019 14:41, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-11-19 14:15:49)
>>
>> On 18/11/2019 23:02, Chris Wilson wrote:
>>> The general concept was that intel_timeline.active_count was locked by
>>> the intel_timeline.mutex. The exception was for power management, where
>>> the engine->kernel_context->timeline could be manipulated under the
>>> global wakeref.mutex.
>>>
>>> This was quite solid, as we always manipulated the timeline only while
>>> we held an engine wakeref.
>>>
>>> And then we started retiring requests outside of struct_mutex, only
>>> using the timelines.active_list and the timeline->mutex. There we
>>> started manipulating intel_timeline.active_count outside of an engine
>>> wakeref, and so introduced a race between __engine_park() and
>>> intel_gt_retire_requests(), a race that could result in the
>>> engine->kernel_context not being added to the active timelines and so
>>> losing requests, which caused us to keep the system permanently powered
>>> up [and unloadable].
>>>
>>> The race would be easy to close if we could take the engine wakeref for
>>> the timeline before we retire -- except timelines are not bound to any
>>> engine and so we would need to keep all active engines awake. The
>>> alternative is to guard intel_timeline_enter/intel_timeline_exit for use
>>> outside of the timeline->mutex.
>>>
>>> Fixes: e5dadff4b093 ("drm/i915: Protect request retirement with timeline->mutex")
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Matthew Auld <matthew.auld@intel.com>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/gt/intel_gt_requests.c   |  8 ++---
>>>    drivers/gpu/drm/i915/gt/intel_timeline.c      | 34 +++++++++++++++----
>>>    .../gpu/drm/i915/gt/intel_timeline_types.h    |  2 +-
>>>    3 files changed, 32 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
>>> index a79e6efb31a2..7559d6373f49 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
>>> @@ -49,8 +49,8 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
>>>                        continue;
>>>    
>>>                intel_timeline_get(tl);
>>> -             GEM_BUG_ON(!tl->active_count);
>>> -             tl->active_count++; /* pin the list element */
>>> +             GEM_BUG_ON(!atomic_read(&tl->active_count));
>>> +             atomic_inc(&tl->active_count); /* pin the list element */
>>>                spin_unlock_irqrestore(&timelines->lock, flags);
>>>    
>>>                if (timeout > 0) {
>>> @@ -71,14 +71,14 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
>>>    
>>>                /* Resume iteration after dropping lock */
>>>                list_safe_reset_next(tl, tn, link);
>>> -             if (!--tl->active_count)
>>> +             if (atomic_dec_and_test(&tl->active_count))
>>>                        list_del(&tl->link);
>>>    
>>>                mutex_unlock(&tl->mutex);
>>>    
>>>                /* Defer the final release to after the spinlock */
>>>                if (refcount_dec_and_test(&tl->kref.refcount)) {
>>> -                     GEM_BUG_ON(tl->active_count);
>>> +                     GEM_BUG_ON(atomic_read(&tl->active_count));
>>>                        list_add(&tl->link, &free);
>>>                }
>>>        }
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
>>> index 16a9e88d93de..4f914f0d5eab 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_timeline.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
>>> @@ -334,15 +334,33 @@ void intel_timeline_enter(struct intel_timeline *tl)
>>>        struct intel_gt_timelines *timelines = &tl->gt->timelines;
>>>        unsigned long flags;
>>>    
>>> +     /*
>>> +      * Pretend we are serialised by the timeline->mutex.
>>> +      *
>>> +      * While generally true, there are a few exceptions to the rule
>>> +      * for the engine->kernel_context being used to manage power
>>> +      * transitions. As the engine_park may be called from under any
>>> +      * timeline, it uses the power mutex as a global serialisation
>>> +      * lock to prevent any other request entering its timeline.
>>> +      *
>>> +      * The rule is generally tl->mutex, otherwise engine->wakeref.mutex.
>>> +      *
>>> +      * However, intel_gt_retire_request() does not know which engine
>>> +      * it is retiring along and so cannot partake in the engine-pm
>>> +      * barrier, and there we use the tl->active_count as a means to
>>> +      * pin the timeline in the active_list while the locks are dropped.
>>> +      * Ergo, as that is outside of the engine-pm barrier, we need to
>>> +      * use atomic to manipulate tl->active_count.
>>> +      */
>>>        lockdep_assert_held(&tl->mutex);
>>> -
>>>        GEM_BUG_ON(!atomic_read(&tl->pin_count));
>>> -     if (tl->active_count++)
>>> +
>>> +     if (atomic_add_unless(&tl->active_count, 1, 0))
>>>                return;
>>> -     GEM_BUG_ON(!tl->active_count); /* overflow? */
>>>    
>>>        spin_lock_irqsave(&timelines->lock, flags);
>>> -     list_add(&tl->link, &timelines->active_list);
>>> +     if (!atomic_fetch_inc(&tl->active_count))
>>> +             list_add(&tl->link, &timelines->active_list);
>>
>> So retirement raced with this and has elevated the active_count? But
>> retirement does not add the timeline to the list, so we exit here
>> without it on the active_list.
> 
> Retirement only sees an element on the active_list. What we observed in
> practice was the inc/dec on tl->active_count racing, causing
> indeterminate results, with the result that we removed the element from
> the active_list while it had a raised tl->active_count (due to the
> inflight posting from the other CPU).
> 
> Thus we kept requests inflight and the engine awake with no way to clear
> them. This most obvious triggered GEM_BUG_ON(gt->awake) during suspend,
> and is also responsible for the timeouts on gem_quiescent_gpu() or
> igt_drop_caches_set(DROP_IDLE).

I understand (I think) the race where retirement races with parking. But 
this side of things (intel_timeline_enter) does not seem to be involved 
in that. intel_timeline_enter is the only place which puts the timeline 
onto the active_list. How can two of them race?

And also, what remains to be purpose of timelines->lock?

Regards,

Tvrtko








_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH 03/19] drm/i915/gt: Close race between engine_park and intel_gt_retire_requests
@ 2019-11-20 11:51           ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-20 11:51 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: Matthew Auld

Quoting Tvrtko Ursulin (2019-11-20 11:39:00)
> 
> On 19/11/2019 14:41, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-11-19 14:15:49)
> >>
> >> On 18/11/2019 23:02, Chris Wilson wrote:
> >>> The general concept was that intel_timeline.active_count was locked by
> >>> the intel_timeline.mutex. The exception was for power management, where
> >>> the engine->kernel_context->timeline could be manipulated under the
> >>> global wakeref.mutex.
> >>>
> >>> This was quite solid, as we always manipulated the timeline only while
> >>> we held an engine wakeref.
> >>>
> >>> And then we started retiring requests outside of struct_mutex, only
> >>> using the timelines.active_list and the timeline->mutex. There we
> >>> started manipulating intel_timeline.active_count outside of an engine
> >>> wakeref, and so introduced a race between __engine_park() and
> >>> intel_gt_retire_requests(), a race that could result in the
> >>> engine->kernel_context not being added to the active timelines and so
> >>> losing requests, which caused us to keep the system permanently powered
> >>> up [and unloadable].
> >>>
> >>> The race would be easy to close if we could take the engine wakeref for
> >>> the timeline before we retire -- except timelines are not bound to any
> >>> engine and so we would need to keep all active engines awake. The
> >>> alternative is to guard intel_timeline_enter/intel_timeline_exit for use
> >>> outside of the timeline->mutex.
> >>>
> >>> Fixes: e5dadff4b093 ("drm/i915: Protect request retirement with timeline->mutex")
> >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>> Cc: Matthew Auld <matthew.auld@intel.com>
> >>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>> ---
> >>>    drivers/gpu/drm/i915/gt/intel_gt_requests.c   |  8 ++---
> >>>    drivers/gpu/drm/i915/gt/intel_timeline.c      | 34 +++++++++++++++----
> >>>    .../gpu/drm/i915/gt/intel_timeline_types.h    |  2 +-
> >>>    3 files changed, 32 insertions(+), 12 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> >>> index a79e6efb31a2..7559d6373f49 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> >>> @@ -49,8 +49,8 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
> >>>                        continue;
> >>>    
> >>>                intel_timeline_get(tl);
> >>> -             GEM_BUG_ON(!tl->active_count);
> >>> -             tl->active_count++; /* pin the list element */
> >>> +             GEM_BUG_ON(!atomic_read(&tl->active_count));
> >>> +             atomic_inc(&tl->active_count); /* pin the list element */
> >>>                spin_unlock_irqrestore(&timelines->lock, flags);
> >>>    
> >>>                if (timeout > 0) {
> >>> @@ -71,14 +71,14 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
> >>>    
> >>>                /* Resume iteration after dropping lock */
> >>>                list_safe_reset_next(tl, tn, link);
> >>> -             if (!--tl->active_count)
> >>> +             if (atomic_dec_and_test(&tl->active_count))
> >>>                        list_del(&tl->link);
> >>>    
> >>>                mutex_unlock(&tl->mutex);
> >>>    
> >>>                /* Defer the final release to after the spinlock */
> >>>                if (refcount_dec_and_test(&tl->kref.refcount)) {
> >>> -                     GEM_BUG_ON(tl->active_count);
> >>> +                     GEM_BUG_ON(atomic_read(&tl->active_count));
> >>>                        list_add(&tl->link, &free);
> >>>                }
> >>>        }
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
> >>> index 16a9e88d93de..4f914f0d5eab 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_timeline.c
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
> >>> @@ -334,15 +334,33 @@ void intel_timeline_enter(struct intel_timeline *tl)
> >>>        struct intel_gt_timelines *timelines = &tl->gt->timelines;
> >>>        unsigned long flags;
> >>>    
> >>> +     /*
> >>> +      * Pretend we are serialised by the timeline->mutex.
> >>> +      *
> >>> +      * While generally true, there are a few exceptions to the rule
> >>> +      * for the engine->kernel_context being used to manage power
> >>> +      * transitions. As the engine_park may be called from under any
> >>> +      * timeline, it uses the power mutex as a global serialisation
> >>> +      * lock to prevent any other request entering its timeline.
> >>> +      *
> >>> +      * The rule is generally tl->mutex, otherwise engine->wakeref.mutex.
> >>> +      *
> >>> +      * However, intel_gt_retire_request() does not know which engine
> >>> +      * it is retiring along and so cannot partake in the engine-pm
> >>> +      * barrier, and there we use the tl->active_count as a means to
> >>> +      * pin the timeline in the active_list while the locks are dropped.
> >>> +      * Ergo, as that is outside of the engine-pm barrier, we need to
> >>> +      * use atomic to manipulate tl->active_count.
> >>> +      */
> >>>        lockdep_assert_held(&tl->mutex);
> >>> -
> >>>        GEM_BUG_ON(!atomic_read(&tl->pin_count));
> >>> -     if (tl->active_count++)
> >>> +
> >>> +     if (atomic_add_unless(&tl->active_count, 1, 0))
> >>>                return;
> >>> -     GEM_BUG_ON(!tl->active_count); /* overflow? */
> >>>    
> >>>        spin_lock_irqsave(&timelines->lock, flags);
> >>> -     list_add(&tl->link, &timelines->active_list);
> >>> +     if (!atomic_fetch_inc(&tl->active_count))
> >>> +             list_add(&tl->link, &timelines->active_list);
> >>
> >> So retirement raced with this and has elevated the active_count? But
> >> retirement does not add the timeline to the list, so we exit here
> >> without it on the active_list.
> > 
> > Retirement only sees an element on the active_list. What we observed in
> > practice was the inc/dec on tl->active_count racing, causing
> > indeterminate results, with the result that we removed the element from
> > the active_list while it had a raised tl->active_count (due to the
> > inflight posting from the other CPU).
> > 
> > Thus we kept requests inflight and the engine awake with no way to clear
> > them. This most obvious triggered GEM_BUG_ON(gt->awake) during suspend,
> > and is also responsible for the timeouts on gem_quiescent_gpu() or
> > igt_drop_caches_set(DROP_IDLE).
> 
> I understand (I think) the race where retirement races with parking. But 
> this side of things (intel_timeline_enter) does not seem to be involved 
> in that. intel_timeline_enter is the only place which puts the timeline 
> onto the active_list. How can two of them race?

There should not be concurrent intel_timeline_enter(), or concurrent
intel_timeline_exit()/intel_timeline_enter(), as they are serialised by a
mixture of engine-pm or timeline->mutex. The complexity is from
intel_gt_retire_requests which does not have the engine wakeref and so
we have a potential data race between parking and retiring.

> And also, what remains to be purpose of timelines->lock?

It protects the list, which is how we find the timelines to retire
(before we can take the timeline->mutex).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [Intel-gfx] [PATCH 03/19] drm/i915/gt: Close race between engine_park and intel_gt_retire_requests
@ 2019-11-20 11:51           ` Chris Wilson
  0 siblings, 0 replies; 90+ messages in thread
From: Chris Wilson @ 2019-11-20 11:51 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: Matthew Auld

Quoting Tvrtko Ursulin (2019-11-20 11:39:00)
> 
> On 19/11/2019 14:41, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-11-19 14:15:49)
> >>
> >> On 18/11/2019 23:02, Chris Wilson wrote:
> >>> The general concept was that intel_timeline.active_count was locked by
> >>> the intel_timeline.mutex. The exception was for power management, where
> >>> the engine->kernel_context->timeline could be manipulated under the
> >>> global wakeref.mutex.
> >>>
> >>> This was quite solid, as we always manipulated the timeline only while
> >>> we held an engine wakeref.
> >>>
> >>> And then we started retiring requests outside of struct_mutex, only
> >>> using the timelines.active_list and the timeline->mutex. There we
> >>> started manipulating intel_timeline.active_count outside of an engine
> >>> wakeref, and so introduced a race between __engine_park() and
> >>> intel_gt_retire_requests(), a race that could result in the
> >>> engine->kernel_context not being added to the active timelines and so
> >>> losing requests, which caused us to keep the system permanently powered
> >>> up [and unloadable].
> >>>
> >>> The race would be easy to close if we could take the engine wakeref for
> >>> the timeline before we retire -- except timelines are not bound to any
> >>> engine and so we would need to keep all active engines awake. The
> >>> alternative is to guard intel_timeline_enter/intel_timeline_exit for use
> >>> outside of the timeline->mutex.
> >>>
> >>> Fixes: e5dadff4b093 ("drm/i915: Protect request retirement with timeline->mutex")
> >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>> Cc: Matthew Auld <matthew.auld@intel.com>
> >>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>> ---
> >>>    drivers/gpu/drm/i915/gt/intel_gt_requests.c   |  8 ++---
> >>>    drivers/gpu/drm/i915/gt/intel_timeline.c      | 34 +++++++++++++++----
> >>>    .../gpu/drm/i915/gt/intel_timeline_types.h    |  2 +-
> >>>    3 files changed, 32 insertions(+), 12 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> >>> index a79e6efb31a2..7559d6373f49 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> >>> @@ -49,8 +49,8 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
> >>>                        continue;
> >>>    
> >>>                intel_timeline_get(tl);
> >>> -             GEM_BUG_ON(!tl->active_count);
> >>> -             tl->active_count++; /* pin the list element */
> >>> +             GEM_BUG_ON(!atomic_read(&tl->active_count));
> >>> +             atomic_inc(&tl->active_count); /* pin the list element */
> >>>                spin_unlock_irqrestore(&timelines->lock, flags);
> >>>    
> >>>                if (timeout > 0) {
> >>> @@ -71,14 +71,14 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
> >>>    
> >>>                /* Resume iteration after dropping lock */
> >>>                list_safe_reset_next(tl, tn, link);
> >>> -             if (!--tl->active_count)
> >>> +             if (atomic_dec_and_test(&tl->active_count))
> >>>                        list_del(&tl->link);
> >>>    
> >>>                mutex_unlock(&tl->mutex);
> >>>    
> >>>                /* Defer the final release to after the spinlock */
> >>>                if (refcount_dec_and_test(&tl->kref.refcount)) {
> >>> -                     GEM_BUG_ON(tl->active_count);
> >>> +                     GEM_BUG_ON(atomic_read(&tl->active_count));
> >>>                        list_add(&tl->link, &free);
> >>>                }
> >>>        }
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
> >>> index 16a9e88d93de..4f914f0d5eab 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_timeline.c
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
> >>> @@ -334,15 +334,33 @@ void intel_timeline_enter(struct intel_timeline *tl)
> >>>        struct intel_gt_timelines *timelines = &tl->gt->timelines;
> >>>        unsigned long flags;
> >>>    
> >>> +     /*
> >>> +      * Pretend we are serialised by the timeline->mutex.
> >>> +      *
> >>> +      * While generally true, there are a few exceptions to the rule
> >>> +      * for the engine->kernel_context being used to manage power
> >>> +      * transitions. As the engine_park may be called from under any
> >>> +      * timeline, it uses the power mutex as a global serialisation
> >>> +      * lock to prevent any other request entering its timeline.
> >>> +      *
> >>> +      * The rule is generally tl->mutex, otherwise engine->wakeref.mutex.
> >>> +      *
> >>> +      * However, intel_gt_retire_request() does not know which engine
> >>> +      * it is retiring along and so cannot partake in the engine-pm
> >>> +      * barrier, and there we use the tl->active_count as a means to
> >>> +      * pin the timeline in the active_list while the locks are dropped.
> >>> +      * Ergo, as that is outside of the engine-pm barrier, we need to
> >>> +      * use atomic to manipulate tl->active_count.
> >>> +      */
> >>>        lockdep_assert_held(&tl->mutex);
> >>> -
> >>>        GEM_BUG_ON(!atomic_read(&tl->pin_count));
> >>> -     if (tl->active_count++)
> >>> +
> >>> +     if (atomic_add_unless(&tl->active_count, 1, 0))
> >>>                return;
> >>> -     GEM_BUG_ON(!tl->active_count); /* overflow? */
> >>>    
> >>>        spin_lock_irqsave(&timelines->lock, flags);
> >>> -     list_add(&tl->link, &timelines->active_list);
> >>> +     if (!atomic_fetch_inc(&tl->active_count))
> >>> +             list_add(&tl->link, &timelines->active_list);
> >>
> >> So retirement raced with this and has elevated the active_count? But
> >> retirement does not add the timeline to the list, so we exit here
> >> without it on the active_list.
> > 
> > Retirement only sees an element on the active_list. What we observed in
> > practice was the inc/dec on tl->active_count racing, causing
> > indeterminate results, with the result that we removed the element from
> > the active_list while it had a raised tl->active_count (due to the
> > inflight posting from the other CPU).
> > 
> > Thus we kept requests inflight and the engine awake with no way to clear
> > them. This most obvious triggered GEM_BUG_ON(gt->awake) during suspend,
> > and is also responsible for the timeouts on gem_quiescent_gpu() or
> > igt_drop_caches_set(DROP_IDLE).
> 
> I understand (I think) the race where retirement races with parking. But 
> this side of things (intel_timeline_enter) does not seem to be involved 
> in that. intel_timeline_enter is the only place which puts the timeline 
> onto the active_list. How can two of them race?

There should not be concurrent intel_timeline_enter(), or concurrent
intel_timeline_exit()/intel_timeline_enter(), as they are serialised by a
mixture of engine-pm or timeline->mutex. The complexity is from
intel_gt_retire_requests which does not have the engine wakeref and so
we have a potential data race between parking and retiring.

> And also, what remains to be purpose of timelines->lock?

It protects the list, which is how we find the timelines to retire
(before we can take the timeline->mutex).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 90+ messages in thread

end of thread, other threads:[~2019-11-20 11:51 UTC | newest]

Thread overview: 90+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-18 23:02 Fast soft-rc6 Chris Wilson
2019-11-18 23:02 ` [Intel-gfx] " Chris Wilson
2019-11-18 23:02 ` [PATCH 01/19] drm/i915/selftests: Force bonded submission to overlap Chris Wilson
2019-11-18 23:02   ` [Intel-gfx] " Chris Wilson
2019-11-18 23:02 ` [PATCH 02/19] drm/i915/gem: Manually dump the debug trace on GEM_BUG_ON Chris Wilson
2019-11-18 23:02   ` [Intel-gfx] " Chris Wilson
2019-11-18 23:02 ` [PATCH 03/19] drm/i915/gt: Close race between engine_park and intel_gt_retire_requests Chris Wilson
2019-11-18 23:02   ` [Intel-gfx] " Chris Wilson
2019-11-19 14:15   ` Tvrtko Ursulin
2019-11-19 14:15     ` [Intel-gfx] " Tvrtko Ursulin
2019-11-19 14:41     ` Chris Wilson
2019-11-19 14:41       ` [Intel-gfx] " Chris Wilson
2019-11-20 11:39       ` Tvrtko Ursulin
2019-11-20 11:39         ` [Intel-gfx] " Tvrtko Ursulin
2019-11-20 11:51         ` Chris Wilson
2019-11-20 11:51           ` [Intel-gfx] " Chris Wilson
2019-11-18 23:02 ` [PATCH 04/19] drm/i915/gt: Unlock engine-pm after queuing the kernel context switch Chris Wilson
2019-11-18 23:02   ` [Intel-gfx] " Chris Wilson
2019-11-19 14:35   ` Tvrtko Ursulin
2019-11-19 14:35     ` [Intel-gfx] " Tvrtko Ursulin
2019-11-19 14:50     ` Chris Wilson
2019-11-19 14:50       ` [Intel-gfx] " Chris Wilson
2019-11-19 15:03   ` [PATCH] " Chris Wilson
2019-11-19 15:03     ` [Intel-gfx] " Chris Wilson
2019-11-18 23:02 ` [PATCH 05/19] drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint Chris Wilson
2019-11-18 23:02   ` [Intel-gfx] " Chris Wilson
2019-11-19 14:54   ` Tvrtko Ursulin
2019-11-19 14:54     ` [Intel-gfx] " Tvrtko Ursulin
2019-11-18 23:02 ` [PATCH 06/19] drm/i915/gt: Schedule request retirement when submission idles Chris Wilson
2019-11-18 23:02   ` [Intel-gfx] " Chris Wilson
2019-11-19 15:04   ` Tvrtko Ursulin
2019-11-19 15:04     ` [Intel-gfx] " Tvrtko Ursulin
2019-11-19 16:20     ` Chris Wilson
2019-11-19 16:20       ` [Intel-gfx] " Chris Wilson
2019-11-19 16:33       ` Tvrtko Ursulin
2019-11-19 16:33         ` [Intel-gfx] " Tvrtko Ursulin
2019-11-19 16:42         ` Chris Wilson
2019-11-19 16:42           ` [Intel-gfx] " Chris Wilson
2019-11-19 18:58           ` Chris Wilson
2019-11-19 18:58             ` [Intel-gfx] " Chris Wilson
2019-11-18 23:02 ` [PATCH 07/19] drm/i915: Mark up the calling context for intel_wakeref_put() Chris Wilson
2019-11-18 23:02   ` [Intel-gfx] " Chris Wilson
2019-11-19 15:57   ` Tvrtko Ursulin
2019-11-19 15:57     ` [Intel-gfx] " Tvrtko Ursulin
2019-11-19 16:12     ` Chris Wilson
2019-11-19 16:12       ` [Intel-gfx] " Chris Wilson
2019-11-18 23:02 ` [PATCH 08/19] drm/i915/gem: Merge GGTT vma flush into a single loop Chris Wilson
2019-11-18 23:02   ` [Intel-gfx] " Chris Wilson
2019-11-18 23:02 ` [PATCH 09/19] drm/i915/gt: Only wait for register chipset flush if active Chris Wilson
2019-11-18 23:02   ` [Intel-gfx] " Chris Wilson
2019-11-18 23:02 ` [PATCH 10/19] drm/i915: Protect the obj->vma.list during iteration Chris Wilson
2019-11-18 23:02   ` [Intel-gfx] " Chris Wilson
2019-11-18 23:02 ` [PATCH 11/19] drm/i915: Wait until the intel_wakeref idle callback is complete Chris Wilson
2019-11-18 23:02   ` [Intel-gfx] " Chris Wilson
2019-11-19 16:15   ` Tvrtko Ursulin
2019-11-19 16:15     ` [Intel-gfx] " Tvrtko Ursulin
2019-11-18 23:02 ` [PATCH 12/19] drm/i915/gt: Declare timeline.lock to be irq-free Chris Wilson
2019-11-18 23:02   ` [Intel-gfx] " Chris Wilson
2019-11-19 15:58   ` Tvrtko Ursulin
2019-11-19 15:58     ` [Intel-gfx] " Tvrtko Ursulin
2019-11-18 23:02 ` [PATCH 13/19] drm/i915/gt: Move new timelines to the end of active_list Chris Wilson
2019-11-18 23:02   ` [Intel-gfx] " Chris Wilson
2019-11-19 16:02   ` Tvrtko Ursulin
2019-11-19 16:02     ` [Intel-gfx] " Tvrtko Ursulin
2019-11-18 23:02 ` [PATCH 14/19] drm/i915/gt: Schedule next retirement worker first Chris Wilson
2019-11-18 23:02   ` [Intel-gfx] " Chris Wilson
2019-11-19 16:07   ` Tvrtko Ursulin
2019-11-19 16:07     ` [Intel-gfx] " Tvrtko Ursulin
2019-11-18 23:02 ` [PATCH 15/19] drm/i915/gt: Flush the requests after wedging on suspend Chris Wilson
2019-11-18 23:02   ` [Intel-gfx] " Chris Wilson
2019-11-19 16:12   ` Tvrtko Ursulin
2019-11-19 16:12     ` [Intel-gfx] " Tvrtko Ursulin
2019-11-19 17:22     ` Chris Wilson
2019-11-19 17:22       ` [Intel-gfx] " Chris Wilson
2019-11-18 23:02 ` [PATCH 16/19] drm/i915/selftests: Flush the active callbacks Chris Wilson
2019-11-18 23:02   ` [Intel-gfx] " Chris Wilson
2019-11-18 23:02 ` [PATCH 17/19] drm/i915/selftests: Be explicit in ERR_PTR handling Chris Wilson
2019-11-18 23:02   ` [Intel-gfx] " Chris Wilson
2019-11-18 23:02 ` [PATCH 18/19] drm/i915/selftests: Exercise rc6 handling Chris Wilson
2019-11-18 23:02   ` [Intel-gfx] " Chris Wilson
2019-11-18 23:02 ` [PATCH 19/19] drm/i915/gt: Track engine round-trip times Chris Wilson
2019-11-18 23:02   ` [Intel-gfx] " Chris Wilson
2019-11-18 23:21 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/19] drm/i915/selftests: Force bonded submission to overlap Patchwork
2019-11-18 23:21   ` [Intel-gfx] " Patchwork
2019-11-19  0:04 ` ✓ Fi.CI.BAT: success " Patchwork
2019-11-19  0:04   ` [Intel-gfx] " Patchwork
2019-11-19  9:08 ` ✗ Fi.CI.IGT: failure " Patchwork
2019-11-19  9:08   ` [Intel-gfx] " Patchwork
2019-11-19 19:04 ` ✗ Fi.CI.BUILD: failure for series starting with [01/19] drm/i915/selftests: Force bonded submission to overlap (rev2) Patchwork
2019-11-19 19:04   ` [Intel-gfx] " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.