All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Wilson <chris@chris-wilson.co.uk>
To: intel-gfx@lists.freedesktop.org
Subject: [PATCH 3/9] drm/i915/gt: Unlock engine-pm after queuing the kernel context switch
Date: Wed, 20 Nov 2019 09:32:56 +0000	[thread overview]
Message-ID: <20191120093302.3723715-3-chris@chris-wilson.co.uk> (raw)
In-Reply-To: <20191120093302.3723715-1-chris@chris-wilson.co.uk>

In commit a79ca656b648 ("drm/i915: Push the wakeref->count deferral to
the backend"), I erroneously concluded that we last modify the engine
inside __i915_request_commit() meaning that we could enable concurrent
submission for userspace as we enqueued this request. However, this
falls into a trap with other users of the engine->kernel_context waking
up and submitting their request before the idle-switch is queued, with
the result that the kernel_context is executed out-of-sequence most
likely upsetting the GPU and certainly ourselves when we try to retire
the out-of-sequence requests.

As such we need to hold onto the effective engine->kernel_context mutex
lock (via the engine pm mutex proxy) until we have finish queuing the
request to the engine.

v2: Serialise against concurrent intel_gt_retire_requests()
v3: Describe the hairy locking scheme with intel_gt_retire_requests()
for future reference.
v4: Combine timeline->lock and engine pm release; it's hairy.

Fixes: a79ca656b648 ("drm/i915: Push the wakeref->count deferral to the backend")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_pm.c | 47 +++++++++++++++++++----
 1 file changed, 40 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 3c0f490ff2c7..1f517357a268 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -73,8 +73,25 @@ static inline void __timeline_mark_unlock(struct intel_context *ce,
 
 #endif /* !IS_ENABLED(CONFIG_LOCKDEP) */
 
+static void
+__intel_timeline_enter_and_pm_release(struct intel_timeline *tl,
+				      struct intel_engine_cs *engine)
+{
+	struct intel_gt_timelines *timelines = &engine->gt->timelines;
+
+	spin_lock(&timelines->lock);
+
+	if (!atomic_fetch_inc(&tl->active_count))
+		list_add_tail(&tl->link, &timelines->active_list);
+
+	__intel_wakeref_defer_park(&engine->wakeref);
+
+	spin_unlock(&timelines->lock);
+}
+
 static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 {
+	struct intel_context *ce = engine->kernel_context;
 	struct i915_request *rq;
 	unsigned long flags;
 	bool result = true;
@@ -98,16 +115,31 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 	 * This should hold true as we can only park the engine after
 	 * retiring the last request, thus all rings should be empty and
 	 * all timelines idle.
+	 *
+	 * For unlocking, there are 2 other parties and the GPU who have a
+	 * stake here.
+	 *
+	 * A new gpu user will be waiting on the engine-pm to start their
+	 * engine_unpark. New waiters are predicated on engine->wakeref.count
+	 * and so intel_wakeref_defer_park() acts like a mutex_unlock of the
+	 * engine->wakeref.
+	 *
+	 * The other party is intel_gt_retire_requests(), which is walking the
+	 * list of active timelines looking for completions. Meanwhile as soon
+	 * as we call __i915_request_queue(), the GPU may complete our request.
+	 * Ergo, if we put ourselves on the timelines.active_list
+	 * (se intel_timeline_enter()) before we increment the
+	 * engine->wakeref.count, we may see the request completion and retire
+	 * it causing an undeflow of the engine->wakeref.
 	 */
-	flags = __timeline_mark_lock(engine->kernel_context);
+	flags = __timeline_mark_lock(ce);
+	GEM_BUG_ON(atomic_read(&ce->timeline->active_count) < 0);
 
-	rq = __i915_request_create(engine->kernel_context, GFP_NOWAIT);
+	rq = __i915_request_create(ce, GFP_NOWAIT);
 	if (IS_ERR(rq))
 		/* Context switch failed, hope for the best! Maybe reset? */
 		goto out_unlock;
 
-	intel_timeline_enter(i915_request_timeline(rq));
-
 	/* Check again on the next retirement. */
 	engine->wakeref_serial = engine->serial + 1;
 	i915_request_add_active_barriers(rq);
@@ -116,13 +148,14 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 	rq->sched.attr.priority = I915_PRIORITY_BARRIER;
 	__i915_request_commit(rq);
 
-	/* Release our exclusive hold on the engine */
-	__intel_wakeref_defer_park(&engine->wakeref);
 	__i915_request_queue(rq, NULL);
 
+	/* Expose ourselves to intel_gt_retire_requests() and new submission */
+	__intel_timeline_enter_and_pm_release(ce->timeline, engine);
+
 	result = false;
 out_unlock:
-	__timeline_mark_unlock(engine->kernel_context, flags);
+	__timeline_mark_unlock(ce, flags);
 	return result;
 }
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

WARNING: multiple messages have this Message-ID (diff)
From: Chris Wilson <chris@chris-wilson.co.uk>
To: intel-gfx@lists.freedesktop.org
Subject: [Intel-gfx] [PATCH 3/9] drm/i915/gt: Unlock engine-pm after queuing the kernel context switch
Date: Wed, 20 Nov 2019 09:32:56 +0000	[thread overview]
Message-ID: <20191120093302.3723715-3-chris@chris-wilson.co.uk> (raw)
Message-ID: <20191120093256.1HUqOc7KKoFE7ew6AsPMfZDwidTmoA6LiCvYxPdKixQ@z> (raw)
In-Reply-To: <20191120093302.3723715-1-chris@chris-wilson.co.uk>

In commit a79ca656b648 ("drm/i915: Push the wakeref->count deferral to
the backend"), I erroneously concluded that we last modify the engine
inside __i915_request_commit() meaning that we could enable concurrent
submission for userspace as we enqueued this request. However, this
falls into a trap with other users of the engine->kernel_context waking
up and submitting their request before the idle-switch is queued, with
the result that the kernel_context is executed out-of-sequence most
likely upsetting the GPU and certainly ourselves when we try to retire
the out-of-sequence requests.

As such we need to hold onto the effective engine->kernel_context mutex
lock (via the engine pm mutex proxy) until we have finish queuing the
request to the engine.

v2: Serialise against concurrent intel_gt_retire_requests()
v3: Describe the hairy locking scheme with intel_gt_retire_requests()
for future reference.
v4: Combine timeline->lock and engine pm release; it's hairy.

Fixes: a79ca656b648 ("drm/i915: Push the wakeref->count deferral to the backend")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_pm.c | 47 +++++++++++++++++++----
 1 file changed, 40 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 3c0f490ff2c7..1f517357a268 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -73,8 +73,25 @@ static inline void __timeline_mark_unlock(struct intel_context *ce,
 
 #endif /* !IS_ENABLED(CONFIG_LOCKDEP) */
 
+static void
+__intel_timeline_enter_and_pm_release(struct intel_timeline *tl,
+				      struct intel_engine_cs *engine)
+{
+	struct intel_gt_timelines *timelines = &engine->gt->timelines;
+
+	spin_lock(&timelines->lock);
+
+	if (!atomic_fetch_inc(&tl->active_count))
+		list_add_tail(&tl->link, &timelines->active_list);
+
+	__intel_wakeref_defer_park(&engine->wakeref);
+
+	spin_unlock(&timelines->lock);
+}
+
 static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 {
+	struct intel_context *ce = engine->kernel_context;
 	struct i915_request *rq;
 	unsigned long flags;
 	bool result = true;
@@ -98,16 +115,31 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 	 * This should hold true as we can only park the engine after
 	 * retiring the last request, thus all rings should be empty and
 	 * all timelines idle.
+	 *
+	 * For unlocking, there are 2 other parties and the GPU who have a
+	 * stake here.
+	 *
+	 * A new gpu user will be waiting on the engine-pm to start their
+	 * engine_unpark. New waiters are predicated on engine->wakeref.count
+	 * and so intel_wakeref_defer_park() acts like a mutex_unlock of the
+	 * engine->wakeref.
+	 *
+	 * The other party is intel_gt_retire_requests(), which is walking the
+	 * list of active timelines looking for completions. Meanwhile as soon
+	 * as we call __i915_request_queue(), the GPU may complete our request.
+	 * Ergo, if we put ourselves on the timelines.active_list
+	 * (se intel_timeline_enter()) before we increment the
+	 * engine->wakeref.count, we may see the request completion and retire
+	 * it causing an undeflow of the engine->wakeref.
 	 */
-	flags = __timeline_mark_lock(engine->kernel_context);
+	flags = __timeline_mark_lock(ce);
+	GEM_BUG_ON(atomic_read(&ce->timeline->active_count) < 0);
 
-	rq = __i915_request_create(engine->kernel_context, GFP_NOWAIT);
+	rq = __i915_request_create(ce, GFP_NOWAIT);
 	if (IS_ERR(rq))
 		/* Context switch failed, hope for the best! Maybe reset? */
 		goto out_unlock;
 
-	intel_timeline_enter(i915_request_timeline(rq));
-
 	/* Check again on the next retirement. */
 	engine->wakeref_serial = engine->serial + 1;
 	i915_request_add_active_barriers(rq);
@@ -116,13 +148,14 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 	rq->sched.attr.priority = I915_PRIORITY_BARRIER;
 	__i915_request_commit(rq);
 
-	/* Release our exclusive hold on the engine */
-	__intel_wakeref_defer_park(&engine->wakeref);
 	__i915_request_queue(rq, NULL);
 
+	/* Expose ourselves to intel_gt_retire_requests() and new submission */
+	__intel_timeline_enter_and_pm_release(ce->timeline, engine);
+
 	result = false;
 out_unlock:
-	__timeline_mark_unlock(engine->kernel_context, flags);
+	__timeline_mark_unlock(ce, flags);
 	return result;
 }
 
-- 
2.24.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2019-11-20  9:33 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-20  9:32 [PATCH 1/9] drm/i915/selftests: Take a ref to the request we wait upon Chris Wilson
2019-11-20  9:32 ` [Intel-gfx] " Chris Wilson
2019-11-20  9:32 ` [PATCH 2/9] drm/i915/gt: Close race between engine_park and intel_gt_retire_requests Chris Wilson
2019-11-20  9:32   ` [Intel-gfx] " Chris Wilson
2019-11-20 13:19   ` Tvrtko Ursulin
2019-11-20 13:19     ` [Intel-gfx] " Tvrtko Ursulin
2019-11-20  9:32 ` Chris Wilson [this message]
2019-11-20  9:32   ` [Intel-gfx] [PATCH 3/9] drm/i915/gt: Unlock engine-pm after queuing the kernel context switch Chris Wilson
2019-11-20 11:58   ` Tvrtko Ursulin
2019-11-20 11:58     ` [Intel-gfx] " Tvrtko Ursulin
2019-11-20 12:07     ` Chris Wilson
2019-11-20 12:07       ` [Intel-gfx] " Chris Wilson
2019-11-20 12:40       ` Tvrtko Ursulin
2019-11-20 12:40         ` [Intel-gfx] " Tvrtko Ursulin
2019-11-20 12:44         ` Chris Wilson
2019-11-20 12:44           ` [Intel-gfx] " Chris Wilson
2019-11-20 13:19           ` Tvrtko Ursulin
2019-11-20 13:19             ` [Intel-gfx] " Tvrtko Ursulin
2019-11-20  9:32 ` [PATCH 4/9] drm/i915: Mark up the calling context for intel_wakeref_put() Chris Wilson
2019-11-20  9:32   ` [Intel-gfx] " Chris Wilson
2019-11-20 12:46   ` Tvrtko Ursulin
2019-11-20 12:46     ` [Intel-gfx] " Tvrtko Ursulin
2019-11-20  9:32 ` [PATCH 5/9] drm/i915/gt: Declare timeline.lock to be irq-free Chris Wilson
2019-11-20  9:32   ` [Intel-gfx] " Chris Wilson
2019-11-20  9:32 ` [PATCH 6/9] drm/i915/selftests: Force bonded submission to overlap Chris Wilson
2019-11-20  9:32   ` [Intel-gfx] " Chris Wilson
2019-11-20 12:55   ` Tvrtko Ursulin
2019-11-20 12:55     ` [Intel-gfx] " Tvrtko Ursulin
2019-11-20 12:59     ` Chris Wilson
2019-11-20 12:59       ` [Intel-gfx] " Chris Wilson
2019-11-20 13:18       ` Tvrtko Ursulin
2019-11-20 13:18         ` [Intel-gfx] " Tvrtko Ursulin
2019-11-20 13:29         ` Chris Wilson
2019-11-20 13:29           ` [Intel-gfx] " Chris Wilson
2019-11-22  9:34           ` Tvrtko Ursulin
2019-11-22  9:34             ` [Intel-gfx] " Tvrtko Ursulin
2019-11-22 10:03             ` Chris Wilson
2019-11-22 10:03               ` [Intel-gfx] " Chris Wilson
2019-11-22 10:43   ` [PATCH] " Chris Wilson
2019-11-22 10:43     ` [Intel-gfx] " Chris Wilson
2019-11-20  9:33 ` [PATCH 7/9] drm/i915/selftests: Flush the active callbacks Chris Wilson
2019-11-20  9:33   ` [Intel-gfx] " Chris Wilson
2019-11-20  9:33 ` [PATCH 8/9] drm/i915/selftests: Be explicit in ERR_PTR handling Chris Wilson
2019-11-20  9:33   ` [Intel-gfx] " Chris Wilson
2019-11-20 10:23   ` Matthew Auld
2019-11-20 10:23     ` [Intel-gfx] " Matthew Auld
2019-11-20  9:33 ` [PATCH 9/9] drm/i915/gt: Schedule request retirement when timeline idles Chris Wilson
2019-11-20  9:33   ` [Intel-gfx] " Chris Wilson
2019-11-20 13:16   ` Tvrtko Ursulin
2019-11-20 13:16     ` [Intel-gfx] " Tvrtko Ursulin
2019-11-20 13:39     ` Chris Wilson
2019-11-20 13:39       ` [Intel-gfx] " Chris Wilson
2019-11-20 14:16       ` Tvrtko Ursulin
2019-11-20 14:16         ` [Intel-gfx] " Tvrtko Ursulin
2019-11-20 14:25         ` Chris Wilson
2019-11-20 14:25           ` [Intel-gfx] " Chris Wilson
2019-11-20 10:19 ` [PATCH 1/9] drm/i915/selftests: Take a ref to the request we wait upon Matthew Auld
2019-11-20 10:19   ` [Intel-gfx] " Matthew Auld
2019-11-20 10:25   ` Chris Wilson
2019-11-20 10:25     ` [Intel-gfx] " Chris Wilson
2019-11-20 10:27 ` [PATCH] " Chris Wilson
2019-11-20 10:27   ` [Intel-gfx] " Chris Wilson
2019-11-20 10:34   ` Matthew Auld
2019-11-20 10:34     ` [Intel-gfx] " Matthew Auld
2019-11-20 13:51 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with drm/i915/selftests: Take a ref to the request we wait upon (rev2) Patchwork
2019-11-20 13:51   ` [Intel-gfx] " Patchwork
2019-11-20 14:19 ` ✗ Fi.CI.BAT: failure " Patchwork
2019-11-20 14:19   ` [Intel-gfx] " Patchwork
2019-11-20 14:20   ` Chris Wilson
2019-11-20 14:20     ` [Intel-gfx] " Chris Wilson
2019-11-22 12:33 ` ✗ Fi.CI.BUILD: failure for series starting with drm/i915/selftests: Take a ref to the request we wait upon (rev3) Patchwork
2019-11-22 12:33   ` [Intel-gfx] " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191120093302.3723715-3-chris@chris-wilson.co.uk \
    --to=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.