All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Wilson <chris@chris-wilson.co.uk>
To: intel-gfx@lists.freedesktop.org
Cc: "Nayana, Venkata Ramana" <venkata.ramana.nayana@intel.com>,
	thomas.hellstrom@intel.com,
	Chris Wilson <chris@chris-wilson.co.uk>
Subject: [Intel-gfx] [PATCH 12/21] drm/i915/gt: Only transfer the virtual context to the new engine if active
Date: Thu, 30 Jul 2020 10:37:47 +0100	[thread overview]
Message-ID: <20200730093756.16737-13-chris@chris-wilson.co.uk> (raw)
In-Reply-To: <20200730093756.16737-1-chris@chris-wilson.co.uk>

One more complication of preempt-to-busy with respect to the virtual
engine is that we may have retired the last request along the virtual
engine at the same time as preparing to submit the completed request to
a new engine. That submit will be shortcircuited, but not before we have
updated the context with the new register offsets and marked the virtual
engine as bound to the new engine (by calling swap on ve->siblings[]).
As we may have just retired the completed request, we may also be in the
middle of calling virtual_context_exit() to turn off the power management
associated with the virtual engine, and that in turn walks the
ve->siblings[]. If we happen to call swap() on the array as we walk, we
will call intel_engine_pm_put() twice on the same engine.

In this patch, we prevent this by only updating the bound engine after a
successful submission which weeds out the already completed requests.

Alternatively, we could walk a non-volatile array for the pm, such as
using the engine->mask. The small advantage to performing the update
after the submit is that we then only have to do a swap for active
requests.

Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
References: 6d06779e8672 ("drm/i915: Load balancing across a virtual engine"
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: "Nayana, Venkata Ramana" <venkata.ramana.nayana@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 65 ++++++++++++++++++-----------
 1 file changed, 40 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index a4959e8229ac..904e9b8bcbf6 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1805,6 +1805,33 @@ static bool virtual_matches(const struct virtual_engine *ve,
 	return true;
 }
 
+static void virtual_xfer_context(struct virtual_engine *ve,
+				 struct intel_engine_cs *engine)
+{
+	unsigned int n;
+
+	if (likely(engine == ve->siblings[0]))
+		return;
+
+	GEM_BUG_ON(READ_ONCE(ve->context.inflight));
+	if (!intel_engine_has_relative_mmio(engine))
+		virtual_update_register_offsets(ve->context.lrc_reg_state,
+						engine);
+
+	/*
+	 * Move the bound engine to the top of the list for
+	 * future execution. We then kick this tasklet first
+	 * before checking others, so that we preferentially
+	 * reuse this set of bound registers.
+	 */
+	for (n = 1; n < ve->num_siblings; n++) {
+		if (ve->siblings[n] == engine) {
+			swap(ve->siblings[n], ve->siblings[0]);
+			break;
+		}
+	}
+}
+
 #define for_each_waiter(p__, rq__) \
 	list_for_each_entry_lockless(p__, \
 				     &(rq__)->sched.waiters_list, \
@@ -2253,35 +2280,23 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 			GEM_BUG_ON(!(rq->execution_mask & engine->mask));
 			WRITE_ONCE(rq->engine, engine);
 
-			if (engine != ve->siblings[0]) {
-				u32 *regs = ve->context.lrc_reg_state;
-				unsigned int n;
-
-				GEM_BUG_ON(READ_ONCE(ve->context.inflight));
-
-				if (!intel_engine_has_relative_mmio(engine))
-					virtual_update_register_offsets(regs,
-									engine);
-
+			if (__i915_request_submit(rq)) {
 				/*
-				 * Move the bound engine to the top of the list
-				 * for future execution. We then kick this
-				 * tasklet first before checking others, so that
-				 * we preferentially reuse this set of bound
-				 * registers.
+				 * Only after we confirm that we will submit
+				 * this request (i.e. it has not already
+				 * completed), do we want to update the context.
+				 *
+				 * This serves two purposes. It avoids
+				 * unnecessary work if we are resubmitting an
+				 * already completed request after timeslicing.
+				 * But more importantly, it prevents us altering
+				 * ve->siblings[] on an idle context, where
+				 * we may be using ve->siblings[] in
+				 * virtual_context_enter / virtual_context_exit.
 				 */
-				for (n = 1; n < ve->num_siblings; n++) {
-					if (ve->siblings[n] == engine) {
-						swap(ve->siblings[n],
-						     ve->siblings[0]);
-						break;
-					}
-				}
-
+				virtual_xfer_context(ve, engine);
 				GEM_BUG_ON(ve->siblings[0] != engine);
-			}
 
-			if (__i915_request_submit(rq)) {
 				submit = true;
 				last = rq;
 			}
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2020-07-30  9:38 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-30  9:37 [Intel-gfx] Breadcrumbs fixes and stall avoidance Chris Wilson
2020-07-30  9:37 ` [Intel-gfx] [PATCH 01/21] drm/i915: Add a couple of missing i915_active_fini() Chris Wilson
2020-07-30  9:37 ` [Intel-gfx] [PATCH 02/21] drm/i915: Skip taking acquire mutex for no ref->active callback Chris Wilson
2020-07-30  9:37 ` [Intel-gfx] [PATCH 03/21] drm/i915: Export a preallocate variant of i915_active_acquire() Chris Wilson
2020-07-31  7:33   ` Thomas Hellström (Intel)
2020-07-30  9:37 ` [Intel-gfx] [PATCH 04/21] drm/i915: Keep the most recently used active-fence upon discard Chris Wilson
2020-07-30  9:37 ` [Intel-gfx] [PATCH 05/21] drm/i915: Make the stale cached active node available for any timeline Chris Wilson
2020-07-30  9:37 ` [Intel-gfx] [PATCH 06/21] drm/i915: Reduce locking around i915_active_acquire_preallocate_barrier() Chris Wilson
2020-07-31  7:39   ` Thomas Hellström (Intel)
2020-07-30  9:37 ` [Intel-gfx] [PATCH 07/21] drm/i915: Provide a fastpath for waiting on vma bindings Chris Wilson
2020-07-30  9:37 ` [Intel-gfx] [PATCH 08/21] drm/i915/gem: Reduce ctx->engine_mutex for reading the clone source Chris Wilson
2020-07-30  9:37 ` [Intel-gfx] [PATCH 09/21] drm/i915/gem: Reduce ctx->engines_mutex for get_engines() Chris Wilson
2020-07-30  9:37 ` [Intel-gfx] [PATCH 10/21] drm/i915: Remove requirement for holding i915_request.lock for breadcrumbs Chris Wilson
2020-07-30  9:37 ` [Intel-gfx] [PATCH 11/21] drm/i915/gt: Replace intel_engine_transfer_stale_breadcrumbs Chris Wilson
2020-07-30  9:37 ` Chris Wilson [this message]
2020-07-30  9:37 ` [Intel-gfx] [PATCH 13/21] drm/i915/gt: Distinguish the virtual breadcrumbs from the irq breadcrumbs Chris Wilson
2020-07-31 14:53   ` Tvrtko Ursulin
2020-07-30  9:37 ` [Intel-gfx] [PATCH 14/21] drm/i915/gt: Move intel_breadcrumbs_arm_irq earlier Chris Wilson
2020-07-30  9:37 ` [Intel-gfx] [PATCH 15/21] drm/i915/gt: Hold context/request reference while breadcrumbs are active Chris Wilson
2020-07-30  9:37 ` [Intel-gfx] [PATCH 16/21] drm/i915/gt: Track signaled breadcrumbs outside of the breadcrumb spinlock Chris Wilson
2020-07-31 15:06   ` Tvrtko Ursulin
2020-07-31 15:12     ` Chris Wilson
2020-07-31 15:21       ` Chris Wilson
2020-07-31 16:06         ` Tvrtko Ursulin
2020-07-31 17:59           ` Chris Wilson
2020-07-31 15:32       ` Tvrtko Ursulin
2020-07-30  9:37 ` [Intel-gfx] [PATCH 17/21] drm/i915/gt: Protect context lifetime with RCU Chris Wilson
2020-07-31 15:15   ` Tvrtko Ursulin
2020-07-31 15:24     ` Chris Wilson
2020-07-31 15:45       ` Tvrtko Ursulin
2020-07-30  9:37 ` [Intel-gfx] [PATCH 18/21] drm/i915/gt: Split the breadcrumb spinlock between global and contexts Chris Wilson
2020-07-30  9:37 ` [Intel-gfx] [PATCH 19/21] drm/i915: Drop i915_request.lock serialisation around await_start Chris Wilson
2020-07-30  9:37 ` [Intel-gfx] [PATCH 20/21] drm/i915: Drop i915_request.lock requirement for intel_rps_boost() Chris Wilson
2020-07-30  9:37 ` [PATCH 21/21] drm/i915/gem: Delay tracking the GEM context until it is registered Chris Wilson
2020-07-30  9:37   ` [Intel-gfx] " Chris Wilson
2020-07-30 13:45 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/21] drm/i915: Add a couple of missing i915_active_fini() Patchwork
2020-07-30 13:46 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2020-07-30 14:04 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2020-07-30 19:20 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200730093756.16737-13-chris@chris-wilson.co.uk \
    --to=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=thomas.hellstrom@intel.com \
    --cc=venkata.ramana.nayana@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.