All of lore.kernel.org
 help / color / mirror / Atom feed
* ctx->engines[rcs0, rcs0]
@ 2019-03-25  9:03 Chris Wilson
  2019-03-25  9:03 ` [PATCH 01/22] drm/i915: Report the correct errno from i915_gem_context_open() Chris Wilson
                   ` (25 more replies)
  0 siblings, 26 replies; 43+ messages in thread
From: Chris Wilson @ 2019-03-25  9:03 UTC (permalink / raw)
  To: intel-gfx

The headline change is the wholehearted decision to allow the user to
establish an ctx->engines mapping of [rcs0, rcs0] to mean two logically
distinct pipelines to the same engine. An example of this use case is in
iris which constructs two GEM contexts in order to have two distinct
logical ccontexts to create both a render pipeline and a compute pipeline
within the same GL context. As it it the same GL context, it should be
a singular timeline and benefits from a shared ppgtt; a natural way of
constructing a GEM counterpart to this composite GL context is with an
engine map of [rcs0, rcs0] and single timeline.

One corollary to this is that given an ctx->engines[], we cannot assume
that (engine_class, engine_instance) is enough to adequately identify a
unique logical context. I propose that given user ctx->engines[], we
force all subsequent user engine lookups to use indices. Thoughts?
-Chris


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 01/22] drm/i915: Report the correct errno from i915_gem_context_open()
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
@ 2019-03-25  9:03 ` Chris Wilson
  2019-03-25  9:24   ` Tvrtko Ursulin
  2019-03-25  9:03 ` [PATCH 02/22] drm/i915/guc: Replace preempt_client lookup with engine->preempt_context Chris Wilson
                   ` (24 subsequent siblings)
  25 siblings, 1 reply; 43+ messages in thread
From: Chris Wilson @ 2019-03-25  9:03 UTC (permalink / raw)
  To: intel-gfx

Fixup the errno as we adjusted the error path to receive the errno and
not computed itself from ERR_PTR(ctx) anymore.

drivers/gpu/drm/i915/i915_gem_context.c:793 i915_gem_context_open() warn: passing a valid pointer to 'PTR_ERR'

Fixes: 3aa9945a528e ("drm/i915: Separate GEM context construction and registration to userspace")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index e6f594668245..25f267a03d3d 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -709,7 +709,7 @@ int i915_gem_context_open(struct drm_i915_private *i915,
 	idr_destroy(&file_priv->context_idr);
 	mutex_destroy(&file_priv->vm_idr_lock);
 	mutex_destroy(&file_priv->context_idr_lock);
-	return PTR_ERR(ctx);
+	return err;
 }
 
 void i915_gem_context_close(struct drm_file *file)
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 02/22] drm/i915/guc: Replace preempt_client lookup with engine->preempt_context
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
  2019-03-25  9:03 ` [PATCH 01/22] drm/i915: Report the correct errno from i915_gem_context_open() Chris Wilson
@ 2019-03-25  9:03 ` Chris Wilson
  2019-03-25  9:03 ` [PATCH 03/22] drm/i915: Pull the GEM powermangement coupling into its own file Chris Wilson
                   ` (23 subsequent siblings)
  25 siblings, 0 replies; 43+ messages in thread
From: Chris Wilson @ 2019-03-25  9:03 UTC (permalink / raw)
  To: intel-gfx

Circumvent the dance we currently perform to find the preempt_client and
lookup its HW context for this engine, as we know we have already pinned
the preempt_context on the engine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_guc_submission.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
index c4ad73980988..30dd6706a1d2 100644
--- a/drivers/gpu/drm/i915/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/intel_guc_submission.c
@@ -567,7 +567,7 @@ static void inject_preempt_context(struct work_struct *work)
 					     preempt_work[engine->id]);
 	struct intel_guc_client *client = guc->preempt_client;
 	struct guc_stage_desc *stage_desc = __get_stage_desc(client);
-	struct intel_context *ce = intel_context_lookup(client->owner, engine);
+	struct intel_context *ce = engine->preempt_context;
 	u32 data[7];
 
 	if (!ce->ring->emit) { /* recreate upon load/resume */
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 03/22] drm/i915: Pull the GEM powermangement coupling into its own file
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
  2019-03-25  9:03 ` [PATCH 01/22] drm/i915: Report the correct errno from i915_gem_context_open() Chris Wilson
  2019-03-25  9:03 ` [PATCH 02/22] drm/i915/guc: Replace preempt_client lookup with engine->preempt_context Chris Wilson
@ 2019-03-25  9:03 ` Chris Wilson
  2019-04-01 14:56   ` Tvrtko Ursulin
  2019-04-01 15:39   ` Lucas De Marchi
  2019-03-25  9:03 ` [PATCH 04/22] drm/i915: Guard unpark/park with the gt.active_mutex Chris Wilson
                   ` (22 subsequent siblings)
  25 siblings, 2 replies; 43+ messages in thread
From: Chris Wilson @ 2019-03-25  9:03 UTC (permalink / raw)
  To: intel-gfx

Split out the powermanagement portion (GT wakeref, suspend/resume) of
GEM from i915_gem.c into its own file.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/Makefile                 |   2 +
 drivers/gpu/drm/i915/i915_gem.c               | 335 +----------------
 drivers/gpu/drm/i915/i915_gem_pm.c            | 341 ++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_pm.h            |  28 ++
 .../drm/i915/test_i915_gem_pm_standalone.c    |   7 +
 5 files changed, 381 insertions(+), 332 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_gem_pm.c
 create mode 100644 drivers/gpu/drm/i915/i915_gem_pm.h
 create mode 100644 drivers/gpu/drm/i915/test_i915_gem_pm_standalone.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 60de05f3fa60..bd1657c3d395 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -61,6 +61,7 @@ i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
 i915-$(CONFIG_DRM_I915_WERROR) += \
 	test_i915_active_types_standalone.o \
 	test_i915_gem_context_types_standalone.o \
+	test_i915_gem_pm_standalone.o \
 	test_i915_timeline_types_standalone.o \
 	test_intel_context_types_standalone.o \
 	test_intel_engine_types_standalone.o \
@@ -81,6 +82,7 @@ i915-y += \
 	  i915_gem_internal.o \
 	  i915_gem.o \
 	  i915_gem_object.o \
+	  i915_gem_pm.o \
 	  i915_gem_render_state.o \
 	  i915_gem_shrinker.o \
 	  i915_gem_stolen.o \
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f6cdd5fb9deb..47c672432594 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -42,7 +42,7 @@
 #include "i915_drv.h"
 #include "i915_gem_clflush.h"
 #include "i915_gemfs.h"
-#include "i915_globals.h"
+#include "i915_gem_pm.h"
 #include "i915_reset.h"
 #include "i915_trace.h"
 #include "i915_vgpu.h"
@@ -101,105 +101,6 @@ static void i915_gem_info_remove_obj(struct drm_i915_private *dev_priv,
 	spin_unlock(&dev_priv->mm.object_stat_lock);
 }
 
-static void __i915_gem_park(struct drm_i915_private *i915)
-{
-	intel_wakeref_t wakeref;
-
-	GEM_TRACE("\n");
-
-	lockdep_assert_held(&i915->drm.struct_mutex);
-	GEM_BUG_ON(i915->gt.active_requests);
-	GEM_BUG_ON(!list_empty(&i915->gt.active_rings));
-
-	if (!i915->gt.awake)
-		return;
-
-	/*
-	 * Be paranoid and flush a concurrent interrupt to make sure
-	 * we don't reactivate any irq tasklets after parking.
-	 *
-	 * FIXME: Note that even though we have waited for execlists to be idle,
-	 * there may still be an in-flight interrupt even though the CSB
-	 * is now empty. synchronize_irq() makes sure that a residual interrupt
-	 * is completed before we continue, but it doesn't prevent the HW from
-	 * raising a spurious interrupt later. To complete the shield we should
-	 * coordinate disabling the CS irq with flushing the interrupts.
-	 */
-	synchronize_irq(i915->drm.irq);
-
-	intel_engines_park(i915);
-	i915_timelines_park(i915);
-
-	i915_pmu_gt_parked(i915);
-	i915_vma_parked(i915);
-
-	wakeref = fetch_and_zero(&i915->gt.awake);
-	GEM_BUG_ON(!wakeref);
-
-	if (INTEL_GEN(i915) >= 6)
-		gen6_rps_idle(i915);
-
-	intel_display_power_put(i915, POWER_DOMAIN_GT_IRQ, wakeref);
-
-	i915_globals_park();
-}
-
-void i915_gem_park(struct drm_i915_private *i915)
-{
-	GEM_TRACE("\n");
-
-	lockdep_assert_held(&i915->drm.struct_mutex);
-	GEM_BUG_ON(i915->gt.active_requests);
-
-	if (!i915->gt.awake)
-		return;
-
-	/* Defer the actual call to __i915_gem_park() to prevent ping-pongs */
-	mod_delayed_work(i915->wq, &i915->gt.idle_work, msecs_to_jiffies(100));
-}
-
-void i915_gem_unpark(struct drm_i915_private *i915)
-{
-	GEM_TRACE("\n");
-
-	lockdep_assert_held(&i915->drm.struct_mutex);
-	GEM_BUG_ON(!i915->gt.active_requests);
-	assert_rpm_wakelock_held(i915);
-
-	if (i915->gt.awake)
-		return;
-
-	/*
-	 * It seems that the DMC likes to transition between the DC states a lot
-	 * when there are no connected displays (no active power domains) during
-	 * command submission.
-	 *
-	 * This activity has negative impact on the performance of the chip with
-	 * huge latencies observed in the interrupt handler and elsewhere.
-	 *
-	 * Work around it by grabbing a GT IRQ power domain whilst there is any
-	 * GT activity, preventing any DC state transitions.
-	 */
-	i915->gt.awake = intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
-	GEM_BUG_ON(!i915->gt.awake);
-
-	i915_globals_unpark();
-
-	intel_enable_gt_powersave(i915);
-	i915_update_gfx_val(i915);
-	if (INTEL_GEN(i915) >= 6)
-		gen6_rps_busy(i915);
-	i915_pmu_gt_unparked(i915);
-
-	intel_engines_unpark(i915);
-
-	i915_queue_hangcheck(i915);
-
-	queue_delayed_work(i915->wq,
-			   &i915->gt.retire_work,
-			   round_jiffies_up_relative(HZ));
-}
-
 int
 i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
 			    struct drm_file *file)
@@ -2874,108 +2775,6 @@ i915_gem_retire_work_handler(struct work_struct *work)
 				   round_jiffies_up_relative(HZ));
 }
 
-static bool switch_to_kernel_context_sync(struct drm_i915_private *i915,
-					  unsigned long mask)
-{
-	bool result = true;
-
-	/*
-	 * Even if we fail to switch, give whatever is running a small chance
-	 * to save itself before we report the failure. Yes, this may be a
-	 * false positive due to e.g. ENOMEM, caveat emptor!
-	 */
-	if (i915_gem_switch_to_kernel_context(i915, mask))
-		result = false;
-
-	if (i915_gem_wait_for_idle(i915,
-				   I915_WAIT_LOCKED |
-				   I915_WAIT_FOR_IDLE_BOOST,
-				   I915_GEM_IDLE_TIMEOUT))
-		result = false;
-
-	if (!result) {
-		if (i915_modparams.reset) { /* XXX hide warning from gem_eio */
-			dev_err(i915->drm.dev,
-				"Failed to idle engines, declaring wedged!\n");
-			GEM_TRACE_DUMP();
-		}
-
-		/* Forcibly cancel outstanding work and leave the gpu quiet. */
-		i915_gem_set_wedged(i915);
-	}
-
-	i915_retire_requests(i915); /* ensure we flush after wedging */
-	return result;
-}
-
-static bool load_power_context(struct drm_i915_private *i915)
-{
-	/* Force loading the kernel context on all engines */
-	if (!switch_to_kernel_context_sync(i915, ALL_ENGINES))
-		return false;
-
-	/*
-	 * Immediately park the GPU so that we enable powersaving and
-	 * treat it as idle. The next time we issue a request, we will
-	 * unpark and start using the engine->pinned_default_state, otherwise
-	 * it is in limbo and an early reset may fail.
-	 */
-	__i915_gem_park(i915);
-
-	return true;
-}
-
-static void
-i915_gem_idle_work_handler(struct work_struct *work)
-{
-	struct drm_i915_private *i915 =
-		container_of(work, typeof(*i915), gt.idle_work.work);
-	bool rearm_hangcheck;
-
-	if (!READ_ONCE(i915->gt.awake))
-		return;
-
-	if (READ_ONCE(i915->gt.active_requests))
-		return;
-
-	rearm_hangcheck =
-		cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
-
-	if (!mutex_trylock(&i915->drm.struct_mutex)) {
-		/* Currently busy, come back later */
-		mod_delayed_work(i915->wq,
-				 &i915->gt.idle_work,
-				 msecs_to_jiffies(50));
-		goto out_rearm;
-	}
-
-	/*
-	 * Flush out the last user context, leaving only the pinned
-	 * kernel context resident. Should anything unfortunate happen
-	 * while we are idle (such as the GPU being power cycled), no users
-	 * will be harmed.
-	 */
-	if (!work_pending(&i915->gt.idle_work.work) &&
-	    !i915->gt.active_requests) {
-		++i915->gt.active_requests; /* don't requeue idle */
-
-		switch_to_kernel_context_sync(i915, i915->gt.active_engines);
-
-		if (!--i915->gt.active_requests) {
-			__i915_gem_park(i915);
-			rearm_hangcheck = false;
-		}
-	}
-
-	mutex_unlock(&i915->drm.struct_mutex);
-
-out_rearm:
-	if (rearm_hangcheck) {
-		GEM_BUG_ON(!i915->gt.awake);
-		i915_queue_hangcheck(i915);
-	}
-}
-
 void i915_gem_close_object(struct drm_gem_object *gem, struct drm_file *file)
 {
 	struct drm_i915_private *i915 = to_i915(gem->dev);
@@ -4390,133 +4189,6 @@ void i915_gem_sanitize(struct drm_i915_private *i915)
 	mutex_unlock(&i915->drm.struct_mutex);
 }
 
-void i915_gem_suspend(struct drm_i915_private *i915)
-{
-	intel_wakeref_t wakeref;
-
-	GEM_TRACE("\n");
-
-	wakeref = intel_runtime_pm_get(i915);
-
-	flush_workqueue(i915->wq);
-
-	mutex_lock(&i915->drm.struct_mutex);
-
-	/*
-	 * We have to flush all the executing contexts to main memory so
-	 * that they can saved in the hibernation image. To ensure the last
-	 * context image is coherent, we have to switch away from it. That
-	 * leaves the i915->kernel_context still active when
-	 * we actually suspend, and its image in memory may not match the GPU
-	 * state. Fortunately, the kernel_context is disposable and we do
-	 * not rely on its state.
-	 */
-	switch_to_kernel_context_sync(i915, i915->gt.active_engines);
-
-	mutex_unlock(&i915->drm.struct_mutex);
-	i915_reset_flush(i915);
-
-	drain_delayed_work(&i915->gt.retire_work);
-
-	/*
-	 * As the idle_work is rearming if it detects a race, play safe and
-	 * repeat the flush until it is definitely idle.
-	 */
-	drain_delayed_work(&i915->gt.idle_work);
-
-	/*
-	 * Assert that we successfully flushed all the work and
-	 * reset the GPU back to its idle, low power state.
-	 */
-	GEM_BUG_ON(i915->gt.awake);
-
-	intel_uc_suspend(i915);
-
-	intel_runtime_pm_put(i915, wakeref);
-}
-
-void i915_gem_suspend_late(struct drm_i915_private *i915)
-{
-	struct drm_i915_gem_object *obj;
-	struct list_head *phases[] = {
-		&i915->mm.unbound_list,
-		&i915->mm.bound_list,
-		NULL
-	}, **phase;
-
-	/*
-	 * Neither the BIOS, ourselves or any other kernel
-	 * expects the system to be in execlists mode on startup,
-	 * so we need to reset the GPU back to legacy mode. And the only
-	 * known way to disable logical contexts is through a GPU reset.
-	 *
-	 * So in order to leave the system in a known default configuration,
-	 * always reset the GPU upon unload and suspend. Afterwards we then
-	 * clean up the GEM state tracking, flushing off the requests and
-	 * leaving the system in a known idle state.
-	 *
-	 * Note that is of the upmost importance that the GPU is idle and
-	 * all stray writes are flushed *before* we dismantle the backing
-	 * storage for the pinned objects.
-	 *
-	 * However, since we are uncertain that resetting the GPU on older
-	 * machines is a good idea, we don't - just in case it leaves the
-	 * machine in an unusable condition.
-	 */
-
-	mutex_lock(&i915->drm.struct_mutex);
-	for (phase = phases; *phase; phase++) {
-		list_for_each_entry(obj, *phase, mm.link)
-			WARN_ON(i915_gem_object_set_to_gtt_domain(obj, false));
-	}
-	mutex_unlock(&i915->drm.struct_mutex);
-
-	intel_uc_sanitize(i915);
-	i915_gem_sanitize(i915);
-}
-
-void i915_gem_resume(struct drm_i915_private *i915)
-{
-	GEM_TRACE("\n");
-
-	WARN_ON(i915->gt.awake);
-
-	mutex_lock(&i915->drm.struct_mutex);
-	intel_uncore_forcewake_get(&i915->uncore, FORCEWAKE_ALL);
-
-	i915_gem_restore_gtt_mappings(i915);
-	i915_gem_restore_fences(i915);
-
-	/*
-	 * As we didn't flush the kernel context before suspend, we cannot
-	 * guarantee that the context image is complete. So let's just reset
-	 * it and start again.
-	 */
-	i915->gt.resume(i915);
-
-	if (i915_gem_init_hw(i915))
-		goto err_wedged;
-
-	intel_uc_resume(i915);
-
-	/* Always reload a context for powersaving. */
-	if (!load_power_context(i915))
-		goto err_wedged;
-
-out_unlock:
-	intel_uncore_forcewake_put(&i915->uncore, FORCEWAKE_ALL);
-	mutex_unlock(&i915->drm.struct_mutex);
-	return;
-
-err_wedged:
-	if (!i915_reset_failed(i915)) {
-		dev_err(i915->drm.dev,
-			"Failed to re-initialize GPU, declaring it wedged!\n");
-		i915_gem_set_wedged(i915);
-	}
-	goto out_unlock;
-}
-
 void i915_gem_init_swizzling(struct drm_i915_private *dev_priv)
 {
 	if (INTEL_GEN(dev_priv) < 5 ||
@@ -4699,7 +4371,7 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
 	}
 
 	/* Flush the default context image to memory, and enable powersaving. */
-	if (!load_power_context(i915)) {
+	if (!i915_gem_load_power_context(i915)) {
 		err = -EIO;
 		goto err_active;
 	}
@@ -5096,11 +4768,10 @@ int i915_gem_init_early(struct drm_i915_private *dev_priv)
 	INIT_LIST_HEAD(&dev_priv->gt.closed_vma);
 
 	i915_gem_init__mm(dev_priv);
+	i915_gem_init__pm(dev_priv);
 
 	INIT_DELAYED_WORK(&dev_priv->gt.retire_work,
 			  i915_gem_retire_work_handler);
-	INIT_DELAYED_WORK(&dev_priv->gt.idle_work,
-			  i915_gem_idle_work_handler);
 	init_waitqueue_head(&dev_priv->gpu_error.wait_queue);
 	init_waitqueue_head(&dev_priv->gpu_error.reset_queue);
 	mutex_init(&dev_priv->gpu_error.wedge_mutex);
diff --git a/drivers/gpu/drm/i915/i915_gem_pm.c b/drivers/gpu/drm/i915/i915_gem_pm.c
new file mode 100644
index 000000000000..faa4eb42ec0a
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_gem_pm.c
@@ -0,0 +1,341 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "i915_drv.h"
+#include "i915_gem_pm.h"
+#include "i915_globals.h"
+
+static void __i915_gem_park(struct drm_i915_private *i915)
+{
+	intel_wakeref_t wakeref;
+
+	GEM_TRACE("\n");
+
+	lockdep_assert_held(&i915->drm.struct_mutex);
+	GEM_BUG_ON(i915->gt.active_requests);
+	GEM_BUG_ON(!list_empty(&i915->gt.active_rings));
+
+	if (!i915->gt.awake)
+		return;
+
+	/*
+	 * Be paranoid and flush a concurrent interrupt to make sure
+	 * we don't reactivate any irq tasklets after parking.
+	 *
+	 * FIXME: Note that even though we have waited for execlists to be idle,
+	 * there may still be an in-flight interrupt even though the CSB
+	 * is now empty. synchronize_irq() makes sure that a residual interrupt
+	 * is completed before we continue, but it doesn't prevent the HW from
+	 * raising a spurious interrupt later. To complete the shield we should
+	 * coordinate disabling the CS irq with flushing the interrupts.
+	 */
+	synchronize_irq(i915->drm.irq);
+
+	intel_engines_park(i915);
+	i915_timelines_park(i915);
+
+	i915_pmu_gt_parked(i915);
+	i915_vma_parked(i915);
+
+	wakeref = fetch_and_zero(&i915->gt.awake);
+	GEM_BUG_ON(!wakeref);
+
+	if (INTEL_GEN(i915) >= 6)
+		gen6_rps_idle(i915);
+
+	intel_display_power_put(i915, POWER_DOMAIN_GT_IRQ, wakeref);
+
+	i915_globals_park();
+}
+
+static bool switch_to_kernel_context_sync(struct drm_i915_private *i915,
+					  unsigned long mask)
+{
+	bool result = true;
+
+	/*
+	 * Even if we fail to switch, give whatever is running a small chance
+	 * to save itself before we report the failure. Yes, this may be a
+	 * false positive due to e.g. ENOMEM, caveat emptor!
+	 */
+	if (i915_gem_switch_to_kernel_context(i915, mask))
+		result = false;
+
+	if (i915_gem_wait_for_idle(i915,
+				   I915_WAIT_LOCKED |
+				   I915_WAIT_FOR_IDLE_BOOST,
+				   I915_GEM_IDLE_TIMEOUT))
+		result = false;
+
+	if (!result) {
+		if (i915_modparams.reset) { /* XXX hide warning from gem_eio */
+			dev_err(i915->drm.dev,
+				"Failed to idle engines, declaring wedged!\n");
+			GEM_TRACE_DUMP();
+		}
+
+		/* Forcibly cancel outstanding work and leave the gpu quiet. */
+		i915_gem_set_wedged(i915);
+	}
+
+	i915_retire_requests(i915); /* ensure we flush after wedging */
+	return result;
+}
+
+static void idle_work_handler(struct work_struct *work)
+{
+	struct drm_i915_private *i915 =
+		container_of(work, typeof(*i915), gt.idle_work.work);
+	bool rearm_hangcheck;
+
+	if (!READ_ONCE(i915->gt.awake))
+		return;
+
+	if (READ_ONCE(i915->gt.active_requests))
+		return;
+
+	rearm_hangcheck =
+		cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
+
+	if (!mutex_trylock(&i915->drm.struct_mutex)) {
+		/* Currently busy, come back later */
+		mod_delayed_work(i915->wq,
+				 &i915->gt.idle_work,
+				 msecs_to_jiffies(50));
+		goto out_rearm;
+	}
+
+	/*
+	 * Flush out the last user context, leaving only the pinned
+	 * kernel context resident. Should anything unfortunate happen
+	 * while we are idle (such as the GPU being power cycled), no users
+	 * will be harmed.
+	 */
+	if (!work_pending(&i915->gt.idle_work.work) &&
+	    !i915->gt.active_requests) {
+		++i915->gt.active_requests; /* don't requeue idle */
+
+		switch_to_kernel_context_sync(i915, i915->gt.active_engines);
+
+		if (!--i915->gt.active_requests) {
+			__i915_gem_park(i915);
+			rearm_hangcheck = false;
+		}
+	}
+
+	mutex_unlock(&i915->drm.struct_mutex);
+
+out_rearm:
+	if (rearm_hangcheck) {
+		GEM_BUG_ON(!i915->gt.awake);
+		i915_queue_hangcheck(i915);
+	}
+}
+
+void i915_gem_park(struct drm_i915_private *i915)
+{
+	GEM_TRACE("\n");
+
+	lockdep_assert_held(&i915->drm.struct_mutex);
+	GEM_BUG_ON(i915->gt.active_requests);
+
+	if (!i915->gt.awake)
+		return;
+
+	/* Defer the actual call to __i915_gem_park() to prevent ping-pongs */
+	mod_delayed_work(i915->wq, &i915->gt.idle_work, msecs_to_jiffies(100));
+}
+
+void i915_gem_unpark(struct drm_i915_private *i915)
+{
+	GEM_TRACE("\n");
+
+	lockdep_assert_held(&i915->drm.struct_mutex);
+	GEM_BUG_ON(!i915->gt.active_requests);
+	assert_rpm_wakelock_held(i915);
+
+	if (i915->gt.awake)
+		return;
+
+	/*
+	 * It seems that the DMC likes to transition between the DC states a lot
+	 * when there are no connected displays (no active power domains) during
+	 * command submission.
+	 *
+	 * This activity has negative impact on the performance of the chip with
+	 * huge latencies observed in the interrupt handler and elsewhere.
+	 *
+	 * Work around it by grabbing a GT IRQ power domain whilst there is any
+	 * GT activity, preventing any DC state transitions.
+	 */
+	i915->gt.awake = intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
+	GEM_BUG_ON(!i915->gt.awake);
+
+	i915_globals_unpark();
+
+	intel_enable_gt_powersave(i915);
+	i915_update_gfx_val(i915);
+	if (INTEL_GEN(i915) >= 6)
+		gen6_rps_busy(i915);
+	i915_pmu_gt_unparked(i915);
+
+	intel_engines_unpark(i915);
+
+	i915_queue_hangcheck(i915);
+
+	queue_delayed_work(i915->wq,
+			   &i915->gt.retire_work,
+			   round_jiffies_up_relative(HZ));
+}
+
+bool i915_gem_load_power_context(struct drm_i915_private *i915)
+{
+	/* Force loading the kernel context on all engines */
+	if (!switch_to_kernel_context_sync(i915, ALL_ENGINES))
+		return false;
+
+	/*
+	 * Immediately park the GPU so that we enable powersaving and
+	 * treat it as idle. The next time we issue a request, we will
+	 * unpark and start using the engine->pinned_default_state, otherwise
+	 * it is in limbo and an early reset may fail.
+	 */
+	__i915_gem_park(i915);
+
+	return true;
+}
+
+void i915_gem_suspend(struct drm_i915_private *i915)
+{
+	intel_wakeref_t wakeref;
+
+	GEM_TRACE("\n");
+
+	wakeref = intel_runtime_pm_get(i915);
+
+	flush_workqueue(i915->wq);
+
+	mutex_lock(&i915->drm.struct_mutex);
+
+	/*
+	 * We have to flush all the executing contexts to main memory so
+	 * that they can saved in the hibernation image. To ensure the last
+	 * context image is coherent, we have to switch away from it. That
+	 * leaves the i915->kernel_context still active when
+	 * we actually suspend, and its image in memory may not match the GPU
+	 * state. Fortunately, the kernel_context is disposable and we do
+	 * not rely on its state.
+	 */
+	switch_to_kernel_context_sync(i915, i915->gt.active_engines);
+
+	mutex_unlock(&i915->drm.struct_mutex);
+	i915_reset_flush(i915);
+
+	drain_delayed_work(&i915->gt.retire_work);
+
+	/*
+	 * As the idle_work is rearming if it detects a race, play safe and
+	 * repeat the flush until it is definitely idle.
+	 */
+	drain_delayed_work(&i915->gt.idle_work);
+
+	/*
+	 * Assert that we successfully flushed all the work and
+	 * reset the GPU back to its idle, low power state.
+	 */
+	GEM_BUG_ON(i915->gt.awake);
+
+	intel_uc_suspend(i915);
+
+	intel_runtime_pm_put(i915, wakeref);
+}
+
+void i915_gem_suspend_late(struct drm_i915_private *i915)
+{
+	struct drm_i915_gem_object *obj;
+	struct list_head *phases[] = {
+		&i915->mm.unbound_list,
+		&i915->mm.bound_list,
+		NULL
+	}, **phase;
+
+	/*
+	 * Neither the BIOS, ourselves or any other kernel
+	 * expects the system to be in execlists mode on startup,
+	 * so we need to reset the GPU back to legacy mode. And the only
+	 * known way to disable logical contexts is through a GPU reset.
+	 *
+	 * So in order to leave the system in a known default configuration,
+	 * always reset the GPU upon unload and suspend. Afterwards we then
+	 * clean up the GEM state tracking, flushing off the requests and
+	 * leaving the system in a known idle state.
+	 *
+	 * Note that is of the upmost importance that the GPU is idle and
+	 * all stray writes are flushed *before* we dismantle the backing
+	 * storage for the pinned objects.
+	 *
+	 * However, since we are uncertain that resetting the GPU on older
+	 * machines is a good idea, we don't - just in case it leaves the
+	 * machine in an unusable condition.
+	 */
+
+	mutex_lock(&i915->drm.struct_mutex);
+	for (phase = phases; *phase; phase++) {
+		list_for_each_entry(obj, *phase, mm.link)
+			WARN_ON(i915_gem_object_set_to_gtt_domain(obj, false));
+	}
+	mutex_unlock(&i915->drm.struct_mutex);
+
+	intel_uc_sanitize(i915);
+	i915_gem_sanitize(i915);
+}
+
+void i915_gem_resume(struct drm_i915_private *i915)
+{
+	GEM_TRACE("\n");
+
+	WARN_ON(i915->gt.awake);
+
+	mutex_lock(&i915->drm.struct_mutex);
+	intel_uncore_forcewake_get(&i915->uncore, FORCEWAKE_ALL);
+
+	i915_gem_restore_gtt_mappings(i915);
+	i915_gem_restore_fences(i915);
+
+	/*
+	 * As we didn't flush the kernel context before suspend, we cannot
+	 * guarantee that the context image is complete. So let's just reset
+	 * it and start again.
+	 */
+	i915->gt.resume(i915);
+
+	if (i915_gem_init_hw(i915))
+		goto err_wedged;
+
+	intel_uc_resume(i915);
+
+	/* Always reload a context for powersaving. */
+	if (!i915_gem_load_power_context(i915))
+		goto err_wedged;
+
+out_unlock:
+	intel_uncore_forcewake_put(&i915->uncore, FORCEWAKE_ALL);
+	mutex_unlock(&i915->drm.struct_mutex);
+	return;
+
+err_wedged:
+	if (!i915_reset_failed(i915)) {
+		dev_err(i915->drm.dev,
+			"Failed to re-initialize GPU, declaring it wedged!\n");
+		i915_gem_set_wedged(i915);
+	}
+	goto out_unlock;
+}
+
+void i915_gem_init__pm(struct drm_i915_private *i915)
+{
+	INIT_DELAYED_WORK(&i915->gt.idle_work, idle_work_handler);
+}
diff --git a/drivers/gpu/drm/i915/i915_gem_pm.h b/drivers/gpu/drm/i915/i915_gem_pm.h
new file mode 100644
index 000000000000..52f65e3f06b5
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_gem_pm.h
@@ -0,0 +1,28 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef __I915_GEM_PM_H__
+#define __I915_GEM_PM_H__
+
+#include <linux/types.h>
+
+struct drm_i915_private;
+struct work_struct;
+
+void i915_gem_init__pm(struct drm_i915_private *i915);
+
+bool i915_gem_load_power_context(struct drm_i915_private *i915);
+void i915_gem_resume(struct drm_i915_private *i915);
+
+void i915_gem_unpark(struct drm_i915_private *i915);
+void i915_gem_park(struct drm_i915_private *i915);
+
+void i915_gem_idle_work_handler(struct work_struct *work);
+
+void i915_gem_suspend(struct drm_i915_private *i915);
+void i915_gem_suspend_late(struct drm_i915_private *i915);
+
+#endif /* __I915_GEM_PM_H__ */
diff --git a/drivers/gpu/drm/i915/test_i915_gem_pm_standalone.c b/drivers/gpu/drm/i915/test_i915_gem_pm_standalone.c
new file mode 100644
index 000000000000..3524e471b46b
--- /dev/null
+++ b/drivers/gpu/drm/i915/test_i915_gem_pm_standalone.c
@@ -0,0 +1,7 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "i915_gem_pm.h"
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 04/22] drm/i915: Guard unpark/park with the gt.active_mutex
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
                   ` (2 preceding siblings ...)
  2019-03-25  9:03 ` [PATCH 03/22] drm/i915: Pull the GEM powermangement coupling into its own file Chris Wilson
@ 2019-03-25  9:03 ` Chris Wilson
  2019-04-01 15:22   ` Tvrtko Ursulin
  2019-03-25  9:03 ` [PATCH 05/22] drm/i915/selftests: Take GEM runtime wakeref Chris Wilson
                   ` (21 subsequent siblings)
  25 siblings, 1 reply; 43+ messages in thread
From: Chris Wilson @ 2019-03-25  9:03 UTC (permalink / raw)
  To: intel-gfx

If we introduce a new mutex for the exclusive use of GEM's runtime power
management, we can remove its requirement of holding the struct_mutex.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c           |  9 +--
 drivers/gpu/drm/i915/i915_drv.h               |  3 +-
 drivers/gpu/drm/i915/i915_gem.c               |  2 +-
 drivers/gpu/drm/i915/i915_gem.h               |  3 -
 drivers/gpu/drm/i915/i915_gem_evict.c         |  2 +-
 drivers/gpu/drm/i915/i915_gem_pm.c            | 70 ++++++++++++-------
 drivers/gpu/drm/i915/i915_request.c           | 24 ++-----
 .../gpu/drm/i915/selftests/i915_gem_context.c |  4 +-
 .../gpu/drm/i915/selftests/i915_gem_object.c  | 13 ++--
 .../gpu/drm/i915/selftests/mock_gem_device.c  |  3 +-
 10 files changed, 68 insertions(+), 65 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 47bf07a59b5e..98ff1a14ccf3 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2034,7 +2034,8 @@ static int i915_rps_boost_info(struct seq_file *m, void *data)
 
 	seq_printf(m, "RPS enabled? %d\n", rps->enabled);
 	seq_printf(m, "GPU busy? %s [%d requests]\n",
-		   yesno(dev_priv->gt.awake), dev_priv->gt.active_requests);
+		   yesno(dev_priv->gt.awake),
+		   atomic_read(&dev_priv->gt.active_requests));
 	seq_printf(m, "Boosts outstanding? %d\n",
 		   atomic_read(&rps->num_waiters));
 	seq_printf(m, "Interactive? %d\n", READ_ONCE(rps->power.interactive));
@@ -2055,7 +2056,7 @@ static int i915_rps_boost_info(struct seq_file *m, void *data)
 
 	if (INTEL_GEN(dev_priv) >= 6 &&
 	    rps->enabled &&
-	    dev_priv->gt.active_requests) {
+	    atomic_read(&dev_priv->gt.active_requests)) {
 		u32 rpup, rpupei;
 		u32 rpdown, rpdownei;
 
@@ -3087,7 +3088,7 @@ static int i915_engine_info(struct seq_file *m, void *unused)
 
 	seq_printf(m, "GT awake? %s\n", yesno(dev_priv->gt.awake));
 	seq_printf(m, "Global active requests: %d\n",
-		   dev_priv->gt.active_requests);
+		   atomic_read(&dev_priv->gt.active_requests));
 	seq_printf(m, "CS timestamp frequency: %u kHz\n",
 		   RUNTIME_INFO(dev_priv)->cs_timestamp_frequency_khz);
 
@@ -3933,7 +3934,7 @@ i915_drop_caches_set(void *data, u64 val)
 
 	if (val & DROP_IDLE) {
 		do {
-			if (READ_ONCE(i915->gt.active_requests))
+			if (atomic_read(&i915->gt.active_requests))
 				flush_delayed_work(&i915->gt.retire_work);
 			drain_delayed_work(&i915->gt.idle_work);
 		} while (READ_ONCE(i915->gt.awake));
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 11803d485275..7c7afe99986c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2008,7 +2008,8 @@ struct drm_i915_private {
 		intel_engine_mask_t active_engines;
 		struct list_head active_rings;
 		struct list_head closed_vma;
-		u32 active_requests;
+		atomic_t active_requests;
+		struct mutex active_mutex;
 
 		/**
 		 * Is the GPU currently considered idle, or busy executing
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 47c672432594..79919e0cf03d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2914,7 +2914,7 @@ wait_for_timelines(struct drm_i915_private *i915,
 	struct i915_gt_timelines *gt = &i915->gt.timelines;
 	struct i915_timeline *tl;
 
-	if (!READ_ONCE(i915->gt.active_requests))
+	if (!atomic_read(&i915->gt.active_requests))
 		return timeout;
 
 	mutex_lock(&gt->mutex);
diff --git a/drivers/gpu/drm/i915/i915_gem.h b/drivers/gpu/drm/i915/i915_gem.h
index 5c073fe73664..bd13198a9058 100644
--- a/drivers/gpu/drm/i915/i915_gem.h
+++ b/drivers/gpu/drm/i915/i915_gem.h
@@ -77,9 +77,6 @@ struct drm_i915_private;
 
 #define I915_GEM_IDLE_TIMEOUT (HZ / 5)
 
-void i915_gem_park(struct drm_i915_private *i915);
-void i915_gem_unpark(struct drm_i915_private *i915);
-
 static inline void __tasklet_disable_sync_once(struct tasklet_struct *t)
 {
 	if (!atomic_fetch_inc(&t->count))
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 060f5903544a..20e835a7f116 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -38,7 +38,7 @@ I915_SELFTEST_DECLARE(static struct igt_evict_ctl {
 
 static bool ggtt_is_idle(struct drm_i915_private *i915)
 {
-	return !i915->gt.active_requests;
+	return !atomic_read(&i915->gt.active_requests);
 }
 
 static int ggtt_flush(struct drm_i915_private *i915)
diff --git a/drivers/gpu/drm/i915/i915_gem_pm.c b/drivers/gpu/drm/i915/i915_gem_pm.c
index faa4eb42ec0a..6ecd9f8ac87d 100644
--- a/drivers/gpu/drm/i915/i915_gem_pm.c
+++ b/drivers/gpu/drm/i915/i915_gem_pm.c
@@ -14,8 +14,8 @@ static void __i915_gem_park(struct drm_i915_private *i915)
 
 	GEM_TRACE("\n");
 
-	lockdep_assert_held(&i915->drm.struct_mutex);
-	GEM_BUG_ON(i915->gt.active_requests);
+	lockdep_assert_held(&i915->gt.active_mutex);
+	GEM_BUG_ON(atomic_read(&i915->gt.active_requests));
 	GEM_BUG_ON(!list_empty(&i915->gt.active_rings));
 
 	if (!i915->gt.awake)
@@ -94,12 +94,13 @@ static void idle_work_handler(struct work_struct *work)
 	if (!READ_ONCE(i915->gt.awake))
 		return;
 
-	if (READ_ONCE(i915->gt.active_requests))
+	if (atomic_read(&i915->gt.active_requests))
 		return;
 
 	rearm_hangcheck =
 		cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
 
+	/* XXX need to support lockless kernel_context before removing! */
 	if (!mutex_trylock(&i915->drm.struct_mutex)) {
 		/* Currently busy, come back later */
 		mod_delayed_work(i915->wq,
@@ -114,18 +115,19 @@ static void idle_work_handler(struct work_struct *work)
 	 * while we are idle (such as the GPU being power cycled), no users
 	 * will be harmed.
 	 */
+	mutex_lock(&i915->gt.active_mutex);
 	if (!work_pending(&i915->gt.idle_work.work) &&
-	    !i915->gt.active_requests) {
-		++i915->gt.active_requests; /* don't requeue idle */
+	    !atomic_read(&i915->gt.active_requests)) {
+		atomic_inc(&i915->gt.active_requests); /* don't requeue idle */
 
 		switch_to_kernel_context_sync(i915, i915->gt.active_engines);
 
-		if (!--i915->gt.active_requests) {
+		if (atomic_dec_and_test(&i915->gt.active_requests)) {
 			__i915_gem_park(i915);
 			rearm_hangcheck = false;
 		}
 	}
-
+	mutex_unlock(&i915->gt.active_mutex);
 	mutex_unlock(&i915->drm.struct_mutex);
 
 out_rearm:
@@ -137,26 +139,16 @@ static void idle_work_handler(struct work_struct *work)
 
 void i915_gem_park(struct drm_i915_private *i915)
 {
-	GEM_TRACE("\n");
-
-	lockdep_assert_held(&i915->drm.struct_mutex);
-	GEM_BUG_ON(i915->gt.active_requests);
-
-	if (!i915->gt.awake)
-		return;
-
 	/* Defer the actual call to __i915_gem_park() to prevent ping-pongs */
-	mod_delayed_work(i915->wq, &i915->gt.idle_work, msecs_to_jiffies(100));
+	GEM_BUG_ON(!atomic_read(&i915->gt.active_requests));
+	if (atomic_dec_and_test(&i915->gt.active_requests))
+		mod_delayed_work(i915->wq,
+				 &i915->gt.idle_work,
+				 msecs_to_jiffies(100));
 }
 
-void i915_gem_unpark(struct drm_i915_private *i915)
+static void __i915_gem_unpark(struct drm_i915_private *i915)
 {
-	GEM_TRACE("\n");
-
-	lockdep_assert_held(&i915->drm.struct_mutex);
-	GEM_BUG_ON(!i915->gt.active_requests);
-	assert_rpm_wakelock_held(i915);
-
 	if (i915->gt.awake)
 		return;
 
@@ -191,11 +183,29 @@ void i915_gem_unpark(struct drm_i915_private *i915)
 			   round_jiffies_up_relative(HZ));
 }
 
+void i915_gem_unpark(struct drm_i915_private *i915)
+{
+	if (atomic_add_unless(&i915->gt.active_requests, 1, 0))
+		return;
+
+	mutex_lock(&i915->gt.active_mutex);
+	if (!atomic_read(&i915->gt.active_requests)) {
+		GEM_TRACE("\n");
+		__i915_gem_unpark(i915);
+		smp_mb__before_atomic();
+	}
+	atomic_inc(&i915->gt.active_requests);
+	mutex_unlock(&i915->gt.active_mutex);
+}
+
 bool i915_gem_load_power_context(struct drm_i915_private *i915)
 {
+	mutex_lock(&i915->gt.active_mutex);
+	atomic_inc(&i915->gt.active_requests);
+
 	/* Force loading the kernel context on all engines */
 	if (!switch_to_kernel_context_sync(i915, ALL_ENGINES))
-		return false;
+		goto err_active;
 
 	/*
 	 * Immediately park the GPU so that we enable powersaving and
@@ -203,9 +213,20 @@ bool i915_gem_load_power_context(struct drm_i915_private *i915)
 	 * unpark and start using the engine->pinned_default_state, otherwise
 	 * it is in limbo and an early reset may fail.
 	 */
+
+	if (!atomic_dec_and_test(&i915->gt.active_requests))
+		goto err_unlock;
+
 	__i915_gem_park(i915);
+	mutex_unlock(&i915->gt.active_mutex);
 
 	return true;
+
+err_active:
+	atomic_dec(&i915->gt.active_requests);
+err_unlock:
+	mutex_unlock(&i915->gt.active_mutex);
+	return false;
 }
 
 void i915_gem_suspend(struct drm_i915_private *i915)
@@ -337,5 +358,6 @@ void i915_gem_resume(struct drm_i915_private *i915)
 
 void i915_gem_init__pm(struct drm_i915_private *i915)
 {
+	mutex_init(&i915->gt.active_mutex);
 	INIT_DELAYED_WORK(&i915->gt.idle_work, idle_work_handler);
 }
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index e9c2094ab8ea..8d396f3c747d 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -31,6 +31,7 @@
 
 #include "i915_drv.h"
 #include "i915_active.h"
+#include "i915_gem_pm.h" /* XXX layering violation! */
 #include "i915_globals.h"
 #include "i915_reset.h"
 
@@ -130,19 +131,6 @@ i915_request_remove_from_client(struct i915_request *request)
 	spin_unlock(&file_priv->mm.lock);
 }
 
-static void reserve_gt(struct drm_i915_private *i915)
-{
-	if (!i915->gt.active_requests++)
-		i915_gem_unpark(i915);
-}
-
-static void unreserve_gt(struct drm_i915_private *i915)
-{
-	GEM_BUG_ON(!i915->gt.active_requests);
-	if (!--i915->gt.active_requests)
-		i915_gem_park(i915);
-}
-
 static void advance_ring(struct i915_request *request)
 {
 	struct intel_ring *ring = request->ring;
@@ -304,7 +292,7 @@ static void i915_request_retire(struct i915_request *request)
 
 	__retire_engine_upto(request->engine, request);
 
-	unreserve_gt(request->i915);
+	i915_gem_park(request->i915);
 
 	i915_sched_node_fini(&request->sched);
 	i915_request_put(request);
@@ -607,8 +595,6 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	u32 seqno;
 	int ret;
 
-	lockdep_assert_held(&i915->drm.struct_mutex);
-
 	/*
 	 * Preempt contexts are reserved for exclusive use to inject a
 	 * preemption context switch. They are never to be used for any trivial
@@ -633,7 +619,7 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	if (IS_ERR(ce))
 		return ERR_CAST(ce);
 
-	reserve_gt(i915);
+	i915_gem_unpark(i915);
 	mutex_lock(&ce->ring->timeline->mutex);
 
 	/* Move our oldest request to the slab-cache (if not in use!) */
@@ -766,7 +752,7 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	kmem_cache_free(global.slab_requests, rq);
 err_unreserve:
 	mutex_unlock(&ce->ring->timeline->mutex);
-	unreserve_gt(i915);
+	i915_gem_park(i915);
 	intel_context_unpin(ce);
 	return ERR_PTR(ret);
 }
@@ -1356,7 +1342,7 @@ void i915_retire_requests(struct drm_i915_private *i915)
 
 	lockdep_assert_held(&i915->drm.struct_mutex);
 
-	if (!i915->gt.active_requests)
+	if (!atomic_read(&i915->gt.active_requests))
 		return;
 
 	list_for_each_entry_safe(ring, tmp,
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index 45f73b8b4e6d..6ce366091e0b 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -1646,9 +1646,9 @@ static int __igt_switch_to_kernel_context(struct drm_i915_private *i915,
 				return err;
 		}
 
-		if (i915->gt.active_requests) {
+		if (atomic_read(&i915->gt.active_requests)) {
 			pr_err("%d active requests remain after switching to kernel context, pass %d (%s) on %s engine%s\n",
-			       i915->gt.active_requests,
+			       atomic_read(&i915->gt.active_requests),
 			       pass, from_idle ? "idle" : "busy",
 			       __engine_name(i915, engines),
 			       is_power_of_2(engines) ? "" : "s");
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
index 971148fbe6f5..c2b08fdf23cf 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
@@ -506,12 +506,7 @@ static void disable_retire_worker(struct drm_i915_private *i915)
 	i915_gem_shrinker_unregister(i915);
 
 	mutex_lock(&i915->drm.struct_mutex);
-	if (!i915->gt.active_requests++) {
-		intel_wakeref_t wakeref;
-
-		with_intel_runtime_pm(i915, wakeref)
-			i915_gem_unpark(i915);
-	}
+	i915_gem_unpark(i915);
 	mutex_unlock(&i915->drm.struct_mutex);
 
 	cancel_delayed_work_sync(&i915->gt.retire_work);
@@ -616,10 +611,10 @@ static int igt_mmap_offset_exhaustion(void *arg)
 	drm_mm_remove_node(&resv);
 out_park:
 	mutex_lock(&i915->drm.struct_mutex);
-	if (--i915->gt.active_requests)
-		queue_delayed_work(i915->wq, &i915->gt.retire_work, 0);
-	else
+	if (atomic_dec_and_test(&i915->gt.active_requests))
 		queue_delayed_work(i915->wq, &i915->gt.idle_work, 0);
+	else
+		queue_delayed_work(i915->wq, &i915->gt.retire_work, 0);
 	mutex_unlock(&i915->drm.struct_mutex);
 	i915_gem_shrinker_register(i915);
 	return err;
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index 60bbf8b4df40..7afc5ee8dda5 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -44,7 +44,7 @@ void mock_device_flush(struct drm_i915_private *i915)
 		mock_engine_flush(engine);
 
 	i915_retire_requests(i915);
-	GEM_BUG_ON(i915->gt.active_requests);
+	GEM_BUG_ON(atomic_read(&i915->gt.active_requests));
 }
 
 static void mock_device_release(struct drm_device *dev)
@@ -203,6 +203,7 @@ struct drm_i915_private *mock_gem_device(void)
 
 	i915_timelines_init(i915);
 
+	mutex_init(&i915->gt.active_mutex);
 	INIT_LIST_HEAD(&i915->gt.active_rings);
 	INIT_LIST_HEAD(&i915->gt.closed_vma);
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 05/22] drm/i915/selftests: Take GEM runtime wakeref
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
                   ` (3 preceding siblings ...)
  2019-03-25  9:03 ` [PATCH 04/22] drm/i915: Guard unpark/park with the gt.active_mutex Chris Wilson
@ 2019-03-25  9:03 ` Chris Wilson
  2019-04-01 15:34   ` Tvrtko Ursulin
  2019-03-25  9:03 ` [PATCH 06/22] drm/i915: Pass intel_context to i915_request_create() Chris Wilson
                   ` (20 subsequent siblings)
  25 siblings, 1 reply; 43+ messages in thread
From: Chris Wilson @ 2019-03-25  9:03 UTC (permalink / raw)
  To: intel-gfx

Transition from calling the lower level intel_runtime_pm functions to
using the GEM runtime_pm functions (i915_gem_unpark, i915_gem_park) now
that they are decoupled from struct_mutex. This has the small advantage
of reducing our overhead for request emission and ensuring that GEM
state is locked awake during the tests (to reduce interference).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/selftests/huge_pages.c   |  5 +-
 drivers/gpu/drm/i915/selftests/i915_active.c  | 11 ++---
 drivers/gpu/drm/i915/selftests/i915_gem.c     |  5 +-
 .../drm/i915/selftests/i915_gem_coherency.c   |  6 +--
 .../gpu/drm/i915/selftests/i915_gem_context.c | 49 ++++++++-----------
 .../gpu/drm/i915/selftests/i915_gem_evict.c   |  6 +--
 .../gpu/drm/i915/selftests/i915_gem_object.c  |  8 ++-
 drivers/gpu/drm/i915/selftests/i915_request.c | 33 ++++++-------
 .../gpu/drm/i915/selftests/i915_timeline.c    | 21 ++++----
 drivers/gpu/drm/i915/selftests/intel_guc.c    |  9 ++--
 .../gpu/drm/i915/selftests/intel_hangcheck.c  | 19 ++++---
 drivers/gpu/drm/i915/selftests/intel_lrc.c    | 41 +++++++---------
 .../drm/i915/selftests/intel_workarounds.c    | 23 ++++-----
 13 files changed, 105 insertions(+), 131 deletions(-)

diff --git a/drivers/gpu/drm/i915/selftests/huge_pages.c b/drivers/gpu/drm/i915/selftests/huge_pages.c
index 90721b54e7ae..1597a6e1f364 100644
--- a/drivers/gpu/drm/i915/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/selftests/huge_pages.c
@@ -1753,7 +1753,6 @@ int i915_gem_huge_page_live_selftests(struct drm_i915_private *dev_priv)
 	};
 	struct drm_file *file;
 	struct i915_gem_context *ctx;
-	intel_wakeref_t wakeref;
 	int err;
 
 	if (!HAS_PPGTT(dev_priv)) {
@@ -1769,7 +1768,7 @@ int i915_gem_huge_page_live_selftests(struct drm_i915_private *dev_priv)
 		return PTR_ERR(file);
 
 	mutex_lock(&dev_priv->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(dev_priv);
+	i915_gem_unpark(dev_priv);
 
 	ctx = live_context(dev_priv, file);
 	if (IS_ERR(ctx)) {
@@ -1783,7 +1782,7 @@ int i915_gem_huge_page_live_selftests(struct drm_i915_private *dev_priv)
 	err = i915_subtests(tests, ctx);
 
 out_unlock:
-	intel_runtime_pm_put(dev_priv, wakeref);
+	i915_gem_park(dev_priv);
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 
 	mock_file_free(dev_priv, file);
diff --git a/drivers/gpu/drm/i915/selftests/i915_active.c b/drivers/gpu/drm/i915/selftests/i915_active.c
index 27d8f853111b..42bcceba175c 100644
--- a/drivers/gpu/drm/i915/selftests/i915_active.c
+++ b/drivers/gpu/drm/i915/selftests/i915_active.c
@@ -5,6 +5,7 @@
  */
 
 #include "../i915_selftest.h"
+#include "../i915_gem_pm.h"
 
 #include "igt_flush_test.h"
 #include "lib_sw_fence.h"
@@ -89,13 +90,12 @@ static int live_active_wait(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
 	struct live_active active;
-	intel_wakeref_t wakeref;
 	int err;
 
 	/* Check that we get a callback when requests retire upon waiting */
 
+	i915_gem_unpark(i915);
 	mutex_lock(&i915->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(i915);
 
 	err = __live_active_setup(i915, &active);
 
@@ -109,8 +109,8 @@ static int live_active_wait(void *arg)
 	if (igt_flush_test(i915, I915_WAIT_LOCKED))
 		err = -EIO;
 
-	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
 	return err;
 }
 
@@ -118,13 +118,12 @@ static int live_active_retire(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
 	struct live_active active;
-	intel_wakeref_t wakeref;
 	int err;
 
 	/* Check that we get a callback when requests are indirectly retired */
 
+	i915_gem_unpark(i915);
 	mutex_lock(&i915->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(i915);
 
 	err = __live_active_setup(i915, &active);
 
@@ -138,8 +137,8 @@ static int live_active_retire(void *arg)
 	}
 
 	i915_active_fini(&active.base);
-	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
 	return err;
 }
 
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem.c b/drivers/gpu/drm/i915/selftests/i915_gem.c
index 50bb7bbd26d3..7d79f1fe6bbd 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem.c
@@ -16,10 +16,9 @@ static int switch_to_context(struct drm_i915_private *i915,
 {
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
-	intel_wakeref_t wakeref;
 	int err = 0;
 
-	wakeref = intel_runtime_pm_get(i915);
+	i915_gem_unpark(i915);
 
 	for_each_engine(engine, i915, id) {
 		struct i915_request *rq;
@@ -33,7 +32,7 @@ static int switch_to_context(struct drm_i915_private *i915,
 		i915_request_add(rq);
 	}
 
-	intel_runtime_pm_put(i915, wakeref);
+	i915_gem_park(i915);
 
 	return err;
 }
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
index e43630b40fce..497929238f02 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
@@ -279,7 +279,6 @@ static int igt_gem_coherency(void *arg)
 	struct drm_i915_private *i915 = arg;
 	const struct igt_coherency_mode *read, *write, *over;
 	struct drm_i915_gem_object *obj;
-	intel_wakeref_t wakeref;
 	unsigned long count, n;
 	u32 *offsets, *values;
 	int err = 0;
@@ -298,8 +297,9 @@ static int igt_gem_coherency(void *arg)
 
 	values = offsets + ncachelines;
 
+	i915_gem_unpark(i915);
 	mutex_lock(&i915->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(i915);
+
 	for (over = igt_coherency_mode; over->name; over++) {
 		if (!over->set)
 			continue;
@@ -377,8 +377,8 @@ static int igt_gem_coherency(void *arg)
 		}
 	}
 unlock:
-	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
 	kfree(offsets);
 	return err;
 
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index 6ce366091e0b..b4039df633ec 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -24,6 +24,7 @@
 
 #include <linux/prime_numbers.h>
 
+#include "../i915_gem_pm.h"
 #include "../i915_reset.h"
 #include "../i915_selftest.h"
 #include "i915_random.h"
@@ -45,7 +46,6 @@ static int live_nop_switch(void *arg)
 	struct intel_engine_cs *engine;
 	struct i915_gem_context **ctx;
 	enum intel_engine_id id;
-	intel_wakeref_t wakeref;
 	struct igt_live_test t;
 	struct drm_file *file;
 	unsigned long n;
@@ -66,8 +66,8 @@ static int live_nop_switch(void *arg)
 	if (IS_ERR(file))
 		return PTR_ERR(file);
 
+	i915_gem_unpark(i915);
 	mutex_lock(&i915->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(i915);
 
 	ctx = kcalloc(nctx, sizeof(*ctx), GFP_KERNEL);
 	if (!ctx) {
@@ -170,8 +170,8 @@ static int live_nop_switch(void *arg)
 	}
 
 out_unlock:
-	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
 	mock_file_free(i915, file);
 	return err;
 }
@@ -514,6 +514,7 @@ static int igt_ctx_exec(void *arg)
 		if (IS_ERR(file))
 			return PTR_ERR(file);
 
+		i915_gem_unpark(i915);
 		mutex_lock(&i915->drm.struct_mutex);
 
 		err = igt_live_test_begin(&t, i915, __func__, engine->name);
@@ -525,7 +526,6 @@ static int igt_ctx_exec(void *arg)
 		dw = 0;
 		while (!time_after(jiffies, end_time)) {
 			struct i915_gem_context *ctx;
-			intel_wakeref_t wakeref;
 
 			ctx = live_context(i915, file);
 			if (IS_ERR(ctx)) {
@@ -541,8 +541,7 @@ static int igt_ctx_exec(void *arg)
 				}
 			}
 
-			with_intel_runtime_pm(i915, wakeref)
-				err = gpu_fill(obj, ctx, engine, dw);
+			err = gpu_fill(obj, ctx, engine, dw);
 			if (err) {
 				pr_err("Failed to fill dword %lu [%lu/%lu] with gpu (%s) in ctx %u [full-ppgtt? %s], err=%d\n",
 				       ndwords, dw, max_dwords(obj),
@@ -579,6 +578,7 @@ static int igt_ctx_exec(void *arg)
 		if (igt_live_test_end(&t))
 			err = -EIO;
 		mutex_unlock(&i915->drm.struct_mutex);
+		i915_gem_park(i915);
 
 		mock_file_free(i915, file);
 		if (err)
@@ -610,6 +610,7 @@ static int igt_shared_ctx_exec(void *arg)
 	if (IS_ERR(file))
 		return PTR_ERR(file);
 
+	i915_gem_unpark(i915);
 	mutex_lock(&i915->drm.struct_mutex);
 
 	parent = live_context(i915, file);
@@ -641,7 +642,6 @@ static int igt_shared_ctx_exec(void *arg)
 		ncontexts = 0;
 		while (!time_after(jiffies, end_time)) {
 			struct i915_gem_context *ctx;
-			intel_wakeref_t wakeref;
 
 			ctx = kernel_context(i915);
 			if (IS_ERR(ctx)) {
@@ -660,9 +660,7 @@ static int igt_shared_ctx_exec(void *arg)
 				}
 			}
 
-			err = 0;
-			with_intel_runtime_pm(i915, wakeref)
-				err = gpu_fill(obj, ctx, engine, dw);
+			err = gpu_fill(obj, ctx, engine, dw);
 			if (err) {
 				pr_err("Failed to fill dword %lu [%lu/%lu] with gpu (%s) in ctx %u [full-ppgtt? %s], err=%d\n",
 				       ndwords, dw, max_dwords(obj),
@@ -702,6 +700,7 @@ static int igt_shared_ctx_exec(void *arg)
 		err = -EIO;
 out_unlock:
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
 
 	mock_file_free(i915, file);
 	return err;
@@ -1052,7 +1051,6 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
 	struct drm_i915_gem_object *obj;
 	struct i915_gem_context *ctx;
 	struct intel_sseu pg_sseu;
-	intel_wakeref_t wakeref;
 	struct drm_file *file;
 	int ret;
 
@@ -1100,7 +1098,7 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
 		goto out_unlock;
 	}
 
-	wakeref = intel_runtime_pm_get(i915);
+	i915_gem_unpark(i915);
 
 	/* First set the default mask. */
 	ret = __sseu_test(i915, name, flags, ctx, engine, obj, default_sseu);
@@ -1128,7 +1126,7 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
 
 	i915_gem_object_put(obj);
 
-	intel_runtime_pm_put(i915, wakeref);
+	i915_gem_park(i915);
 
 out_unlock:
 	mutex_unlock(&i915->drm.struct_mutex);
@@ -1191,6 +1189,7 @@ static int igt_ctx_readonly(void *arg)
 	if (IS_ERR(file))
 		return PTR_ERR(file);
 
+	i915_gem_unpark(i915);
 	mutex_lock(&i915->drm.struct_mutex);
 
 	err = igt_live_test_begin(&t, i915, __func__, "");
@@ -1216,8 +1215,6 @@ static int igt_ctx_readonly(void *arg)
 		unsigned int id;
 
 		for_each_engine(engine, i915, id) {
-			intel_wakeref_t wakeref;
-
 			if (!intel_engine_can_store_dword(engine))
 				continue;
 
@@ -1232,9 +1229,7 @@ static int igt_ctx_readonly(void *arg)
 					i915_gem_object_set_readonly(obj);
 			}
 
-			err = 0;
-			with_intel_runtime_pm(i915, wakeref)
-				err = gpu_fill(obj, ctx, engine, dw);
+			err = gpu_fill(obj, ctx, engine, dw);
 			if (err) {
 				pr_err("Failed to fill dword %lu [%lu/%lu] with gpu (%s) in ctx %u [full-ppgtt? %s], err=%d\n",
 				       ndwords, dw, max_dwords(obj),
@@ -1275,6 +1270,7 @@ static int igt_ctx_readonly(void *arg)
 	if (igt_live_test_end(&t))
 		err = -EIO;
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
 
 	mock_file_free(i915, file);
 	return err;
@@ -1491,7 +1487,6 @@ static int igt_vm_isolation(void *arg)
 	struct drm_i915_private *i915 = arg;
 	struct i915_gem_context *ctx_a, *ctx_b;
 	struct intel_engine_cs *engine;
-	intel_wakeref_t wakeref;
 	struct igt_live_test t;
 	struct drm_file *file;
 	I915_RND_STATE(prng);
@@ -1538,7 +1533,7 @@ static int igt_vm_isolation(void *arg)
 	GEM_BUG_ON(ctx_b->ppgtt->vm.total != vm_total);
 	vm_total -= I915_GTT_PAGE_SIZE;
 
-	wakeref = intel_runtime_pm_get(i915);
+	i915_gem_unpark(i915);
 
 	count = 0;
 	for_each_engine(engine, i915, id) {
@@ -1583,7 +1578,7 @@ static int igt_vm_isolation(void *arg)
 		count, RUNTIME_INFO(i915)->num_engines);
 
 out_rpm:
-	intel_runtime_pm_put(i915, wakeref);
+	i915_gem_park(i915);
 out_unlock:
 	if (igt_live_test_end(&t))
 		err = -EIO;
@@ -1622,6 +1617,7 @@ static int __igt_switch_to_kernel_context(struct drm_i915_private *i915,
 		int err;
 
 		if (!from_idle) {
+			i915_gem_unpark(i915);
 			for_each_engine_masked(engine, i915, engines, tmp) {
 				struct i915_request *rq;
 
@@ -1631,6 +1627,7 @@ static int __igt_switch_to_kernel_context(struct drm_i915_private *i915,
 
 				i915_request_add(rq);
 			}
+			i915_gem_park(i915);
 		}
 
 		err = i915_gem_switch_to_kernel_context(i915,
@@ -1674,7 +1671,6 @@ static int igt_switch_to_kernel_context(void *arg)
 	struct intel_engine_cs *engine;
 	struct i915_gem_context *ctx;
 	enum intel_engine_id id;
-	intel_wakeref_t wakeref;
 	int err;
 
 	/*
@@ -1685,7 +1681,6 @@ static int igt_switch_to_kernel_context(void *arg)
 	 */
 
 	mutex_lock(&i915->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(i915);
 
 	ctx = kernel_context(i915);
 	if (IS_ERR(ctx)) {
@@ -1708,7 +1703,6 @@ static int igt_switch_to_kernel_context(void *arg)
 out_unlock:
 	GEM_TRACE_DUMP_ON(err);
 
-	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
 
 	kernel_context_close(ctx);
@@ -1729,7 +1723,6 @@ static int mock_context_barrier(void *arg)
 	struct drm_i915_private *i915 = arg;
 	struct i915_gem_context *ctx;
 	struct i915_request *rq;
-	intel_wakeref_t wakeref;
 	unsigned int counter;
 	int err;
 
@@ -1738,6 +1731,7 @@ static int mock_context_barrier(void *arg)
 	 * a request; useful for retiring old state after loading new.
 	 */
 
+	i915_gem_unpark(i915);
 	mutex_lock(&i915->drm.struct_mutex);
 
 	ctx = mock_context(i915, "mock");
@@ -1772,9 +1766,7 @@ static int mock_context_barrier(void *arg)
 		goto out;
 	}
 
-	rq = ERR_PTR(-ENODEV);
-	with_intel_runtime_pm(i915, wakeref)
-		rq = i915_request_alloc(i915->engine[RCS0], ctx);
+	rq = i915_request_alloc(i915->engine[RCS0], ctx);
 	if (IS_ERR(rq)) {
 		pr_err("Request allocation failed!\n");
 		goto out;
@@ -1816,6 +1808,7 @@ static int mock_context_barrier(void *arg)
 	mock_context_close(ctx);
 unlock:
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
 	return err;
 #undef pr_fmt
 #define pr_fmt(x) x
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
index 9a9451846b33..c0cf26507915 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
@@ -23,6 +23,7 @@
  */
 
 #include "../i915_selftest.h"
+#include "../i915_gem_pm.h"
 
 #include "lib_sw_fence.h"
 #include "mock_context.h"
@@ -377,7 +378,6 @@ static int igt_evict_contexts(void *arg)
 		struct drm_mm_node node;
 		struct reserved *next;
 	} *reserved = NULL;
-	intel_wakeref_t wakeref;
 	struct drm_mm_node hole;
 	unsigned long count;
 	int err;
@@ -396,8 +396,8 @@ static int igt_evict_contexts(void *arg)
 	if (!HAS_FULL_PPGTT(i915))
 		return 0;
 
+	i915_gem_unpark(i915);
 	mutex_lock(&i915->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(i915);
 
 	/* Reserve a block so that we know we have enough to fit a few rq */
 	memset(&hole, 0, sizeof(hole));
@@ -508,8 +508,8 @@ static int igt_evict_contexts(void *arg)
 	}
 	if (drm_mm_node_allocated(&hole))
 		drm_mm_remove_node(&hole);
-	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
 
 	return err;
 }
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
index c2b08fdf23cf..cd6590e01dec 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
@@ -576,8 +576,6 @@ static int igt_mmap_offset_exhaustion(void *arg)
 
 	/* Now fill with busy dead objects that we expect to reap */
 	for (loop = 0; loop < 3; loop++) {
-		intel_wakeref_t wakeref;
-
 		if (i915_terminally_wedged(i915))
 			break;
 
@@ -587,11 +585,11 @@ static int igt_mmap_offset_exhaustion(void *arg)
 			goto out;
 		}
 
-		err = 0;
+		i915_gem_unpark(i915);
 		mutex_lock(&i915->drm.struct_mutex);
-		with_intel_runtime_pm(i915, wakeref)
-			err = make_obj_busy(obj);
+		err = make_obj_busy(obj);
 		mutex_unlock(&i915->drm.struct_mutex);
+		i915_gem_park(i915);
 		if (err) {
 			pr_err("[loop %d] Failed to busy the object\n", loop);
 			goto err_obj;
diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c
index e6ffe2240126..665cafa82390 100644
--- a/drivers/gpu/drm/i915/selftests/i915_request.c
+++ b/drivers/gpu/drm/i915/selftests/i915_request.c
@@ -505,15 +505,15 @@ int i915_request_mock_selftests(void)
 		SUBTEST(mock_breadcrumbs_smoketest),
 	};
 	struct drm_i915_private *i915;
-	intel_wakeref_t wakeref;
-	int err = 0;
+	int err;
 
 	i915 = mock_gem_device();
 	if (!i915)
 		return -ENOMEM;
 
-	with_intel_runtime_pm(i915, wakeref)
-		err = i915_subtests(tests, i915);
+	i915_gem_unpark(i915);
+	err = i915_subtests(tests, i915);
+	i915_gem_park(i915);
 
 	drm_dev_put(&i915->drm);
 
@@ -524,7 +524,6 @@ static int live_nop_request(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
 	struct intel_engine_cs *engine;
-	intel_wakeref_t wakeref;
 	struct igt_live_test t;
 	unsigned int id;
 	int err = -ENODEV;
@@ -534,8 +533,8 @@ static int live_nop_request(void *arg)
 	 * the overhead of submitting requests to the hardware.
 	 */
 
+	i915_gem_unpark(i915);
 	mutex_lock(&i915->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(i915);
 
 	for_each_engine(engine, i915, id) {
 		struct i915_request *request = NULL;
@@ -596,8 +595,8 @@ static int live_nop_request(void *arg)
 	}
 
 out_unlock:
-	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
 	return err;
 }
 
@@ -669,7 +668,6 @@ static int live_empty_request(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
 	struct intel_engine_cs *engine;
-	intel_wakeref_t wakeref;
 	struct igt_live_test t;
 	struct i915_vma *batch;
 	unsigned int id;
@@ -680,8 +678,8 @@ static int live_empty_request(void *arg)
 	 * the overhead of submitting requests to the hardware.
 	 */
 
+	i915_gem_unpark(i915);
 	mutex_lock(&i915->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(i915);
 
 	batch = empty_batch(i915);
 	if (IS_ERR(batch)) {
@@ -745,8 +743,8 @@ static int live_empty_request(void *arg)
 	i915_vma_unpin(batch);
 	i915_vma_put(batch);
 out_unlock:
-	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
 	return err;
 }
 
@@ -827,7 +825,6 @@ static int live_all_engines(void *arg)
 	struct drm_i915_private *i915 = arg;
 	struct intel_engine_cs *engine;
 	struct i915_request *request[I915_NUM_ENGINES];
-	intel_wakeref_t wakeref;
 	struct igt_live_test t;
 	struct i915_vma *batch;
 	unsigned int id;
@@ -838,8 +835,8 @@ static int live_all_engines(void *arg)
 	 * block doing so, and that they don't complete too soon.
 	 */
 
+	i915_gem_unpark(i915);
 	mutex_lock(&i915->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(i915);
 
 	err = igt_live_test_begin(&t, i915, __func__, "");
 	if (err)
@@ -922,8 +919,8 @@ static int live_all_engines(void *arg)
 	i915_vma_unpin(batch);
 	i915_vma_put(batch);
 out_unlock:
-	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
 	return err;
 }
 
@@ -933,7 +930,6 @@ static int live_sequential_engines(void *arg)
 	struct i915_request *request[I915_NUM_ENGINES] = {};
 	struct i915_request *prev = NULL;
 	struct intel_engine_cs *engine;
-	intel_wakeref_t wakeref;
 	struct igt_live_test t;
 	unsigned int id;
 	int err;
@@ -944,8 +940,8 @@ static int live_sequential_engines(void *arg)
 	 * they are running on independent engines.
 	 */
 
+	i915_gem_unpark(i915);
 	mutex_lock(&i915->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(i915);
 
 	err = igt_live_test_begin(&t, i915, __func__, "");
 	if (err)
@@ -1052,8 +1048,8 @@ static int live_sequential_engines(void *arg)
 		i915_request_put(request[id]);
 	}
 out_unlock:
-	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
 	return err;
 }
 
@@ -1104,7 +1100,6 @@ static int live_breadcrumbs_smoketest(void *arg)
 	struct task_struct **threads;
 	struct igt_live_test live;
 	enum intel_engine_id id;
-	intel_wakeref_t wakeref;
 	struct drm_file *file;
 	unsigned int n;
 	int ret = 0;
@@ -1117,7 +1112,7 @@ static int live_breadcrumbs_smoketest(void *arg)
 	 * On real hardware this time.
 	 */
 
-	wakeref = intel_runtime_pm_get(i915);
+	i915_gem_unpark(i915);
 
 	file = mock_file(i915);
 	if (IS_ERR(file)) {
@@ -1224,7 +1219,7 @@ static int live_breadcrumbs_smoketest(void *arg)
 out_file:
 	mock_file_free(i915, file);
 out_rpm:
-	intel_runtime_pm_put(i915, wakeref);
+	i915_gem_park(i915);
 
 	return ret;
 }
diff --git a/drivers/gpu/drm/i915/selftests/i915_timeline.c b/drivers/gpu/drm/i915/selftests/i915_timeline.c
index 8e7bcaa1eb66..b04969ea74d3 100644
--- a/drivers/gpu/drm/i915/selftests/i915_timeline.c
+++ b/drivers/gpu/drm/i915/selftests/i915_timeline.c
@@ -7,6 +7,7 @@
 #include <linux/prime_numbers.h>
 
 #include "../i915_selftest.h"
+#include "../i915_gem_pm.h"
 #include "i915_random.h"
 
 #include "igt_flush_test.h"
@@ -497,7 +498,6 @@ static int live_hwsp_engine(void *arg)
 	struct i915_timeline **timelines;
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
-	intel_wakeref_t wakeref;
 	unsigned long count, n;
 	int err = 0;
 
@@ -512,8 +512,8 @@ static int live_hwsp_engine(void *arg)
 	if (!timelines)
 		return -ENOMEM;
 
+	i915_gem_unpark(i915);
 	mutex_lock(&i915->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(i915);
 
 	count = 0;
 	for_each_engine(engine, i915, id) {
@@ -556,8 +556,8 @@ static int live_hwsp_engine(void *arg)
 		i915_timeline_put(tl);
 	}
 
-	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
 
 	kvfree(timelines);
 
@@ -572,7 +572,6 @@ static int live_hwsp_alternate(void *arg)
 	struct i915_timeline **timelines;
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
-	intel_wakeref_t wakeref;
 	unsigned long count, n;
 	int err = 0;
 
@@ -588,8 +587,8 @@ static int live_hwsp_alternate(void *arg)
 	if (!timelines)
 		return -ENOMEM;
 
+	i915_gem_unpark(i915);
 	mutex_lock(&i915->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(i915);
 
 	count = 0;
 	for (n = 0; n < NUM_TIMELINES; n++) {
@@ -632,8 +631,8 @@ static int live_hwsp_alternate(void *arg)
 		i915_timeline_put(tl);
 	}
 
-	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
 
 	kvfree(timelines);
 
@@ -647,7 +646,6 @@ static int live_hwsp_wrap(void *arg)
 	struct intel_engine_cs *engine;
 	struct i915_timeline *tl;
 	enum intel_engine_id id;
-	intel_wakeref_t wakeref;
 	int err = 0;
 
 	/*
@@ -655,8 +653,8 @@ static int live_hwsp_wrap(void *arg)
 	 * foreign GPU references.
 	 */
 
+	i915_gem_unpark(i915);
 	mutex_lock(&i915->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(i915);
 
 	tl = i915_timeline_create(i915, NULL);
 	if (IS_ERR(tl)) {
@@ -747,8 +745,8 @@ static int live_hwsp_wrap(void *arg)
 out_free:
 	i915_timeline_put(tl);
 out_rpm:
-	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
 
 	return err;
 }
@@ -758,7 +756,6 @@ static int live_hwsp_recycle(void *arg)
 	struct drm_i915_private *i915 = arg;
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
-	intel_wakeref_t wakeref;
 	unsigned long count;
 	int err = 0;
 
@@ -768,8 +765,8 @@ static int live_hwsp_recycle(void *arg)
 	 * want to confuse ourselves or the GPU.
 	 */
 
+	i915_gem_unpark(i915);
 	mutex_lock(&i915->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(i915);
 
 	count = 0;
 	for_each_engine(engine, i915, id) {
@@ -823,8 +820,8 @@ static int live_hwsp_recycle(void *arg)
 out:
 	if (igt_flush_test(i915, I915_WAIT_LOCKED))
 		err = -EIO;
-	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
 
 	return err;
 }
diff --git a/drivers/gpu/drm/i915/selftests/intel_guc.c b/drivers/gpu/drm/i915/selftests/intel_guc.c
index b05a21eaa8f4..e62073af4728 100644
--- a/drivers/gpu/drm/i915/selftests/intel_guc.c
+++ b/drivers/gpu/drm/i915/selftests/intel_guc.c
@@ -22,6 +22,7 @@
  *
  */
 
+#include "../i915_gem_pm.h"
 #include "../i915_selftest.h"
 
 /* max doorbell number + negative test for each client type */
@@ -137,13 +138,11 @@ static bool client_doorbell_in_sync(struct intel_guc_client *client)
 static int igt_guc_clients(void *args)
 {
 	struct drm_i915_private *dev_priv = args;
-	intel_wakeref_t wakeref;
 	struct intel_guc *guc;
 	int err = 0;
 
 	GEM_BUG_ON(!HAS_GUC(dev_priv));
 	mutex_lock(&dev_priv->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(dev_priv);
 
 	guc = &dev_priv->guc;
 	if (!guc) {
@@ -226,7 +225,6 @@ static int igt_guc_clients(void *args)
 	guc_clients_create(guc);
 	guc_clients_enable(guc);
 unlock:
-	intel_runtime_pm_put(dev_priv, wakeref);
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 	return err;
 }
@@ -239,14 +237,13 @@ static int igt_guc_clients(void *args)
 static int igt_guc_doorbells(void *arg)
 {
 	struct drm_i915_private *dev_priv = arg;
-	intel_wakeref_t wakeref;
 	struct intel_guc *guc;
 	int i, err = 0;
 	u16 db_id;
 
 	GEM_BUG_ON(!HAS_GUC(dev_priv));
+	i915_gem_unpark(dev_priv);
 	mutex_lock(&dev_priv->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(dev_priv);
 
 	guc = &dev_priv->guc;
 	if (!guc) {
@@ -339,8 +336,8 @@ static int igt_guc_doorbells(void *arg)
 			guc_client_free(clients[i]);
 		}
 unlock:
-	intel_runtime_pm_put(dev_priv, wakeref);
 	mutex_unlock(&dev_priv->drm.struct_mutex);
+	i915_gem_park(dev_priv);
 	return err;
 }
 
diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
index 76b4fa150f2e..f6f417386b9f 100644
--- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
@@ -24,6 +24,7 @@
 
 #include <linux/kthread.h>
 
+#include "../i915_gem_pm.h"
 #include "../i915_selftest.h"
 #include "i915_random.h"
 #include "igt_flush_test.h"
@@ -86,6 +87,7 @@ static int hang_init(struct hang *h, struct drm_i915_private *i915)
 	}
 	h->batch = vaddr;
 
+	i915_gem_unpark(i915);
 	return 0;
 
 err_unpin_hws:
@@ -287,6 +289,7 @@ static void hang_fini(struct hang *h)
 	kernel_context_close(h->ctx);
 
 	igt_flush_test(h->i915, I915_WAIT_LOCKED);
+	i915_gem_park(h->i915);
 }
 
 static bool wait_until_running(struct hang *h, struct i915_request *rq)
@@ -422,7 +425,6 @@ static int igt_reset_nop(void *arg)
 	struct i915_gem_context *ctx;
 	unsigned int reset_count, count;
 	enum intel_engine_id id;
-	intel_wakeref_t wakeref;
 	struct drm_file *file;
 	IGT_TIMEOUT(end_time);
 	int err = 0;
@@ -442,7 +444,7 @@ static int igt_reset_nop(void *arg)
 	}
 
 	i915_gem_context_clear_bannable(ctx);
-	wakeref = intel_runtime_pm_get(i915);
+	i915_gem_unpark(i915);
 	reset_count = i915_reset_count(&i915->gpu_error);
 	count = 0;
 	do {
@@ -502,7 +504,7 @@ static int igt_reset_nop(void *arg)
 	err = igt_flush_test(i915, I915_WAIT_LOCKED);
 	mutex_unlock(&i915->drm.struct_mutex);
 
-	intel_runtime_pm_put(i915, wakeref);
+	i915_gem_park(i915);
 
 out:
 	mock_file_free(i915, file);
@@ -517,7 +519,6 @@ static int igt_reset_nop_engine(void *arg)
 	struct intel_engine_cs *engine;
 	struct i915_gem_context *ctx;
 	enum intel_engine_id id;
-	intel_wakeref_t wakeref;
 	struct drm_file *file;
 	int err = 0;
 
@@ -539,7 +540,7 @@ static int igt_reset_nop_engine(void *arg)
 	}
 
 	i915_gem_context_clear_bannable(ctx);
-	wakeref = intel_runtime_pm_get(i915);
+	i915_gem_unpark(i915);
 	for_each_engine(engine, i915, id) {
 		unsigned int reset_count, reset_engine_count;
 		unsigned int count;
@@ -623,7 +624,7 @@ static int igt_reset_nop_engine(void *arg)
 	err = igt_flush_test(i915, I915_WAIT_LOCKED);
 	mutex_unlock(&i915->drm.struct_mutex);
 
-	intel_runtime_pm_put(i915, wakeref);
+	i915_gem_park(i915);
 out:
 	mock_file_free(i915, file);
 	if (i915_reset_failed(i915))
@@ -651,6 +652,7 @@ static int __igt_reset_engine(struct drm_i915_private *i915, bool active)
 			return err;
 	}
 
+	i915_gem_unpark(i915);
 	for_each_engine(engine, i915, id) {
 		unsigned int reset_count, reset_engine_count;
 		IGT_TIMEOUT(end_time);
@@ -744,6 +746,7 @@ static int __igt_reset_engine(struct drm_i915_private *i915, bool active)
 		if (err)
 			break;
 	}
+	i915_gem_park(i915);
 
 	if (i915_reset_failed(i915))
 		err = -EIO;
@@ -829,6 +832,7 @@ static int active_engine(void *data)
 		}
 	}
 
+	i915_gem_unpark(engine->i915);
 	while (!kthread_should_stop()) {
 		unsigned int idx = count++ & (ARRAY_SIZE(rq) - 1);
 		struct i915_request *old = rq[idx];
@@ -856,6 +860,7 @@ static int active_engine(void *data)
 
 		cond_resched();
 	}
+	i915_gem_park(engine->i915);
 
 	for (count = 0; count < ARRAY_SIZE(rq); count++) {
 		int err__ = active_request_put(rq[count]);
@@ -897,6 +902,7 @@ static int __igt_reset_engines(struct drm_i915_private *i915,
 			h.ctx->sched.priority = 1024;
 	}
 
+	i915_gem_unpark(i915);
 	for_each_engine(engine, i915, id) {
 		struct active_engine threads[I915_NUM_ENGINES] = {};
 		unsigned long global = i915_reset_count(&i915->gpu_error);
@@ -1073,6 +1079,7 @@ static int __igt_reset_engines(struct drm_i915_private *i915,
 		if (err)
 			break;
 	}
+	i915_gem_park(i915);
 
 	if (i915_reset_failed(i915))
 		err = -EIO;
diff --git a/drivers/gpu/drm/i915/selftests/intel_lrc.c b/drivers/gpu/drm/i915/selftests/intel_lrc.c
index 0d3cae564db8..45370922d965 100644
--- a/drivers/gpu/drm/i915/selftests/intel_lrc.c
+++ b/drivers/gpu/drm/i915/selftests/intel_lrc.c
@@ -6,6 +6,7 @@
 
 #include <linux/prime_numbers.h>
 
+#include "../i915_gem_pm.h"
 #include "../i915_reset.h"
 
 #include "../i915_selftest.h"
@@ -23,14 +24,13 @@ static int live_sanitycheck(void *arg)
 	struct i915_gem_context *ctx;
 	enum intel_engine_id id;
 	struct igt_spinner spin;
-	intel_wakeref_t wakeref;
 	int err = -ENOMEM;
 
 	if (!HAS_LOGICAL_RING_CONTEXTS(i915))
 		return 0;
 
+	i915_gem_unpark(i915);
 	mutex_lock(&i915->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(i915);
 
 	if (igt_spinner_init(&spin, i915))
 		goto err_unlock;
@@ -71,8 +71,8 @@ static int live_sanitycheck(void *arg)
 	igt_spinner_fini(&spin);
 err_unlock:
 	igt_flush_test(i915, I915_WAIT_LOCKED);
-	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
 	return err;
 }
 
@@ -83,7 +83,6 @@ static int live_preempt(void *arg)
 	struct igt_spinner spin_hi, spin_lo;
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
-	intel_wakeref_t wakeref;
 	int err = -ENOMEM;
 
 	if (!HAS_LOGICAL_RING_PREEMPTION(i915))
@@ -92,8 +91,8 @@ static int live_preempt(void *arg)
 	if (!(i915->caps.scheduler & I915_SCHEDULER_CAP_PREEMPTION))
 		pr_err("Logical preemption supported, but not exposed\n");
 
+	i915_gem_unpark(i915);
 	mutex_lock(&i915->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(i915);
 
 	if (igt_spinner_init(&spin_hi, i915))
 		goto err_unlock;
@@ -178,8 +177,8 @@ static int live_preempt(void *arg)
 	igt_spinner_fini(&spin_hi);
 err_unlock:
 	igt_flush_test(i915, I915_WAIT_LOCKED);
-	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
 	return err;
 }
 
@@ -191,14 +190,13 @@ static int live_late_preempt(void *arg)
 	struct intel_engine_cs *engine;
 	struct i915_sched_attr attr = {};
 	enum intel_engine_id id;
-	intel_wakeref_t wakeref;
 	int err = -ENOMEM;
 
 	if (!HAS_LOGICAL_RING_PREEMPTION(i915))
 		return 0;
 
+	i915_gem_unpark(i915);
 	mutex_lock(&i915->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(i915);
 
 	if (igt_spinner_init(&spin_hi, i915))
 		goto err_unlock;
@@ -282,8 +280,8 @@ static int live_late_preempt(void *arg)
 	igt_spinner_fini(&spin_hi);
 err_unlock:
 	igt_flush_test(i915, I915_WAIT_LOCKED);
-	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
 	return err;
 
 err_wedged:
@@ -331,7 +329,6 @@ static int live_suppress_self_preempt(void *arg)
 	};
 	struct preempt_client a, b;
 	enum intel_engine_id id;
-	intel_wakeref_t wakeref;
 	int err = -ENOMEM;
 
 	/*
@@ -347,8 +344,8 @@ static int live_suppress_self_preempt(void *arg)
 	if (USES_GUC_SUBMISSION(i915))
 		return 0; /* presume black blox */
 
+	i915_gem_unpark(i915);
 	mutex_lock(&i915->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(i915);
 
 	if (preempt_client_init(i915, &a))
 		goto err_unlock;
@@ -422,8 +419,8 @@ static int live_suppress_self_preempt(void *arg)
 err_unlock:
 	if (igt_flush_test(i915, I915_WAIT_LOCKED))
 		err = -EIO;
-	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
 	return err;
 
 err_wedged:
@@ -480,7 +477,6 @@ static int live_suppress_wait_preempt(void *arg)
 	struct preempt_client client[4];
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
-	intel_wakeref_t wakeref;
 	int err = -ENOMEM;
 	int i;
 
@@ -493,8 +489,8 @@ static int live_suppress_wait_preempt(void *arg)
 	if (!HAS_LOGICAL_RING_PREEMPTION(i915))
 		return 0;
 
+	i915_gem_unpark(i915);
 	mutex_lock(&i915->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(i915);
 
 	if (preempt_client_init(i915, &client[0])) /* ELSP[0] */
 		goto err_unlock;
@@ -587,8 +583,8 @@ static int live_suppress_wait_preempt(void *arg)
 err_unlock:
 	if (igt_flush_test(i915, I915_WAIT_LOCKED))
 		err = -EIO;
-	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
 	return err;
 
 err_wedged:
@@ -605,7 +601,6 @@ static int live_chain_preempt(void *arg)
 	struct intel_engine_cs *engine;
 	struct preempt_client hi, lo;
 	enum intel_engine_id id;
-	intel_wakeref_t wakeref;
 	int err = -ENOMEM;
 
 	/*
@@ -617,8 +612,8 @@ static int live_chain_preempt(void *arg)
 	if (!HAS_LOGICAL_RING_PREEMPTION(i915))
 		return 0;
 
+	i915_gem_unpark(i915);
 	mutex_lock(&i915->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(i915);
 
 	if (preempt_client_init(i915, &hi))
 		goto err_unlock;
@@ -735,8 +730,8 @@ static int live_chain_preempt(void *arg)
 err_unlock:
 	if (igt_flush_test(i915, I915_WAIT_LOCKED))
 		err = -EIO;
-	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
 	return err;
 
 err_wedged:
@@ -754,7 +749,6 @@ static int live_preempt_hang(void *arg)
 	struct igt_spinner spin_hi, spin_lo;
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
-	intel_wakeref_t wakeref;
 	int err = -ENOMEM;
 
 	if (!HAS_LOGICAL_RING_PREEMPTION(i915))
@@ -763,8 +757,8 @@ static int live_preempt_hang(void *arg)
 	if (!intel_has_reset_engine(i915))
 		return 0;
 
+	i915_gem_unpark(i915);
 	mutex_lock(&i915->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(i915);
 
 	if (igt_spinner_init(&spin_hi, i915))
 		goto err_unlock;
@@ -859,8 +853,8 @@ static int live_preempt_hang(void *arg)
 	igt_spinner_fini(&spin_hi);
 err_unlock:
 	igt_flush_test(i915, I915_WAIT_LOCKED);
-	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
 	return err;
 }
 
@@ -1047,7 +1041,6 @@ static int live_preempt_smoke(void *arg)
 		.ncontext = 1024,
 	};
 	const unsigned int phase[] = { 0, BATCH };
-	intel_wakeref_t wakeref;
 	struct igt_live_test t;
 	int err = -ENOMEM;
 	u32 *cs;
@@ -1062,8 +1055,8 @@ static int live_preempt_smoke(void *arg)
 	if (!smoke.contexts)
 		return -ENOMEM;
 
+	i915_gem_unpark(smoke.i915);
 	mutex_lock(&smoke.i915->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(smoke.i915);
 
 	smoke.batch = i915_gem_object_create_internal(smoke.i915, PAGE_SIZE);
 	if (IS_ERR(smoke.batch)) {
@@ -1116,8 +1109,8 @@ static int live_preempt_smoke(void *arg)
 err_batch:
 	i915_gem_object_put(smoke.batch);
 err_unlock:
-	intel_runtime_pm_put(smoke.i915, wakeref);
 	mutex_unlock(&smoke.i915->drm.struct_mutex);
+	i915_gem_park(smoke.i915);
 	kfree(smoke.contexts);
 
 	return err;
diff --git a/drivers/gpu/drm/i915/selftests/intel_workarounds.c b/drivers/gpu/drm/i915/selftests/intel_workarounds.c
index 3baed59008d7..0e42e1a0b46c 100644
--- a/drivers/gpu/drm/i915/selftests/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/selftests/intel_workarounds.c
@@ -5,6 +5,7 @@
  */
 
 #include "../i915_selftest.h"
+#include "../i915_gem_pm.h"
 #include "../i915_reset.h"
 
 #include "igt_flush_test.h"
@@ -238,7 +239,6 @@ switch_to_scratch_context(struct intel_engine_cs *engine,
 {
 	struct i915_gem_context *ctx;
 	struct i915_request *rq;
-	intel_wakeref_t wakeref;
 	int err = 0;
 
 	ctx = kernel_context(engine->i915);
@@ -247,9 +247,9 @@ switch_to_scratch_context(struct intel_engine_cs *engine,
 
 	GEM_BUG_ON(i915_gem_context_is_bannable(ctx));
 
-	rq = ERR_PTR(-ENODEV);
-	with_intel_runtime_pm(engine->i915, wakeref)
-		rq = igt_spinner_create_request(spin, ctx, engine, MI_NOOP);
+	i915_gem_unpark(engine->i915);
+	rq = igt_spinner_create_request(spin, ctx, engine, MI_NOOP);
+	i915_gem_park(engine->i915);
 
 	kernel_context_close(ctx);
 
@@ -666,7 +666,6 @@ static int live_dirty_whitelist(void *arg)
 	struct intel_engine_cs *engine;
 	struct i915_gem_context *ctx;
 	enum intel_engine_id id;
-	intel_wakeref_t wakeref;
 	struct drm_file *file;
 	int err = 0;
 
@@ -675,7 +674,7 @@ static int live_dirty_whitelist(void *arg)
 	if (INTEL_GEN(i915) < 7) /* minimum requirement for LRI, SRM, LRM */
 		return 0;
 
-	wakeref = intel_runtime_pm_get(i915);
+	i915_gem_unpark(i915);
 
 	mutex_unlock(&i915->drm.struct_mutex);
 	file = mock_file(i915);
@@ -705,7 +704,7 @@ static int live_dirty_whitelist(void *arg)
 	mock_file_free(i915, file);
 	mutex_lock(&i915->drm.struct_mutex);
 out_rpm:
-	intel_runtime_pm_put(i915, wakeref);
+	i915_gem_park(i915);
 	return err;
 }
 
@@ -762,7 +761,6 @@ static int
 live_gpu_reset_gt_engine_workarounds(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
-	intel_wakeref_t wakeref;
 	struct wa_lists lists;
 	bool ok;
 
@@ -771,8 +769,8 @@ live_gpu_reset_gt_engine_workarounds(void *arg)
 
 	pr_info("Verifying after GPU reset...\n");
 
+	i915_gem_unpark(i915);
 	igt_global_reset_lock(i915);
-	wakeref = intel_runtime_pm_get(i915);
 
 	reference_lists_init(i915, &lists);
 
@@ -786,8 +784,8 @@ live_gpu_reset_gt_engine_workarounds(void *arg)
 
 out:
 	reference_lists_fini(i915, &lists);
-	intel_runtime_pm_put(i915, wakeref);
 	igt_global_reset_unlock(i915);
+	i915_gem_park(i915);
 
 	return ok ? 0 : -ESRCH;
 }
@@ -801,7 +799,6 @@ live_engine_reset_gt_engine_workarounds(void *arg)
 	struct igt_spinner spin;
 	enum intel_engine_id id;
 	struct i915_request *rq;
-	intel_wakeref_t wakeref;
 	struct wa_lists lists;
 	int ret = 0;
 
@@ -812,8 +809,8 @@ live_engine_reset_gt_engine_workarounds(void *arg)
 	if (IS_ERR(ctx))
 		return PTR_ERR(ctx);
 
+	i915_gem_unpark(i915);
 	igt_global_reset_lock(i915);
-	wakeref = intel_runtime_pm_get(i915);
 
 	reference_lists_init(i915, &lists);
 
@@ -870,8 +867,8 @@ live_engine_reset_gt_engine_workarounds(void *arg)
 
 err:
 	reference_lists_fini(i915, &lists);
-	intel_runtime_pm_put(i915, wakeref);
 	igt_global_reset_unlock(i915);
+	i915_gem_park(i915);
 	kernel_context_close(ctx);
 
 	igt_flush_test(i915, I915_WAIT_LOCKED);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 06/22] drm/i915: Pass intel_context to i915_request_create()
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
                   ` (4 preceding siblings ...)
  2019-03-25  9:03 ` [PATCH 05/22] drm/i915/selftests: Take GEM runtime wakeref Chris Wilson
@ 2019-03-25  9:03 ` Chris Wilson
  2019-04-02 13:17   ` Tvrtko Ursulin
  2019-03-25  9:03 ` [PATCH 07/22] drm/i915/gvt: Pin the per-engine GVT shadow contexts Chris Wilson
                   ` (19 subsequent siblings)
  25 siblings, 1 reply; 43+ messages in thread
From: Chris Wilson @ 2019-03-25  9:03 UTC (permalink / raw)
  To: intel-gfx

Start acquiring the logical intel_context and using that as our primary
means for request allocation. This is the initial step to allow us to
avoid requiring struct_mutex for request allocation along the
perma-pinned kernel context, but it also provides a foundation for
breaking up the complex request allocation to handle different scenarios
inside execbuf.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c       |  20 +--
 drivers/gpu/drm/i915/i915_perf.c              |  22 ++--
 drivers/gpu/drm/i915/i915_request.c           | 118 ++++++++++--------
 drivers/gpu/drm/i915/i915_request.h           |   3 +
 drivers/gpu/drm/i915/i915_reset.c             |   8 +-
 drivers/gpu/drm/i915/intel_overlay.c          |  28 +++--
 drivers/gpu/drm/i915/selftests/i915_active.c  |   2 +-
 .../drm/i915/selftests/i915_gem_coherency.c   |   2 +-
 .../gpu/drm/i915/selftests/i915_gem_object.c  |   9 +-
 drivers/gpu/drm/i915/selftests/i915_request.c |   9 +-
 .../gpu/drm/i915/selftests/i915_timeline.c    |   4 +-
 11 files changed, 135 insertions(+), 90 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 25f267a03d3d..6a452345ffdb 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -88,6 +88,7 @@
 #include <linux/log2.h>
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
+#include "i915_gem_pm.h"
 #include "i915_globals.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
@@ -863,7 +864,6 @@ static int context_barrier_task(struct i915_gem_context *ctx,
 	struct drm_i915_private *i915 = ctx->i915;
 	struct context_barrier_task *cb;
 	struct intel_context *ce, *next;
-	intel_wakeref_t wakeref;
 	int err = 0;
 
 	lockdep_assert_held(&i915->drm.struct_mutex);
@@ -876,7 +876,7 @@ static int context_barrier_task(struct i915_gem_context *ctx,
 	i915_active_init(i915, &cb->base, cb_retire);
 	i915_active_acquire(&cb->base);
 
-	wakeref = intel_runtime_pm_get(i915);
+	i915_gem_unpark(i915);
 	rbtree_postorder_for_each_entry_safe(ce, next, &ctx->hw_contexts, node) {
 		struct intel_engine_cs *engine = ce->engine;
 		struct i915_request *rq;
@@ -890,7 +890,7 @@ static int context_barrier_task(struct i915_gem_context *ctx,
 			break;
 		}
 
-		rq = i915_request_alloc(engine, ctx);
+		rq = i915_request_create(ce);
 		if (IS_ERR(rq)) {
 			err = PTR_ERR(rq);
 			break;
@@ -906,7 +906,7 @@ static int context_barrier_task(struct i915_gem_context *ctx,
 		if (err)
 			break;
 	}
-	intel_runtime_pm_put(i915, wakeref);
+	i915_gem_park(i915);
 
 	cb->task = err ? NULL : task; /* caller needs to unwind instead */
 	cb->data = data;
@@ -930,11 +930,12 @@ int i915_gem_switch_to_kernel_context(struct drm_i915_private *i915,
 	if (i915_terminally_wedged(i915))
 		return 0;
 
+	i915_gem_unpark(i915);
 	for_each_engine_masked(engine, i915, mask, mask) {
 		struct intel_ring *ring;
 		struct i915_request *rq;
 
-		rq = i915_request_alloc(engine, i915->kernel_context);
+		rq = i915_request_create(engine->kernel_context);
 		if (IS_ERR(rq))
 			return PTR_ERR(rq);
 
@@ -960,6 +961,7 @@ int i915_gem_switch_to_kernel_context(struct drm_i915_private *i915,
 
 		i915_request_add(rq);
 	}
+	i915_gem_park(i915);
 
 	return 0;
 }
@@ -1158,7 +1160,6 @@ gen8_modify_rpcs(struct intel_context *ce, struct intel_sseu sseu)
 {
 	struct drm_i915_private *i915 = ce->engine->i915;
 	struct i915_request *rq, *prev;
-	intel_wakeref_t wakeref;
 	int ret;
 
 	lockdep_assert_held(&ce->pin_mutex);
@@ -1173,9 +1174,9 @@ gen8_modify_rpcs(struct intel_context *ce, struct intel_sseu sseu)
 		return 0;
 
 	/* Submitting requests etc needs the hw awake. */
-	wakeref = intel_runtime_pm_get(i915);
+	i915_gem_unpark(i915);
 
-	rq = i915_request_alloc(ce->engine, i915->kernel_context);
+	rq = i915_request_create(ce->engine->kernel_context);
 	if (IS_ERR(rq)) {
 		ret = PTR_ERR(rq);
 		goto out_put;
@@ -1213,8 +1214,7 @@ gen8_modify_rpcs(struct intel_context *ce, struct intel_sseu sseu)
 out_add:
 	i915_request_add(rq);
 out_put:
-	intel_runtime_pm_put(i915, wakeref);
-
+	i915_gem_park(i915);
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 85c5cb779297..fe7267da52e5 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -196,6 +196,7 @@
 #include <linux/uuid.h>
 
 #include "i915_drv.h"
+#include "i915_gem_pm.h"
 #include "i915_oa_hsw.h"
 #include "i915_oa_bdw.h"
 #include "i915_oa_chv.h"
@@ -1716,6 +1717,7 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
 	int ret;
 
 	lockdep_assert_held(&dev_priv->drm.struct_mutex);
+	i915_gem_unpark(dev_priv);
 
 	/*
 	 * The OA register config is setup through the context image. This image
@@ -1734,7 +1736,7 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
 				     I915_WAIT_LOCKED,
 				     MAX_SCHEDULE_TIMEOUT);
 	if (ret)
-		return ret;
+		goto out_pm;
 
 	/* Update all contexts now that we've stalled the submission. */
 	list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
@@ -1746,8 +1748,10 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
 			continue;
 
 		regs = i915_gem_object_pin_map(ce->state->obj, map_type);
-		if (IS_ERR(regs))
-			return PTR_ERR(regs);
+		if (IS_ERR(regs)) {
+			ret = PTR_ERR(regs);
+			goto out_pm;
+		}
 
 		ce->state->obj->mm.dirty = true;
 		regs += LRC_STATE_PN * PAGE_SIZE / sizeof(*regs);
@@ -1761,13 +1765,17 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
 	 * Apply the configuration by doing one context restore of the edited
 	 * context image.
 	 */
-	rq = i915_request_alloc(engine, dev_priv->kernel_context);
-	if (IS_ERR(rq))
-		return PTR_ERR(rq);
+	rq = i915_request_create(engine->kernel_context);
+	if (IS_ERR(rq)) {
+		ret = PTR_ERR(rq);
+		goto out_pm;
+	}
 
 	i915_request_add(rq);
 
-	return 0;
+out_pm:
+	i915_gem_park(dev_priv);
+	return ret;
 }
 
 static int gen8_enable_metric_set(struct i915_perf_stream *stream)
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 8d396f3c747d..fd24f576ca61 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -576,51 +576,19 @@ static int add_timeline_barrier(struct i915_request *rq)
 	return i915_request_await_active_request(rq, &rq->timeline->barrier);
 }
 
-/**
- * i915_request_alloc - allocate a request structure
- *
- * @engine: engine that we wish to issue the request on.
- * @ctx: context that the request will be associated with.
- *
- * Returns a pointer to the allocated request if successful,
- * or an error code if not.
- */
-struct i915_request *
-i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
+struct i915_request *i915_request_create(struct intel_context *ce)
 {
-	struct drm_i915_private *i915 = engine->i915;
-	struct intel_context *ce;
-	struct i915_timeline *tl;
+	struct drm_i915_private *i915 = ce->engine->i915;
+	struct i915_timeline *tl = ce->ring->timeline;
 	struct i915_request *rq;
 	u32 seqno;
 	int ret;
 
-	/*
-	 * Preempt contexts are reserved for exclusive use to inject a
-	 * preemption context switch. They are never to be used for any trivial
-	 * request!
-	 */
-	GEM_BUG_ON(ctx == i915->preempt_context);
+	GEM_BUG_ON(!i915->gt.awake);
 
-	/*
-	 * ABI: Before userspace accesses the GPU (e.g. execbuffer), report
-	 * EIO if the GPU is already wedged.
-	 */
-	ret = i915_terminally_wedged(i915);
-	if (ret)
-		return ERR_PTR(ret);
-
-	/*
-	 * Pinning the contexts may generate requests in order to acquire
-	 * GGTT space, so do this first before we reserve a seqno for
-	 * ourselves.
-	 */
-	ce = intel_context_pin(ctx, engine);
-	if (IS_ERR(ce))
-		return ERR_CAST(ce);
-
-	i915_gem_unpark(i915);
-	mutex_lock(&ce->ring->timeline->mutex);
+	/* Check that the caller provided an already pinned context */
+	__intel_context_pin(ce);
+	mutex_lock(&tl->mutex);
 
 	/* Move our oldest request to the slab-cache (if not in use!) */
 	rq = list_first_entry(&ce->ring->request_list, typeof(*rq), ring_link);
@@ -670,18 +638,17 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	INIT_LIST_HEAD(&rq->active_list);
 	INIT_LIST_HEAD(&rq->execute_cb);
 
-	tl = ce->ring->timeline;
 	ret = i915_timeline_get_seqno(tl, rq, &seqno);
 	if (ret)
 		goto err_free;
 
 	rq->i915 = i915;
-	rq->engine = engine;
-	rq->gem_context = ctx;
 	rq->hw_context = ce;
+	rq->gem_context = ce->gem_context;
+	rq->engine = ce->engine;
 	rq->ring = ce->ring;
 	rq->timeline = tl;
-	GEM_BUG_ON(rq->timeline == &engine->timeline);
+	GEM_BUG_ON(rq->timeline == &ce->engine->timeline);
 	rq->hwsp_seqno = tl->hwsp_seqno;
 	rq->hwsp_cacheline = tl->hwsp_cacheline;
 	rq->rcustate = get_state_synchronize_rcu(); /* acts as smp_mb() */
@@ -713,7 +680,8 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	 * around inside i915_request_add() there is sufficient space at
 	 * the beginning of the ring as well.
 	 */
-	rq->reserved_space = 2 * engine->emit_fini_breadcrumb_dw * sizeof(u32);
+	rq->reserved_space =
+		2 * rq->engine->emit_fini_breadcrumb_dw * sizeof(u32);
 
 	/*
 	 * Record the position of the start of the request so that
@@ -727,17 +695,19 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	if (ret)
 		goto err_unwind;
 
-	ret = engine->request_alloc(rq);
+	ret = rq->engine->request_alloc(rq);
 	if (ret)
 		goto err_unwind;
 
+	rq->infix = rq->ring->emit; /* end of header; start of user payload */
+
 	/* Keep a second pin for the dual retirement along engine and ring */
 	__intel_context_pin(ce);
-
-	rq->infix = rq->ring->emit; /* end of header; start of user payload */
+	atomic_inc(&i915->gt.active_requests);
 
 	/* Check that we didn't interrupt ourselves with a new request */
 	GEM_BUG_ON(rq->timeline->seqno != rq->fence.seqno);
+	lockdep_assert_held(&tl->mutex);
 	return rq;
 
 err_unwind:
@@ -751,12 +721,62 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 err_free:
 	kmem_cache_free(global.slab_requests, rq);
 err_unreserve:
-	mutex_unlock(&ce->ring->timeline->mutex);
-	i915_gem_park(i915);
+	mutex_unlock(&tl->mutex);
 	intel_context_unpin(ce);
 	return ERR_PTR(ret);
 }
 
+/**
+ * i915_request_alloc - allocate a request structure
+ *
+ * @engine: engine that we wish to issue the request on.
+ * @ctx: context that the request will be associated with.
+ *
+ * Returns a pointer to the allocated request if successful,
+ * or an error code if not.
+ */
+struct i915_request *
+i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
+{
+	struct drm_i915_private *i915 = engine->i915;
+	struct intel_context *ce;
+	struct i915_request *rq;
+	int ret;
+
+	/*
+	 * Preempt contexts are reserved for exclusive use to inject a
+	 * preemption context switch. They are never to be used for any trivial
+	 * request!
+	 */
+	GEM_BUG_ON(ctx == i915->preempt_context);
+
+	/*
+	 * ABI: Before userspace accesses the GPU (e.g. execbuffer), report
+	 * EIO if the GPU is already wedged.
+	 */
+	ret = i915_terminally_wedged(i915);
+	if (ret)
+		return ERR_PTR(ret);
+
+	/*
+	 * Pinning the contexts may generate requests in order to acquire
+	 * GGTT space, so do this first before we reserve a seqno for
+	 * ourselves.
+	 */
+	ce = intel_context_pin(ctx, engine);
+	if (IS_ERR(ce))
+		return ERR_CAST(ce);
+
+	i915_gem_unpark(i915);
+
+	rq = i915_request_create(ce);
+
+	i915_gem_park(i915);
+	intel_context_unpin(ce);
+
+	return rq;
+}
+
 static int
 emit_semaphore_wait(struct i915_request *to,
 		    struct i915_request *from,
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index cd6c130964cd..37f84ad067da 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -228,6 +228,9 @@ static inline bool dma_fence_is_i915(const struct dma_fence *fence)
 	return fence->ops == &i915_fence_ops;
 }
 
+struct i915_request * __must_check
+i915_request_create(struct intel_context *ce);
+
 struct i915_request * __must_check
 i915_request_alloc(struct intel_engine_cs *engine,
 		   struct i915_gem_context *ctx);
diff --git a/drivers/gpu/drm/i915/i915_reset.c b/drivers/gpu/drm/i915/i915_reset.c
index 0aea19cefe4a..e1f2cf64639a 100644
--- a/drivers/gpu/drm/i915/i915_reset.c
+++ b/drivers/gpu/drm/i915/i915_reset.c
@@ -8,6 +8,7 @@
 #include <linux/stop_machine.h>
 
 #include "i915_drv.h"
+#include "i915_gem_pm.h"
 #include "i915_gpu_error.h"
 #include "i915_reset.h"
 
@@ -727,9 +728,8 @@ static void restart_work(struct work_struct *work)
 	struct drm_i915_private *i915 = arg->i915;
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
-	intel_wakeref_t wakeref;
 
-	wakeref = intel_runtime_pm_get(i915);
+	i915_gem_unpark(i915);
 	mutex_lock(&i915->drm.struct_mutex);
 	WRITE_ONCE(i915->gpu_error.restart, NULL);
 
@@ -744,13 +744,13 @@ static void restart_work(struct work_struct *work)
 		if (!intel_engine_is_idle(engine))
 			continue;
 
-		rq = i915_request_alloc(engine, i915->kernel_context);
+		rq = i915_request_create(engine->kernel_context);
 		if (!IS_ERR(rq))
 			i915_request_add(rq);
 	}
 
 	mutex_unlock(&i915->drm.struct_mutex);
-	intel_runtime_pm_put(i915, wakeref);
+	i915_gem_park(i915);
 
 	kfree(arg);
 }
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index a882b8d42bd9..e23a91eb9149 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -29,6 +29,7 @@
 #include <drm/drm_fourcc.h>
 
 #include "i915_drv.h"
+#include "i915_gem_pm.h"
 #include "i915_reg.h"
 #include "intel_drv.h"
 #include "intel_frontbuffer.h"
@@ -235,10 +236,9 @@ static int intel_overlay_do_wait_request(struct intel_overlay *overlay,
 
 static struct i915_request *alloc_request(struct intel_overlay *overlay)
 {
-	struct drm_i915_private *dev_priv = overlay->i915;
-	struct intel_engine_cs *engine = dev_priv->engine[RCS0];
+	struct intel_engine_cs *engine = overlay->i915->engine[RCS0];
 
-	return i915_request_alloc(engine, dev_priv->kernel_context);
+	return i915_request_create(engine->kernel_context);
 }
 
 /* overlay needs to be disable in OCMD reg */
@@ -247,17 +247,21 @@ static int intel_overlay_on(struct intel_overlay *overlay)
 	struct drm_i915_private *dev_priv = overlay->i915;
 	struct i915_request *rq;
 	u32 *cs;
+	int err;
 
 	WARN_ON(overlay->active);
+	i915_gem_unpark(dev_priv);
 
 	rq = alloc_request(overlay);
-	if (IS_ERR(rq))
-		return PTR_ERR(rq);
+	if (IS_ERR(rq)) {
+		err = PTR_ERR(rq);
+		goto err_pm;
+	}
 
 	cs = intel_ring_begin(rq, 4);
 	if (IS_ERR(cs)) {
-		i915_request_add(rq);
-		return PTR_ERR(cs);
+		err = PTR_ERR(cs);
+		goto err_rq;
 	}
 
 	overlay->active = true;
@@ -272,6 +276,12 @@ static int intel_overlay_on(struct intel_overlay *overlay)
 	intel_ring_advance(rq, cs);
 
 	return intel_overlay_do_wait_request(overlay, rq, NULL);
+
+err_rq:
+	i915_request_add(rq);
+err_pm:
+	i915_gem_park(dev_priv);
+	return err;
 }
 
 static void intel_overlay_flip_prepare(struct intel_overlay *overlay,
@@ -376,6 +386,8 @@ static void intel_overlay_off_tail(struct i915_active_request *active,
 
 	if (IS_I830(dev_priv))
 		i830_overlay_clock_gating(dev_priv, true);
+
+	i915_gem_park(dev_priv);
 }
 
 /* overlay needs to be disabled in OCMD reg */
@@ -485,6 +497,8 @@ void intel_overlay_reset(struct drm_i915_private *dev_priv)
 	overlay->old_yscale = 0;
 	overlay->crtc = NULL;
 	overlay->active = false;
+
+	i915_gem_park(dev_priv);
 }
 
 static int packed_depth_bytes(u32 format)
diff --git a/drivers/gpu/drm/i915/selftests/i915_active.c b/drivers/gpu/drm/i915/selftests/i915_active.c
index 42bcceba175c..b10308c20e7d 100644
--- a/drivers/gpu/drm/i915/selftests/i915_active.c
+++ b/drivers/gpu/drm/i915/selftests/i915_active.c
@@ -47,7 +47,7 @@ static int __live_active_setup(struct drm_i915_private *i915,
 	for_each_engine(engine, i915, id) {
 		struct i915_request *rq;
 
-		rq = i915_request_alloc(engine, i915->kernel_context);
+		rq = i915_request_create(engine->kernel_context);
 		if (IS_ERR(rq)) {
 			err = PTR_ERR(rq);
 			break;
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
index 497929238f02..b926f40bb01a 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
@@ -202,7 +202,7 @@ static int gpu_set(struct drm_i915_gem_object *obj,
 	if (IS_ERR(vma))
 		return PTR_ERR(vma);
 
-	rq = i915_request_alloc(i915->engine[RCS0], i915->kernel_context);
+	rq = i915_request_create(i915->engine[RCS0]->kernel_context);
 	if (IS_ERR(rq)) {
 		i915_vma_unpin(vma);
 		return PTR_ERR(rq);
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
index cd6590e01dec..e5166f9b04c1 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
@@ -308,7 +308,6 @@ static int igt_partial_tiling(void *arg)
 	const unsigned int nreal = 1 << 12; /* largest tile row x2 */
 	struct drm_i915_private *i915 = arg;
 	struct drm_i915_gem_object *obj;
-	intel_wakeref_t wakeref;
 	int tiling;
 	int err;
 
@@ -333,8 +332,8 @@ static int igt_partial_tiling(void *arg)
 		goto out;
 	}
 
+	i915_gem_unpark(i915);
 	mutex_lock(&i915->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(i915);
 
 	if (1) {
 		IGT_TIMEOUT(end);
@@ -445,8 +444,8 @@ next_tiling: ;
 	}
 
 out_unlock:
-	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
 	i915_gem_object_unpin_pages(obj);
 out:
 	i915_gem_object_put(obj);
@@ -468,7 +467,9 @@ static int make_obj_busy(struct drm_i915_gem_object *obj)
 	if (err)
 		return err;
 
-	rq = i915_request_alloc(i915->engine[RCS0], i915->kernel_context);
+	i915_gem_unpark(i915);
+	rq = i915_request_create(i915->engine[RCS0]->kernel_context);
+	i915_gem_park(i915);
 	if (IS_ERR(rq)) {
 		i915_vma_unpin(vma);
 		return PTR_ERR(rq);
diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c
index 665cafa82390..03bea3caaafe 100644
--- a/drivers/gpu/drm/i915/selftests/i915_request.c
+++ b/drivers/gpu/drm/i915/selftests/i915_request.c
@@ -550,8 +550,7 @@ static int live_nop_request(void *arg)
 			times[1] = ktime_get_raw();
 
 			for (n = 0; n < prime; n++) {
-				request = i915_request_alloc(engine,
-							     i915->kernel_context);
+				request = i915_request_create(engine->kernel_context);
 				if (IS_ERR(request)) {
 					err = PTR_ERR(request);
 					goto out_unlock;
@@ -648,7 +647,7 @@ empty_request(struct intel_engine_cs *engine,
 	struct i915_request *request;
 	int err;
 
-	request = i915_request_alloc(engine, engine->i915->kernel_context);
+	request = i915_request_create(engine->kernel_context);
 	if (IS_ERR(request))
 		return request;
 
@@ -850,7 +849,7 @@ static int live_all_engines(void *arg)
 	}
 
 	for_each_engine(engine, i915, id) {
-		request[id] = i915_request_alloc(engine, i915->kernel_context);
+		request[id] = i915_request_create(engine->kernel_context);
 		if (IS_ERR(request[id])) {
 			err = PTR_ERR(request[id]);
 			pr_err("%s: Request allocation failed with err=%d\n",
@@ -958,7 +957,7 @@ static int live_sequential_engines(void *arg)
 			goto out_unlock;
 		}
 
-		request[id] = i915_request_alloc(engine, i915->kernel_context);
+		request[id] = i915_request_create(engine->kernel_context);
 		if (IS_ERR(request[id])) {
 			err = PTR_ERR(request[id]);
 			pr_err("%s: Request allocation failed for %s with err=%d\n",
diff --git a/drivers/gpu/drm/i915/selftests/i915_timeline.c b/drivers/gpu/drm/i915/selftests/i915_timeline.c
index b04969ea74d3..77f6c8c66568 100644
--- a/drivers/gpu/drm/i915/selftests/i915_timeline.c
+++ b/drivers/gpu/drm/i915/selftests/i915_timeline.c
@@ -455,7 +455,7 @@ tl_write(struct i915_timeline *tl, struct intel_engine_cs *engine, u32 value)
 		goto out;
 	}
 
-	rq = i915_request_alloc(engine, engine->i915->kernel_context);
+	rq = i915_request_create(engine->kernel_context);
 	if (IS_ERR(rq))
 		goto out_unpin;
 
@@ -676,7 +676,7 @@ static int live_hwsp_wrap(void *arg)
 		if (!intel_engine_can_store_dword(engine))
 			continue;
 
-		rq = i915_request_alloc(engine, i915->kernel_context);
+		rq = i915_request_create(engine->kernel_context);
 		if (IS_ERR(rq)) {
 			err = PTR_ERR(rq);
 			goto out;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 07/22] drm/i915/gvt: Pin the per-engine GVT shadow contexts
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
                   ` (5 preceding siblings ...)
  2019-03-25  9:03 ` [PATCH 06/22] drm/i915: Pass intel_context to i915_request_create() Chris Wilson
@ 2019-03-25  9:03 ` Chris Wilson
  2019-03-25  9:03 ` [PATCH 08/22] drm/i915: Explicitly pin the logical context for execbuf Chris Wilson
                   ` (18 subsequent siblings)
  25 siblings, 0 replies; 43+ messages in thread
From: Chris Wilson @ 2019-03-25  9:03 UTC (permalink / raw)
  To: intel-gfx

OUr eventual goal is to rid request construction of struct_mutex, with
the short term step of lifting the struct_mutex requirements into the
higher levels (i.e. the caller must ensure that the context is already
pinned into the GTT). In this patch, we pin GVT's shadow context upon
allocation and so keep them pinned into the GGTT for as long as the
virtual machine is alive, and so we can use the simpler request
construction path safe in the knowledge that the hard work is already
done.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gvt/gvt.h          |   2 +-
 drivers/gpu/drm/i915/gvt/kvmgt.c        |   2 +-
 drivers/gpu/drm/i915/gvt/mmio_context.c |   3 +-
 drivers/gpu/drm/i915/gvt/scheduler.c    | 138 ++++++++++++------------
 4 files changed, 75 insertions(+), 70 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/gvt.h b/drivers/gpu/drm/i915/gvt/gvt.h
index 8bce09de4b82..1077f525d91d 100644
--- a/drivers/gpu/drm/i915/gvt/gvt.h
+++ b/drivers/gpu/drm/i915/gvt/gvt.h
@@ -152,9 +152,9 @@ struct intel_vgpu_submission_ops {
 struct intel_vgpu_submission {
 	struct intel_vgpu_execlist execlist[I915_NUM_ENGINES];
 	struct list_head workload_q_head[I915_NUM_ENGINES];
+	struct intel_context *shadow[I915_NUM_ENGINES];
 	struct kmem_cache *workloads;
 	atomic_t running_workload_num;
-	struct i915_gem_context *shadow_ctx;
 	union {
 		u64 i915_context_pml4;
 		u64 i915_context_pdps[GEN8_3LVL_PDPES];
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index d5fcc447d22f..6fd635e3d29e 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -1576,7 +1576,7 @@ hw_id_show(struct device *dev, struct device_attribute *attr,
 		struct intel_vgpu *vgpu = (struct intel_vgpu *)
 			mdev_get_drvdata(mdev);
 		return sprintf(buf, "%u\n",
-			       vgpu->submission.shadow_ctx->hw_id);
+			       vgpu->submission.shadow[0]->gem_context->hw_id);
 	}
 	return sprintf(buf, "\n");
 }
diff --git a/drivers/gpu/drm/i915/gvt/mmio_context.c b/drivers/gpu/drm/i915/gvt/mmio_context.c
index a00a807a1d55..ef6667fbbc17 100644
--- a/drivers/gpu/drm/i915/gvt/mmio_context.c
+++ b/drivers/gpu/drm/i915/gvt/mmio_context.c
@@ -494,8 +494,7 @@ static void switch_mmio(struct intel_vgpu *pre,
 			 * itself.
 			 */
 			if (mmio->in_context &&
-			    !is_inhibit_context(intel_context_lookup(s->shadow_ctx,
-								     dev_priv->engine[ring_id])))
+			    !is_inhibit_context(s->shadow[ring_id]))
 				continue;
 
 			if (mmio->mask)
diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c
index bf743b47b59d..2d2dafd22a18 100644
--- a/drivers/gpu/drm/i915/gvt/scheduler.c
+++ b/drivers/gpu/drm/i915/gvt/scheduler.c
@@ -36,6 +36,7 @@
 #include <linux/kthread.h>
 
 #include "i915_drv.h"
+#include "i915_gem_pm.h"
 #include "gvt.h"
 
 #define RING_CTX_OFF(x) \
@@ -277,18 +278,23 @@ static int shadow_context_status_change(struct notifier_block *nb,
 	return NOTIFY_OK;
 }
 
-static void shadow_context_descriptor_update(struct intel_context *ce)
+static void
+shadow_context_descriptor_update(struct intel_context *ce,
+				 struct intel_vgpu_workload *workload)
 {
-	u64 desc = 0;
-
-	desc = ce->lrc_desc;
+	u64 desc = ce->lrc_desc;
 
-	/* Update bits 0-11 of the context descriptor which includes flags
+	/*
+	 * Update bits 0-11 of the context descriptor which includes flags
 	 * like GEN8_CTX_* cached in desc_template
 	 */
 	desc &= U64_MAX << 12;
 	desc |= ce->gem_context->desc_template & ((1ULL << 12) - 1);
 
+	desc &= ~(0x3 << GEN8_CTX_ADDRESSING_MODE_SHIFT);
+	desc |= workload->ctx_desc.addressing_mode <<
+		GEN8_CTX_ADDRESSING_MODE_SHIFT;
+
 	ce->lrc_desc = desc;
 }
 
@@ -365,26 +371,22 @@ intel_gvt_workload_req_alloc(struct intel_vgpu_workload *workload)
 {
 	struct intel_vgpu *vgpu = workload->vgpu;
 	struct intel_vgpu_submission *s = &vgpu->submission;
-	struct i915_gem_context *shadow_ctx = s->shadow_ctx;
 	struct drm_i915_private *dev_priv = vgpu->gvt->dev_priv;
-	struct intel_engine_cs *engine = dev_priv->engine[workload->ring_id];
 	struct i915_request *rq;
-	int ret = 0;
 
 	lockdep_assert_held(&dev_priv->drm.struct_mutex);
 
 	if (workload->req)
-		goto out;
+		return 0;
 
-	rq = i915_request_alloc(engine, shadow_ctx);
+	rq = i915_request_create(s->shadow[workload->ring_id]);
 	if (IS_ERR(rq)) {
 		gvt_vgpu_err("fail to allocate gem request\n");
-		ret = PTR_ERR(rq);
-		goto out;
+		return PTR_ERR(rq);
 	}
+
 	workload->req = i915_request_get(rq);
-out:
-	return ret;
+	return 0;
 }
 
 /**
@@ -399,10 +401,7 @@ int intel_gvt_scan_and_shadow_workload(struct intel_vgpu_workload *workload)
 {
 	struct intel_vgpu *vgpu = workload->vgpu;
 	struct intel_vgpu_submission *s = &vgpu->submission;
-	struct i915_gem_context *shadow_ctx = s->shadow_ctx;
 	struct drm_i915_private *dev_priv = vgpu->gvt->dev_priv;
-	struct intel_engine_cs *engine = dev_priv->engine[workload->ring_id];
-	struct intel_context *ce;
 	int ret;
 
 	lockdep_assert_held(&dev_priv->drm.struct_mutex);
@@ -410,29 +409,13 @@ int intel_gvt_scan_and_shadow_workload(struct intel_vgpu_workload *workload)
 	if (workload->shadow)
 		return 0;
 
-	/* pin shadow context by gvt even the shadow context will be pinned
-	 * when i915 alloc request. That is because gvt will update the guest
-	 * context from shadow context when workload is completed, and at that
-	 * moment, i915 may already unpined the shadow context to make the
-	 * shadow_ctx pages invalid. So gvt need to pin itself. After update
-	 * the guest context, gvt can unpin the shadow_ctx safely.
-	 */
-	ce = intel_context_pin(shadow_ctx, engine);
-	if (IS_ERR(ce)) {
-		gvt_vgpu_err("fail to pin shadow context\n");
-		return PTR_ERR(ce);
-	}
-
-	shadow_ctx->desc_template &= ~(0x3 << GEN8_CTX_ADDRESSING_MODE_SHIFT);
-	shadow_ctx->desc_template |= workload->ctx_desc.addressing_mode <<
-				    GEN8_CTX_ADDRESSING_MODE_SHIFT;
-
 	if (!test_and_set_bit(workload->ring_id, s->shadow_ctx_desc_updated))
-		shadow_context_descriptor_update(ce);
+		shadow_context_descriptor_update(s->shadow[workload->ring_id],
+						 workload);
 
 	ret = intel_gvt_scan_and_shadow_ringbuffer(workload);
 	if (ret)
-		goto err_unpin;
+		return ret;
 
 	if (workload->ring_id == RCS0 && workload->wa_ctx.indirect_ctx.size) {
 		ret = intel_gvt_scan_and_shadow_wa_ctx(&workload->wa_ctx);
@@ -444,8 +427,6 @@ int intel_gvt_scan_and_shadow_workload(struct intel_vgpu_workload *workload)
 	return 0;
 err_shadow:
 	release_shadow_wa_ctx(&workload->wa_ctx);
-err_unpin:
-	intel_context_unpin(ce);
 	return ret;
 }
 
@@ -672,7 +653,6 @@ static int dispatch_workload(struct intel_vgpu_workload *workload)
 	struct intel_vgpu *vgpu = workload->vgpu;
 	struct drm_i915_private *dev_priv = vgpu->gvt->dev_priv;
 	struct intel_vgpu_submission *s = &vgpu->submission;
-	struct i915_gem_context *shadow_ctx = s->shadow_ctx;
 	struct i915_request *rq;
 	int ring_id = workload->ring_id;
 	int ret;
@@ -683,7 +663,8 @@ static int dispatch_workload(struct intel_vgpu_workload *workload)
 	mutex_lock(&vgpu->vgpu_lock);
 	mutex_lock(&dev_priv->drm.struct_mutex);
 
-	ret = set_context_ppgtt_from_shadow(workload, shadow_ctx);
+	ret = set_context_ppgtt_from_shadow(workload,
+					    s->shadow[ring_id]->gem_context);
 	if (ret < 0) {
 		gvt_vgpu_err("workload shadow ppgtt isn't ready\n");
 		goto err_req;
@@ -994,7 +975,7 @@ static int workload_thread(void *priv)
 				workload->ring_id, workload,
 				workload->vgpu->id);
 
-		intel_runtime_pm_get(gvt->dev_priv);
+		i915_gem_unpark(gvt->dev_priv);
 
 		gvt_dbg_sched("ring id %d will dispatch workload %p\n",
 				workload->ring_id, workload);
@@ -1025,7 +1006,7 @@ static int workload_thread(void *priv)
 			intel_uncore_forcewake_put(&gvt->dev_priv->uncore,
 					FORCEWAKE_ALL);
 
-		intel_runtime_pm_put_unchecked(gvt->dev_priv);
+		i915_gem_park(gvt->dev_priv);
 		if (ret && (vgpu_is_vm_unhealthy(ret)))
 			enter_failsafe_mode(vgpu, GVT_FAILSAFE_GUEST_ERR);
 	}
@@ -1108,17 +1089,17 @@ int intel_gvt_init_workload_scheduler(struct intel_gvt *gvt)
 }
 
 static void
-i915_context_ppgtt_root_restore(struct intel_vgpu_submission *s)
+i915_context_ppgtt_root_restore(struct intel_vgpu_submission *s,
+				struct i915_hw_ppgtt *ppgtt)
 {
-	struct i915_hw_ppgtt *i915_ppgtt = s->shadow_ctx->ppgtt;
 	int i;
 
-	if (i915_vm_is_4lvl(&i915_ppgtt->vm)) {
-		px_dma(&i915_ppgtt->pml4) = s->i915_context_pml4;
+	if (i915_vm_is_4lvl(&ppgtt->vm)) {
+		px_dma(&ppgtt->pml4) = s->i915_context_pml4;
 	} else {
 		for (i = 0; i < GEN8_3LVL_PDPES; i++)
-			px_dma(i915_ppgtt->pdp.page_directory[i]) =
-						s->i915_context_pdps[i];
+			px_dma(ppgtt->pdp.page_directory[i]) =
+				s->i915_context_pdps[i];
 	}
 }
 
@@ -1132,10 +1113,15 @@ i915_context_ppgtt_root_restore(struct intel_vgpu_submission *s)
 void intel_vgpu_clean_submission(struct intel_vgpu *vgpu)
 {
 	struct intel_vgpu_submission *s = &vgpu->submission;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
 
 	intel_vgpu_select_submission_ops(vgpu, ALL_ENGINES, 0);
-	i915_context_ppgtt_root_restore(s);
-	i915_gem_context_put(s->shadow_ctx);
+
+	i915_context_ppgtt_root_restore(s, s->shadow[0]->gem_context->ppgtt);
+	for_each_engine(engine, vgpu->gvt->dev_priv, id)
+		intel_context_unpin(s->shadow[id]);
+
 	kmem_cache_destroy(s->workloads);
 }
 
@@ -1161,17 +1147,17 @@ void intel_vgpu_reset_submission(struct intel_vgpu *vgpu,
 }
 
 static void
-i915_context_ppgtt_root_save(struct intel_vgpu_submission *s)
+i915_context_ppgtt_root_save(struct intel_vgpu_submission *s,
+			     struct i915_hw_ppgtt *ppgtt)
 {
-	struct i915_hw_ppgtt *i915_ppgtt = s->shadow_ctx->ppgtt;
 	int i;
 
-	if (i915_vm_is_4lvl(&i915_ppgtt->vm))
-		s->i915_context_pml4 = px_dma(&i915_ppgtt->pml4);
-	else {
+	if (i915_vm_is_4lvl(&ppgtt->vm)) {
+		s->i915_context_pml4 = px_dma(&ppgtt->pml4);
+	} else {
 		for (i = 0; i < GEN8_3LVL_PDPES; i++)
 			s->i915_context_pdps[i] =
-				px_dma(i915_ppgtt->pdp.page_directory[i]);
+				px_dma(ppgtt->pdp.page_directory[i]);
 	}
 }
 
@@ -1188,16 +1174,31 @@ i915_context_ppgtt_root_save(struct intel_vgpu_submission *s)
 int intel_vgpu_setup_submission(struct intel_vgpu *vgpu)
 {
 	struct intel_vgpu_submission *s = &vgpu->submission;
-	enum intel_engine_id i;
 	struct intel_engine_cs *engine;
+	struct i915_gem_context *ctx;
+	enum intel_engine_id i;
 	int ret;
 
-	s->shadow_ctx = i915_gem_context_create_gvt(
-			&vgpu->gvt->dev_priv->drm);
-	if (IS_ERR(s->shadow_ctx))
-		return PTR_ERR(s->shadow_ctx);
+	ctx = i915_gem_context_create_gvt(&vgpu->gvt->dev_priv->drm);
+	if (IS_ERR(ctx))
+		return PTR_ERR(ctx);
+
+	i915_context_ppgtt_root_save(s, ctx->ppgtt);
+
+	for_each_engine(engine, vgpu->gvt->dev_priv, i) {
+		struct intel_context *ce;
+
+		INIT_LIST_HEAD(&s->workload_q_head[i]);
+		s->shadow[i] = ERR_PTR(-EINVAL);
 
-	i915_context_ppgtt_root_save(s);
+		ce = intel_context_pin(ctx, engine);
+		if (IS_ERR(ce)) {
+			ret = PTR_ERR(ce);
+			goto out_shadow_ctx;
+		}
+
+		s->shadow[i] = ce;
+	}
 
 	bitmap_zero(s->shadow_ctx_desc_updated, I915_NUM_ENGINES);
 
@@ -1213,16 +1214,21 @@ int intel_vgpu_setup_submission(struct intel_vgpu *vgpu)
 		goto out_shadow_ctx;
 	}
 
-	for_each_engine(engine, vgpu->gvt->dev_priv, i)
-		INIT_LIST_HEAD(&s->workload_q_head[i]);
-
 	atomic_set(&s->running_workload_num, 0);
 	bitmap_zero(s->tlb_handle_pending, I915_NUM_ENGINES);
 
+	i915_gem_context_put(ctx);
 	return 0;
 
 out_shadow_ctx:
-	i915_gem_context_put(s->shadow_ctx);
+	i915_context_ppgtt_root_restore(s, ctx->ppgtt);
+	for_each_engine(engine, vgpu->gvt->dev_priv, i) {
+		if (IS_ERR(s->shadow[i]))
+			break;
+
+		intel_context_unpin(s->shadow[i]);
+	}
+	i915_gem_context_put(ctx);
 	return ret;
 }
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 08/22] drm/i915: Explicitly pin the logical context for execbuf
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
                   ` (6 preceding siblings ...)
  2019-03-25  9:03 ` [PATCH 07/22] drm/i915/gvt: Pin the per-engine GVT shadow contexts Chris Wilson
@ 2019-03-25  9:03 ` Chris Wilson
  2019-04-02 16:09   ` Tvrtko Ursulin
  2019-03-25  9:04 ` [PATCH 09/22] drm/i915: Export intel_context_instance() Chris Wilson
                   ` (17 subsequent siblings)
  25 siblings, 1 reply; 43+ messages in thread
From: Chris Wilson @ 2019-03-25  9:03 UTC (permalink / raw)
  To: intel-gfx

In order to separate the reservation phase of building a request from
its emission phase, we need to pull some of the request alloc activities
from deep inside i915_request to the surface, GEM_EXECBUFFER.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 106 +++++++++++++--------
 drivers/gpu/drm/i915/i915_request.c        |   9 --
 2 files changed, 67 insertions(+), 48 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 3d672c9edb94..8754bb02c6ec 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -36,6 +36,7 @@
 
 #include "i915_drv.h"
 #include "i915_gem_clflush.h"
+#include "i915_gem_pm.h"
 #include "i915_trace.h"
 #include "intel_drv.h"
 #include "intel_frontbuffer.h"
@@ -236,7 +237,8 @@ struct i915_execbuffer {
 	unsigned int *flags;
 
 	struct intel_engine_cs *engine; /** engine to queue the request to */
-	struct i915_gem_context *ctx; /** context for building the request */
+	struct intel_context *context; /* logical state for the request */
+	struct i915_gem_context *gem_context; /** caller's context */
 	struct i915_address_space *vm; /** GTT and vma for the request */
 
 	struct i915_request *request; /** our request to build */
@@ -738,7 +740,7 @@ static int eb_select_context(struct i915_execbuffer *eb)
 	if (unlikely(!ctx))
 		return -ENOENT;
 
-	eb->ctx = ctx;
+	eb->gem_context = ctx;
 	if (ctx->ppgtt) {
 		eb->vm = &ctx->ppgtt->vm;
 		eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
@@ -761,7 +763,7 @@ static struct i915_request *__eb_wait_for_ring(struct intel_ring *ring)
 	 * Completely unscientific finger-in-the-air estimates for suitable
 	 * maximum user request size (to avoid blocking) and then backoff.
 	 */
-	if (intel_ring_update_space(ring) >= PAGE_SIZE)
+	if (!ring || intel_ring_update_space(ring) >= PAGE_SIZE)
 		return NULL;
 
 	/*
@@ -784,7 +786,6 @@ static struct i915_request *__eb_wait_for_ring(struct intel_ring *ring)
 
 static int eb_wait_for_ring(const struct i915_execbuffer *eb)
 {
-	const struct intel_context *ce;
 	struct i915_request *rq;
 	int ret = 0;
 
@@ -794,11 +795,7 @@ static int eb_wait_for_ring(const struct i915_execbuffer *eb)
 	 * keeping all of their resources pinned.
 	 */
 
-	ce = intel_context_lookup(eb->ctx, eb->engine);
-	if (!ce || !ce->ring) /* first use, assume empty! */
-		return 0;
-
-	rq = __eb_wait_for_ring(ce->ring);
+	rq = __eb_wait_for_ring(eb->context->ring);
 	if (rq) {
 		mutex_unlock(&eb->i915->drm.struct_mutex);
 
@@ -817,15 +814,15 @@ static int eb_wait_for_ring(const struct i915_execbuffer *eb)
 
 static int eb_lookup_vmas(struct i915_execbuffer *eb)
 {
-	struct radix_tree_root *handles_vma = &eb->ctx->handles_vma;
+	struct radix_tree_root *handles_vma = &eb->gem_context->handles_vma;
 	struct drm_i915_gem_object *obj;
 	unsigned int i, batch;
 	int err;
 
-	if (unlikely(i915_gem_context_is_closed(eb->ctx)))
+	if (unlikely(i915_gem_context_is_closed(eb->gem_context)))
 		return -ENOENT;
 
-	if (unlikely(i915_gem_context_is_banned(eb->ctx)))
+	if (unlikely(i915_gem_context_is_banned(eb->gem_context)))
 		return -EIO;
 
 	INIT_LIST_HEAD(&eb->relocs);
@@ -870,8 +867,8 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb)
 		if (!vma->open_count++)
 			i915_vma_reopen(vma);
 		list_add(&lut->obj_link, &obj->lut_list);
-		list_add(&lut->ctx_link, &eb->ctx->handles_list);
-		lut->ctx = eb->ctx;
+		list_add(&lut->ctx_link, &eb->gem_context->handles_list);
+		lut->ctx = eb->gem_context;
 		lut->handle = handle;
 
 add_vma:
@@ -1227,7 +1224,7 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb,
 	if (err)
 		goto err_unmap;
 
-	rq = i915_request_alloc(eb->engine, eb->ctx);
+	rq = i915_request_create(eb->context);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto err_unpin;
@@ -2088,8 +2085,41 @@ static const enum intel_engine_id user_ring_map[I915_USER_RINGS + 1] = {
 	[I915_EXEC_VEBOX]	= VECS0
 };
 
-static struct intel_engine_cs *
-eb_select_engine(struct drm_i915_private *dev_priv,
+static int eb_pin_context(struct i915_execbuffer *eb,
+			  struct intel_engine_cs *engine)
+{
+	struct intel_context *ce;
+	int err;
+
+	/*
+	 * ABI: Before userspace accesses the GPU (e.g. execbuffer), report
+	 * EIO if the GPU is already wedged.
+	 */
+	err = i915_terminally_wedged(eb->i915);
+	if (err)
+		return err;
+
+	/*
+	 * Pinning the contexts may generate requests in order to acquire
+	 * GGTT space, so do this first before we reserve a seqno for
+	 * ourselves.
+	 */
+	ce = intel_context_pin(eb->gem_context, engine);
+	if (IS_ERR(ce))
+		return PTR_ERR(ce);
+
+	eb->engine = engine;
+	eb->context = ce;
+	return 0;
+}
+
+static void eb_unpin_context(struct i915_execbuffer *eb)
+{
+	intel_context_unpin(eb->context);
+}
+
+static int
+eb_select_engine(struct i915_execbuffer *eb,
 		 struct drm_file *file,
 		 struct drm_i915_gem_execbuffer2 *args)
 {
@@ -2098,21 +2128,21 @@ eb_select_engine(struct drm_i915_private *dev_priv,
 
 	if (user_ring_id > I915_USER_RINGS) {
 		DRM_DEBUG("execbuf with unknown ring: %u\n", user_ring_id);
-		return NULL;
+		return -EINVAL;
 	}
 
 	if ((user_ring_id != I915_EXEC_BSD) &&
 	    ((args->flags & I915_EXEC_BSD_MASK) != 0)) {
 		DRM_DEBUG("execbuf with non bsd ring but with invalid "
 			  "bsd dispatch flags: %d\n", (int)(args->flags));
-		return NULL;
+		return -EINVAL;
 	}
 
-	if (user_ring_id == I915_EXEC_BSD && HAS_ENGINE(dev_priv, VCS1)) {
+	if (user_ring_id == I915_EXEC_BSD && HAS_ENGINE(eb->i915, VCS1)) {
 		unsigned int bsd_idx = args->flags & I915_EXEC_BSD_MASK;
 
 		if (bsd_idx == I915_EXEC_BSD_DEFAULT) {
-			bsd_idx = gen8_dispatch_bsd_engine(dev_priv, file);
+			bsd_idx = gen8_dispatch_bsd_engine(eb->i915, file);
 		} else if (bsd_idx >= I915_EXEC_BSD_RING1 &&
 			   bsd_idx <= I915_EXEC_BSD_RING2) {
 			bsd_idx >>= I915_EXEC_BSD_SHIFT;
@@ -2120,20 +2150,20 @@ eb_select_engine(struct drm_i915_private *dev_priv,
 		} else {
 			DRM_DEBUG("execbuf with unknown bsd ring: %u\n",
 				  bsd_idx);
-			return NULL;
+			return -EINVAL;
 		}
 
-		engine = dev_priv->engine[_VCS(bsd_idx)];
+		engine = eb->i915->engine[_VCS(bsd_idx)];
 	} else {
-		engine = dev_priv->engine[user_ring_map[user_ring_id]];
+		engine = eb->i915->engine[user_ring_map[user_ring_id]];
 	}
 
 	if (!engine) {
 		DRM_DEBUG("execbuf with invalid ring: %u\n", user_ring_id);
-		return NULL;
+		return -EINVAL;
 	}
 
-	return engine;
+	return eb_pin_context(eb, engine);
 }
 
 static void
@@ -2275,7 +2305,6 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	struct i915_execbuffer eb;
 	struct dma_fence *in_fence = NULL;
 	struct sync_file *out_fence = NULL;
-	intel_wakeref_t wakeref;
 	int out_fence_fd = -1;
 	int err;
 
@@ -2335,12 +2364,6 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	if (unlikely(err))
 		goto err_destroy;
 
-	eb.engine = eb_select_engine(eb.i915, file, args);
-	if (!eb.engine) {
-		err = -EINVAL;
-		goto err_engine;
-	}
-
 	/*
 	 * Take a local wakeref for preparing to dispatch the execbuf as
 	 * we expect to access the hardware fairly frequently in the
@@ -2348,16 +2371,20 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	 * wakeref that we hold until the GPU has been idle for at least
 	 * 100ms.
 	 */
-	wakeref = intel_runtime_pm_get(eb.i915);
+	i915_gem_unpark(eb.i915);
 
 	err = i915_mutex_lock_interruptible(dev);
 	if (err)
 		goto err_rpm;
 
-	err = eb_wait_for_ring(&eb); /* may temporarily drop struct_mutex */
+	err = eb_select_engine(&eb, file, args);
 	if (unlikely(err))
 		goto err_unlock;
 
+	err = eb_wait_for_ring(&eb); /* may temporarily drop struct_mutex */
+	if (unlikely(err))
+		goto err_engine;
+
 	err = eb_relocate(&eb);
 	if (err) {
 		/*
@@ -2441,7 +2468,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	GEM_BUG_ON(eb.reloc_cache.rq);
 
 	/* Allocate a request for this batch buffer nice and early. */
-	eb.request = i915_request_alloc(eb.engine, eb.ctx);
+	eb.request = i915_request_create(eb.context);
 	if (IS_ERR(eb.request)) {
 		err = PTR_ERR(eb.request);
 		goto err_batch_unpin;
@@ -2502,12 +2529,13 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 err_vma:
 	if (eb.exec)
 		eb_release_vmas(&eb);
+err_engine:
+	eb_unpin_context(&eb);
 err_unlock:
 	mutex_unlock(&dev->struct_mutex);
 err_rpm:
-	intel_runtime_pm_put(eb.i915, wakeref);
-err_engine:
-	i915_gem_context_put(eb.ctx);
+	i915_gem_park(eb.i915);
+	i915_gem_context_put(eb.gem_context);
 err_destroy:
 	eb_destroy(&eb);
 err_out_fence:
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index fd24f576ca61..10edeb285870 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -741,7 +741,6 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	struct drm_i915_private *i915 = engine->i915;
 	struct intel_context *ce;
 	struct i915_request *rq;
-	int ret;
 
 	/*
 	 * Preempt contexts are reserved for exclusive use to inject a
@@ -750,14 +749,6 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	 */
 	GEM_BUG_ON(ctx == i915->preempt_context);
 
-	/*
-	 * ABI: Before userspace accesses the GPU (e.g. execbuffer), report
-	 * EIO if the GPU is already wedged.
-	 */
-	ret = i915_terminally_wedged(i915);
-	if (ret)
-		return ERR_PTR(ret);
-
 	/*
 	 * Pinning the contexts may generate requests in order to acquire
 	 * GGTT space, so do this first before we reserve a seqno for
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 09/22] drm/i915: Export intel_context_instance()
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
                   ` (7 preceding siblings ...)
  2019-03-25  9:03 ` [PATCH 08/22] drm/i915: Explicitly pin the logical context for execbuf Chris Wilson
@ 2019-03-25  9:04 ` Chris Wilson
  2019-03-25  9:04 ` [PATCH 10/22] drm/i915/selftests: Use the real kernel context for sseu isolation tests Chris Wilson
                   ` (16 subsequent siblings)
  25 siblings, 0 replies; 43+ messages in thread
From: Chris Wilson @ 2019-03-25  9:04 UTC (permalink / raw)
  To: intel-gfx

We want to pass in a intel_context into intel_context_pin() and that
requires us to first be able to lookup the intel_context!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gvt/scheduler.c       |  7 +++-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 11 +++++--
 drivers/gpu/drm/i915/i915_perf.c           | 21 ++++++++----
 drivers/gpu/drm/i915/i915_request.c        |  8 ++++-
 drivers/gpu/drm/i915/intel_context.c       | 38 +++++++++++-----------
 drivers/gpu/drm/i915/intel_context.h       | 19 +++++++----
 drivers/gpu/drm/i915/intel_engine_cs.c     |  8 ++++-
 7 files changed, 74 insertions(+), 38 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c
index 2d2dafd22a18..ae1f09d2d4ae 100644
--- a/drivers/gpu/drm/i915/gvt/scheduler.c
+++ b/drivers/gpu/drm/i915/gvt/scheduler.c
@@ -1191,12 +1191,17 @@ int intel_vgpu_setup_submission(struct intel_vgpu *vgpu)
 		INIT_LIST_HEAD(&s->workload_q_head[i]);
 		s->shadow[i] = ERR_PTR(-EINVAL);
 
-		ce = intel_context_pin(ctx, engine);
+		ce = intel_context_instance(ctx, engine);
 		if (IS_ERR(ce)) {
 			ret = PTR_ERR(ce);
 			goto out_shadow_ctx;
 		}
 
+		ret = intel_context_pin(ce);
+		intel_context_put(ce);
+		if (ret)
+			goto out_shadow_ctx;
+
 		s->shadow[i] = ce;
 	}
 
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 8754bb02c6ec..cf7aa0e325d2 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -2099,14 +2099,19 @@ static int eb_pin_context(struct i915_execbuffer *eb,
 	if (err)
 		return err;
 
+	ce = intel_context_instance(eb->gem_context, engine);
+	if (IS_ERR(ce))
+		return PTR_ERR(ce);
+
 	/*
 	 * Pinning the contexts may generate requests in order to acquire
 	 * GGTT space, so do this first before we reserve a seqno for
 	 * ourselves.
 	 */
-	ce = intel_context_pin(eb->gem_context, engine);
-	if (IS_ERR(ce))
-		return PTR_ERR(ce);
+	err = intel_context_pin(ce);
+	intel_context_put(ce);
+	if (err)
+		return err;
 
 	eb->engine = engine;
 	eb->context = ce;
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index fe7267da52e5..28475cbbdcbb 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1205,11 +1205,17 @@ static struct intel_context *oa_pin_context(struct drm_i915_private *i915,
 {
 	struct intel_engine_cs *engine = i915->engine[RCS0];
 	struct intel_context *ce;
-	int ret;
+	int err;
 
-	ret = i915_mutex_lock_interruptible(&i915->drm);
-	if (ret)
-		return ERR_PTR(ret);
+	ce = intel_context_instance(ctx, engine);
+	if (IS_ERR(ce))
+		return ce;
+
+	err = i915_mutex_lock_interruptible(&i915->drm);
+	if (err) {
+		intel_context_put(ce);
+		return ERR_PTR(err);
+	}
 
 	/*
 	 * As the ID is the gtt offset of the context's vma we
@@ -1217,10 +1223,11 @@ static struct intel_context *oa_pin_context(struct drm_i915_private *i915,
 	 *
 	 * NB: implied RCS engine...
 	 */
-	ce = intel_context_pin(ctx, engine);
+	err = intel_context_pin(ce);
 	mutex_unlock(&i915->drm.struct_mutex);
-	if (IS_ERR(ce))
-		return ce;
+	intel_context_put(ce);
+	if (err)
+		return ERR_PTR(err);
 
 	i915->perf.oa.pinned_ctx = ce;
 
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 10edeb285870..fe8db5ef0ded 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -741,6 +741,7 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	struct drm_i915_private *i915 = engine->i915;
 	struct intel_context *ce;
 	struct i915_request *rq;
+	int err;
 
 	/*
 	 * Preempt contexts are reserved for exclusive use to inject a
@@ -754,10 +755,15 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	 * GGTT space, so do this first before we reserve a seqno for
 	 * ourselves.
 	 */
-	ce = intel_context_pin(ctx, engine);
+	ce = intel_context_instance(ctx, engine);
 	if (IS_ERR(ce))
 		return ERR_CAST(ce);
 
+	err = intel_context_pin(ce);
+	intel_context_put(ce);
+	if (err)
+		return ERR_PTR(err);
+
 	i915_gem_unpark(i915);
 
 	rq = i915_request_create(ce);
diff --git a/drivers/gpu/drm/i915/intel_context.c b/drivers/gpu/drm/i915/intel_context.c
index 8931e0fee873..ca81b4dc5364 100644
--- a/drivers/gpu/drm/i915/intel_context.c
+++ b/drivers/gpu/drm/i915/intel_context.c
@@ -102,7 +102,7 @@ void __intel_context_remove(struct intel_context *ce)
 	spin_unlock(&ctx->hw_contexts_lock);
 }
 
-static struct intel_context *
+struct intel_context *
 intel_context_instance(struct i915_gem_context *ctx,
 		       struct intel_engine_cs *engine)
 {
@@ -110,7 +110,7 @@ intel_context_instance(struct i915_gem_context *ctx,
 
 	ce = intel_context_lookup(ctx, engine);
 	if (likely(ce))
-		return ce;
+		return intel_context_get(ce);
 
 	ce = intel_context_alloc();
 	if (!ce)
@@ -123,7 +123,7 @@ intel_context_instance(struct i915_gem_context *ctx,
 		intel_context_free(ce);
 
 	GEM_BUG_ON(intel_context_lookup(ctx, engine) != pos);
-	return pos;
+	return intel_context_get(pos);
 }
 
 struct intel_context *
@@ -137,36 +137,36 @@ intel_context_pin_lock(struct i915_gem_context *ctx,
 	if (IS_ERR(ce))
 		return ce;
 
-	if (mutex_lock_interruptible(&ce->pin_mutex))
+	if (mutex_lock_interruptible(&ce->pin_mutex)) {
+		intel_context_put(ce);
 		return ERR_PTR(-EINTR);
+	}
 
 	return ce;
 }
 
-struct intel_context *
-intel_context_pin(struct i915_gem_context *ctx,
-		  struct intel_engine_cs *engine)
+void intel_context_pin_unlock(struct intel_context *ce)
+	__releases(ce->pin_mutex)
 {
-	struct intel_context *ce;
-	int err;
-
-	ce = intel_context_instance(ctx, engine);
-	if (IS_ERR(ce))
-		return ce;
+	mutex_unlock(&ce->pin_mutex);
+	intel_context_put(ce);
+}
 
-	if (likely(atomic_inc_not_zero(&ce->pin_count)))
-		return ce;
+int __intel_context_do_pin(struct intel_context *ce)
+{
+	int err;
 
 	if (mutex_lock_interruptible(&ce->pin_mutex))
-		return ERR_PTR(-EINTR);
+		return -EINTR;
 
 	if (likely(!atomic_read(&ce->pin_count))) {
+		struct i915_gem_context *ctx = ce->gem_context;
+
 		err = ce->ops->pin(ce);
 		if (err)
 			goto err;
 
 		i915_gem_context_get(ctx);
-		GEM_BUG_ON(ce->gem_context != ctx);
 
 		mutex_lock(&ctx->mutex);
 		list_add(&ce->active_link, &ctx->active_engines);
@@ -180,11 +180,11 @@ intel_context_pin(struct i915_gem_context *ctx,
 	GEM_BUG_ON(!intel_context_is_pinned(ce)); /* no overflow! */
 
 	mutex_unlock(&ce->pin_mutex);
-	return ce;
+	return 0;
 
 err:
 	mutex_unlock(&ce->pin_mutex);
-	return ERR_PTR(err);
+	return err;
 }
 
 void intel_context_unpin(struct intel_context *ce)
diff --git a/drivers/gpu/drm/i915/intel_context.h b/drivers/gpu/drm/i915/intel_context.h
index ebc861b1a49e..2daf6a5217ae 100644
--- a/drivers/gpu/drm/i915/intel_context.h
+++ b/drivers/gpu/drm/i915/intel_context.h
@@ -49,11 +49,7 @@ intel_context_is_pinned(struct intel_context *ce)
 	return atomic_read(&ce->pin_count);
 }
 
-static inline void intel_context_pin_unlock(struct intel_context *ce)
-__releases(ce->pin_mutex)
-{
-	mutex_unlock(&ce->pin_mutex);
-}
+void intel_context_pin_unlock(struct intel_context *ce);
 
 struct intel_context *
 __intel_context_insert(struct i915_gem_context *ctx,
@@ -63,7 +59,18 @@ void
 __intel_context_remove(struct intel_context *ce);
 
 struct intel_context *
-intel_context_pin(struct i915_gem_context *ctx, struct intel_engine_cs *engine);
+intel_context_instance(struct i915_gem_context *ctx,
+		       struct intel_engine_cs *engine);
+
+int __intel_context_do_pin(struct intel_context *ce);
+
+static inline int intel_context_pin(struct intel_context *ce)
+{
+	if (likely(atomic_inc_not_zero(&ce->pin_count)))
+		return 0;
+
+	return __intel_context_do_pin(ce);
+}
 
 static inline void __intel_context_pin(struct intel_context *ce)
 {
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index c5b417327132..a7413e74410f 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -693,11 +693,17 @@ static int pin_context(struct i915_gem_context *ctx,
 		       struct intel_context **out)
 {
 	struct intel_context *ce;
+	int err;
 
-	ce = intel_context_pin(ctx, engine);
+	ce = intel_context_instance(ctx, engine);
 	if (IS_ERR(ce))
 		return PTR_ERR(ce);
 
+	err = intel_context_pin(ce);
+	intel_context_put(ce);
+	if (err)
+		return err;
+
 	*out = ce;
 	return 0;
 }
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 10/22] drm/i915/selftests: Use the real kernel context for sseu isolation tests
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
                   ` (8 preceding siblings ...)
  2019-03-25  9:04 ` [PATCH 09/22] drm/i915: Export intel_context_instance() Chris Wilson
@ 2019-03-25  9:04 ` Chris Wilson
  2019-03-25  9:04 ` [PATCH 11/22] drm/i915/selftests: Pass around intel_context for sseu Chris Wilson
                   ` (15 subsequent siblings)
  25 siblings, 0 replies; 43+ messages in thread
From: Chris Wilson @ 2019-03-25  9:04 UTC (permalink / raw)
  To: intel-gfx

Simply the setup slightly for the sseu selftests to use the actual
kernel_context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../gpu/drm/i915/selftests/i915_gem_context.c   | 17 ++++-------------
 1 file changed, 4 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index b4039df633ec..0e82303336ed 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -955,7 +955,6 @@ __sseu_finish(struct drm_i915_private *i915,
 	      const char *name,
 	      unsigned int flags,
 	      struct i915_gem_context *ctx,
-	      struct i915_gem_context *kctx,
 	      struct intel_engine_cs *engine,
 	      struct drm_i915_gem_object *obj,
 	      unsigned int expected,
@@ -978,7 +977,8 @@ __sseu_finish(struct drm_i915_private *i915,
 	if (ret)
 		goto out;
 
-	ret = __read_slice_count(i915, kctx, engine, obj, NULL, &rpcs);
+	ret = __read_slice_count(i915, i915->kernel_context, engine, obj,
+				 NULL, &rpcs);
 	ret = __check_rpcs(name, rpcs, ret, slices, "Kernel context", "!");
 
 out:
@@ -1010,22 +1010,17 @@ __sseu_test(struct drm_i915_private *i915,
 	    struct intel_sseu sseu)
 {
 	struct igt_spinner *spin = NULL;
-	struct i915_gem_context *kctx;
 	int ret;
 
-	kctx = kernel_context(i915);
-	if (IS_ERR(kctx))
-		return PTR_ERR(kctx);
-
 	ret = __sseu_prepare(i915, name, flags, ctx, engine, &spin);
 	if (ret)
-		goto out_context;
+		return ret;
 
 	ret = __i915_gem_context_reconfigure_sseu(ctx, engine, sseu);
 	if (ret)
 		goto out_spin;
 
-	ret = __sseu_finish(i915, name, flags, ctx, kctx, engine, obj,
+	ret = __sseu_finish(i915, name, flags, ctx, engine, obj,
 			    hweight32(sseu.slice_mask), spin);
 
 out_spin:
@@ -1034,10 +1029,6 @@ __sseu_test(struct drm_i915_private *i915,
 		igt_spinner_fini(spin);
 		kfree(spin);
 	}
-
-out_context:
-	kernel_context_close(kctx);
-
 	return ret;
 }
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 11/22] drm/i915/selftests: Pass around intel_context for sseu
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
                   ` (9 preceding siblings ...)
  2019-03-25  9:04 ` [PATCH 10/22] drm/i915/selftests: Use the real kernel context for sseu isolation tests Chris Wilson
@ 2019-03-25  9:04 ` Chris Wilson
  2019-03-25  9:04 ` [PATCH 12/22] drm/i915: Pass intel_context to intel_context_pin_lock() Chris Wilson
                   ` (14 subsequent siblings)
  25 siblings, 0 replies; 43+ messages in thread
From: Chris Wilson @ 2019-03-25  9:04 UTC (permalink / raw)
  To: intel-gfx

Combine the (i915_gem_context, intel_engine) into a single parameter,
the intel_context for convenience and later simplification.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../gpu/drm/i915/selftests/i915_gem_context.c | 64 +++++++++++--------
 1 file changed, 36 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index 0e82303336ed..da4b000377e0 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -753,8 +753,7 @@ static struct i915_vma *rpcs_query_batch(struct i915_vma *vma)
 
 static int
 emit_rpcs_query(struct drm_i915_gem_object *obj,
-		struct i915_gem_context *ctx,
-		struct intel_engine_cs *engine,
+		struct intel_context *ce,
 		struct i915_request **rq_out)
 {
 	struct i915_request *rq;
@@ -762,9 +761,9 @@ emit_rpcs_query(struct drm_i915_gem_object *obj,
 	struct i915_vma *vma;
 	int err;
 
-	GEM_BUG_ON(!intel_engine_can_store_dword(engine));
+	GEM_BUG_ON(!intel_engine_can_store_dword(ce->engine));
 
-	vma = i915_vma_instance(obj, &ctx->ppgtt->vm, NULL);
+	vma = i915_vma_instance(obj, &ce->gem_context->ppgtt->vm, NULL);
 	if (IS_ERR(vma))
 		return PTR_ERR(vma);
 
@@ -782,13 +781,15 @@ emit_rpcs_query(struct drm_i915_gem_object *obj,
 		goto err_vma;
 	}
 
-	rq = i915_request_alloc(engine, ctx);
+	rq = i915_request_create(ce);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto err_batch;
 	}
 
-	err = engine->emit_bb_start(rq, batch->node.start, batch->node.size, 0);
+	err = rq->engine->emit_bb_start(rq,
+					batch->node.start, batch->node.size,
+					0);
 	if (err)
 		goto err_request;
 
@@ -832,8 +833,7 @@ static int
 __sseu_prepare(struct drm_i915_private *i915,
 	       const char *name,
 	       unsigned int flags,
-	       struct i915_gem_context *ctx,
-	       struct intel_engine_cs *engine,
+	       struct intel_context *ce,
 	       struct igt_spinner **spin)
 {
 	struct i915_request *rq;
@@ -851,7 +851,10 @@ __sseu_prepare(struct drm_i915_private *i915,
 	if (ret)
 		goto err_free;
 
-	rq = igt_spinner_create_request(*spin, ctx, engine, MI_NOOP);
+	rq = igt_spinner_create_request(*spin,
+					ce->gem_context,
+					ce->engine,
+					MI_NOOP);
 	if (IS_ERR(rq)) {
 		ret = PTR_ERR(rq);
 		goto err_fini;
@@ -878,8 +881,7 @@ __sseu_prepare(struct drm_i915_private *i915,
 
 static int
 __read_slice_count(struct drm_i915_private *i915,
-		   struct i915_gem_context *ctx,
-		   struct intel_engine_cs *engine,
+		   struct intel_context *ce,
 		   struct drm_i915_gem_object *obj,
 		   struct igt_spinner *spin,
 		   u32 *rpcs)
@@ -890,7 +892,7 @@ __read_slice_count(struct drm_i915_private *i915,
 	u32 *buf, val;
 	long ret;
 
-	ret = emit_rpcs_query(obj, ctx, engine, &rq);
+	ret = emit_rpcs_query(obj, ce, &rq);
 	if (ret)
 		return ret;
 
@@ -954,8 +956,7 @@ static int
 __sseu_finish(struct drm_i915_private *i915,
 	      const char *name,
 	      unsigned int flags,
-	      struct i915_gem_context *ctx,
-	      struct intel_engine_cs *engine,
+	      struct intel_context *ce,
 	      struct drm_i915_gem_object *obj,
 	      unsigned int expected,
 	      struct igt_spinner *spin)
@@ -966,18 +967,18 @@ __sseu_finish(struct drm_i915_private *i915,
 	int ret = 0;
 
 	if (flags & TEST_RESET) {
-		ret = i915_reset_engine(engine, "sseu");
+		ret = i915_reset_engine(ce->engine, "sseu");
 		if (ret)
 			goto out;
 	}
 
-	ret = __read_slice_count(i915, ctx, engine, obj,
+	ret = __read_slice_count(i915, ce, obj,
 				 flags & TEST_RESET ? NULL : spin, &rpcs);
 	ret = __check_rpcs(name, rpcs, ret, expected, "Context", "!");
 	if (ret)
 		goto out;
 
-	ret = __read_slice_count(i915, i915->kernel_context, engine, obj,
+	ret = __read_slice_count(i915, ce->engine->kernel_context, obj,
 				 NULL, &rpcs);
 	ret = __check_rpcs(name, rpcs, ret, slices, "Kernel context", "!");
 
@@ -992,7 +993,7 @@ __sseu_finish(struct drm_i915_private *i915,
 		if (ret)
 			return ret;
 
-		ret = __read_slice_count(i915, ctx, engine, obj, NULL, &rpcs);
+		ret = __read_slice_count(i915, ce, obj, NULL, &rpcs);
 		ret = __check_rpcs(name, rpcs, ret, expected,
 				   "Context", " after idle!");
 	}
@@ -1004,23 +1005,22 @@ static int
 __sseu_test(struct drm_i915_private *i915,
 	    const char *name,
 	    unsigned int flags,
-	    struct i915_gem_context *ctx,
-	    struct intel_engine_cs *engine,
+	    struct intel_context *ce,
 	    struct drm_i915_gem_object *obj,
 	    struct intel_sseu sseu)
 {
 	struct igt_spinner *spin = NULL;
 	int ret;
 
-	ret = __sseu_prepare(i915, name, flags, ctx, engine, &spin);
+	ret = __sseu_prepare(i915, name, flags, ce, &spin);
 	if (ret)
 		return ret;
 
-	ret = __i915_gem_context_reconfigure_sseu(ctx, engine, sseu);
+	ret = __i915_gem_context_reconfigure_sseu(ce->gem_context, ce->engine, sseu);
 	if (ret)
 		goto out_spin;
 
-	ret = __sseu_finish(i915, name, flags, ctx, engine, obj,
+	ret = __sseu_finish(i915, name, flags, ce, obj,
 			    hweight32(sseu.slice_mask), spin);
 
 out_spin:
@@ -1038,9 +1038,9 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
 	       unsigned int flags)
 {
 	struct intel_sseu default_sseu = intel_device_default_sseu(i915);
-	struct intel_engine_cs *engine = i915->engine[RCS0];
 	struct drm_i915_gem_object *obj;
 	struct i915_gem_context *ctx;
+	struct intel_context *ce;
 	struct intel_sseu pg_sseu;
 	struct drm_file *file;
 	int ret;
@@ -1091,23 +1091,29 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
 
 	i915_gem_unpark(i915);
 
+	ce = intel_context_instance(ctx, i915->engine[RCS0]);
+	if (IS_ERR(ce)) {
+		ret = PTR_ERR(ce);
+		goto out_rpm;
+	}
+
 	/* First set the default mask. */
-	ret = __sseu_test(i915, name, flags, ctx, engine, obj, default_sseu);
+	ret = __sseu_test(i915, name, flags, ce, obj, default_sseu);
 	if (ret)
 		goto out_fail;
 
 	/* Then set a power-gated configuration. */
-	ret = __sseu_test(i915, name, flags, ctx, engine, obj, pg_sseu);
+	ret = __sseu_test(i915, name, flags, ce, obj, pg_sseu);
 	if (ret)
 		goto out_fail;
 
 	/* Back to defaults. */
-	ret = __sseu_test(i915, name, flags, ctx, engine, obj, default_sseu);
+	ret = __sseu_test(i915, name, flags, ce, obj, default_sseu);
 	if (ret)
 		goto out_fail;
 
 	/* One last power-gated configuration for the road. */
-	ret = __sseu_test(i915, name, flags, ctx, engine, obj, pg_sseu);
+	ret = __sseu_test(i915, name, flags, ce, obj, pg_sseu);
 	if (ret)
 		goto out_fail;
 
@@ -1116,7 +1122,9 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
 		ret = -EIO;
 
 	i915_gem_object_put(obj);
+	intel_context_put(ce);
 
+out_rpm:
 	i915_gem_park(i915);
 
 out_unlock:
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 12/22] drm/i915: Pass intel_context to intel_context_pin_lock()
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
                   ` (10 preceding siblings ...)
  2019-03-25  9:04 ` [PATCH 11/22] drm/i915/selftests: Pass around intel_context for sseu Chris Wilson
@ 2019-03-25  9:04 ` Chris Wilson
  2019-03-25  9:04 ` [PATCH 13/22] drm/i915: Split engine setup/init into two phases Chris Wilson
                   ` (13 subsequent siblings)
  25 siblings, 0 replies; 43+ messages in thread
From: Chris Wilson @ 2019-03-25  9:04 UTC (permalink / raw)
  To: intel-gfx

Move the intel_context_instance() to the caller so that we can decouple
ourselves from one context instance per engine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_context.c       | 88 +++++++++++--------
 drivers/gpu/drm/i915/intel_context.c          | 26 ------
 drivers/gpu/drm/i915/intel_context.h          | 14 ++-
 .../gpu/drm/i915/selftests/i915_gem_context.c |  3 +-
 4 files changed, 64 insertions(+), 67 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 6a452345ffdb..c6a15b958166 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -136,6 +136,18 @@ static void lut_close(struct i915_gem_context *ctx)
 	rcu_read_unlock();
 }
 
+static struct intel_context *
+lookup_user_engine(struct i915_gem_context *ctx, u16 class, u16 instance)
+{
+	struct intel_engine_cs *engine;
+
+	engine = intel_engine_lookup_user(ctx->i915, class, instance);
+	if (!engine)
+		return ERR_PTR(-EINVAL);
+
+	return intel_context_instance(ctx, engine);
+}
+
 static inline int new_hw_id(struct drm_i915_private *i915, gfp_t gfp)
 {
 	unsigned int max;
@@ -1219,19 +1231,17 @@ gen8_modify_rpcs(struct intel_context *ce, struct intel_sseu sseu)
 }
 
 static int
-__i915_gem_context_reconfigure_sseu(struct i915_gem_context *ctx,
-				    struct intel_engine_cs *engine,
-				    struct intel_sseu sseu)
+__intel_context_reconfigure_sseu(struct intel_context *ce,
+				 struct intel_sseu sseu)
 {
-	struct intel_context *ce;
-	int ret = 0;
+	int ret;
 
-	GEM_BUG_ON(INTEL_GEN(ctx->i915) < 8);
-	GEM_BUG_ON(engine->id != RCS0);
+	GEM_BUG_ON(INTEL_GEN(ce->gem_context->i915) < 8);
+	GEM_BUG_ON(ce->engine->id != RCS0);
 
-	ce = intel_context_pin_lock(ctx, engine);
-	if (IS_ERR(ce))
-		return PTR_ERR(ce);
+	ret = intel_context_pin_lock(ce);
+	if (ret)
+		return ret;
 
 	/* Nothing to do if unmodified. */
 	if (!memcmp(&ce->sseu, &sseu, sizeof(sseu)))
@@ -1247,19 +1257,18 @@ __i915_gem_context_reconfigure_sseu(struct i915_gem_context *ctx,
 }
 
 static int
-i915_gem_context_reconfigure_sseu(struct i915_gem_context *ctx,
-				  struct intel_engine_cs *engine,
-				  struct intel_sseu sseu)
+intel_context_reconfigure_sseu(struct intel_context *ce, struct intel_sseu sseu)
 {
+	struct drm_i915_private *i915 = ce->gem_context->i915;
 	int ret;
 
-	ret = mutex_lock_interruptible(&ctx->i915->drm.struct_mutex);
+	ret = mutex_lock_interruptible(&i915->drm.struct_mutex);
 	if (ret)
 		return ret;
 
-	ret = __i915_gem_context_reconfigure_sseu(ctx, engine, sseu);
+	ret = __intel_context_reconfigure_sseu(ce, sseu);
 
-	mutex_unlock(&ctx->i915->drm.struct_mutex);
+	mutex_unlock(&i915->drm.struct_mutex);
 
 	return ret;
 }
@@ -1367,7 +1376,7 @@ static int set_sseu(struct i915_gem_context *ctx,
 {
 	struct drm_i915_private *i915 = ctx->i915;
 	struct drm_i915_gem_context_param_sseu user_sseu;
-	struct intel_engine_cs *engine;
+	struct intel_context *ce;
 	struct intel_sseu sseu;
 	int ret;
 
@@ -1384,27 +1393,31 @@ static int set_sseu(struct i915_gem_context *ctx,
 	if (user_sseu.flags || user_sseu.rsvd)
 		return -EINVAL;
 
-	engine = intel_engine_lookup_user(i915,
-					  user_sseu.engine_class,
-					  user_sseu.engine_instance);
-	if (!engine)
-		return -EINVAL;
+	ce = lookup_user_engine(ctx,
+				user_sseu.engine_class,
+				user_sseu.engine_instance);
+	if (IS_ERR(ce))
+		return PTR_ERR(ce);
 
 	/* Only render engine supports RPCS configuration. */
-	if (engine->class != RENDER_CLASS)
-		return -ENODEV;
+	if (ce->engine->class != RENDER_CLASS) {
+		ret = -ENODEV;
+		goto out_ce;
+	}
 
 	ret = user_to_context_sseu(i915, &user_sseu, &sseu);
 	if (ret)
-		return ret;
+		goto out_ce;
 
-	ret = i915_gem_context_reconfigure_sseu(ctx, engine, sseu);
+	ret = intel_context_reconfigure_sseu(ce, sseu);
 	if (ret)
-		return ret;
+		goto out_ce;
 
 	args->size = sizeof(user_sseu);
 
-	return 0;
+out_ce:
+	intel_context_put(ce);
+	return ret;
 }
 
 static int ctx_setparam(struct i915_gem_context *ctx,
@@ -1608,8 +1621,8 @@ static int get_sseu(struct i915_gem_context *ctx,
 		    struct drm_i915_gem_context_param *args)
 {
 	struct drm_i915_gem_context_param_sseu user_sseu;
-	struct intel_engine_cs *engine;
 	struct intel_context *ce;
+	int err;
 
 	if (args->size == 0)
 		goto out;
@@ -1623,22 +1636,25 @@ static int get_sseu(struct i915_gem_context *ctx,
 	if (user_sseu.flags || user_sseu.rsvd)
 		return -EINVAL;
 
-	engine = intel_engine_lookup_user(ctx->i915,
-					  user_sseu.engine_class,
-					  user_sseu.engine_instance);
-	if (!engine)
-		return -EINVAL;
-
-	ce = intel_context_pin_lock(ctx, engine); /* serialises with set_sseu */
+	ce = lookup_user_engine(ctx,
+				user_sseu.engine_class,
+				user_sseu.engine_instance);
 	if (IS_ERR(ce))
 		return PTR_ERR(ce);
 
+	err = intel_context_pin_lock(ce); /* serialises with set_sseu */
+	if (err) {
+		intel_context_put(ce);
+		return err;
+	}
+
 	user_sseu.slice_mask = ce->sseu.slice_mask;
 	user_sseu.subslice_mask = ce->sseu.subslice_mask;
 	user_sseu.min_eus_per_subslice = ce->sseu.min_eus_per_subslice;
 	user_sseu.max_eus_per_subslice = ce->sseu.max_eus_per_subslice;
 
 	intel_context_pin_unlock(ce);
+	intel_context_put(ce);
 
 	if (copy_to_user(u64_to_user_ptr(args->value), &user_sseu,
 			 sizeof(user_sseu)))
diff --git a/drivers/gpu/drm/i915/intel_context.c b/drivers/gpu/drm/i915/intel_context.c
index ca81b4dc5364..65031744c0a8 100644
--- a/drivers/gpu/drm/i915/intel_context.c
+++ b/drivers/gpu/drm/i915/intel_context.c
@@ -126,32 +126,6 @@ intel_context_instance(struct i915_gem_context *ctx,
 	return intel_context_get(pos);
 }
 
-struct intel_context *
-intel_context_pin_lock(struct i915_gem_context *ctx,
-		       struct intel_engine_cs *engine)
-	__acquires(ce->pin_mutex)
-{
-	struct intel_context *ce;
-
-	ce = intel_context_instance(ctx, engine);
-	if (IS_ERR(ce))
-		return ce;
-
-	if (mutex_lock_interruptible(&ce->pin_mutex)) {
-		intel_context_put(ce);
-		return ERR_PTR(-EINTR);
-	}
-
-	return ce;
-}
-
-void intel_context_pin_unlock(struct intel_context *ce)
-	__releases(ce->pin_mutex)
-{
-	mutex_unlock(&ce->pin_mutex);
-	intel_context_put(ce);
-}
-
 int __intel_context_do_pin(struct intel_context *ce)
 {
 	int err;
diff --git a/drivers/gpu/drm/i915/intel_context.h b/drivers/gpu/drm/i915/intel_context.h
index 2daf6a5217ae..3b3b14190ce7 100644
--- a/drivers/gpu/drm/i915/intel_context.h
+++ b/drivers/gpu/drm/i915/intel_context.h
@@ -39,9 +39,17 @@ intel_context_lookup(struct i915_gem_context *ctx,
  * can neither be bound to the GPU or unbound whilst the lock is held, i.e.
  * intel_context_is_pinned() remains stable.
  */
-struct intel_context *
-intel_context_pin_lock(struct i915_gem_context *ctx,
-		       struct intel_engine_cs *engine);
+static inline int intel_context_pin_lock(struct intel_context *ce)
+	__acquires(ce->pin_mutex)
+{
+	return mutex_lock_interruptible(&ce->pin_mutex);
+}
+
+static inline void intel_context_pin_unlock(struct intel_context *ce)
+	__releases(ce->pin_mutex)
+{
+	mutex_unlock(&ce->pin_mutex);
+}
 
 static inline bool
 intel_context_is_pinned(struct intel_context *ce)
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index da4b000377e0..2a4f7348dcae 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -1016,7 +1016,7 @@ __sseu_test(struct drm_i915_private *i915,
 	if (ret)
 		return ret;
 
-	ret = __i915_gem_context_reconfigure_sseu(ce->gem_context, ce->engine, sseu);
+	ret = __intel_context_reconfigure_sseu(ce, sseu);
 	if (ret)
 		goto out_spin;
 
@@ -1123,7 +1123,6 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
 
 	i915_gem_object_put(obj);
 	intel_context_put(ce);
-
 out_rpm:
 	i915_gem_park(i915);
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 13/22] drm/i915: Split engine setup/init into two phases
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
                   ` (11 preceding siblings ...)
  2019-03-25  9:04 ` [PATCH 12/22] drm/i915: Pass intel_context to intel_context_pin_lock() Chris Wilson
@ 2019-03-25  9:04 ` Chris Wilson
  2019-03-25  9:04 ` [PATCH 14/22] drm/i915: Switch back to an array of logical per-engine HW contexts Chris Wilson
                   ` (12 subsequent siblings)
  25 siblings, 0 replies; 43+ messages in thread
From: Chris Wilson @ 2019-03-25  9:04 UTC (permalink / raw)
  To: intel-gfx

In the next patch, we require the engine vfuncs setup prior to
initialising the pinned kernel contexts, so split the vfunc setup from
the engine initialisation and call it earlier.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c          |   6 +
 drivers/gpu/drm/i915/intel_engine_cs.c   |  99 ++++++----
 drivers/gpu/drm/i915/intel_lrc.c         |  74 ++------
 drivers/gpu/drm/i915/intel_lrc.h         |   5 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c  | 232 +++++++++++------------
 drivers/gpu/drm/i915/intel_ringbuffer.h  |   8 +-
 drivers/gpu/drm/i915/intel_workarounds.c |   3 +-
 7 files changed, 207 insertions(+), 220 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 79919e0cf03d..f6cfaf033167 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4548,6 +4548,12 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 		goto err_ggtt;
 	}
 
+	ret = intel_engines_setup(dev_priv);
+	if (ret) {
+		GEM_BUG_ON(ret == -EIO);
+		goto err_unlock;
+	}
+
 	ret = i915_gem_contexts_init(dev_priv);
 	if (ret) {
 		GEM_BUG_ON(ret == -EIO);
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index a7413e74410f..2302746c4ce0 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -48,35 +48,24 @@
 
 struct engine_class_info {
 	const char *name;
-	int (*init_legacy)(struct intel_engine_cs *engine);
-	int (*init_execlists)(struct intel_engine_cs *engine);
-
 	u8 uabi_class;
 };
 
 static const struct engine_class_info intel_engine_classes[] = {
 	[RENDER_CLASS] = {
 		.name = "rcs",
-		.init_execlists = logical_render_ring_init,
-		.init_legacy = intel_init_render_ring_buffer,
 		.uabi_class = I915_ENGINE_CLASS_RENDER,
 	},
 	[COPY_ENGINE_CLASS] = {
 		.name = "bcs",
-		.init_execlists = logical_xcs_ring_init,
-		.init_legacy = intel_init_blt_ring_buffer,
 		.uabi_class = I915_ENGINE_CLASS_COPY,
 	},
 	[VIDEO_DECODE_CLASS] = {
 		.name = "vcs",
-		.init_execlists = logical_xcs_ring_init,
-		.init_legacy = intel_init_bsd_ring_buffer,
 		.uabi_class = I915_ENGINE_CLASS_VIDEO,
 	},
 	[VIDEO_ENHANCEMENT_CLASS] = {
 		.name = "vecs",
-		.init_execlists = logical_xcs_ring_init,
-		.init_legacy = intel_init_vebox_ring_buffer,
 		.uabi_class = I915_ENGINE_CLASS_VIDEO_ENHANCE,
 	},
 };
@@ -401,48 +390,39 @@ int intel_engines_init_mmio(struct drm_i915_private *dev_priv)
 
 /**
  * intel_engines_init() - init the Engine Command Streamers
- * @dev_priv: i915 device private
+ * @i915: i915 device private
  *
  * Return: non-zero if the initialization failed.
  */
-int intel_engines_init(struct drm_i915_private *dev_priv)
+int intel_engines_init(struct drm_i915_private *i915)
 {
+	int (*init)(struct intel_engine_cs *engine);
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id, err_id;
 	int err;
 
-	for_each_engine(engine, dev_priv, id) {
-		const struct engine_class_info *class_info =
-			&intel_engine_classes[engine->class];
-		int (*init)(struct intel_engine_cs *engine);
-
-		if (HAS_EXECLISTS(dev_priv))
-			init = class_info->init_execlists;
-		else
-			init = class_info->init_legacy;
+	if (HAS_EXECLISTS(i915))
+		init = intel_execlists_submission_init;
+	else
+		init = intel_ring_submission_init;
 
-		err = -EINVAL;
+	for_each_engine(engine, i915, id) {
 		err_id = id;
 
-		if (GEM_DEBUG_WARN_ON(!init))
-			goto cleanup;
-
 		err = init(engine);
 		if (err)
 			goto cleanup;
-
-		GEM_BUG_ON(!engine->submit_request);
 	}
 
 	return 0;
 
 cleanup:
-	for_each_engine(engine, dev_priv, id) {
+	for_each_engine(engine, i915, id) {
 		if (id >= err_id) {
 			kfree(engine);
-			dev_priv->engine[id] = NULL;
+			i915->engine[id] = NULL;
 		} else {
-			dev_priv->gt.cleanup_engine(engine);
+			i915->gt.cleanup_engine(engine);
 		}
 	}
 	return err;
@@ -560,16 +540,7 @@ static int init_status_page(struct intel_engine_cs *engine)
 	return ret;
 }
 
-/**
- * intel_engines_setup_common - setup engine state not requiring hw access
- * @engine: Engine to setup.
- *
- * Initializes @engine@ structure members shared between legacy and execlists
- * submission modes which do not require hardware access.
- *
- * Typically done early in the submission mode specific engine setup stage.
- */
-int intel_engine_setup_common(struct intel_engine_cs *engine)
+static int intel_engine_setup_common(struct intel_engine_cs *engine)
 {
 	int err;
 
@@ -598,6 +569,52 @@ int intel_engine_setup_common(struct intel_engine_cs *engine)
 	return err;
 }
 
+/**
+ * intel_engines_setup- setup engine state not requiring hw access
+ * @i915: Device to setup.
+ *
+ * Initializes engine structure members shared between legacy and execlists
+ * submission modes which do not require hardware access.
+ *
+ * Typically done early in the submission mode specific engine setup stage.
+ */
+int intel_engines_setup(struct drm_i915_private *i915)
+{
+	int (*setup)(struct intel_engine_cs *engine);
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	int err;
+
+	if (HAS_EXECLISTS(i915))
+		setup = intel_execlists_submission_setup;
+	else
+		setup = intel_ring_submission_setup;
+
+	for_each_engine(engine, i915, id) {
+		err = intel_engine_setup_common(engine);
+		if (err)
+			goto cleanup;
+
+		err = setup(engine);
+		if (err)
+			goto cleanup;
+
+		GEM_BUG_ON(!engine->cops);
+	}
+
+	return 0;
+
+cleanup:
+	for_each_engine(engine, i915, id) {
+		if (engine->cops)
+			i915->gt.cleanup_engine(engine);
+		else
+			kfree(engine);
+		i915->engine[id] = NULL;
+	}
+	return err;
+}
+
 void intel_engines_set_scheduler_caps(struct drm_i915_private *i915)
 {
 	static const struct {
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 66bc3cd4e166..5804703c9d97 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1759,8 +1759,8 @@ static int intel_init_workaround_bb(struct intel_engine_cs *engine)
 	unsigned int i;
 	int ret;
 
-	if (GEM_DEBUG_WARN_ON(engine->id != RCS0))
-		return -EINVAL;
+	if (engine->class != RENDER_CLASS)
+		return 0;
 
 	switch (INTEL_GEN(engine->i915)) {
 	case 11:
@@ -2375,15 +2375,8 @@ logical_ring_default_irqs(struct intel_engine_cs *engine)
 	engine->irq_keep_mask = GT_CONTEXT_SWITCH_INTERRUPT << shift;
 }
 
-static int
-logical_ring_setup(struct intel_engine_cs *engine)
+int intel_execlists_submission_setup(struct intel_engine_cs *engine)
 {
-	int err;
-
-	err = intel_engine_setup_common(engine);
-	if (err)
-		return err;
-
 	/* Intentionally left blank. */
 	engine->buffer = NULL;
 
@@ -2393,10 +2386,16 @@ logical_ring_setup(struct intel_engine_cs *engine)
 	logical_ring_default_vfuncs(engine);
 	logical_ring_default_irqs(engine);
 
+	if (engine->class == RENDER_CLASS) {
+		engine->init_context = gen8_init_rcs_context;
+		engine->emit_flush = gen8_emit_flush_render;
+		engine->emit_fini_breadcrumb = gen8_emit_fini_breadcrumb_rcs;
+	}
+
 	return 0;
 }
 
-static int logical_ring_init(struct intel_engine_cs *engine)
+int intel_execlists_submission_init(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *i915 = engine->i915;
 	struct intel_engine_execlists * const execlists = &engine->execlists;
@@ -2407,6 +2406,15 @@ static int logical_ring_init(struct intel_engine_cs *engine)
 		return ret;
 
 	intel_engine_init_workarounds(engine);
+	intel_engine_init_whitelist(engine);
+
+	if (intel_init_workaround_bb(engine))
+		/*
+		 * We continue even if we fail to initialize WA batch
+		 * because we only expect rare glitches but nothing
+		 * critical to prevent us from using GPU
+		 */
+		DRM_ERROR("WA batch buffer initialization failed\n");
 
 	if (HAS_LOGICAL_RING_ELSQ(i915)) {
 		execlists->submit_reg = i915->uncore.regs +
@@ -2434,50 +2442,6 @@ static int logical_ring_init(struct intel_engine_cs *engine)
 	return 0;
 }
 
-int logical_render_ring_init(struct intel_engine_cs *engine)
-{
-	int ret;
-
-	ret = logical_ring_setup(engine);
-	if (ret)
-		return ret;
-
-	/* Override some for render ring. */
-	engine->init_context = gen8_init_rcs_context;
-	engine->emit_flush = gen8_emit_flush_render;
-	engine->emit_fini_breadcrumb = gen8_emit_fini_breadcrumb_rcs;
-
-	ret = logical_ring_init(engine);
-	if (ret)
-		return ret;
-
-	ret = intel_init_workaround_bb(engine);
-	if (ret) {
-		/*
-		 * We continue even if we fail to initialize WA batch
-		 * because we only expect rare glitches but nothing
-		 * critical to prevent us from using GPU
-		 */
-		DRM_ERROR("WA batch buffer initialization failed: %d\n",
-			  ret);
-	}
-
-	intel_engine_init_whitelist(engine);
-
-	return 0;
-}
-
-int logical_xcs_ring_init(struct intel_engine_cs *engine)
-{
-	int err;
-
-	err = logical_ring_setup(engine);
-	if (err)
-		return err;
-
-	return logical_ring_init(engine);
-}
-
 u32 gen8_make_rpcs(struct drm_i915_private *i915, struct intel_sseu *req_sseu)
 {
 	const struct sseu_dev_info *sseu = &RUNTIME_INFO(i915)->sseu;
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index f1aec8a6986f..c73fe820b05f 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -68,8 +68,9 @@ enum {
 
 /* Logical Rings */
 void intel_logical_ring_cleanup(struct intel_engine_cs *engine);
-int logical_render_ring_init(struct intel_engine_cs *engine);
-int logical_xcs_ring_init(struct intel_engine_cs *engine);
+
+int intel_execlists_submission_setup(struct intel_engine_cs *engine);
+int intel_execlists_submission_init(struct intel_engine_cs *engine);
 
 /* Logical Ring Contexts */
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 359d6af1a0b9..bafbf4735807 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1523,54 +1523,6 @@ static const struct intel_context_ops ring_context_ops = {
 	.destroy = ring_context_destroy,
 };
 
-static int intel_init_ring_buffer(struct intel_engine_cs *engine)
-{
-	struct i915_timeline *timeline;
-	struct intel_ring *ring;
-	int err;
-
-	err = intel_engine_setup_common(engine);
-	if (err)
-		return err;
-
-	timeline = i915_timeline_create(engine->i915, engine->status_page.vma);
-	if (IS_ERR(timeline)) {
-		err = PTR_ERR(timeline);
-		goto err;
-	}
-	GEM_BUG_ON(timeline->has_initial_breadcrumb);
-
-	ring = intel_engine_create_ring(engine, timeline, 32 * PAGE_SIZE);
-	i915_timeline_put(timeline);
-	if (IS_ERR(ring)) {
-		err = PTR_ERR(ring);
-		goto err;
-	}
-
-	err = intel_ring_pin(ring);
-	if (err)
-		goto err_ring;
-
-	GEM_BUG_ON(engine->buffer);
-	engine->buffer = ring;
-
-	err = intel_engine_init_common(engine);
-	if (err)
-		goto err_unpin;
-
-	GEM_BUG_ON(ring->timeline->hwsp_ggtt != engine->status_page.vma);
-
-	return 0;
-
-err_unpin:
-	intel_ring_unpin(ring);
-err_ring:
-	intel_ring_put(ring);
-err:
-	intel_engine_cleanup_common(engine);
-	return err;
-}
-
 void intel_engine_cleanup(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
@@ -2179,24 +2131,6 @@ static int gen6_ring_flush(struct i915_request *rq, u32 mode)
 	return gen6_flush_dw(rq, mode, MI_INVALIDATE_TLB);
 }
 
-static void intel_ring_init_irq(struct drm_i915_private *dev_priv,
-				struct intel_engine_cs *engine)
-{
-	if (INTEL_GEN(dev_priv) >= 6) {
-		engine->irq_enable = gen6_irq_enable;
-		engine->irq_disable = gen6_irq_disable;
-	} else if (INTEL_GEN(dev_priv) >= 5) {
-		engine->irq_enable = gen5_irq_enable;
-		engine->irq_disable = gen5_irq_disable;
-	} else if (INTEL_GEN(dev_priv) >= 3) {
-		engine->irq_enable = i9xx_irq_enable;
-		engine->irq_disable = i9xx_irq_disable;
-	} else {
-		engine->irq_enable = i8xx_irq_enable;
-		engine->irq_disable = i8xx_irq_disable;
-	}
-}
-
 static void i9xx_set_default_submission(struct intel_engine_cs *engine)
 {
 	engine->submit_request = i9xx_submit_request;
@@ -2212,13 +2146,33 @@ static void gen6_bsd_set_default_submission(struct intel_engine_cs *engine)
 	engine->submit_request = gen6_bsd_submit_request;
 }
 
-static void intel_ring_default_vfuncs(struct drm_i915_private *dev_priv,
-				      struct intel_engine_cs *engine)
+static void setup_irq(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *i915 = engine->i915;
+
+	if (INTEL_GEN(i915) >= 6) {
+		engine->irq_enable = gen6_irq_enable;
+		engine->irq_disable = gen6_irq_disable;
+	} else if (INTEL_GEN(i915) >= 5) {
+		engine->irq_enable = gen5_irq_enable;
+		engine->irq_disable = gen5_irq_disable;
+	} else if (INTEL_GEN(i915) >= 3) {
+		engine->irq_enable = i9xx_irq_enable;
+		engine->irq_disable = i9xx_irq_disable;
+	} else {
+		engine->irq_enable = i8xx_irq_enable;
+		engine->irq_disable = i8xx_irq_disable;
+	}
+}
+
+static void setup_xcs(struct intel_engine_cs *engine)
 {
+	struct drm_i915_private *i915 = engine->i915;
+
 	/* gen8+ are only supported with execlists */
-	GEM_BUG_ON(INTEL_GEN(dev_priv) >= 8);
+	GEM_BUG_ON(INTEL_GEN(i915) >= 8);
 
-	intel_ring_init_irq(dev_priv, engine);
+	setup_irq(engine);
 
 	engine->init_hw = init_ring_common;
 	engine->reset.prepare = reset_prepare;
@@ -2234,117 +2188,96 @@ static void intel_ring_default_vfuncs(struct drm_i915_private *dev_priv,
 	 * engine->emit_init_breadcrumb().
 	 */
 	engine->emit_fini_breadcrumb = i9xx_emit_breadcrumb;
-	if (IS_GEN(dev_priv, 5))
+	if (IS_GEN(i915, 5))
 		engine->emit_fini_breadcrumb = gen5_emit_breadcrumb;
 
 	engine->set_default_submission = i9xx_set_default_submission;
 
-	if (INTEL_GEN(dev_priv) >= 6)
+	if (INTEL_GEN(i915) >= 6)
 		engine->emit_bb_start = gen6_emit_bb_start;
-	else if (INTEL_GEN(dev_priv) >= 4)
+	else if (INTEL_GEN(i915) >= 4)
 		engine->emit_bb_start = i965_emit_bb_start;
-	else if (IS_I830(dev_priv) || IS_I845G(dev_priv))
+	else if (IS_I830(i915) || IS_I845G(i915))
 		engine->emit_bb_start = i830_emit_bb_start;
 	else
 		engine->emit_bb_start = i915_emit_bb_start;
 }
 
-int intel_init_render_ring_buffer(struct intel_engine_cs *engine)
+static void setup_rcs(struct intel_engine_cs *engine)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	intel_ring_default_vfuncs(dev_priv, engine);
+	struct drm_i915_private *i915 = engine->i915;
 
-	if (HAS_L3_DPF(dev_priv))
+	if (HAS_L3_DPF(i915))
 		engine->irq_keep_mask = GT_RENDER_L3_PARITY_ERROR_INTERRUPT;
 
 	engine->irq_enable_mask = GT_RENDER_USER_INTERRUPT;
 
-	if (INTEL_GEN(dev_priv) >= 7) {
+	if (INTEL_GEN(i915) >= 7) {
 		engine->init_context = intel_rcs_ctx_init;
 		engine->emit_flush = gen7_render_ring_flush;
 		engine->emit_fini_breadcrumb = gen7_rcs_emit_breadcrumb;
-	} else if (IS_GEN(dev_priv, 6)) {
+	} else if (IS_GEN(i915, 6)) {
 		engine->init_context = intel_rcs_ctx_init;
 		engine->emit_flush = gen6_render_ring_flush;
 		engine->emit_fini_breadcrumb = gen6_rcs_emit_breadcrumb;
-	} else if (IS_GEN(dev_priv, 5)) {
+	} else if (IS_GEN(i915, 5)) {
 		engine->emit_flush = gen4_render_ring_flush;
 	} else {
-		if (INTEL_GEN(dev_priv) < 4)
+		if (INTEL_GEN(i915) < 4)
 			engine->emit_flush = gen2_render_ring_flush;
 		else
 			engine->emit_flush = gen4_render_ring_flush;
 		engine->irq_enable_mask = I915_USER_INTERRUPT;
 	}
 
-	if (IS_HASWELL(dev_priv))
+	if (IS_HASWELL(i915))
 		engine->emit_bb_start = hsw_emit_bb_start;
 
 	engine->init_hw = init_render_ring;
-
-	ret = intel_init_ring_buffer(engine);
-	if (ret)
-		return ret;
-
-	return 0;
 }
 
-int intel_init_bsd_ring_buffer(struct intel_engine_cs *engine)
+static void setup_vcs(struct intel_engine_cs *engine)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
-
-	intel_ring_default_vfuncs(dev_priv, engine);
+	struct drm_i915_private *i915 = engine->i915;
 
-	if (INTEL_GEN(dev_priv) >= 6) {
+	if (INTEL_GEN(i915) >= 6) {
 		/* gen6 bsd needs a special wa for tail updates */
-		if (IS_GEN(dev_priv, 6))
+		if (IS_GEN(i915, 6))
 			engine->set_default_submission = gen6_bsd_set_default_submission;
 		engine->emit_flush = gen6_bsd_ring_flush;
 		engine->irq_enable_mask = GT_BSD_USER_INTERRUPT;
 
-		if (IS_GEN(dev_priv, 6))
+		if (IS_GEN(i915, 6))
 			engine->emit_fini_breadcrumb = gen6_xcs_emit_breadcrumb;
 		else
 			engine->emit_fini_breadcrumb = gen7_xcs_emit_breadcrumb;
 	} else {
 		engine->emit_flush = bsd_ring_flush;
-		if (IS_GEN(dev_priv, 5))
+		if (IS_GEN(i915, 5))
 			engine->irq_enable_mask = ILK_BSD_USER_INTERRUPT;
 		else
 			engine->irq_enable_mask = I915_BSD_USER_INTERRUPT;
 	}
-
-	return intel_init_ring_buffer(engine);
 }
 
-int intel_init_blt_ring_buffer(struct intel_engine_cs *engine)
+static void setup_bcs(struct intel_engine_cs *engine)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
-
-	GEM_BUG_ON(INTEL_GEN(dev_priv) < 6);
-
-	intel_ring_default_vfuncs(dev_priv, engine);
+	struct drm_i915_private *i915 = engine->i915;
 
 	engine->emit_flush = gen6_ring_flush;
 	engine->irq_enable_mask = GT_BLT_USER_INTERRUPT;
 
-	if (IS_GEN(dev_priv, 6))
+	if (IS_GEN(i915, 6))
 		engine->emit_fini_breadcrumb = gen6_xcs_emit_breadcrumb;
 	else
 		engine->emit_fini_breadcrumb = gen7_xcs_emit_breadcrumb;
-
-	return intel_init_ring_buffer(engine);
 }
 
-int intel_init_vebox_ring_buffer(struct intel_engine_cs *engine)
+static void setup_vecs(struct intel_engine_cs *engine)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
-
-	GEM_BUG_ON(INTEL_GEN(dev_priv) < 7);
+	struct drm_i915_private *i915 = engine->i915;
 
-	intel_ring_default_vfuncs(dev_priv, engine);
+	GEM_BUG_ON(INTEL_GEN(i915) < 7);
 
 	engine->emit_flush = gen6_ring_flush;
 	engine->irq_enable_mask = PM_VEBOX_USER_INTERRUPT;
@@ -2352,6 +2285,73 @@ int intel_init_vebox_ring_buffer(struct intel_engine_cs *engine)
 	engine->irq_disable = hsw_vebox_irq_disable;
 
 	engine->emit_fini_breadcrumb = gen7_xcs_emit_breadcrumb;
+}
+
+int intel_ring_submission_setup(struct intel_engine_cs *engine)
+{
+	setup_xcs(engine);
+
+	switch (engine->class) {
+	case RENDER_CLASS:
+		setup_rcs(engine);
+		break;
+	case VIDEO_DECODE_CLASS:
+		setup_vcs(engine);
+		break;
+	case COPY_ENGINE_CLASS:
+		setup_bcs(engine);
+		break;
+	case VIDEO_ENHANCEMENT_CLASS:
+		setup_vecs(engine);
+		break;
+	default:
+		MISSING_CASE(engine->class);
+		return -ENODEV;
+	}
+
+	return 0;
+}
+
+int intel_ring_submission_init(struct intel_engine_cs *engine)
+{
+	struct i915_timeline *timeline;
+	struct intel_ring *ring;
+	int err;
+
+	timeline = i915_timeline_create(engine->i915, engine->status_page.vma);
+	if (IS_ERR(timeline)) {
+		err = PTR_ERR(timeline);
+		goto err;
+	}
+	GEM_BUG_ON(timeline->has_initial_breadcrumb);
+
+	ring = intel_engine_create_ring(engine, timeline, 32 * PAGE_SIZE);
+	i915_timeline_put(timeline);
+	if (IS_ERR(ring)) {
+		err = PTR_ERR(ring);
+		goto err;
+	}
+
+	err = intel_ring_pin(ring);
+	if (err)
+		goto err_ring;
 
-	return intel_init_ring_buffer(engine);
+	GEM_BUG_ON(engine->buffer);
+	engine->buffer = ring;
+
+	err = intel_engine_init_common(engine);
+	if (err)
+		goto err_unpin;
+
+	GEM_BUG_ON(ring->timeline->hwsp_ggtt != engine->status_page.vma);
+
+	return 0;
+
+err_unpin:
+	intel_ring_unpin(ring);
+err_ring:
+	intel_ring_put(ring);
+err:
+	intel_engine_cleanup_common(engine);
+	return err;
 }
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index a02c92dac5da..81a07cdcb369 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -343,14 +343,12 @@ __intel_ring_space(unsigned int head, unsigned int tail, unsigned int size)
 	return (head - tail - CACHELINE_BYTES) & (size - 1);
 }
 
-int intel_engine_setup_common(struct intel_engine_cs *engine);
+int intel_engines_setup(struct drm_i915_private *i915);
 int intel_engine_init_common(struct intel_engine_cs *engine);
 void intel_engine_cleanup_common(struct intel_engine_cs *engine);
 
-int intel_init_render_ring_buffer(struct intel_engine_cs *engine);
-int intel_init_bsd_ring_buffer(struct intel_engine_cs *engine);
-int intel_init_blt_ring_buffer(struct intel_engine_cs *engine);
-int intel_init_vebox_ring_buffer(struct intel_engine_cs *engine);
+int intel_ring_submission_setup(struct intel_engine_cs *engine);
+int intel_ring_submission_init(struct intel_engine_cs *engine);
 
 int intel_engine_stop_cs(struct intel_engine_cs *engine);
 void intel_engine_cancel_stop_cs(struct intel_engine_cs *engine);
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index e758bbf50617..a47555931da8 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -1060,7 +1060,8 @@ void intel_engine_init_whitelist(struct intel_engine_cs *engine)
 	struct drm_i915_private *i915 = engine->i915;
 	struct i915_wa_list *w = &engine->whitelist;
 
-	GEM_BUG_ON(engine->id != RCS0);
+	if (engine->class != RENDER_CLASS)
+		return;
 
 	wa_init_start(w, "whitelist");
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 14/22] drm/i915: Switch back to an array of logical per-engine HW contexts
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
                   ` (12 preceding siblings ...)
  2019-03-25  9:04 ` [PATCH 13/22] drm/i915: Split engine setup/init into two phases Chris Wilson
@ 2019-03-25  9:04 ` Chris Wilson
  2019-03-25  9:04 ` [PATCH 15/22] drm/i915: Move i915_request_alloc into selftests/ Chris Wilson
                   ` (11 subsequent siblings)
  25 siblings, 0 replies; 43+ messages in thread
From: Chris Wilson @ 2019-03-25  9:04 UTC (permalink / raw)
  To: intel-gfx

We switched to a tree of per-engine HW context to accommodate the
introduction of virtual engines. However, we plan to also support
multiple instances of the same engine within the GEM context, defeating
our use of the engine as a key to looking up the HW context. Just
allocate a logical per-engine instance and always use an index into the
ctx->engines[]. Later on, this ctx->engines[] may be replaced by a user
specified map.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gvt/scheduler.c          |   2 +-
 drivers/gpu/drm/i915/i915_gem.c               |  41 ++++---
 drivers/gpu/drm/i915/i915_gem_context.c       |  82 +++++++++++---
 drivers/gpu/drm/i915/i915_gem_context.h       |  42 ++++++++
 drivers/gpu/drm/i915/i915_gem_context_types.h |  25 ++++-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c    |  63 +++++------
 drivers/gpu/drm/i915/i915_perf.c              |  85 ++++++++-------
 drivers/gpu/drm/i915/i915_request.c           |   2 +-
 drivers/gpu/drm/i915/intel_context.c          | 101 ++----------------
 drivers/gpu/drm/i915/intel_context.h          |  25 +----
 drivers/gpu/drm/i915/intel_context_types.h    |   2 -
 drivers/gpu/drm/i915/intel_engine_cs.c        |   2 +-
 drivers/gpu/drm/i915/intel_guc_submission.c   |  24 +++--
 .../gpu/drm/i915/selftests/i915_gem_context.c |   2 +-
 drivers/gpu/drm/i915/selftests/mock_context.c |  14 ++-
 15 files changed, 278 insertions(+), 234 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c
index ae1f09d2d4ae..822b606b3078 100644
--- a/drivers/gpu/drm/i915/gvt/scheduler.c
+++ b/drivers/gpu/drm/i915/gvt/scheduler.c
@@ -1191,7 +1191,7 @@ int intel_vgpu_setup_submission(struct intel_vgpu *vgpu)
 		INIT_LIST_HEAD(&s->workload_q_head[i]);
 		s->shadow[i] = ERR_PTR(-EINVAL);
 
-		ce = intel_context_instance(ctx, engine);
+		ce = i915_gem_context_get_engine(ctx, i);
 		if (IS_ERR(ce)) {
 			ret = PTR_ERR(ce);
 			goto out_shadow_ctx;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f6cfaf033167..6f615be007d7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4334,8 +4334,9 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
 
 static int __intel_engines_record_defaults(struct drm_i915_private *i915)
 {
-	struct i915_gem_context *ctx;
 	struct intel_engine_cs *engine;
+	struct i915_gem_context *ctx;
+	struct i915_gem_engines *e;
 	enum intel_engine_id id;
 	int err = 0;
 
@@ -4352,40 +4353,45 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
 	if (IS_ERR(ctx))
 		return PTR_ERR(ctx);
 
+	e = i915_gem_context_engine_list_lock(ctx);
+
+	i915_gem_unpark(i915);
 	for_each_engine(engine, i915, id) {
+		struct intel_context *ce = e->engines[id];
 		struct i915_request *rq;
 
-		rq = i915_request_alloc(engine, ctx);
+		err = intel_context_pin(ce);
+		if (err)
+			goto err_active;
+
+		rq = i915_request_create(ce);
+		intel_context_unpin(ce);
 		if (IS_ERR(rq)) {
 			err = PTR_ERR(rq);
-			goto out_ctx;
+			goto err_active;
 		}
 
 		err = 0;
-		if (engine->init_context)
-			err = engine->init_context(rq);
+		if (rq->engine->init_context)
+			err = rq->engine->init_context(rq);
 
 		i915_request_add(rq);
 		if (err)
 			goto err_active;
 	}
+	i915_gem_park(i915);
 
 	/* Flush the default context image to memory, and enable powersaving. */
 	if (!i915_gem_load_power_context(i915)) {
 		err = -EIO;
-		goto err_active;
+		goto err_parked;
 	}
 
 	for_each_engine(engine, i915, id) {
-		struct intel_context *ce;
-		struct i915_vma *state;
+		struct intel_context *ce = e->engines[id];
+		struct i915_vma *state = ce->state;
 		void *vaddr;
 
-		ce = intel_context_lookup(ctx, engine);
-		if (!ce)
-			continue;
-
-		state = ce->state;
 		if (!state)
 			continue;
 
@@ -4401,11 +4407,11 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
 		 */
 		err = i915_vma_unbind(state);
 		if (err)
-			goto err_active;
+			goto err_parked;
 
 		err = i915_gem_object_set_to_cpu_domain(state->obj, false);
 		if (err)
-			goto err_active;
+			goto err_parked;
 
 		engine->default_state = i915_gem_object_get(state->obj);
 		i915_gem_object_set_cache_coherency(engine->default_state,
@@ -4416,7 +4422,7 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
 						I915_MAP_FORCE_WB);
 		if (IS_ERR(vaddr)) {
 			err = PTR_ERR(vaddr);
-			goto err_active;
+			goto err_parked;
 		}
 
 		i915_gem_object_unpin_map(engine->default_state);
@@ -4441,11 +4447,14 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
 	}
 
 out_ctx:
+	i915_gem_context_engine_list_unlock(ctx);
 	i915_gem_context_set_closed(ctx);
 	i915_gem_context_put(ctx);
 	return err;
 
 err_active:
+	i915_gem_park(i915);
+err_parked:
 	/*
 	 * If we have to abandon now, we expect the engines to be idle
 	 * and ready to be torn-down. The quickest way we can accomplish
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index c6a15b958166..4372e244b005 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -145,7 +145,7 @@ lookup_user_engine(struct i915_gem_context *ctx, u16 class, u16 instance)
 	if (!engine)
 		return ERR_PTR(-EINVAL);
 
-	return intel_context_instance(ctx, engine);
+	return i915_gem_context_get_engine(ctx, engine->id);
 }
 
 static inline int new_hw_id(struct drm_i915_private *i915, gfp_t gfp)
@@ -237,10 +237,54 @@ static void release_hw_id(struct i915_gem_context *ctx)
 	mutex_unlock(&i915->contexts.mutex);
 }
 
-static void i915_gem_context_free(struct i915_gem_context *ctx)
+static void free_engines(struct i915_gem_engines *e)
+{
+	unsigned int n;
+
+	for (n = 0; n < e->num_engines; n++) {
+		if (!e->engines[n])
+			continue;
+
+		intel_context_put(e->engines[n]);
+	}
+	kfree(e);
+}
+
+static void free_engines_n(struct i915_gem_engines *e, unsigned long n)
 {
-	struct intel_context *it, *n;
+	e->num_engines = n;
+	free_engines(e);
+}
 
+static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx)
+{
+	struct intel_engine_cs *engine;
+	struct i915_gem_engines *e;
+	enum intel_engine_id id;
+
+	e = kzalloc(struct_size(e, engines, I915_NUM_ENGINES), GFP_KERNEL);
+	if (!e)
+		return ERR_PTR(-ENOMEM);
+
+	init_rcu_head(&e->rcu);
+	for_each_engine(engine, ctx->i915, id) {
+		struct intel_context *ce;
+
+		ce = intel_context_create(ctx, engine);
+		if (IS_ERR(ce)) {
+			free_engines_n(e, id);
+			return ERR_CAST(ce);
+		}
+
+		e->engines[id] = ce;
+	}
+	e->num_engines = id;
+
+	return e;
+}
+
+static void i915_gem_context_free(struct i915_gem_context *ctx)
+{
 	lockdep_assert_held(&ctx->i915->drm.struct_mutex);
 	GEM_BUG_ON(!i915_gem_context_is_closed(ctx));
 	GEM_BUG_ON(!list_empty(&ctx->active_engines));
@@ -248,8 +292,8 @@ static void i915_gem_context_free(struct i915_gem_context *ctx)
 	release_hw_id(ctx);
 	i915_ppgtt_put(ctx->ppgtt);
 
-	rbtree_postorder_for_each_entry_safe(it, n, &ctx->hw_contexts, node)
-		intel_context_put(it);
+	free_engines(ctx->engines);
+	mutex_destroy(&ctx->engines_mutex);
 
 	if (ctx->timeline)
 		i915_timeline_put(ctx->timeline);
@@ -358,6 +402,8 @@ static struct i915_gem_context *
 __create_context(struct drm_i915_private *dev_priv)
 {
 	struct i915_gem_context *ctx;
+	struct i915_gem_engines *e;
+	int err;
 	int i;
 
 	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
@@ -371,8 +417,13 @@ __create_context(struct drm_i915_private *dev_priv)
 	INIT_LIST_HEAD(&ctx->active_engines);
 	mutex_init(&ctx->mutex);
 
-	ctx->hw_contexts = RB_ROOT;
-	spin_lock_init(&ctx->hw_contexts_lock);
+	mutex_init(&ctx->engines_mutex);
+	e = default_engines(ctx);
+	if (IS_ERR(e)) {
+		err = PTR_ERR(e);
+		goto err_free;
+	}
+	RCU_INIT_POINTER(ctx->engines, e);
 
 	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
 	INIT_LIST_HEAD(&ctx->handles_list);
@@ -394,6 +445,10 @@ __create_context(struct drm_i915_private *dev_priv)
 		ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
 
 	return ctx;
+
+err_free:
+	kfree(ctx);
+	return ERR_PTR(err);
 }
 
 static struct i915_hw_ppgtt *
@@ -875,7 +930,8 @@ static int context_barrier_task(struct i915_gem_context *ctx,
 {
 	struct drm_i915_private *i915 = ctx->i915;
 	struct context_barrier_task *cb;
-	struct intel_context *ce, *next;
+	struct i915_gem_engines *e;
+	unsigned int i;
 	int err = 0;
 
 	lockdep_assert_held(&i915->drm.struct_mutex);
@@ -889,15 +945,16 @@ static int context_barrier_task(struct i915_gem_context *ctx,
 	i915_active_acquire(&cb->base);
 
 	i915_gem_unpark(i915);
-	rbtree_postorder_for_each_entry_safe(ce, next, &ctx->hw_contexts, node) {
-		struct intel_engine_cs *engine = ce->engine;
+	e = i915_gem_context_engine_list_lock(ctx);
+	for (i = 0; i < e->num_engines; i++) {
+		struct intel_context *ce = e->engines[i];
 		struct i915_request *rq;
 
-		if (!(engine->mask & engines))
+		if (!ce || !(ce->engine->mask & engines))
 			continue;
 
 		if (I915_SELFTEST_ONLY(context_barrier_inject_fault &
-				       engine->mask)) {
+				       ce->engine->mask)) {
 			err = -ENXIO;
 			break;
 		}
@@ -918,6 +975,7 @@ static int context_barrier_task(struct i915_gem_context *ctx,
 		if (err)
 			break;
 	}
+	i915_gem_context_engine_list_unlock(ctx);
 	i915_gem_park(i915);
 
 	cb->task = err ? NULL : task; /* caller needs to unwind instead */
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index edc6ba3f0288..6a336a8a2324 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -179,6 +179,48 @@ static inline void i915_gem_context_put(struct i915_gem_context *ctx)
 	kref_put(&ctx->ref, i915_gem_context_release);
 }
 
+static inline struct i915_gem_engines *
+i915_gem_context_engine_list(struct i915_gem_context *ctx)
+{
+	return rcu_dereference_protected(ctx->engines,
+					 lockdep_is_held(&ctx->engines_mutex));
+}
+
+static inline struct i915_gem_engines *
+i915_gem_context_engine_list_lock(struct i915_gem_context *ctx)
+	__acquires(&ctx->engines_mutex)
+{
+	mutex_lock(&ctx->engines_mutex);
+	return i915_gem_context_engine_list(ctx);
+}
+
+static inline void
+i915_gem_context_engine_list_unlock(struct i915_gem_context *ctx)
+	__releases(&ctx->engines_mutex)
+{
+	mutex_unlock(&ctx->engines_mutex);
+}
+
+static inline struct intel_context *
+i915_gem_context_lookup_engine(struct i915_gem_context *ctx, unsigned long idx)
+{
+	return i915_gem_context_engine_list(ctx)->engines[idx];
+}
+
+static inline struct intel_context *
+i915_gem_context_get_engine(struct i915_gem_context *ctx, unsigned long idx)
+{
+	struct intel_context *ce = ERR_PTR(-EINVAL);
+
+	rcu_read_lock(); {
+		struct i915_gem_engines *e = rcu_dereference(ctx->engines);
+		if (likely(idx < e->num_engines && e->engines[idx]))
+			ce = intel_context_get(e->engines[idx]);
+	} rcu_read_unlock();
+
+	return ce;
+}
+
 struct i915_lut_handle *i915_lut_handle_alloc(void);
 void i915_lut_handle_free(struct i915_lut_handle *lut);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context_types.h b/drivers/gpu/drm/i915/i915_gem_context_types.h
index e2ec58b10fb2..9a4104505fa6 100644
--- a/drivers/gpu/drm/i915/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/i915_gem_context_types.h
@@ -28,6 +28,12 @@ struct i915_hw_ppgtt;
 struct i915_timeline;
 struct intel_ring;
 
+struct i915_gem_engines {
+	struct rcu_head rcu;
+	unsigned long num_engines;
+	struct intel_context *engines[];
+};
+
 /**
  * struct i915_gem_context - client state
  *
@@ -41,6 +47,21 @@ struct i915_gem_context {
 	/** file_priv: owning file descriptor */
 	struct drm_i915_file_private *file_priv;
 
+	/**
+	 * @engines: User defined engines for this context
+	 *
+	 * NULL means to use legacy definitions (including random meaning of
+	 * I915_EXEC_BSD with I915_EXEC_BSD_SELECTOR overrides).
+	 *
+	 * If defined, execbuf uses the I915_EXEC_MASK as an index into
+	 * array, and various uAPI other the ability to lookup up an
+	 * index from this array to select an engine operate on.
+	 *
+	 * User defined by I915_CONTEXT_PARAM_ENGINE.
+	 */
+	struct i915_gem_engines * __rcu engines;
+	struct mutex engines_mutex; /* guards writes to engines */
+
 	struct i915_timeline *timeline;
 
 	/**
@@ -133,10 +154,6 @@ struct i915_gem_context {
 
 	struct i915_sched_attr sched;
 
-	/** hw_contexts: per-engine logical HW state */
-	struct rb_root hw_contexts;
-	spinlock_t hw_contexts_lock;
-
 	/** ring_size: size for allocating the per-engine ring buffer */
 	u32 ring_size;
 	/** desc_template: invariant fields for the HW context descriptor */
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index cf7aa0e325d2..45c086451397 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -2085,10 +2085,8 @@ static const enum intel_engine_id user_ring_map[I915_USER_RINGS + 1] = {
 	[I915_EXEC_VEBOX]	= VECS0
 };
 
-static int eb_pin_context(struct i915_execbuffer *eb,
-			  struct intel_engine_cs *engine)
+static int eb_pin_context(struct i915_execbuffer *eb, struct intel_context *ce)
 {
-	struct intel_context *ce;
 	int err;
 
 	/*
@@ -2099,21 +2097,16 @@ static int eb_pin_context(struct i915_execbuffer *eb,
 	if (err)
 		return err;
 
-	ce = intel_context_instance(eb->gem_context, engine);
-	if (IS_ERR(ce))
-		return PTR_ERR(ce);
-
 	/*
 	 * Pinning the contexts may generate requests in order to acquire
 	 * GGTT space, so do this first before we reserve a seqno for
 	 * ourselves.
 	 */
 	err = intel_context_pin(ce);
-	intel_context_put(ce);
 	if (err)
 		return err;
 
-	eb->engine = engine;
+	eb->engine = ce->engine;
 	eb->context = ce;
 	return 0;
 }
@@ -2123,24 +2116,18 @@ static void eb_unpin_context(struct i915_execbuffer *eb)
 	intel_context_unpin(eb->context);
 }
 
-static int
-eb_select_engine(struct i915_execbuffer *eb,
-		 struct drm_file *file,
-		 struct drm_i915_gem_execbuffer2 *args)
+static unsigned int
+eb_select_legacy_ring(struct i915_execbuffer *eb,
+		      struct drm_file *file,
+		      struct drm_i915_gem_execbuffer2 *args)
 {
 	unsigned int user_ring_id = args->flags & I915_EXEC_RING_MASK;
-	struct intel_engine_cs *engine;
-
-	if (user_ring_id > I915_USER_RINGS) {
-		DRM_DEBUG("execbuf with unknown ring: %u\n", user_ring_id);
-		return -EINVAL;
-	}
 
-	if ((user_ring_id != I915_EXEC_BSD) &&
-	    ((args->flags & I915_EXEC_BSD_MASK) != 0)) {
+	if (user_ring_id != I915_EXEC_BSD &&
+	    (args->flags & I915_EXEC_BSD_MASK)) {
 		DRM_DEBUG("execbuf with non bsd ring but with invalid "
 			  "bsd dispatch flags: %d\n", (int)(args->flags));
-		return -EINVAL;
+		return -1;
 	}
 
 	if (user_ring_id == I915_EXEC_BSD && HAS_ENGINE(eb->i915, VCS1)) {
@@ -2155,20 +2142,34 @@ eb_select_engine(struct i915_execbuffer *eb,
 		} else {
 			DRM_DEBUG("execbuf with unknown bsd ring: %u\n",
 				  bsd_idx);
-			return -EINVAL;
+			return -1;
 		}
 
-		engine = eb->i915->engine[_VCS(bsd_idx)];
-	} else {
-		engine = eb->i915->engine[user_ring_map[user_ring_id]];
+		return _VCS(bsd_idx);
 	}
 
-	if (!engine) {
-		DRM_DEBUG("execbuf with invalid ring: %u\n", user_ring_id);
-		return -EINVAL;
-	}
+	return user_ring_map[user_ring_id];
+}
 
-	return eb_pin_context(eb, engine);
+static int
+eb_select_engine(struct i915_execbuffer *eb,
+		 struct drm_file *file,
+		 struct drm_i915_gem_execbuffer2 *args)
+{
+	struct intel_context *ce;
+	unsigned int idx;
+	int err;
+
+	idx = eb_select_legacy_ring(eb, file, args);
+
+	ce = i915_gem_context_get_engine(eb->gem_context, idx);
+	if (IS_ERR(ce))
+		return PTR_ERR(ce);
+
+	err = eb_pin_context(eb, ce);
+	intel_context_put(ce);
+
+	return err;
 }
 
 static void
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 28475cbbdcbb..34979a017afb 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1203,35 +1203,37 @@ static int i915_oa_read(struct i915_perf_stream *stream,
 static struct intel_context *oa_pin_context(struct drm_i915_private *i915,
 					    struct i915_gem_context *ctx)
 {
-	struct intel_engine_cs *engine = i915->engine[RCS0];
-	struct intel_context *ce;
+	struct i915_gem_engines *e;
+	unsigned int i;
 	int err;
 
-	ce = intel_context_instance(ctx, engine);
-	if (IS_ERR(ce))
-		return ce;
-
 	err = i915_mutex_lock_interruptible(&i915->drm);
-	if (err) {
-		intel_context_put(ce);
+	if (err)
 		return ERR_PTR(err);
-	}
 
-	/*
-	 * As the ID is the gtt offset of the context's vma we
-	 * pin the vma to ensure the ID remains fixed.
-	 *
-	 * NB: implied RCS engine...
-	 */
-	err = intel_context_pin(ce);
+	e = i915_gem_context_engine_list_lock(ctx);
+	for (i = 0; i < e->num_engines; i++) {
+		struct intel_context *ce = e->engines[i];
+
+		if (ce->engine->class != RENDER_CLASS)
+			continue;
+
+		/*
+		 * As the ID is the gtt offset of the context's vma we
+		 * pin the vma to ensure the ID remains fixed.
+		 */
+		err = intel_context_pin(ce);
+		if (err == 0) {
+			i915->perf.oa.pinned_ctx = ce;
+			break;
+		}
+	}
+	i915_gem_context_engine_list_unlock(ctx);
 	mutex_unlock(&i915->drm.struct_mutex);
-	intel_context_put(ce);
 	if (err)
 		return ERR_PTR(err);
 
-	i915->perf.oa.pinned_ctx = ce;
-
-	return ce;
+	return i915->perf.oa.pinned_ctx;
 }
 
 /**
@@ -1717,10 +1719,10 @@ gen8_update_reg_state_unlocked(struct intel_context *ce,
 static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
 				       const struct i915_oa_config *oa_config)
 {
-	struct intel_engine_cs *engine = dev_priv->engine[RCS0];
 	unsigned int map_type = i915_coherent_map_type(dev_priv);
 	struct i915_gem_context *ctx;
 	struct i915_request *rq;
+	unsigned int i;
 	int ret;
 
 	lockdep_assert_held(&dev_priv->drm.struct_mutex);
@@ -1747,32 +1749,43 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
 
 	/* Update all contexts now that we've stalled the submission. */
 	list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
-		struct intel_context *ce = intel_context_lookup(ctx, engine);
-		u32 *regs;
+		struct i915_gem_engines *e =
+			i915_gem_context_engine_list_lock(ctx);
 
-		/* OA settings will be set upon first use */
-		if (!ce || !ce->state)
-			continue;
+		for (i = 0; i < e->num_engines; i++) {
+			struct intel_context *ce = e->engines[i];
+			u32 *regs;
 
-		regs = i915_gem_object_pin_map(ce->state->obj, map_type);
-		if (IS_ERR(regs)) {
-			ret = PTR_ERR(regs);
-			goto out_pm;
-		}
+			if (!ce || ce->engine->class != RENDER_CLASS)
+				continue;
+
+			/* OA settings will be set upon first use */
+			if (!ce->state)
+				continue;
+
+			regs = i915_gem_object_pin_map(ce->state->obj,
+						       map_type);
+			if (IS_ERR(regs)) {
+				ret = PTR_ERR(regs);
+				goto out_pm;
+			}
+
+			ce->state->obj->mm.dirty = true;
+			regs += LRC_STATE_PN * PAGE_SIZE / sizeof(*regs);
 
-		ce->state->obj->mm.dirty = true;
-		regs += LRC_STATE_PN * PAGE_SIZE / sizeof(*regs);
+			gen8_update_reg_state_unlocked(ce, regs, oa_config);
 
-		gen8_update_reg_state_unlocked(ce, regs, oa_config);
+			i915_gem_object_unpin_map(ce->state->obj);
+		}
 
-		i915_gem_object_unpin_map(ce->state->obj);
+		i915_gem_context_engine_list_unlock(ctx);
 	}
 
 	/*
 	 * Apply the configuration by doing one context restore of the edited
 	 * context image.
 	 */
-	rq = i915_request_create(engine->kernel_context);
+	rq = i915_request_create(dev_priv->engine[RCS0]->kernel_context);
 	if (IS_ERR(rq)) {
 		ret = PTR_ERR(rq);
 		goto out_pm;
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index fe8db5ef0ded..dda856f2a012 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -755,7 +755,7 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	 * GGTT space, so do this first before we reserve a seqno for
 	 * ourselves.
 	 */
-	ce = intel_context_instance(ctx, engine);
+	ce = i915_gem_context_get_engine(ctx, engine->id);
 	if (IS_ERR(ce))
 		return ERR_CAST(ce);
 
diff --git a/drivers/gpu/drm/i915/intel_context.c b/drivers/gpu/drm/i915/intel_context.c
index 65031744c0a8..fd0ba616311d 100644
--- a/drivers/gpu/drm/i915/intel_context.c
+++ b/drivers/gpu/drm/i915/intel_context.c
@@ -15,115 +15,28 @@ static struct i915_global_context {
 	struct kmem_cache *slab_ce;
 } global;
 
-struct intel_context *intel_context_alloc(void)
-{
-	return kmem_cache_zalloc(global.slab_ce, GFP_KERNEL);
-}
-
 void intel_context_free(struct intel_context *ce)
 {
 	kmem_cache_free(global.slab_ce, ce);
 }
 
-struct intel_context *
-intel_context_lookup(struct i915_gem_context *ctx,
-		     struct intel_engine_cs *engine)
-{
-	struct intel_context *ce = NULL;
-	struct rb_node *p;
-
-	spin_lock(&ctx->hw_contexts_lock);
-	p = ctx->hw_contexts.rb_node;
-	while (p) {
-		struct intel_context *this =
-			rb_entry(p, struct intel_context, node);
-
-		if (this->engine == engine) {
-			GEM_BUG_ON(this->gem_context != ctx);
-			ce = this;
-			break;
-		}
-
-		if (this->engine < engine)
-			p = p->rb_right;
-		else
-			p = p->rb_left;
-	}
-	spin_unlock(&ctx->hw_contexts_lock);
-
-	return ce;
-}
-
-struct intel_context *
-__intel_context_insert(struct i915_gem_context *ctx,
-		       struct intel_engine_cs *engine,
-		       struct intel_context *ce)
-{
-	struct rb_node **p, *parent;
-	int err = 0;
-
-	spin_lock(&ctx->hw_contexts_lock);
-
-	parent = NULL;
-	p = &ctx->hw_contexts.rb_node;
-	while (*p) {
-		struct intel_context *this;
-
-		parent = *p;
-		this = rb_entry(parent, struct intel_context, node);
-
-		if (this->engine == engine) {
-			err = -EEXIST;
-			ce = this;
-			break;
-		}
-
-		if (this->engine < engine)
-			p = &parent->rb_right;
-		else
-			p = &parent->rb_left;
-	}
-	if (!err) {
-		rb_link_node(&ce->node, parent, p);
-		rb_insert_color(&ce->node, &ctx->hw_contexts);
-	}
-
-	spin_unlock(&ctx->hw_contexts_lock);
-
-	return ce;
-}
-
-void __intel_context_remove(struct intel_context *ce)
+static struct intel_context *intel_context_alloc(void)
 {
-	struct i915_gem_context *ctx = ce->gem_context;
-
-	spin_lock(&ctx->hw_contexts_lock);
-	rb_erase(&ce->node, &ctx->hw_contexts);
-	spin_unlock(&ctx->hw_contexts_lock);
+	return kmem_cache_zalloc(global.slab_ce, GFP_KERNEL);
 }
 
 struct intel_context *
-intel_context_instance(struct i915_gem_context *ctx,
-		       struct intel_engine_cs *engine)
+intel_context_create(struct i915_gem_context *ctx,
+		     struct intel_engine_cs *engine)
 {
-	struct intel_context *ce, *pos;
-
-	ce = intel_context_lookup(ctx, engine);
-	if (likely(ce))
-		return intel_context_get(ce);
+	struct intel_context *ce;
 
 	ce = intel_context_alloc();
 	if (!ce)
 		return ERR_PTR(-ENOMEM);
 
 	intel_context_init(ce, ctx, engine);
-
-	pos = __intel_context_insert(ctx, engine, ce);
-	if (unlikely(pos != ce)) /* Beaten! Use their HW context instead */
-		intel_context_free(ce);
-
-	GEM_BUG_ON(intel_context_lookup(ctx, engine) != pos);
-	return intel_context_get(pos);
+	return ce;
 }
 
 int __intel_context_do_pin(struct intel_context *ce)
@@ -199,6 +112,8 @@ intel_context_init(struct intel_context *ce,
 		   struct i915_gem_context *ctx,
 		   struct intel_engine_cs *engine)
 {
+	GEM_BUG_ON(!engine->cops);
+
 	kref_init(&ce->ref);
 
 	ce->gem_context = ctx;
diff --git a/drivers/gpu/drm/i915/intel_context.h b/drivers/gpu/drm/i915/intel_context.h
index 3b3b14190ce7..460b5c34cede 100644
--- a/drivers/gpu/drm/i915/intel_context.h
+++ b/drivers/gpu/drm/i915/intel_context.h
@@ -12,24 +12,16 @@
 #include "intel_context_types.h"
 #include "intel_engine_types.h"
 
-struct intel_context *intel_context_alloc(void);
-void intel_context_free(struct intel_context *ce);
-
 void intel_context_init(struct intel_context *ce,
 			struct i915_gem_context *ctx,
 			struct intel_engine_cs *engine);
 
-/**
- * intel_context_lookup - Find the matching HW context for this (ctx, engine)
- * @ctx - the parent GEM context
- * @engine - the target HW engine
- *
- * May return NULL if the HW context hasn't been instantiated (i.e. unused).
- */
 struct intel_context *
-intel_context_lookup(struct i915_gem_context *ctx,
+intel_context_create(struct i915_gem_context *ctx,
 		     struct intel_engine_cs *engine);
 
+void intel_context_free(struct intel_context *ce);
+
 /**
  * intel_context_pin_lock - Stablises the 'pinned' status of the HW context
  * @ctx - the parent GEM context
@@ -59,17 +51,6 @@ intel_context_is_pinned(struct intel_context *ce)
 
 void intel_context_pin_unlock(struct intel_context *ce);
 
-struct intel_context *
-__intel_context_insert(struct i915_gem_context *ctx,
-		       struct intel_engine_cs *engine,
-		       struct intel_context *ce);
-void
-__intel_context_remove(struct intel_context *ce);
-
-struct intel_context *
-intel_context_instance(struct i915_gem_context *ctx,
-		       struct intel_engine_cs *engine);
-
 int __intel_context_do_pin(struct intel_context *ce);
 
 static inline int intel_context_pin(struct intel_context *ce)
diff --git a/drivers/gpu/drm/i915/intel_context_types.h b/drivers/gpu/drm/i915/intel_context_types.h
index 624729a35875..9cd7959529fb 100644
--- a/drivers/gpu/drm/i915/intel_context_types.h
+++ b/drivers/gpu/drm/i915/intel_context_types.h
@@ -10,7 +10,6 @@
 #include <linux/kref.h>
 #include <linux/list.h>
 #include <linux/mutex.h>
-#include <linux/rbtree.h>
 #include <linux/types.h>
 
 #include "i915_active_types.h"
@@ -64,7 +63,6 @@ struct intel_context {
 	struct i915_active_request active_tracker;
 
 	const struct intel_context_ops *ops;
-	struct rb_node node;
 
 	/** sseu: Control eu/slice partitioning */
 	struct intel_sseu sseu;
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 2302746c4ce0..a578403e3436 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -712,7 +712,7 @@ static int pin_context(struct i915_gem_context *ctx,
 	struct intel_context *ce;
 	int err;
 
-	ce = intel_context_instance(ctx, engine);
+	ce = i915_gem_context_get_engine(ctx, engine->id);
 	if (IS_ERR(ce))
 		return PTR_ERR(ce);
 
diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
index 30dd6706a1d2..87bb776c2c7c 100644
--- a/drivers/gpu/drm/i915/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/intel_guc_submission.c
@@ -363,11 +363,10 @@ static void guc_stage_desc_pool_destroy(struct intel_guc *guc)
 static void guc_stage_desc_init(struct intel_guc_client *client)
 {
 	struct intel_guc *guc = client->guc;
-	struct drm_i915_private *dev_priv = guc_to_i915(guc);
-	struct intel_engine_cs *engine;
 	struct i915_gem_context *ctx = client->owner;
+	struct i915_gem_engines *e;
 	struct guc_stage_desc *desc;
-	unsigned int tmp;
+	unsigned int i;
 	u32 gfx_addr;
 
 	desc = __get_stage_desc(client);
@@ -381,10 +380,13 @@ static void guc_stage_desc_init(struct intel_guc_client *client)
 	desc->priority = client->priority;
 	desc->db_id = client->doorbell_id;
 
-	for_each_engine_masked(engine, dev_priv, client->engines, tmp) {
-		struct intel_context *ce = intel_context_lookup(ctx, engine);
-		u32 guc_engine_id = engine->guc_id;
-		struct guc_execlist_context *lrc = &desc->lrc[guc_engine_id];
+	e = i915_gem_context_engine_list_lock(ctx);
+	for (i = 0; i < e->num_engines; i++) {
+		struct intel_context *ce = e->engines[i];
+		struct guc_execlist_context *lrc;
+
+		if (!ce || !(ce->engine->mask & client->engines))
+			continue;
 
 		/* TODO: We have a design issue to be solved here. Only when we
 		 * receive the first batch, we know which engine is used by the
@@ -393,7 +395,7 @@ static void guc_stage_desc_init(struct intel_guc_client *client)
 		 * for now who owns a GuC client. But for future owner of GuC
 		 * client, need to make sure lrc is pinned prior to enter here.
 		 */
-		if (!ce || !ce->state)
+		if (!ce->state)
 			break;	/* XXX: continue? */
 
 		/*
@@ -403,6 +405,7 @@ static void guc_stage_desc_init(struct intel_guc_client *client)
 		 * Instead, the GuC uses the LRCA of the user mode context (see
 		 * guc_add_request below).
 		 */
+		lrc = &desc->lrc[ce->engine->guc_id];
 		lrc->context_desc = lower_32_bits(ce->lrc_desc);
 
 		/* The state page is after PPHWSP */
@@ -413,15 +416,16 @@ static void guc_stage_desc_init(struct intel_guc_client *client)
 		 * here. In proxy submission, it wants the stage id
 		 */
 		lrc->context_id = (client->stage_id << GUC_ELC_CTXID_OFFSET) |
-				(guc_engine_id << GUC_ELC_ENGINE_OFFSET);
+				(ce->engine->guc_id << GUC_ELC_ENGINE_OFFSET);
 
 		lrc->ring_begin = intel_guc_ggtt_offset(guc, ce->ring->vma);
 		lrc->ring_end = lrc->ring_begin + ce->ring->size - 1;
 		lrc->ring_next_free_location = lrc->ring_begin;
 		lrc->ring_current_tail_pointer_value = 0;
 
-		desc->engines_used |= (1 << guc_engine_id);
+		desc->engines_used |= BIT(ce->engine->guc_id);
 	}
+	i915_gem_context_engine_list_unlock(ctx);
 
 	DRM_DEBUG_DRIVER("Host engines 0x%x => GuC engines used 0x%x\n",
 			 client->engines, desc->engines_used);
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index 2a4f7348dcae..0f3018113129 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -1091,7 +1091,7 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
 
 	i915_gem_unpark(i915);
 
-	ce = intel_context_instance(ctx, i915->engine[RCS0]);
+	ce = i915_gem_context_get_engine(ctx, RCS0);
 	if (IS_ERR(ce)) {
 		ret = PTR_ERR(ce);
 		goto out_rpm;
diff --git a/drivers/gpu/drm/i915/selftests/mock_context.c b/drivers/gpu/drm/i915/selftests/mock_context.c
index 0426093bf1d9..89a3c00384cd 100644
--- a/drivers/gpu/drm/i915/selftests/mock_context.c
+++ b/drivers/gpu/drm/i915/selftests/mock_context.c
@@ -30,6 +30,7 @@ mock_context(struct drm_i915_private *i915,
 	     const char *name)
 {
 	struct i915_gem_context *ctx;
+	struct i915_gem_engines *e;
 	int ret;
 
 	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
@@ -40,8 +41,11 @@ mock_context(struct drm_i915_private *i915,
 	INIT_LIST_HEAD(&ctx->link);
 	ctx->i915 = i915;
 
-	ctx->hw_contexts = RB_ROOT;
-	spin_lock_init(&ctx->hw_contexts_lock);
+	mutex_init(&ctx->engines_mutex);
+	e = default_engines(ctx);
+	if (IS_ERR(e))
+		goto err_free;
+	ctx->engines = e;
 
 	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
 	INIT_LIST_HEAD(&ctx->handles_list);
@@ -51,7 +55,7 @@ mock_context(struct drm_i915_private *i915,
 
 	ret = i915_gem_context_pin_hw_id(ctx);
 	if (ret < 0)
-		goto err_handles;
+		goto err_engines;
 
 	if (name) {
 		struct i915_hw_ppgtt *ppgtt;
@@ -69,7 +73,9 @@ mock_context(struct drm_i915_private *i915,
 
 	return ctx;
 
-err_handles:
+err_engines:
+	free_engines(ctx->engines);
+err_free:
 	kfree(ctx);
 	return NULL;
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 15/22] drm/i915: Move i915_request_alloc into selftests/
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
                   ` (13 preceding siblings ...)
  2019-03-25  9:04 ` [PATCH 14/22] drm/i915: Switch back to an array of logical per-engine HW contexts Chris Wilson
@ 2019-03-25  9:04 ` Chris Wilson
  2019-03-25  9:04 ` [PATCH 16/22] drm/i915: Allow a context to define its set of engines Chris Wilson
                   ` (10 subsequent siblings)
  25 siblings, 0 replies; 43+ messages in thread
From: Chris Wilson @ 2019-03-25  9:04 UTC (permalink / raw)
  To: intel-gfx

Having transitioned GEM over to using intel_context as its primary means
of tracking the GEM context and engine combined and using
 i915_request_create(), we can move the older i915_request_alloc()
helper function into selftests/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/Makefile                 |  1 +
 drivers/gpu/drm/i915/i915_request.c           | 48 -------------------
 drivers/gpu/drm/i915/i915_request.h           |  3 --
 drivers/gpu/drm/i915/selftests/huge_pages.c   |  3 +-
 drivers/gpu/drm/i915/selftests/i915_gem.c     |  5 +-
 .../gpu/drm/i915/selftests/i915_gem_context.c | 15 +++---
 .../gpu/drm/i915/selftests/i915_gem_evict.c   |  3 +-
 drivers/gpu/drm/i915/selftests/i915_request.c |  4 +-
 .../gpu/drm/i915/selftests/igt_gem_utils.c    | 39 +++++++++++++++
 .../gpu/drm/i915/selftests/igt_gem_utils.h    | 17 +++++++
 drivers/gpu/drm/i915/selftests/igt_spinner.c  |  3 +-
 .../gpu/drm/i915/selftests/intel_hangcheck.c  |  9 ++--
 drivers/gpu/drm/i915/selftests/intel_lrc.c    |  9 ++--
 .../drm/i915/selftests/intel_workarounds.c    |  8 ++--
 drivers/gpu/drm/i915/selftests/mock_request.c |  3 +-
 15 files changed, 91 insertions(+), 79 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/selftests/igt_gem_utils.c
 create mode 100644 drivers/gpu/drm/i915/selftests/igt_gem_utils.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index bd1657c3d395..688558f935be 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -184,6 +184,7 @@ i915-$(CONFIG_DRM_I915_SELFTEST) += \
 	selftests/i915_random.o \
 	selftests/i915_selftest.o \
 	selftests/igt_flush_test.o \
+	selftests/igt_gem_utils.o \
 	selftests/igt_live_test.o \
 	selftests/igt_reset.o \
 	selftests/igt_spinner.o
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index dda856f2a012..4594b946f4c3 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -726,54 +726,6 @@ struct i915_request *i915_request_create(struct intel_context *ce)
 	return ERR_PTR(ret);
 }
 
-/**
- * i915_request_alloc - allocate a request structure
- *
- * @engine: engine that we wish to issue the request on.
- * @ctx: context that the request will be associated with.
- *
- * Returns a pointer to the allocated request if successful,
- * or an error code if not.
- */
-struct i915_request *
-i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
-{
-	struct drm_i915_private *i915 = engine->i915;
-	struct intel_context *ce;
-	struct i915_request *rq;
-	int err;
-
-	/*
-	 * Preempt contexts are reserved for exclusive use to inject a
-	 * preemption context switch. They are never to be used for any trivial
-	 * request!
-	 */
-	GEM_BUG_ON(ctx == i915->preempt_context);
-
-	/*
-	 * Pinning the contexts may generate requests in order to acquire
-	 * GGTT space, so do this first before we reserve a seqno for
-	 * ourselves.
-	 */
-	ce = i915_gem_context_get_engine(ctx, engine->id);
-	if (IS_ERR(ce))
-		return ERR_CAST(ce);
-
-	err = intel_context_pin(ce);
-	intel_context_put(ce);
-	if (err)
-		return ERR_PTR(err);
-
-	i915_gem_unpark(i915);
-
-	rq = i915_request_create(ce);
-
-	i915_gem_park(i915);
-	intel_context_unpin(ce);
-
-	return rq;
-}
-
 static int
 emit_semaphore_wait(struct i915_request *to,
 		    struct i915_request *from,
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 37f84ad067da..7390e1f9a8cb 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -231,9 +231,6 @@ static inline bool dma_fence_is_i915(const struct dma_fence *fence)
 struct i915_request * __must_check
 i915_request_create(struct intel_context *ce);
 
-struct i915_request * __must_check
-i915_request_alloc(struct intel_engine_cs *engine,
-		   struct i915_gem_context *ctx);
 void i915_request_retire_upto(struct i915_request *rq);
 
 static inline struct i915_request *
diff --git a/drivers/gpu/drm/i915/selftests/huge_pages.c b/drivers/gpu/drm/i915/selftests/huge_pages.c
index 1597a6e1f364..752b552f5c50 100644
--- a/drivers/gpu/drm/i915/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/selftests/huge_pages.c
@@ -26,6 +26,7 @@
 
 #include <linux/prime_numbers.h>
 
+#include "igt_gem_utils.h"
 #include "mock_drm.h"
 #include "i915_random.h"
 
@@ -980,7 +981,7 @@ static int gpu_write(struct i915_vma *vma,
 	if (IS_ERR(batch))
 		return PTR_ERR(batch);
 
-	rq = i915_request_alloc(engine, ctx);
+	rq = igt_request_alloc(ctx, engine);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto err_batch;
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem.c b/drivers/gpu/drm/i915/selftests/i915_gem.c
index 7d79f1fe6bbd..72597aae3314 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem.c
@@ -8,8 +8,9 @@
 
 #include "../i915_selftest.h"
 
-#include "mock_context.h"
+#include "igt_gem_utils.h"
 #include "igt_flush_test.h"
+#include "mock_context.h"
 
 static int switch_to_context(struct drm_i915_private *i915,
 			     struct i915_gem_context *ctx)
@@ -23,7 +24,7 @@ static int switch_to_context(struct drm_i915_private *i915,
 	for_each_engine(engine, i915, id) {
 		struct i915_request *rq;
 
-		rq = i915_request_alloc(engine, ctx);
+		rq = igt_request_alloc(ctx, engine);
 		if (IS_ERR(rq)) {
 			err = PTR_ERR(rq);
 			break;
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index 0f3018113129..e01d2a2855a0 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -29,6 +29,7 @@
 #include "../i915_selftest.h"
 #include "i915_random.h"
 #include "igt_flush_test.h"
+#include "igt_gem_utils.h"
 #include "igt_live_test.h"
 #include "igt_reset.h"
 #include "igt_spinner.h"
@@ -90,7 +91,7 @@ static int live_nop_switch(void *arg)
 
 		times[0] = ktime_get_raw();
 		for (n = 0; n < nctx; n++) {
-			rq = i915_request_alloc(engine, ctx[n]);
+			rq = igt_request_alloc(ctx[n], engine);
 			if (IS_ERR(rq)) {
 				err = PTR_ERR(rq);
 				goto out_unlock;
@@ -120,7 +121,7 @@ static int live_nop_switch(void *arg)
 			times[1] = ktime_get_raw();
 
 			for (n = 0; n < prime; n++) {
-				rq = i915_request_alloc(engine, ctx[n % nctx]);
+				rq = igt_request_alloc(ctx[n % nctx], engine);
 				if (IS_ERR(rq)) {
 					err = PTR_ERR(rq);
 					goto out_unlock;
@@ -300,7 +301,7 @@ static int gpu_fill(struct drm_i915_gem_object *obj,
 		goto err_vma;
 	}
 
-	rq = i915_request_alloc(engine, ctx);
+	rq = igt_request_alloc(ctx, engine);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto err_batch;
@@ -1339,7 +1340,7 @@ static int write_to_scratch(struct i915_gem_context *ctx,
 	if (err)
 		goto err_unpin;
 
-	rq = i915_request_alloc(engine, ctx);
+	rq = igt_request_alloc(ctx, engine);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto err_unpin;
@@ -1434,7 +1435,7 @@ static int read_from_scratch(struct i915_gem_context *ctx,
 	if (err)
 		goto err_unpin;
 
-	rq = i915_request_alloc(engine, ctx);
+	rq = igt_request_alloc(ctx, engine);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto err_unpin;
@@ -1619,7 +1620,7 @@ static int __igt_switch_to_kernel_context(struct drm_i915_private *i915,
 			for_each_engine_masked(engine, i915, engines, tmp) {
 				struct i915_request *rq;
 
-				rq = i915_request_alloc(engine, ctx);
+				rq = igt_request_alloc(ctx, engine);
 				if (IS_ERR(rq))
 					return PTR_ERR(rq);
 
@@ -1764,7 +1765,7 @@ static int mock_context_barrier(void *arg)
 		goto out;
 	}
 
-	rq = i915_request_alloc(i915->engine[RCS0], ctx);
+	rq = igt_request_alloc(ctx, i915->engine[RCS0]);
 	if (IS_ERR(rq)) {
 		pr_err("Request allocation failed!\n");
 		goto out;
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
index c0cf26507915..e4db66e0e8d6 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
@@ -25,6 +25,7 @@
 #include "../i915_selftest.h"
 #include "../i915_gem_pm.h"
 
+#include "igt_gem_utils.h"
 #include "lib_sw_fence.h"
 #include "mock_context.h"
 #include "mock_drm.h"
@@ -460,7 +461,7 @@ static int igt_evict_contexts(void *arg)
 
 			/* We will need some GGTT space for the rq's context */
 			igt_evict_ctl.fail_if_busy = true;
-			rq = i915_request_alloc(engine, ctx);
+			rq = igt_request_alloc(ctx, engine);
 			igt_evict_ctl.fail_if_busy = false;
 
 			if (IS_ERR(rq)) {
diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c
index 03bea3caaafe..6ef04fa775d5 100644
--- a/drivers/gpu/drm/i915/selftests/i915_request.c
+++ b/drivers/gpu/drm/i915/selftests/i915_request.c
@@ -267,7 +267,7 @@ static struct i915_request *
 __live_request_alloc(struct i915_gem_context *ctx,
 		     struct intel_engine_cs *engine)
 {
-	return i915_request_alloc(engine, ctx);
+	return igt_request_alloc(ctx, engine);
 }
 
 static int __igt_breadcrumbs_smoketest(void *arg)
@@ -1070,7 +1070,7 @@ max_batches(struct i915_gem_context *ctx, struct intel_engine_cs *engine)
 	if (HAS_EXECLISTS(ctx->i915))
 		return INT_MAX;
 
-	rq = i915_request_alloc(engine, ctx);
+	rq = igt_request_alloc(ctx, engine);
 	if (IS_ERR(rq)) {
 		ret = PTR_ERR(rq);
 	} else {
diff --git a/drivers/gpu/drm/i915/selftests/igt_gem_utils.c b/drivers/gpu/drm/i915/selftests/igt_gem_utils.c
new file mode 100644
index 000000000000..f9899adfe12a
--- /dev/null
+++ b/drivers/gpu/drm/i915/selftests/igt_gem_utils.c
@@ -0,0 +1,39 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2018 Intel Corporation
+ */
+
+#include "igt_gem_utils.h"
+
+#include "../i915_gem_context.h"
+#include "../i915_gem_pm.h"
+#include "../intel_context.h"
+
+struct i915_request *
+igt_request_alloc(struct i915_gem_context *ctx, struct intel_engine_cs *engine)
+{
+	struct intel_context *ce;
+	struct i915_request *rq;
+	int err;
+
+	/*
+	 * Pinning the contexts may generate requests in order to acquire
+	 * GGTT space, so do this first before we reserve a seqno for
+	 * ourselves.
+	 */
+	ce = i915_gem_context_get_engine(ctx, engine->id);
+	if (IS_ERR(ce))
+		return ERR_CAST(ce);
+
+	err = intel_context_pin(ce);
+	intel_context_put(ce);
+	if (err)
+		return ERR_PTR(err);
+
+	rq = i915_request_create(ce);
+	intel_context_unpin(ce);
+
+	return rq;
+}
+
diff --git a/drivers/gpu/drm/i915/selftests/igt_gem_utils.h b/drivers/gpu/drm/i915/selftests/igt_gem_utils.h
new file mode 100644
index 000000000000..0f17251cf75d
--- /dev/null
+++ b/drivers/gpu/drm/i915/selftests/igt_gem_utils.h
@@ -0,0 +1,17 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2018 Intel Corporation
+ */
+
+#ifndef __IGT_GEM_UTILS_H__
+#define __IGT_GEM_UTILS_H__
+
+struct i915_request;
+struct i915_gem_context;
+struct intel_engine_cs;
+
+struct i915_request *
+igt_request_alloc(struct i915_gem_context *ctx, struct intel_engine_cs *engine);
+
+#endif /* __IGT_GEM_UTILS_H__ */
diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c
index 16890dfe74c0..ece8a8a0d3b0 100644
--- a/drivers/gpu/drm/i915/selftests/igt_spinner.c
+++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c
@@ -4,6 +4,7 @@
  * Copyright © 2018 Intel Corporation
  */
 
+#include "igt_gem_utils.h"
 #include "igt_spinner.h"
 
 int igt_spinner_init(struct igt_spinner *spin, struct drm_i915_private *i915)
@@ -114,7 +115,7 @@ igt_spinner_create_request(struct igt_spinner *spin,
 	if (err)
 		goto unpin_vma;
 
-	rq = i915_request_alloc(engine, ctx);
+	rq = igt_request_alloc(ctx, engine);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto unpin_hws;
diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
index f6f417386b9f..3c843edc2fe4 100644
--- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
@@ -27,6 +27,7 @@
 #include "../i915_gem_pm.h"
 #include "../i915_selftest.h"
 #include "i915_random.h"
+#include "igt_gem_utils.h"
 #include "igt_flush_test.h"
 #include "igt_reset.h"
 #include "igt_wedge_me.h"
@@ -175,7 +176,7 @@ hang_create_request(struct hang *h, struct intel_engine_cs *engine)
 	if (err)
 		goto unpin_vma;
 
-	rq = i915_request_alloc(engine, h->ctx);
+	rq = igt_request_alloc(h->ctx, engine);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto unpin_hws;
@@ -455,7 +456,7 @@ static int igt_reset_nop(void *arg)
 			for (i = 0; i < 16; i++) {
 				struct i915_request *rq;
 
-				rq = i915_request_alloc(engine, ctx);
+				rq = igt_request_alloc(ctx, engine);
 				if (IS_ERR(rq)) {
 					err = PTR_ERR(rq);
 					break;
@@ -566,7 +567,7 @@ static int igt_reset_nop_engine(void *arg)
 			for (i = 0; i < 16; i++) {
 				struct i915_request *rq;
 
-				rq = i915_request_alloc(engine, ctx);
+				rq = igt_request_alloc(ctx, engine);
 				if (IS_ERR(rq)) {
 					err = PTR_ERR(rq);
 					break;
@@ -839,7 +840,7 @@ static int active_engine(void *data)
 		struct i915_request *new;
 
 		mutex_lock(&engine->i915->drm.struct_mutex);
-		new = i915_request_alloc(engine, ctx[idx]);
+		new = igt_request_alloc(ctx[idx], engine);
 		if (IS_ERR(new)) {
 			mutex_unlock(&engine->i915->drm.struct_mutex);
 			err = PTR_ERR(new);
diff --git a/drivers/gpu/drm/i915/selftests/intel_lrc.c b/drivers/gpu/drm/i915/selftests/intel_lrc.c
index 45370922d965..2deefc5845e4 100644
--- a/drivers/gpu/drm/i915/selftests/intel_lrc.c
+++ b/drivers/gpu/drm/i915/selftests/intel_lrc.c
@@ -11,6 +11,7 @@
 
 #include "../i915_selftest.h"
 #include "igt_flush_test.h"
+#include "igt_gem_utils.h"
 #include "igt_live_test.h"
 #include "igt_spinner.h"
 #include "i915_random.h"
@@ -675,13 +676,13 @@ static int live_chain_preempt(void *arg)
 			i915_request_add(rq);
 
 			for (i = 0; i < count; i++) {
-				rq = i915_request_alloc(engine, lo.ctx);
+				rq = igt_request_alloc(lo.ctx, engine);
 				if (IS_ERR(rq))
 					goto err_wedged;
 				i915_request_add(rq);
 			}
 
-			rq = i915_request_alloc(engine, hi.ctx);
+			rq = igt_request_alloc(hi.ctx, engine);
 			if (IS_ERR(rq))
 				goto err_wedged;
 			i915_request_add(rq);
@@ -700,7 +701,7 @@ static int live_chain_preempt(void *arg)
 			}
 			igt_spinner_end(&lo.spin);
 
-			rq = i915_request_alloc(engine, lo.ctx);
+			rq = igt_request_alloc(lo.ctx, engine);
 			if (IS_ERR(rq))
 				goto err_wedged;
 			i915_request_add(rq);
@@ -904,7 +905,7 @@ static int smoke_submit(struct preempt_smoke *smoke,
 
 	ctx->sched.priority = prio;
 
-	rq = i915_request_alloc(smoke->engine, ctx);
+	rq = igt_request_alloc(ctx, smoke->engine);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto unpin;
diff --git a/drivers/gpu/drm/i915/selftests/intel_workarounds.c b/drivers/gpu/drm/i915/selftests/intel_workarounds.c
index 0e42e1a0b46c..e1cac43b7362 100644
--- a/drivers/gpu/drm/i915/selftests/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/selftests/intel_workarounds.c
@@ -9,6 +9,7 @@
 #include "../i915_reset.h"
 
 #include "igt_flush_test.h"
+#include "igt_gem_utils.h"
 #include "igt_reset.h"
 #include "igt_spinner.h"
 #include "igt_wedge_me.h"
@@ -72,7 +73,6 @@ read_nonprivs(struct i915_gem_context *ctx, struct intel_engine_cs *engine)
 {
 	const u32 base = engine->mmio_base;
 	struct drm_i915_gem_object *result;
-	intel_wakeref_t wakeref;
 	struct i915_request *rq;
 	struct i915_vma *vma;
 	u32 srm, *cs;
@@ -104,9 +104,7 @@ read_nonprivs(struct i915_gem_context *ctx, struct intel_engine_cs *engine)
 	if (err)
 		goto err_obj;
 
-	rq = ERR_PTR(-ENODEV);
-	with_intel_runtime_pm(engine->i915, wakeref)
-		rq = i915_request_alloc(engine, ctx);
+	rq = igt_request_alloc(ctx, engine);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto err_pin;
@@ -557,7 +555,7 @@ static int check_dirty_whitelist(struct i915_gem_context *ctx,
 		i915_gem_object_unpin_map(batch->obj);
 		i915_gem_chipset_flush(ctx->i915);
 
-		rq = i915_request_alloc(engine, ctx);
+		rq = igt_request_alloc(ctx, engine);
 		if (IS_ERR(rq)) {
 			err = PTR_ERR(rq);
 			goto out_batch;
diff --git a/drivers/gpu/drm/i915/selftests/mock_request.c b/drivers/gpu/drm/i915/selftests/mock_request.c
index d1a7c9608712..98516d58b2d8 100644
--- a/drivers/gpu/drm/i915/selftests/mock_request.c
+++ b/drivers/gpu/drm/i915/selftests/mock_request.c
@@ -22,6 +22,7 @@
  *
  */
 
+#include "igt_gem_utils.h"
 #include "mock_engine.h"
 #include "mock_request.h"
 
@@ -33,7 +34,7 @@ mock_request(struct intel_engine_cs *engine,
 	struct i915_request *request;
 
 	/* NB the i915->requests slab cache is enlarged to fit mock_request */
-	request = i915_request_alloc(engine, context);
+	request = igt_request_alloc(context, engine);
 	if (IS_ERR(request))
 		return NULL;
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 16/22] drm/i915: Allow a context to define its set of engines
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
                   ` (14 preceding siblings ...)
  2019-03-25  9:04 ` [PATCH 15/22] drm/i915: Move i915_request_alloc into selftests/ Chris Wilson
@ 2019-03-25  9:04 ` Chris Wilson
  2019-03-25  9:04 ` [PATCH 17/22] drm/i915: Allow userspace to clone contexts on creation Chris Wilson
                   ` (9 subsequent siblings)
  25 siblings, 0 replies; 43+ messages in thread
From: Chris Wilson @ 2019-03-25  9:04 UTC (permalink / raw)
  To: intel-gfx

Over the last few years, we have debated how to extend the user API to
support an increase in the number of engines, that may be sparse and
even be heterogeneous within a class (not all video decoders created
equal). We settled on using (class, instance) tuples to identify a
specific engine, with an API for the user to construct a map of engines
to capabilities. Into this picture, we then add a challenge of virtual
engines; one user engine that maps behind the scenes to any number of
physical engines. To keep it general, we want the user to have full
control over that mapping. To that end, we allow the user to constrain a
context to define the set of engines that it can access, order fully
controlled by the user via (class, instance). With such precise control
in context setup, we can continue to use the existing execbuf uABI of
specifying a single index; only now it doesn't automagically map onto
the engines, it uses the user defined engine map from the context.

The I915_EXEC_DEFAULT slot is left empty, and invalid for use by
execbuf. It's use will be revealed in the next patch.

v2: Fixup freeing of local on success of get_engines()
v3: Allow empty engines[]
v4: s/nengine/num_engines/

Testcase: igt/gem_ctx_engines
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v4
---
 drivers/gpu/drm/i915/i915_gem_context.c       | 210 +++++++++++++++++-
 drivers/gpu/drm/i915/i915_gem_context.h       |  18 ++
 drivers/gpu/drm/i915/i915_gem_context_types.h |   8 +
 drivers/gpu/drm/i915/i915_gem_execbuffer.c    |   5 +-
 drivers/gpu/drm/i915/i915_utils.h             |  36 +++
 include/uapi/drm/i915_drm.h                   |  39 ++++
 6 files changed, 310 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 4372e244b005..9d539e6de697 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -86,7 +86,9 @@
  */
 
 #include <linux/log2.h>
+
 #include <drm/i915_drm.h>
+
 #include "i915_drv.h"
 #include "i915_gem_pm.h"
 #include "i915_globals.h"
@@ -139,13 +141,17 @@ static void lut_close(struct i915_gem_context *ctx)
 static struct intel_context *
 lookup_user_engine(struct i915_gem_context *ctx, u16 class, u16 instance)
 {
-	struct intel_engine_cs *engine;
+	if (!i915_gem_context_user_engines(ctx)) {
+		struct intel_engine_cs *engine;
 
-	engine = intel_engine_lookup_user(ctx->i915, class, instance);
-	if (!engine)
-		return ERR_PTR(-EINVAL);
+		engine = intel_engine_lookup_user(ctx->i915, class, instance);
+		if (!engine)
+			return ERR_PTR(-EINVAL);
+
+		instance = engine->id;
+	}
 
-	return i915_gem_context_get_engine(ctx, engine->id);
+	return i915_gem_context_get_engine(ctx, instance);
 }
 
 static inline int new_hw_id(struct drm_i915_private *i915, gfp_t gfp)
@@ -256,6 +262,11 @@ static void free_engines_n(struct i915_gem_engines *e, unsigned long n)
 	free_engines(e);
 }
 
+static void free_engines_rcu(struct rcu_head *rcu)
+{
+	free_engines(container_of(rcu, struct i915_gem_engines, rcu));
+}
+
 static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx)
 {
 	struct intel_engine_cs *engine;
@@ -1478,6 +1489,187 @@ static int set_sseu(struct i915_gem_context *ctx,
 	return ret;
 }
 
+struct set_engines {
+	struct i915_gem_context *ctx;
+	struct i915_gem_engines *engines;
+};
+
+static const i915_user_extension_fn set_engines__extensions[] = {
+};
+
+static int
+set_engines(struct i915_gem_context *ctx,
+	    const struct drm_i915_gem_context_param *args)
+{
+	struct i915_context_param_engines __user *user =
+		u64_to_user_ptr(args->value);
+	struct set_engines set = { .ctx = ctx };
+	unsigned long num_engines, n;
+	u64 extensions;
+	int err;
+
+	if (!args->size) { /* switch back to legacy user_ring_map */
+		if (!i915_gem_context_user_engines(ctx))
+			return 0;
+
+		set.engines = default_engines(ctx);
+		if (IS_ERR(set.engines))
+			return PTR_ERR(set.engines);
+
+		goto replace;
+	}
+
+	BUILD_BUG_ON(!IS_ALIGNED(sizeof(*user), sizeof(*user->class_instance)));
+	if (args->size < sizeof(*user) ||
+	    !IS_ALIGNED(args->size, sizeof(*user->class_instance)))
+		return -EINVAL;
+
+	/* Internal limitation of u64 bitmaps + a few bits of u64 in the uABI */
+	num_engines =
+		(args->size - sizeof(*user)) / sizeof(*user->class_instance);
+	if (num_engines > I915_EXEC_RING_MASK + 1)
+		return -EINVAL;
+
+	set.engines = kmalloc(struct_size(set.engines, engines, num_engines),
+			      GFP_KERNEL);
+	if (!set.engines)
+		return -ENOMEM;
+
+	init_rcu_head(&set.engines->rcu);
+	for (n = 0; n < num_engines; n++) {
+		struct intel_engine_cs *engine;
+		u16 class, inst;
+
+		if (get_user(class, &user->class_instance[n].engine_class) ||
+		    get_user(inst, &user->class_instance[n].engine_instance)) {
+			free_engines_n(set.engines, n);
+			return -EFAULT;
+		}
+
+		if (class == (u16)I915_ENGINE_CLASS_INVALID &&
+		    inst == (u16)I915_ENGINE_CLASS_INVALID_NONE) {
+			set.engines->engines[n] = NULL;
+			continue;
+		}
+
+		engine = intel_engine_lookup_user(ctx->i915, class, inst);
+		if (!engine) {
+			free_engines_n(set.engines, n);
+			return -ENOENT;
+		}
+
+		set.engines->engines[n] = intel_context_create(ctx, engine);
+		if (!set.engines->engines[n]) {
+			free_engines_n(set.engines, n);
+			return -ENOMEM;
+		}
+	}
+	set.engines->num_engines = num_engines;
+
+	err = -EFAULT;
+	if (!get_user(extensions, &user->extensions))
+		err = i915_user_extensions(u64_to_user_ptr(extensions),
+					   set_engines__extensions,
+					   ARRAY_SIZE(set_engines__extensions),
+					   &set);
+	if (err) {
+		free_engines(set.engines);
+		return err;
+	}
+
+replace:
+	mutex_lock(&ctx->i915->drm.struct_mutex);
+	if (args->size)
+		i915_gem_context_set_user_engines(ctx);
+	else
+		i915_gem_context_clear_user_engines(ctx);
+	set.engines = xchg(&ctx->engines, set.engines);
+	mutex_unlock(&ctx->i915->drm.struct_mutex);
+
+	call_rcu(&set.engines->rcu, free_engines_rcu);
+
+	return 0;
+}
+
+static int
+get_engines(struct i915_gem_context *ctx,
+	    struct drm_i915_gem_context_param *args)
+{
+	struct i915_context_param_engines * __user user;
+	struct i915_gem_engines *e;
+	size_t n, count, size;
+	int err = 0;
+
+	err = mutex_lock_interruptible(&ctx->engines_mutex);
+	if (err)
+		return err;
+
+	if (!i915_gem_context_user_engines(ctx)) {
+		args->size = 0;
+		goto unlock;
+	}
+
+	e = i915_gem_context_engine_list(ctx);
+	count = e->num_engines;
+
+	/* Be paranoid in case we have an impedance mismatch */
+	if (!check_struct_size(user, class_instance, count, &size)) {
+		err = -ENOMEM;
+		goto unlock;
+	}
+	if (unlikely(overflows_type(size, args->size))) {
+		err = -ENOMEM;
+		goto unlock;
+	}
+
+	if (!args->size) {
+		args->size = size;
+		goto unlock;
+	}
+
+	if (args->size < size) {
+		err = -EINVAL;
+		goto unlock;
+	}
+
+	user = u64_to_user_ptr(args->value);
+	if (!access_ok(user, size)) {
+		err = -EFAULT;
+		goto unlock;
+	}
+
+	if (put_user(0, &user->extensions)) {
+		err = -EFAULT;
+		goto unlock;
+	}
+
+	for (n = 0; n < count; n++) {
+		struct {
+			u16 class;
+			u16 instance;
+		} ci = {
+			.class = I915_ENGINE_CLASS_INVALID,
+			.instance = I915_ENGINE_CLASS_INVALID_NONE,
+		};
+
+		if (e->engines[n]) {
+			ci.class = e->engines[n]->engine->uabi_class;
+			ci.instance = e->engines[n]->engine->instance;
+		}
+
+		if (copy_to_user(&user->class_instance[n], &ci, sizeof(ci))) {
+			err = -EFAULT;
+			goto unlock;
+		}
+	}
+
+	args->size = size;
+
+unlock:
+	mutex_unlock(&ctx->engines_mutex);
+	return err;
+}
+
 static int ctx_setparam(struct i915_gem_context *ctx,
 			struct drm_i915_gem_context_param *args)
 {
@@ -1550,6 +1742,10 @@ static int ctx_setparam(struct i915_gem_context *ctx,
 		ret = set_ppgtt(ctx, args);
 		break;
 
+	case I915_CONTEXT_PARAM_ENGINES:
+		ret = set_engines(ctx, args);
+		break;
+
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
 	default:
 		ret = -EINVAL;
@@ -1780,6 +1976,10 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		ret = get_ppgtt(ctx, args);
 		break;
 
+	case I915_CONTEXT_PARAM_ENGINES:
+		ret = get_engines(ctx, args);
+		break;
+
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
 	default:
 		ret = -EINVAL;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index 6a336a8a2324..f8d49d402371 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -111,6 +111,24 @@ static inline void i915_gem_context_set_force_single_submission(struct i915_gem_
 	__set_bit(CONTEXT_FORCE_SINGLE_SUBMISSION, &ctx->flags);
 }
 
+static inline bool
+i915_gem_context_user_engines(const struct i915_gem_context *ctx)
+{
+	return test_bit(CONTEXT_USER_ENGINES, &ctx->flags);
+}
+
+static inline void
+i915_gem_context_set_user_engines(struct i915_gem_context *ctx)
+{
+	set_bit(CONTEXT_USER_ENGINES, &ctx->flags);
+}
+
+static inline void
+i915_gem_context_clear_user_engines(struct i915_gem_context *ctx)
+{
+	clear_bit(CONTEXT_USER_ENGINES, &ctx->flags);
+}
+
 int __i915_gem_context_pin_hw_id(struct i915_gem_context *ctx);
 static inline int i915_gem_context_pin_hw_id(struct i915_gem_context *ctx)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_context_types.h b/drivers/gpu/drm/i915/i915_gem_context_types.h
index 9a4104505fa6..11c4c5172d5d 100644
--- a/drivers/gpu/drm/i915/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/i915_gem_context_types.h
@@ -130,6 +130,14 @@ struct i915_gem_context {
 #define CONTEXT_BANNED			0
 #define CONTEXT_CLOSED			1
 #define CONTEXT_FORCE_SINGLE_SUBMISSION	2
+#define CONTEXT_USER_ENGINES		3
+
+	/**
+	 * @num_engines: Number of user defined engines for this context
+	 *
+	 * See @engines for the elements.
+	 */
+	unsigned int num_engines;
 
 	/**
 	 * @hw_id: - unique identifier for the context
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 45c086451397..c2b95cc0f41d 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -2160,7 +2160,10 @@ eb_select_engine(struct i915_execbuffer *eb,
 	unsigned int idx;
 	int err;
 
-	idx = eb_select_legacy_ring(eb, file, args);
+	if (i915_gem_context_user_engines(eb->gem_context))
+		idx = args->flags & I915_EXEC_RING_MASK;
+	else
+		idx = eb_select_legacy_ring(eb, file, args);
 
 	ce = i915_gem_context_get_engine(eb->gem_context, idx);
 	if (IS_ERR(ce))
diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h
index 2dbe8933b50a..1436fe2fb5f8 100644
--- a/drivers/gpu/drm/i915/i915_utils.h
+++ b/drivers/gpu/drm/i915/i915_utils.h
@@ -25,6 +25,9 @@
 #ifndef __I915_UTILS_H
 #define __I915_UTILS_H
 
+#include <linux/kernel.h>
+#include <linux/overflow.h>
+
 #undef WARN_ON
 /* Many gcc seem to no see through this and fall over :( */
 #if 0
@@ -73,6 +76,39 @@
 #define overflows_type(x, T) \
 	(sizeof(x) > sizeof(T) && (x) >> BITS_PER_TYPE(T))
 
+static inline bool
+__check_struct_size(size_t base, size_t arr, size_t count, size_t *size)
+{
+	size_t sz;
+
+	if (check_mul_overflow(count, arr, &sz))
+		return false;
+
+	if (check_add_overflow(sz, base, &sz))
+		return false;
+
+	*size = sz;
+	return true;
+}
+
+/**
+ * check_struct_size() - Calculate size of structure with trailing array.
+ * @p: Pointer to the structure.
+ * @member: Name of the array member.
+ * @n: Number of elements in the array.
+ * @sz: Total size of structure and array
+ *
+ * Calculates size of memory needed for structure @p followed by an
+ * array of @n @member elements, like struct_size() but reports
+ * whether it overflowed, and the resultant size in @sz
+ *
+ * Return: false if the calculation overflowed.
+ */
+#define check_struct_size(p, member, n, sz) \
+	likely(__check_struct_size(sizeof(*(p)), \
+				   sizeof(*(p)->member) + __must_be_array((p)->member), \
+				   n, sz))
+
 #define ptr_mask_bits(ptr, n) ({					\
 	unsigned long __v = (unsigned long)(ptr);			\
 	(typeof(ptr))(__v & -BIT(n));					\
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 9999f7d6a5a9..c963cd5232c8 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -126,6 +126,8 @@ enum drm_i915_gem_engine_class {
 	I915_ENGINE_CLASS_INVALID	= -1
 };
 
+#define I915_ENGINE_CLASS_INVALID_NONE -1
+
 /**
  * DOC: perf_events exposed by i915 through /sys/bus/event_sources/drivers/i915
  *
@@ -1511,6 +1513,26 @@ struct drm_i915_gem_context_param {
 	 * See DRM_I915_GEM_VM_CREATE and DRM_I915_GEM_VM_DESTROY.
 	 */
 #define I915_CONTEXT_PARAM_VM		0x9
+
+/*
+ * I915_CONTEXT_PARAM_ENGINES:
+ *
+ * Bind this context to operate on this subset of available engines. Henceforth,
+ * the I915_EXEC_RING selector for DRM_IOCTL_I915_GEM_EXECBUFFER2 operates as
+ * an index into this array of engines; I915_EXEC_DEFAULT selecting engine[0]
+ * and upwards. Slots 0...N are filled in using the specified (class, instance).
+ * Use
+ *	engine_class: I915_ENGINE_CLASS_INVALID,
+ *	engine_instance: I915_ENGINE_CLASS_INVALID_NONE
+ * to specify a gap in the array that can be filled in later, e.g. by a
+ * virtual engine used for load balancing.
+ *
+ * Setting the number of engines bound to the context to 0, by passing a zero
+ * sized argument, will revert back to default settings.
+ *
+ * See struct i915_context_param_engines.
+ */
+#define I915_CONTEXT_PARAM_ENGINES	0xa
 /* Must be kept compact -- no holes and well documented */
 
 	__u64 value;
@@ -1575,6 +1597,23 @@ struct drm_i915_gem_context_param_sseu {
 	__u32 rsvd;
 };
 
+struct i915_context_param_engines {
+	__u64 extensions; /* linked chain of extension blocks, 0 terminates */
+
+	struct {
+		__u16 engine_class; /* see enum drm_i915_gem_engine_class */
+		__u16 engine_instance;
+	} class_instance[0];
+} __attribute__((packed));
+
+#define I915_DEFINE_CONTEXT_PARAM_ENGINES(name__, N__) struct { \
+	__u64 extensions; \
+	struct { \
+		__u16 engine_class; \
+		__u16 engine_instance; \
+	} class_instance[N__]; \
+} __attribute__((packed)) name__
+
 struct drm_i915_gem_context_create_ext_setparam {
 #define I915_CONTEXT_CREATE_EXT_SETPARAM 0
 	struct i915_user_extension base;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 17/22] drm/i915: Allow userspace to clone contexts on creation
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
                   ` (15 preceding siblings ...)
  2019-03-25  9:04 ` [PATCH 16/22] drm/i915: Allow a context to define its set of engines Chris Wilson
@ 2019-03-25  9:04 ` Chris Wilson
  2019-03-25  9:04 ` [PATCH 18/22] drm/i915: Load balancing across a virtual engine Chris Wilson
                   ` (8 subsequent siblings)
  25 siblings, 0 replies; 43+ messages in thread
From: Chris Wilson @ 2019-03-25  9:04 UTC (permalink / raw)
  To: intel-gfx

A usecase arose out of handling context recovery in mesa, whereby they
wish to recreate a context with fresh logical state but preserving all
other details of the original. Currently, they create a new context and
iterate over which bits they want to copy across, but it would much more
convenient if they were able to just pass in a target context to clone
during creation. This essentially extends the setparam during creation
to pull the details from a target context instead of the user supplied
parameters.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 206 ++++++++++++++++++++++++
 include/uapi/drm/i915_drm.h             |  15 ++
 2 files changed, 221 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 9d539e6de697..9d27e79e1f52 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -1774,8 +1774,214 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
 	return ctx_setparam(arg->ctx, &local.param);
 }
 
+static int clone_engines(struct i915_gem_context *dst,
+			 struct i915_gem_context *src)
+{
+	struct i915_gem_engines *e, *clone;
+	bool user_engines;
+	unsigned long n;
+
+	e = i915_gem_context_engine_list_lock(src);
+
+	clone = kmalloc(struct_size(e, engines, e->num_engines), GFP_KERNEL);
+	if (!clone)
+		goto err_unlock;
+
+	init_rcu_head(&clone->rcu);
+	for (n = 0; n < e->num_engines; n++) {
+		if (!e->engines[n]) {
+			clone->engines[n] = NULL;
+			continue;
+		}
+
+		clone->engines[n] =
+			intel_context_create(dst, e->engines[n]->engine);
+		if (!clone->engines[n]) {
+			free_engines_n(clone, n);
+			goto err_unlock;
+		}
+	}
+	clone->num_engines = n;
+
+	user_engines = i915_gem_context_user_engines(src);
+	i915_gem_context_engine_list_unlock(src);
+
+	free_engines(dst->engines);
+	RCU_INIT_POINTER(dst->engines, clone);
+	if (user_engines)
+		i915_gem_context_set_user_engines(dst);
+	else
+		i915_gem_context_clear_user_engines(dst);
+	return 0;
+
+err_unlock:
+	i915_gem_context_engine_list_unlock(src);
+	return -ENOMEM;
+}
+
+static int clone_flags(struct i915_gem_context *dst,
+		       struct i915_gem_context *src)
+{
+	dst->user_flags = src->user_flags;
+	return 0;
+}
+
+static int clone_schedattr(struct i915_gem_context *dst,
+			   struct i915_gem_context *src)
+{
+	dst->sched = src->sched;
+	return 0;
+}
+
+static int clone_sseu(struct i915_gem_context *dst,
+		      struct i915_gem_context *src)
+{
+	struct i915_gem_engines *e, *clone;
+	unsigned long n;
+	int err;
+
+	clone = dst->engines; /* no locking required; sole access */
+	e = i915_gem_context_engine_list_lock(src);
+	if (e->num_engines != clone->num_engines) {
+		err = -EINVAL;
+		goto unlock;
+	}
+
+	for (n = 0; n < e->num_engines; n++) {
+		struct intel_context *ce = e->engines[n];
+
+		if (clone->engines[n]->engine->class != ce->engine->class) {
+			/* Must have compatible engine maps! */
+			err = -EINVAL;
+			goto unlock;
+		}
+
+		err = intel_context_pin_lock(ce); /* serialises with set_sseu */
+		if (err)
+			goto unlock;
+
+		clone->engines[n]->sseu = ce->sseu;
+		intel_context_pin_unlock(ce);
+	}
+
+	err = 0;
+unlock:
+	i915_gem_context_engine_list_unlock(src);
+	return err;
+}
+
+static int clone_timeline(struct i915_gem_context *dst,
+			  struct i915_gem_context *src)
+{
+	if (src->timeline) {
+		GEM_BUG_ON(src->timeline == dst->timeline);
+
+		if (dst->timeline)
+			i915_timeline_put(dst->timeline);
+		dst->timeline = i915_timeline_get(src->timeline);
+	}
+
+	return 0;
+}
+
+static int clone_vm(struct i915_gem_context *dst,
+		    struct i915_gem_context *src)
+{
+	struct i915_hw_ppgtt *ppgtt;
+
+	rcu_read_lock();
+	do {
+		ppgtt = READ_ONCE(src->ppgtt);
+		if (!ppgtt)
+			break;
+
+		if (!kref_get_unless_zero(&ppgtt->ref))
+			continue;
+
+		/*
+		 * This ppgtt may have be reallocated between
+		 * the read and the kref, and reassigned to a third
+		 * context. In order to avoid inadvertent sharing
+		 * of this ppgtt with that third context (and not
+		 * src), we have to confirm that we have the same
+		 * ppgtt after passing through the strong memory
+		 * barrier implied by a successful
+		 * kref_get_unless_zero().
+		 *
+		 * Once we have acquired the current ppgtt of src,
+		 * we no longer care if it is released from src, as
+		 * it cannot be reallocated elsewhere.
+		 */
+
+		if (ppgtt == READ_ONCE(src->ppgtt))
+			break;
+
+		i915_ppgtt_put(ppgtt);
+	} while (1);
+	rcu_read_unlock();
+
+	if (ppgtt) {
+		__assign_ppgtt(dst, ppgtt);
+		i915_ppgtt_put(ppgtt);
+	}
+
+	return 0;
+}
+
+static int create_clone(struct i915_user_extension __user *ext, void *data)
+{
+	static int (* const fn[])(struct i915_gem_context *dst,
+				  struct i915_gem_context *src) = {
+#define MAP(x, y) [ilog2(I915_CONTEXT_CLONE_##x)] = y
+		MAP(ENGINES, clone_engines),
+		MAP(FLAGS, clone_flags),
+		MAP(SCHEDATTR, clone_schedattr),
+		MAP(SSEU, clone_sseu),
+		MAP(TIMELINE, clone_timeline),
+		MAP(VM, clone_vm),
+#undef MAP
+	};
+	struct drm_i915_gem_context_create_ext_clone local;
+	const struct create_ext *arg = data;
+	struct i915_gem_context *dst = arg->ctx;
+	struct i915_gem_context *src;
+	int err, bit;
+
+	if (copy_from_user(&local, ext, sizeof(local)))
+		return -EFAULT;
+
+	BUILD_BUG_ON(GENMASK(BITS_PER_TYPE(local.flags) - 1, ARRAY_SIZE(fn)) !=
+		     I915_CONTEXT_CLONE_UNKNOWN);
+
+	if (local.flags & I915_CONTEXT_CLONE_UNKNOWN)
+		return -EINVAL;
+
+	if (local.rsvd)
+		return -EINVAL;
+
+	rcu_read_lock();
+	src = __i915_gem_context_lookup_rcu(arg->fpriv, local.clone_id);
+	rcu_read_unlock();
+	if (!src)
+		return -ENOENT;
+
+	GEM_BUG_ON(src == dst);
+
+	for (bit = 0; bit < ARRAY_SIZE(fn); bit++) {
+		if (!(local.flags & BIT(bit)))
+			continue;
+
+		err = fn[bit](dst, src);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
 static const i915_user_extension_fn create_extensions[] = {
 	[I915_CONTEXT_CREATE_EXT_SETPARAM] = create_setparam,
+	[I915_CONTEXT_CREATE_EXT_CLONE] = create_clone,
 };
 
 static bool client_is_banned(struct drm_i915_file_private *file_priv)
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index c963cd5232c8..303d3dd980e4 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1620,6 +1620,21 @@ struct drm_i915_gem_context_create_ext_setparam {
 	struct drm_i915_gem_context_param param;
 };
 
+struct drm_i915_gem_context_create_ext_clone {
+#define I915_CONTEXT_CREATE_EXT_CLONE 1
+	struct i915_user_extension base;
+	__u32 clone_id;
+	__u32 flags;
+#define I915_CONTEXT_CLONE_ENGINES	(1u << 0)
+#define I915_CONTEXT_CLONE_FLAGS	(1u << 1)
+#define I915_CONTEXT_CLONE_SCHEDATTR	(1u << 2)
+#define I915_CONTEXT_CLONE_SSEU		(1u << 3)
+#define I915_CONTEXT_CLONE_TIMELINE	(1u << 4)
+#define I915_CONTEXT_CLONE_VM		(1u << 5)
+#define I915_CONTEXT_CLONE_UNKNOWN -(I915_CONTEXT_CLONE_VM << 1)
+	__u64 rsvd;
+};
+
 struct drm_i915_gem_context_destroy {
 	__u32 ctx_id;
 	__u32 pad;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 18/22] drm/i915: Load balancing across a virtual engine
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
                   ` (16 preceding siblings ...)
  2019-03-25  9:04 ` [PATCH 17/22] drm/i915: Allow userspace to clone contexts on creation Chris Wilson
@ 2019-03-25  9:04 ` Chris Wilson
  2019-03-25  9:04 ` [PATCH 19/22] drm/i915: Extend execution fence to support a callback Chris Wilson
                   ` (7 subsequent siblings)
  25 siblings, 0 replies; 43+ messages in thread
From: Chris Wilson @ 2019-03-25  9:04 UTC (permalink / raw)
  To: intel-gfx

Having allowed the user to define a set of engines that they will want
to only use, we go one step further and allow them to bind those engines
into a single virtual instance. Submitting a batch to the virtual engine
will then forward it to any one of the set in a manner as best to
distribute load.  The virtual engine has a single timeline across all
engines (it operates as a single queue), so it is not able to concurrently
run batches across multiple engines by itself; that is left up to the user
to submit multiple concurrent batches to multiple queues. Multiple users
will be load balanced across the system.

The mechanism used for load balancing in this patch is a late greedy
balancer. When a request is ready for execution, it is added to each
engine's queue, and when an engine is ready for its next request it
claims it from the virtual engine. The first engine to do so, wins, i.e.
the request is executed at the earliest opportunity (idle moment) in the
system.

As not all HW is created equal, the user is still able to skip the
virtual engine and execute the batch on a specific engine, all within the
same queue. It will then be executed in order on the correct engine,
with execution on other virtual engines being moved away due to the load
detection.

A couple of areas for potential improvement left!

- The virtual engine always take priority over equal-priority tasks.
Mostly broken up by applying FQ_CODEL rules for prioritising new clients,
and hopefully the virtual and real engines are not then congested (i.e.
all work is via virtual engines, or all work is to the real engine).

- We require the breadcrumb irq around every virtual engine request. For
normal engines, we eliminate the need for the slow round trip via
interrupt by using the submit fence and queueing in order. For virtual
engines, we have to allow any job to transfer to a new ring, and cannot
coalesce the submissions, so require the completion fence instead,
forcing the persistent use of interrupts.

- We only drip feed single requests through each virtual engine and onto
the physical engines, even if there was enough work to fill all ELSP,
leaving small stalls with an idle CS event at the end of every request.
Could we be greedy and fill both slots? Being lazy is virtuous for load
distribution on less-than-full workloads though.

Other areas of improvement are more general, such as reducing lock
contention, reducing dispatch overhead, looking at direct submission
rather than bouncing around tasklets etc.

sseu: Lift the restriction to allow sseu to be reconfigured on virtual
engines composed of RENDER_CLASS (rcs).

v2: macroize check_user_mbz()
v3: Cancel virtual engines on wedging
v4: Commence commenting

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.h            |   5 +
 drivers/gpu/drm/i915/i915_gem_context.c    |  94 +++-
 drivers/gpu/drm/i915/i915_scheduler.c      |  18 +-
 drivers/gpu/drm/i915/i915_timeline_types.h |   1 +
 drivers/gpu/drm/i915/intel_engine_types.h  |   8 +
 drivers/gpu/drm/i915/intel_lrc.c           | 592 ++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_lrc.h           |   9 +
 drivers/gpu/drm/i915/selftests/intel_lrc.c | 182 +++++++
 include/uapi/drm/i915_drm.h                |  30 ++
 9 files changed, 923 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.h b/drivers/gpu/drm/i915/i915_gem.h
index bd13198a9058..79937b682c93 100644
--- a/drivers/gpu/drm/i915/i915_gem.h
+++ b/drivers/gpu/drm/i915/i915_gem.h
@@ -93,4 +93,9 @@ static inline bool __tasklet_enable(struct tasklet_struct *t)
 	return atomic_dec_and_test(&t->count);
 }
 
+static inline bool __tasklet_is_scheduled(struct tasklet_struct *t)
+{
+	return test_bit(TASKLET_STATE_SCHED, &t->state);
+}
+
 #endif /* __I915_GEM_H__ */
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 9d27e79e1f52..35b2c24a7e30 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -86,6 +86,7 @@
  */
 
 #include <linux/log2.h>
+#include <linux/nospec.h>
 
 #include <drm/i915_drm.h>
 
@@ -95,6 +96,7 @@
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
 #include "intel_lrc_reg.h"
+#include "intel_lrc.h"
 #include "intel_workarounds.h"
 
 #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
@@ -1306,7 +1308,6 @@ __intel_context_reconfigure_sseu(struct intel_context *ce,
 	int ret;
 
 	GEM_BUG_ON(INTEL_GEN(ce->gem_context->i915) < 8);
-	GEM_BUG_ON(ce->engine->id != RCS0);
 
 	ret = intel_context_pin_lock(ce);
 	if (ret)
@@ -1494,7 +1495,76 @@ struct set_engines {
 	struct i915_gem_engines *engines;
 };
 
+static int
+set_engines__load_balance(struct i915_user_extension __user *base, void *data)
+{
+	struct i915_context_engines_load_balance __user *ext =
+		container_of_user(base, typeof(*ext), base);
+	const struct set_engines *set = data;
+	struct intel_engine_cs *stack[64];
+	struct intel_context *ce;
+	unsigned int n, bit;
+	u64 mask;
+	u16 idx;
+	int err;
+
+	if (!HAS_EXECLISTS(set->ctx->i915))
+		return -ENODEV;
+
+	if (USES_GUC_SUBMISSION(set->ctx->i915))
+		return -ENODEV; /* not implement yet */
+
+	if (get_user(idx, &ext->engine_index))
+		return -EFAULT;
+
+	if (idx >= set->engines->num_engines)
+		return -EINVAL;
+
+	idx = array_index_nospec(idx, set->engines->num_engines);
+	if (set->engines->engines[idx])
+		return -EEXIST;
+
+	err = check_user_mbz(&ext->mbz16);
+	if (err)
+		return err;
+
+	err = check_user_mbz(&ext->flags);
+	if (err)
+		return err;
+
+	for (n = 0; n < ARRAY_SIZE(ext->mbz64); n++) {
+		err = check_user_mbz(&ext->mbz64[n]);
+		if (err)
+			return err;
+	}
+
+	if (get_user(mask, &ext->engines_mask))
+		return -EFAULT;
+
+	mask &= GENMASK_ULL(set->engines->num_engines - 1, 0) & ~BIT_ULL(idx);
+	if (!mask)
+		return -EINVAL;
+
+	n = 0;
+	for_each_set_bit(bit,
+			 (unsigned long *)&mask,
+			 set->engines->num_engines)
+		stack[n++] = set->engines->engines[bit]->engine;
+
+	ce = intel_execlists_create_virtual(set->ctx, stack, n);
+	if (IS_ERR(ce))
+		return PTR_ERR(ce);
+
+	if (cmpxchg(&set->engines->engines[idx], NULL, ce)) {
+		intel_context_put(ce);
+		return -EEXIST;
+	}
+
+	return 0;
+}
+
 static const i915_user_extension_fn set_engines__extensions[] = {
+	[I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE] = set_engines__load_balance,
 };
 
 static int
@@ -1789,17 +1859,33 @@ static int clone_engines(struct i915_gem_context *dst,
 
 	init_rcu_head(&clone->rcu);
 	for (n = 0; n < e->num_engines; n++) {
+		struct intel_engine_cs *engine;
+
 		if (!e->engines[n]) {
 			clone->engines[n] = NULL;
 			continue;
 		}
+		engine = e->engines[n]->engine;
 
-		clone->engines[n] =
-			intel_context_create(dst, e->engines[n]->engine);
-		if (!clone->engines[n]) {
+		/*
+		 * Virtual engines are singletons; they can only exist
+		 * inside a single context, because they embed their
+		 * HW context... As each virtual context implies a single
+		 * timeline (each engine can only dequeue a single request
+		 * at any time), it would be surprising for two contexts
+		 * to use the same engine. So let's create a copy of
+		 * the virtual engine instead.
+		 */
+		if (intel_engine_is_virtual(engine))
+			clone->engines[n] =
+				intel_execlists_clone_virtual(dst, engine);
+		else
+			clone->engines[n] = intel_context_create(dst, engine);
+		if (IS_ERR_OR_NULL(clone->engines[n])) {
 			free_engines_n(clone, n);
 			goto err_unlock;
 		}
+
 	}
 	clone->num_engines = n;
 
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index e0f609d01564..8cff4f6d6158 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -247,17 +247,26 @@ sched_lock_engine(const struct i915_sched_node *node,
 		  struct intel_engine_cs *locked,
 		  struct sched_cache *cache)
 {
-	struct intel_engine_cs *engine = node_to_request(node)->engine;
+	const struct i915_request *rq = node_to_request(node);
+	struct intel_engine_cs *engine;
 
 	GEM_BUG_ON(!locked);
 
-	if (engine != locked) {
+	/*
+	 * Virtual engines complicate acquiring the engine timeline lock,
+	 * as their rq->engine pointer is not stable until under that
+	 * engine lock. The simple ploy we use is to take the lock then
+	 * check that the rq still belongs to the newly locked engine.
+	 */
+	while (locked != (engine = READ_ONCE(rq->engine))) {
 		spin_unlock(&locked->timeline.lock);
 		memset(cache, 0, sizeof(*cache));
 		spin_lock(&engine->timeline.lock);
+		locked = engine;
 	}
 
-	return engine;
+	GEM_BUG_ON(locked != engine);
+	return locked;
 }
 
 static bool inflight(const struct i915_request *rq,
@@ -370,8 +379,11 @@ static void __i915_schedule(struct i915_request *rq,
 		if (prio <= node->attr.priority || node_signaled(node))
 			continue;
 
+		GEM_BUG_ON(node_to_request(node)->engine != engine);
+
 		node->attr.priority = prio;
 		if (!list_empty(&node->link)) {
+			GEM_BUG_ON(intel_engine_is_virtual(engine));
 			if (!cache.priolist)
 				cache.priolist =
 					i915_sched_lookup_priolist(engine,
diff --git a/drivers/gpu/drm/i915/i915_timeline_types.h b/drivers/gpu/drm/i915/i915_timeline_types.h
index 12ba3c573aa0..ccd7bea4a342 100644
--- a/drivers/gpu/drm/i915/i915_timeline_types.h
+++ b/drivers/gpu/drm/i915/i915_timeline_types.h
@@ -25,6 +25,7 @@ struct i915_timeline {
 	spinlock_t lock;
 #define TIMELINE_CLIENT 0 /* default subclass */
 #define TIMELINE_ENGINE 1
+#define TIMELINE_VIRTUAL 2
 	struct mutex mutex; /* protects the flow of requests */
 
 	unsigned int pin_count;
diff --git a/drivers/gpu/drm/i915/intel_engine_types.h b/drivers/gpu/drm/i915/intel_engine_types.h
index 88ed7ba8886f..322fbda65190 100644
--- a/drivers/gpu/drm/i915/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/intel_engine_types.h
@@ -218,6 +218,7 @@ struct intel_engine_execlists {
 	 * @queue: queue of requests, in priority lists
 	 */
 	struct rb_root_cached queue;
+	struct rb_root_cached virtual;
 
 	/**
 	 * @csb_write: control register for Context Switch buffer
@@ -423,6 +424,7 @@ struct intel_engine_cs {
 #define I915_ENGINE_SUPPORTS_STATS   BIT(1)
 #define I915_ENGINE_HAS_PREEMPTION   BIT(2)
 #define I915_ENGINE_HAS_SEMAPHORES   BIT(3)
+#define I915_ENGINE_IS_VIRTUAL       BIT(4)
 	unsigned int flags;
 
 	/*
@@ -506,6 +508,12 @@ intel_engine_has_semaphores(const struct intel_engine_cs *engine)
 	return engine->flags & I915_ENGINE_HAS_SEMAPHORES;
 }
 
+static inline bool
+intel_engine_is_virtual(const struct intel_engine_cs *engine)
+{
+	return engine->flags & I915_ENGINE_IS_VIRTUAL;
+}
+
 #define instdone_slice_mask(dev_priv__) \
 	(IS_GEN(dev_priv__, 7) ? \
 	 1 : RUNTIME_INFO(dev_priv__)->sseu.slice_mask)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 5804703c9d97..9ea94524567f 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -166,6 +166,41 @@
 
 #define ACTIVE_PRIORITY (I915_PRIORITY_NEWCLIENT | I915_PRIORITY_NOSEMAPHORE)
 
+struct virtual_engine {
+	struct intel_engine_cs base;
+	struct intel_context context;
+
+	/*
+	 * We allow only a single request through the virtual engine at a time
+	 * (each request in the timeline waits for the completion fence of
+	 * the previous before being submitted). By restricting ourselves to
+	 * only submitting a single request, each request is placed on to a
+	 * physical to maximise load spreading (by virtue of the late greedy
+	 * scheduling -- each real engine takes the next available request
+	 * upon idling).
+	 */
+	struct i915_request *request;
+
+	/*
+	 * We keep a rbtree of available virtual engines inside each physical
+	 * engine, sorted by priority. Here we preallocate the nodes we need
+	 * for the virtual engine, indexed by physical_engine->id.
+	 */
+	struct ve_node {
+		struct rb_node rb;
+		int prio;
+	} nodes[I915_NUM_ENGINES];
+
+	/* And finally, which physical engines this virtual engine maps onto. */
+	unsigned int count;
+	struct intel_engine_cs *siblings[0];
+};
+
+static struct virtual_engine *to_virtual_engine(struct intel_engine_cs *engine)
+{
+	return container_of(engine, struct virtual_engine, base);
+}
+
 static int execlists_context_deferred_alloc(struct intel_context *ce,
 					    struct intel_engine_cs *engine);
 static void execlists_init_reg_state(u32 *reg_state,
@@ -229,7 +264,8 @@ static int queue_prio(const struct intel_engine_execlists *execlists)
 }
 
 static inline bool need_preempt(const struct intel_engine_cs *engine,
-				const struct i915_request *rq)
+				const struct i915_request *rq,
+				struct rb_node *rb)
 {
 	int last_prio;
 
@@ -264,6 +300,22 @@ static inline bool need_preempt(const struct intel_engine_cs *engine,
 	    rq_prio(list_next_entry(rq, link)) > last_prio)
 		return true;
 
+	if (rb) { /* XXX virtual precedence */
+		struct virtual_engine *ve =
+			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+		bool preempt = false;
+
+		if (engine == ve->siblings[0]) { /* only preempt one sibling */
+			spin_lock(&ve->base.timeline.lock);
+			if (ve->request)
+				preempt = rq_prio(ve->request) > last_prio;
+			spin_unlock(&ve->base.timeline.lock);
+		}
+
+		if (preempt)
+			return preempt;
+	}
+
 	/*
 	 * If the inflight context did not trigger the preemption, then maybe
 	 * it was the set of queued requests? Pick the highest priority in
@@ -382,6 +434,8 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
 	list_for_each_entry_safe_reverse(rq, rn,
 					 &engine->timeline.requests,
 					 link) {
+		struct intel_engine_cs *owner;
+
 		if (i915_request_completed(rq))
 			break;
 
@@ -390,14 +444,30 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
 
 		GEM_BUG_ON(rq->hw_context->active);
 
-		GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID);
-		if (rq_prio(rq) != prio) {
-			prio = rq_prio(rq);
-			pl = i915_sched_lookup_priolist(engine, prio);
-		}
-		GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
+		/*
+		 * Push the request back into the queue for later resubmission.
+		 * If this request is not native to this physical engine (i.e.
+		 * it came from a virtual source), push it back onto the virtual
+		 * engine so that it can be moved across onto another physical
+		 * engine as load dictates.
+		 */
+		owner = rq->hw_context->engine;
+		if (likely(owner == engine)) {
+			GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID);
+			if (rq_prio(rq) != prio) {
+				prio = rq_prio(rq);
+				pl = i915_sched_lookup_priolist(engine, prio);
+			}
+			GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
 
-		list_add(&rq->sched.link, pl);
+			list_add(&rq->sched.link, pl);
+		} else {
+			if (__i915_request_has_started(rq))
+				rq->sched.attr.priority |= ACTIVE_PRIORITY;
+
+			rq->engine = owner;
+			owner->submit_request(rq);
+		}
 
 		active = rq;
 	}
@@ -659,6 +729,72 @@ static void complete_preempt_context(struct intel_engine_execlists *execlists)
 						  execlists));
 }
 
+static void virtual_update_register_offsets(u32 *regs,
+					    struct intel_engine_cs *engine)
+{
+	u32 base = engine->mmio_base;
+
+	regs[CTX_CONTEXT_CONTROL] =
+		i915_mmio_reg_offset(RING_CONTEXT_CONTROL(engine));
+	regs[CTX_RING_HEAD] = i915_mmio_reg_offset(RING_HEAD(base));
+	regs[CTX_RING_TAIL] = i915_mmio_reg_offset(RING_TAIL(base));
+	regs[CTX_RING_BUFFER_START] = i915_mmio_reg_offset(RING_START(base));
+	regs[CTX_RING_BUFFER_CONTROL] = i915_mmio_reg_offset(RING_CTL(base));
+
+	regs[CTX_BB_HEAD_U] = i915_mmio_reg_offset(RING_BBADDR_UDW(base));
+	regs[CTX_BB_HEAD_L] = i915_mmio_reg_offset(RING_BBADDR(base));
+	regs[CTX_BB_STATE] = i915_mmio_reg_offset(RING_BBSTATE(base));
+	regs[CTX_SECOND_BB_HEAD_U] =
+		i915_mmio_reg_offset(RING_SBBADDR_UDW(base));
+	regs[CTX_SECOND_BB_HEAD_L] = i915_mmio_reg_offset(RING_SBBADDR(base));
+	regs[CTX_SECOND_BB_STATE] = i915_mmio_reg_offset(RING_SBBSTATE(base));
+
+	regs[CTX_CTX_TIMESTAMP] =
+		i915_mmio_reg_offset(RING_CTX_TIMESTAMP(base));
+	regs[CTX_PDP3_UDW] = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(engine, 3));
+	regs[CTX_PDP3_LDW] = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(engine, 3));
+	regs[CTX_PDP2_UDW] = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(engine, 2));
+	regs[CTX_PDP2_LDW] = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(engine, 2));
+	regs[CTX_PDP1_UDW] = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(engine, 1));
+	regs[CTX_PDP1_LDW] = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(engine, 1));
+	regs[CTX_PDP0_UDW] = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(engine, 0));
+	regs[CTX_PDP0_LDW] = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(engine, 0));
+
+	if (engine->class == RENDER_CLASS) {
+		regs[CTX_RCS_INDIRECT_CTX] =
+			i915_mmio_reg_offset(RING_INDIRECT_CTX(base));
+		regs[CTX_RCS_INDIRECT_CTX_OFFSET] =
+			i915_mmio_reg_offset(RING_INDIRECT_CTX_OFFSET(base));
+		regs[CTX_BB_PER_CTX_PTR] =
+			i915_mmio_reg_offset(RING_BB_PER_CTX_PTR(base));
+
+		regs[CTX_R_PWR_CLK_STATE] =
+			i915_mmio_reg_offset(GEN8_R_PWR_CLK_STATE);
+	}
+}
+
+static bool virtual_matches(const struct virtual_engine *ve,
+			    const struct i915_request *rq,
+			    const struct intel_engine_cs *engine)
+{
+	const struct intel_engine_cs *active;
+
+	/*
+	 * We track when the HW has completed saving the context image
+	 * (i.e. when we have seen the final CS event switching out of
+	 * the context) and must not overwrite the context image before
+	 * then. This restricts us to only using the active engine
+	 * while the previous virtualized request is inflight (so
+	 * we reuse the register offsets). This is a very small
+	 * hystersis on the greedy seelction algorithm.
+	 */
+	active = READ_ONCE(ve->context.active);
+	if (active && active != engine)
+		return false;
+
+	return true;
+}
+
 static void execlists_dequeue(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlists * const execlists = &engine->execlists;
@@ -691,6 +827,26 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	 * and context switches) submission.
 	 */
 
+	for (rb = rb_first_cached(&execlists->virtual); rb; ) {
+		struct virtual_engine *ve =
+			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+		struct i915_request *rq = READ_ONCE(ve->request);
+
+		if (!rq) { /* lazily cleanup after another engine handled rq */
+			rb_erase_cached(rb, &execlists->virtual);
+			RB_CLEAR_NODE(rb);
+			rb = rb_first_cached(&execlists->virtual);
+			continue;
+		}
+
+		if (!virtual_matches(ve, rq, engine)) {
+			rb = rb_next(rb);
+			continue;
+		}
+
+		break;
+	}
+
 	if (last) {
 		/*
 		 * Don't resubmit or switch until all outstanding
@@ -712,7 +868,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		if (!execlists_is_active(execlists, EXECLISTS_ACTIVE_HWACK))
 			return;
 
-		if (need_preempt(engine, last)) {
+		if (need_preempt(engine, last, rb)) {
 			inject_preempt_context(engine);
 			return;
 		}
@@ -752,6 +908,89 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		last->tail = last->wa_tail;
 	}
 
+	while (rb) { /* XXX virtual is always taking precedence */
+		struct virtual_engine *ve =
+			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+		struct i915_request *rq;
+
+		spin_lock(&ve->base.timeline.lock);
+
+		rq = ve->request;
+		if (unlikely(!rq)) { /* lost the race to a sibling */
+			spin_unlock(&ve->base.timeline.lock);
+			rb_erase_cached(rb, &execlists->virtual);
+			RB_CLEAR_NODE(rb);
+			rb = rb_first_cached(&execlists->virtual);
+			continue;
+		}
+
+		GEM_BUG_ON(rq != ve->request);
+		GEM_BUG_ON(rq->engine != &ve->base);
+		GEM_BUG_ON(rq->hw_context != &ve->context);
+
+		if (rq_prio(rq) >= queue_prio(execlists)) {
+			if (!virtual_matches(ve, rq, engine)) {
+				spin_unlock(&ve->base.timeline.lock);
+				rb = rb_next(rb);
+				continue;
+			}
+
+			if (last && !can_merge_rq(last, rq)) {
+				spin_unlock(&ve->base.timeline.lock);
+				return; /* leave this rq for another engine */
+			}
+
+			GEM_TRACE("%s: virtual rq=%llx:%lld%s, new engine? %s\n",
+				  engine->name,
+				  rq->fence.context,
+				  rq->fence.seqno,
+				  i915_request_completed(rq) ? "!" :
+				  i915_request_started(rq) ? "*" :
+				  "",
+				  yesno(engine != ve->siblings[0]));
+
+			ve->request = NULL;
+			ve->base.execlists.queue_priority_hint = INT_MIN;
+			rb_erase_cached(rb, &execlists->virtual);
+			RB_CLEAR_NODE(rb);
+
+			rq->engine = engine;
+
+			if (engine != ve->siblings[0]) {
+				u32 *regs = ve->context.lrc_reg_state;
+				unsigned int n;
+
+				GEM_BUG_ON(READ_ONCE(ve->context.active));
+				virtual_update_register_offsets(regs, engine);
+
+				/*
+				 * Move the bound engine to the top of the list
+				 * for future execution. We then kick this
+				 * tasklet first before checking others, so that
+				 * we preferentially reuse this set of bound
+				 * registers.
+				 */
+				for (n = 1; n < ve->count; n++) {
+					if (ve->siblings[n] == engine) {
+						swap(ve->siblings[n],
+						     ve->siblings[0]);
+						break;
+					}
+				}
+
+				GEM_BUG_ON(ve->siblings[0] != engine);
+			}
+
+			__i915_request_submit(rq);
+			trace_i915_request_in(rq, port_index(port, execlists));
+			submit = true;
+			last = rq;
+		}
+
+		spin_unlock(&ve->base.timeline.lock);
+		break;
+	}
+
 	while ((rb = rb_first_cached(&execlists->queue))) {
 		struct i915_priolist *p = to_priolist(rb);
 		struct i915_request *rq, *rn;
@@ -971,6 +1210,24 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
 		i915_priolist_free(p);
 	}
 
+	/* Cancel all attached virtual engines */
+	while ((rb = rb_first_cached(&execlists->virtual))) {
+		struct virtual_engine *ve =
+			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+
+		rb_erase_cached(rb, &execlists->virtual);
+		RB_CLEAR_NODE(rb);
+
+		spin_lock(&ve->base.timeline.lock);
+		if (ve->request) {
+			__i915_request_submit(ve->request);
+			dma_fence_set_error(&ve->request->fence, -EIO);
+			i915_request_mark_complete(ve->request);
+			ve->request = NULL;
+		}
+		spin_unlock(&ve->base.timeline.lock);
+	}
+
 	/* Remaining _unready_ requests will be nop'ed when submitted */
 
 	execlists->queue_priority_hint = INT_MIN;
@@ -2861,6 +3118,300 @@ void intel_lr_context_resume(struct drm_i915_private *i915)
 	}
 }
 
+static void virtual_context_destroy(struct kref *kref)
+{
+	struct virtual_engine *ve =
+		container_of(kref, typeof(*ve), context.ref);
+	unsigned int n;
+
+	GEM_BUG_ON(ve->request);
+	GEM_BUG_ON(ve->context.active);
+
+	for (n = 0; n < ve->count; n++) {
+		struct intel_engine_cs *sibling = ve->siblings[n];
+		struct rb_node *node = &ve->nodes[sibling->id].rb;
+
+		if (RB_EMPTY_NODE(node))
+			continue;
+
+		spin_lock_irq(&sibling->timeline.lock);
+
+		if (!RB_EMPTY_NODE(node))
+			rb_erase_cached(node, &sibling->execlists.virtual);
+
+		spin_unlock_irq(&sibling->timeline.lock);
+	}
+	GEM_BUG_ON(__tasklet_is_scheduled(&ve->base.execlists.tasklet));
+
+	if (ve->context.state)
+		__execlists_context_fini(&ve->context);
+
+	i915_timeline_fini(&ve->base.timeline);
+	kfree(ve);
+}
+
+static void virtual_engine_initial_hint(struct virtual_engine *ve)
+{
+	int swp;
+
+	/*
+	 * Pick a random sibling on starting to help spread the load around.
+	 *
+	 * New contexts are typically created with exactly the same order
+	 * of siblings, and often started in batches. Due to the way we iterate
+	 * the array of sibling when submitting requests, sibling[0] is
+	 * prioritised for dequeuing. If we make sure that sibling[0] is fairly
+	 * randomised across the system, we also help spread the load by the
+	 * first engine we inspect being different each time.
+	 *
+	 * NB This does not force us to execute on this engine, it will just
+	 * typically be the first we inspect for submission.
+	 */
+	swp = prandom_u32_max(ve->count);
+	if (!swp)
+		return;
+
+	swap(ve->siblings[swp], ve->siblings[0]);
+	virtual_update_register_offsets(ve->context.lrc_reg_state,
+					ve->siblings[0]);
+}
+
+static int virtual_context_pin(struct intel_context *ce)
+{
+	struct virtual_engine *ve = container_of(ce, typeof(*ve), context);
+	int err;
+
+	/* Note: we must use a real engine class for setting up reg state */
+	err = __execlists_context_pin(ce, ve->siblings[0]);
+	if (err)
+		return err;
+
+	virtual_engine_initial_hint(ve);
+	return 0;
+}
+
+static const struct intel_context_ops virtual_context_ops = {
+	.pin = virtual_context_pin,
+	.unpin = execlists_context_unpin,
+
+	.destroy = virtual_context_destroy,
+};
+
+static void virtual_submission_tasklet(unsigned long data)
+{
+	struct virtual_engine * const ve = (struct virtual_engine *)data;
+	const int prio = ve->base.execlists.queue_priority_hint;
+	unsigned int n;
+
+	GEM_TRACE("%s: rq=%llx:%lld, prio=%d\n",
+		  ve->base.name,
+		  ve->request->fence.context,
+		  ve->request->fence.seqno,
+		  prio);
+
+	local_irq_disable();
+	for (n = 0; READ_ONCE(ve->request) && n < ve->count; n++) {
+		struct intel_engine_cs *sibling = ve->siblings[n];
+		struct ve_node * const node = &ve->nodes[sibling->id];
+		struct rb_node **parent, *rb;
+		bool first;
+
+		spin_lock(&sibling->timeline.lock);
+
+		if (!RB_EMPTY_NODE(&node->rb)) {
+			/*
+			 * Cheat and avoid rebalancing the tree if we can
+			 * reuse this node in situ.
+			 */
+			first = rb_first_cached(&sibling->execlists.virtual) ==
+				&node->rb;
+			if (prio == node->prio || (prio > node->prio && first))
+				goto submit_engine;
+
+			rb_erase_cached(&node->rb, &sibling->execlists.virtual);
+		}
+
+		rb = NULL;
+		first = true;
+		parent = &sibling->execlists.virtual.rb_root.rb_node;
+		while (*parent) {
+			struct ve_node *other;
+
+			rb = *parent;
+			other = rb_entry(rb, typeof(*other), rb);
+			if (prio > other->prio) {
+				parent = &rb->rb_left;
+			} else {
+				parent = &rb->rb_right;
+				first = false;
+			}
+		}
+
+		rb_link_node(&node->rb, rb, parent);
+		rb_insert_color_cached(&node->rb,
+				       &sibling->execlists.virtual,
+				       first);
+
+submit_engine:
+		GEM_BUG_ON(RB_EMPTY_NODE(&node->rb));
+		node->prio = prio;
+		if (first && prio > sibling->execlists.queue_priority_hint) {
+			sibling->execlists.queue_priority_hint = prio;
+			tasklet_hi_schedule(&sibling->execlists.tasklet);
+		}
+
+		spin_unlock(&sibling->timeline.lock);
+	}
+	local_irq_enable();
+}
+
+static void virtual_submit_request(struct i915_request *rq)
+{
+	struct virtual_engine *ve = to_virtual_engine(rq->engine);
+
+	GEM_TRACE("%s: rq=%llx:%lld\n",
+		  ve->base.name,
+		  rq->fence.context,
+		  rq->fence.seqno);
+
+	GEM_BUG_ON(ve->base.submit_request != virtual_submit_request);
+
+	GEM_BUG_ON(ve->request);
+	ve->base.execlists.queue_priority_hint = rq_prio(rq);
+	WRITE_ONCE(ve->request, rq);
+
+	tasklet_schedule(&ve->base.execlists.tasklet);
+}
+
+struct intel_context *
+intel_execlists_create_virtual(struct i915_gem_context *ctx,
+			       struct intel_engine_cs **siblings,
+			       unsigned int count)
+{
+	struct virtual_engine *ve;
+	unsigned int n;
+	int err;
+
+	if (count == 0)
+		return ERR_PTR(-EINVAL);
+
+	if (count == 1)
+		return intel_context_create(ctx, siblings[0]);
+
+	ve = kzalloc(struct_size(ve, siblings, count), GFP_KERNEL);
+	if (!ve)
+		return ERR_PTR(-ENOMEM);
+
+	ve->base.i915 = ctx->i915;
+	ve->base.id = -1;
+	ve->base.class = OTHER_CLASS;
+	ve->base.uabi_class = I915_ENGINE_CLASS_INVALID;
+	ve->base.instance = I915_ENGINE_CLASS_INVALID_VIRTUAL;
+	ve->base.flags = I915_ENGINE_IS_VIRTUAL;
+
+	snprintf(ve->base.name, sizeof(ve->base.name), "virtual");
+
+	err = i915_timeline_init(ctx->i915, &ve->base.timeline, NULL);
+	if (err)
+		goto err_put;
+	i915_timeline_set_subclass(&ve->base.timeline, TIMELINE_VIRTUAL);
+
+	ve->base.cops = &virtual_context_ops;
+	ve->base.request_alloc = execlists_request_alloc;
+
+	ve->base.schedule = i915_schedule;
+	ve->base.submit_request = virtual_submit_request;
+
+	ve->base.execlists.queue_priority_hint = INT_MIN;
+	tasklet_init(&ve->base.execlists.tasklet,
+		     virtual_submission_tasklet,
+		     (unsigned long)ve);
+
+	intel_context_init(&ve->context, ctx, &ve->base);
+
+	for (n = 0; n < count; n++) {
+		struct intel_engine_cs *sibling = siblings[n];
+
+		GEM_BUG_ON(!is_power_of_2(sibling->mask));
+		if (sibling->mask & ve->base.mask) {
+			DRM_DEBUG("duplicate %s entry in load balancer\n",
+				  sibling->name);
+			err = -EINVAL;
+			goto err_put;
+		}
+
+		/*
+		 * The virtual engine implementation is tightly coupled to
+		 * the execlists backend -- we push out request directly
+		 * into a tree inside each physical engine. We could support
+		 * layering if we handle cloning of the requests and
+		 * submitting a copy into each backend.
+		 */
+		if (sibling->execlists.tasklet.func !=
+		    execlists_submission_tasklet) {
+			err = -ENODEV;
+			goto err_put;
+		}
+
+		GEM_BUG_ON(RB_EMPTY_NODE(&ve->nodes[sibling->id].rb));
+		RB_CLEAR_NODE(&ve->nodes[sibling->id].rb);
+
+		ve->siblings[ve->count++] = sibling;
+		ve->base.mask |= sibling->mask;
+
+		/*
+		 * All physical engines must be compatible for their emission
+		 * functions (as we build the instructions during request
+		 * construction and do not alter them before submission
+		 * on the physical engine). We use the engine class as a guide
+		 * here, although that could be refined.
+		 */
+		if (ve->base.class != OTHER_CLASS) {
+			if (ve->base.class != sibling->class) {
+				DRM_DEBUG("invalid mixing of engine class, sibling %d, already %d\n",
+					  sibling->class, ve->base.class);
+				err = -EINVAL;
+				goto err_put;
+			}
+			continue;
+		}
+
+		ve->base.class = sibling->class;
+		snprintf(ve->base.name, sizeof(ve->base.name),
+			 "v%dx%d", ve->base.class, count);
+		ve->base.context_size = sibling->context_size;
+
+		ve->base.emit_bb_start = sibling->emit_bb_start;
+		ve->base.emit_flush = sibling->emit_flush;
+		ve->base.emit_init_breadcrumb = sibling->emit_init_breadcrumb;
+		ve->base.emit_fini_breadcrumb = sibling->emit_fini_breadcrumb;
+		ve->base.emit_fini_breadcrumb_dw =
+			sibling->emit_fini_breadcrumb_dw;
+	}
+
+	return &ve->context;
+
+err_put:
+	intel_context_put(&ve->context);
+	return ERR_PTR(err);
+}
+
+struct intel_context *
+intel_execlists_clone_virtual(struct i915_gem_context *ctx,
+			      struct intel_engine_cs *src)
+{
+	struct virtual_engine *se = to_virtual_engine(src);
+	struct intel_context *dst;
+
+	dst = intel_execlists_create_virtual(ctx,
+					     se->siblings,
+					     se->count);
+	if (IS_ERR(dst))
+		return dst;
+
+	return dst;
+}
+
 void intel_execlists_show_requests(struct intel_engine_cs *engine,
 				   struct drm_printer *m,
 				   void (*show_request)(struct drm_printer *m,
@@ -2918,6 +3469,29 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
 		show_request(m, last, "\t\tQ ");
 	}
 
+	last = NULL;
+	count = 0;
+	for (rb = rb_first_cached(&execlists->virtual); rb; rb = rb_next(rb)) {
+		struct virtual_engine *ve =
+			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+		struct i915_request *rq = READ_ONCE(ve->request);
+
+		if (rq) {
+			if (count++ < max - 1)
+				show_request(m, rq, "\t\tV ");
+			else
+				last = rq;
+		}
+	}
+	if (last) {
+		if (count > max) {
+			drm_printf(m,
+				   "\t\t...skipping %d virtual requests...\n",
+				   count - max);
+		}
+		show_request(m, last, "\t\tV ");
+	}
+
 	spin_unlock_irqrestore(&engine->timeline.lock, flags);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index c73fe820b05f..0a7d1ebbec2a 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -113,6 +113,15 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
 							const char *prefix),
 				   unsigned int max);
 
+struct intel_context *
+intel_execlists_create_virtual(struct i915_gem_context *ctx,
+			       struct intel_engine_cs **siblings,
+			       unsigned int count);
+
+struct intel_context *
+intel_execlists_clone_virtual(struct i915_gem_context *ctx,
+			      struct intel_engine_cs *src);
+
 u32 gen8_make_rpcs(struct drm_i915_private *i915, struct intel_sseu *ctx_sseu);
 
 #endif /* _INTEL_LRC_H_ */
diff --git a/drivers/gpu/drm/i915/selftests/intel_lrc.c b/drivers/gpu/drm/i915/selftests/intel_lrc.c
index 2deefc5845e4..83bcb3ca6501 100644
--- a/drivers/gpu/drm/i915/selftests/intel_lrc.c
+++ b/drivers/gpu/drm/i915/selftests/intel_lrc.c
@@ -1117,6 +1117,187 @@ static int live_preempt_smoke(void *arg)
 	return err;
 }
 
+static int nop_virtual_engine(struct drm_i915_private *i915,
+			      struct intel_engine_cs **siblings,
+			      unsigned int nsibling,
+			      unsigned int nctx,
+			      unsigned int flags)
+#define CHAIN BIT(0)
+{
+	IGT_TIMEOUT(end_time);
+	struct i915_request *request[16];
+	struct i915_gem_context *ctx[16];
+	struct intel_context *ve[16];
+	unsigned long n, prime, nc;
+	struct igt_live_test t;
+	ktime_t times[2] = {};
+	int err;
+
+	GEM_BUG_ON(!nctx || nctx > ARRAY_SIZE(ctx));
+
+	for (n = 0; n < nctx; n++) {
+		ctx[n] = kernel_context(i915);
+		if (!ctx[n]) {
+			err = -ENOMEM;
+			nctx = n;
+			goto out;
+		}
+
+		ve[n] = intel_execlists_create_virtual(ctx[n],
+						       siblings, nsibling);
+		if (IS_ERR(ve[n])) {
+			kernel_context_close(ctx[n]);
+			err = PTR_ERR(ve[n]);
+			nctx = n;
+			goto out;
+		}
+
+		err = intel_context_pin(ve[n]);
+		if (err) {
+			intel_context_put(ve[n]);
+			kernel_context_close(ctx[n]);
+			nctx = n;
+			goto out;
+		}
+	}
+
+	err = igt_live_test_begin(&t, i915, __func__, ve[0]->engine->name);
+	if (err)
+		goto out;
+
+	for_each_prime_number_from(prime, 1, 8192) {
+		times[1] = ktime_get_raw();
+
+		if (flags & CHAIN) {
+			for (nc = 0; nc < nctx; nc++) {
+				for (n = 0; n < prime; n++) {
+					request[nc] =
+						i915_request_create(ve[nc]);
+					if (IS_ERR(request[nc])) {
+						err = PTR_ERR(request[nc]);
+						goto out;
+					}
+
+					i915_request_add(request[nc]);
+				}
+			}
+		} else {
+			for (n = 0; n < prime; n++) {
+				for (nc = 0; nc < nctx; nc++) {
+					request[nc] =
+						i915_request_create(ve[nc]);
+					if (IS_ERR(request[nc])) {
+						err = PTR_ERR(request[nc]);
+						goto out;
+					}
+
+					i915_request_add(request[nc]);
+				}
+			}
+		}
+
+		for (nc = 0; nc < nctx; nc++) {
+			if (i915_request_wait(request[nc],
+					      I915_WAIT_LOCKED,
+					      HZ / 10) < 0) {
+				pr_err("%s(%s): wait for %llx:%lld timed out\n",
+				       __func__, ve[0]->engine->name,
+				       request[nc]->fence.context,
+				       request[nc]->fence.seqno);
+
+				GEM_TRACE("%s(%s) failed at request %llx:%lld\n",
+					  __func__, ve[0]->engine->name,
+					  request[nc]->fence.context,
+					  request[nc]->fence.seqno);
+				GEM_TRACE_DUMP();
+				i915_gem_set_wedged(i915);
+				break;
+			}
+		}
+
+		times[1] = ktime_sub(ktime_get_raw(), times[1]);
+		if (prime == 1)
+			times[0] = times[1];
+
+		if (__igt_timeout(end_time, NULL))
+			break;
+	}
+
+	err = igt_live_test_end(&t);
+	if (err)
+		goto out;
+
+	pr_info("Requestx%d latencies on %s: 1 = %lluns, %lu = %lluns\n",
+		nctx, ve[0]->engine->name, ktime_to_ns(times[0]),
+		prime, div64_u64(ktime_to_ns(times[1]), prime));
+
+out:
+	if (igt_flush_test(i915, I915_WAIT_LOCKED))
+		err = -EIO;
+
+	for (nc = 0; nc < nctx; nc++) {
+		intel_context_unpin(ve[nc]);
+		intel_context_put(ve[nc]);
+		kernel_context_close(ctx[nc]);
+	}
+	return err;
+}
+
+static int live_virtual_engine(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	unsigned int class, inst;
+	int err = -ENODEV;
+
+	if (USES_GUC_SUBMISSION(i915))
+		return 0;
+
+	i915_gem_unpark(i915);
+	mutex_lock(&i915->drm.struct_mutex);
+
+	for_each_engine(engine, i915, id) {
+		err = nop_virtual_engine(i915, &engine, 1, 1, 0);
+		if (err) {
+			pr_err("Failed to wrap engine %s: err=%d\n",
+			       engine->name, err);
+			goto out_unlock;
+		}
+	}
+
+	for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
+		int nsibling, n;
+
+		nsibling = 0;
+		for (inst = 0; inst <= MAX_ENGINE_INSTANCE; inst++) {
+			if (!i915->engine_class[class][inst])
+				break;
+
+			siblings[nsibling++] = i915->engine_class[class][inst];
+		}
+		if (nsibling < 2)
+			continue;
+
+		for (n = 1; n <= nsibling + 1; n++) {
+			err = nop_virtual_engine(i915, siblings, nsibling,
+						 n, 0);
+			if (err)
+				goto out_unlock;
+		}
+
+		err = nop_virtual_engine(i915, siblings, nsibling, n, CHAIN);
+		if (err)
+			goto out_unlock;
+	}
+
+out_unlock:
+	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
+	return err;
+}
+
 int intel_execlists_live_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
@@ -1128,6 +1309,7 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(live_chain_preempt),
 		SUBTEST(live_preempt_hang),
 		SUBTEST(live_preempt_smoke),
+		SUBTEST(live_virtual_engine),
 	};
 
 	if (!HAS_EXECLISTS(i915))
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 303d3dd980e4..4f49a0c07ab7 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -127,6 +127,7 @@ enum drm_i915_gem_engine_class {
 };
 
 #define I915_ENGINE_CLASS_INVALID_NONE -1
+#define I915_ENGINE_CLASS_INVALID_VIRTUAL 0
 
 /**
  * DOC: perf_events exposed by i915 through /sys/bus/event_sources/drivers/i915
@@ -1597,8 +1598,37 @@ struct drm_i915_gem_context_param_sseu {
 	__u32 rsvd;
 };
 
+/*
+ * i915_context_engines_load_balance:
+ *
+ * Enable load balancing across this set of engines.
+ *
+ * Into the I915_EXEC_DEFAULT slot [0], a virtual engine is created that when
+ * used will proxy the execbuffer request onto one of the set of engines
+ * in such a way as to distribute the load evenly across the set.
+ *
+ * The set of engines must be compatible (e.g. the same HW class) as they
+ * will share the same logical GPU context and ring.
+ *
+ * To intermix rendering with the virtual engine and direct rendering onto
+ * the backing engines (bypassing the load balancing proxy), the context must
+ * be defined to use a single timeline for all engines.
+ */
+struct i915_context_engines_load_balance {
+	struct i915_user_extension base;
+
+	__u16 engine_index;
+	__u16 mbz16; /* reserved for future use; must be zero */
+	__u32 flags; /* all undefined flags must be zero */
+
+	__u64 engines_mask; /* selection mask of engines[] */
+
+	__u64 mbz64[4]; /* reserved for future use; must be zero */
+};
+
 struct i915_context_param_engines {
 	__u64 extensions; /* linked chain of extension blocks, 0 terminates */
+#define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0
 
 	struct {
 		__u16 engine_class; /* see enum drm_i915_gem_engine_class */
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 19/22] drm/i915: Extend execution fence to support a callback
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
                   ` (17 preceding siblings ...)
  2019-03-25  9:04 ` [PATCH 18/22] drm/i915: Load balancing across a virtual engine Chris Wilson
@ 2019-03-25  9:04 ` Chris Wilson
  2019-03-25  9:04 ` [PATCH 20/22] drm/i915: Move intel_engine_mask_t around for use by i915_request_types.h Chris Wilson
                   ` (6 subsequent siblings)
  25 siblings, 0 replies; 43+ messages in thread
From: Chris Wilson @ 2019-03-25  9:04 UTC (permalink / raw)
  To: intel-gfx

In the next patch, we will want to configure the slave request
depending on which physical engine the master request is executed on.
For this, we introduce a callback from the execute fence to convey this
information.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_request.c | 84 +++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_request.h |  4 ++
 2 files changed, 83 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 4594b946f4c3..6ed4f5efc8bf 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -39,6 +39,8 @@ struct execute_cb {
 	struct list_head link;
 	struct irq_work work;
 	struct i915_sw_fence *fence;
+	void (*hook)(struct i915_request *rq, struct dma_fence *signal);
+	struct i915_request *signal;
 };
 
 static struct i915_global_request {
@@ -330,6 +332,17 @@ static void irq_execute_cb(struct irq_work *wrk)
 	kmem_cache_free(global.slab_execute_cbs, cb);
 }
 
+static void irq_execute_cb_hook(struct irq_work *wrk)
+{
+	struct execute_cb *cb = container_of(wrk, typeof(*cb), work);
+
+	cb->hook(container_of(cb->fence, struct i915_request, submit),
+		 &cb->signal->fence);
+	i915_request_put(cb->signal);
+
+	irq_execute_cb(wrk);
+}
+
 static void __notify_execute_cb(struct i915_request *rq)
 {
 	struct execute_cb *cb;
@@ -356,14 +369,19 @@ static void __notify_execute_cb(struct i915_request *rq)
 }
 
 static int
-i915_request_await_execution(struct i915_request *rq,
-			     struct i915_request *signal,
-			     gfp_t gfp)
+__i915_request_await_execution(struct i915_request *rq,
+			       struct i915_request *signal,
+			       void (*hook)(struct i915_request *rq,
+					    struct dma_fence *signal),
+			       gfp_t gfp)
 {
 	struct execute_cb *cb;
 
-	if (i915_request_is_active(signal))
+	if (i915_request_is_active(signal)) {
+		if (hook)
+			hook(rq, &signal->fence);
 		return 0;
+	}
 
 	cb = kmem_cache_alloc(global.slab_execute_cbs, gfp);
 	if (!cb)
@@ -373,8 +391,18 @@ i915_request_await_execution(struct i915_request *rq,
 	i915_sw_fence_await(cb->fence);
 	init_irq_work(&cb->work, irq_execute_cb);
 
+	if (hook) {
+		cb->hook = hook;
+		cb->signal = i915_request_get(signal);
+		cb->work.func = irq_execute_cb_hook;
+	}
+
 	spin_lock_irq(&signal->lock);
 	if (i915_request_is_active(signal)) {
+		if (hook) {
+			hook(rq, &signal->fence);
+			i915_request_put(signal);
+		}
 		i915_sw_fence_complete(cb->fence);
 		kmem_cache_free(global.slab_execute_cbs, cb);
 	} else {
@@ -744,7 +772,7 @@ emit_semaphore_wait(struct i915_request *to,
 		return err;
 
 	/* Only submit our spinner after the signaler is running! */
-	err = i915_request_await_execution(to, from, gfp);
+	err = __i915_request_await_execution(to, from, NULL, gfp);
 	if (err)
 		return err;
 
@@ -864,6 +892,52 @@ i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence)
 	return 0;
 }
 
+int
+i915_request_await_execution(struct i915_request *rq,
+			     struct dma_fence *fence,
+			     void (*hook)(struct i915_request *rq,
+					  struct dma_fence *signal))
+{
+	struct dma_fence **child = &fence;
+	unsigned int nchild = 1;
+	int ret;
+
+	if (dma_fence_is_array(fence)) {
+		struct dma_fence_array *array = to_dma_fence_array(fence);
+
+		/* XXX Error for signal-on-any fence arrays */
+
+		child = array->fences;
+		nchild = array->num_fences;
+		GEM_BUG_ON(!nchild);
+	}
+
+	do {
+		fence = *child++;
+		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
+			continue;
+
+		/*
+		 * We don't squash repeated fence dependencies here as we
+		 * want to run our callback in all cases.
+		 */
+
+		if (dma_fence_is_i915(fence))
+			ret = __i915_request_await_execution(rq,
+							     to_request(fence),
+							     hook,
+							     I915_FENCE_GFP);
+		else
+			ret = i915_sw_fence_await_dma_fence(&rq->submit, fence,
+							    I915_FENCE_TIMEOUT,
+							    GFP_KERNEL);
+		if (ret < 0)
+			return ret;
+	} while (--nchild);
+
+	return 0;
+}
+
 /**
  * i915_request_await_object - set this request to (async) wait upon a bo
  * @to: request we are wishing to use
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 7390e1f9a8cb..adaabfcf97da 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -265,6 +265,10 @@ int i915_request_await_object(struct i915_request *to,
 			      bool write);
 int i915_request_await_dma_fence(struct i915_request *rq,
 				 struct dma_fence *fence);
+int i915_request_await_execution(struct i915_request *rq,
+				 struct dma_fence *fence,
+				 void (*hook)(struct i915_request *rq,
+					      struct dma_fence *signal));
 
 void i915_request_add(struct i915_request *rq);
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 20/22] drm/i915: Move intel_engine_mask_t around for use by i915_request_types.h
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
                   ` (18 preceding siblings ...)
  2019-03-25  9:04 ` [PATCH 19/22] drm/i915: Extend execution fence to support a callback Chris Wilson
@ 2019-03-25  9:04 ` Chris Wilson
  2019-03-25  9:04 ` [PATCH 21/22] drm/i915/execlists: Virtual engine bonding Chris Wilson
                   ` (5 subsequent siblings)
  25 siblings, 0 replies; 43+ messages in thread
From: Chris Wilson @ 2019-03-25  9:04 UTC (permalink / raw)
  To: intel-gfx

We want to use intel_engine_mask_t inside i915_request.h, which means
extracting it from the general header file mess and placing it inside a
types.h. A knock on effect is that the compiler wants to warn about
type-contraction of ALL_ENGINES into intel_engine_maskt_t, so prepare
for the worst.

v2: Use intel_engine_mask_t consistently

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |  1 +
 drivers/gpu/drm/i915/gvt/execlist.c           | 11 ++-
 drivers/gpu/drm/i915/gvt/execlist.h           |  2 +-
 drivers/gpu/drm/i915/gvt/gvt.h                |  8 +-
 drivers/gpu/drm/i915/gvt/handlers.c           |  2 +-
 drivers/gpu/drm/i915/gvt/scheduler.c          |  8 +-
 drivers/gpu/drm/i915/gvt/scheduler.h          |  6 +-
 drivers/gpu/drm/i915/gvt/vgpu.c               |  4 +-
 drivers/gpu/drm/i915/i915_debugfs.c           |  2 +-
 drivers/gpu/drm/i915/i915_drv.h               |  1 -
 drivers/gpu/drm/i915/i915_gem_context.c       |  6 +-
 drivers/gpu/drm/i915/i915_gem_context.h       |  2 +-
 drivers/gpu/drm/i915/i915_gem_gtt.h           |  2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c         |  9 +-
 drivers/gpu/drm/i915/i915_gpu_error.h         |  2 +-
 drivers/gpu/drm/i915/i915_reset.c             | 43 ++++----
 drivers/gpu/drm/i915/i915_reset.h             |  9 +-
 drivers/gpu/drm/i915/i915_scheduler.h         | 86 +---------------
 drivers/gpu/drm/i915/i915_scheduler_types.h   | 98 +++++++++++++++++++
 drivers/gpu/drm/i915/i915_timeline.h          |  1 +
 drivers/gpu/drm/i915/i915_timeline_types.h    |  3 +-
 drivers/gpu/drm/i915/intel_device_info.h      |  3 +-
 drivers/gpu/drm/i915/intel_engine_types.h     |  8 +-
 drivers/gpu/drm/i915/intel_hangcheck.c        |  2 +-
 .../gpu/drm/i915/selftests/i915_gem_context.c |  8 +-
 .../gpu/drm/i915/selftests/igt_gem_utils.c    |  1 +
 .../gpu/drm/i915/selftests/intel_hangcheck.c  |  3 +-
 .../test_i915_scheduler_types_standalone.c    |  7 ++
 28 files changed, 189 insertions(+), 149 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_scheduler_types.h
 create mode 100644 drivers/gpu/drm/i915/test_i915_scheduler_types_standalone.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 688558f935be..161f4ee38c77 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -62,6 +62,7 @@ i915-$(CONFIG_DRM_I915_WERROR) += \
 	test_i915_active_types_standalone.o \
 	test_i915_gem_context_types_standalone.o \
 	test_i915_gem_pm_standalone.o \
+	test_i915_scheduler_types_standalone.o \
 	test_i915_timeline_types_standalone.o \
 	test_intel_context_types_standalone.o \
 	test_intel_engine_types_standalone.o \
diff --git a/drivers/gpu/drm/i915/gvt/execlist.c b/drivers/gpu/drm/i915/gvt/execlist.c
index 1a93472cb34e..f21b8fb5b37e 100644
--- a/drivers/gpu/drm/i915/gvt/execlist.c
+++ b/drivers/gpu/drm/i915/gvt/execlist.c
@@ -526,12 +526,13 @@ static void init_vgpu_execlist(struct intel_vgpu *vgpu, int ring_id)
 	vgpu_vreg(vgpu, ctx_status_ptr_reg) = ctx_status_ptr.dw;
 }
 
-static void clean_execlist(struct intel_vgpu *vgpu, unsigned long engine_mask)
+static void clean_execlist(struct intel_vgpu *vgpu,
+			   intel_engine_mask_t engine_mask)
 {
-	unsigned int tmp;
 	struct drm_i915_private *dev_priv = vgpu->gvt->dev_priv;
 	struct intel_engine_cs *engine;
 	struct intel_vgpu_submission *s = &vgpu->submission;
+	intel_engine_mask_t tmp;
 
 	for_each_engine_masked(engine, dev_priv, engine_mask, tmp) {
 		kfree(s->ring_scan_buffer[engine->id]);
@@ -541,18 +542,18 @@ static void clean_execlist(struct intel_vgpu *vgpu, unsigned long engine_mask)
 }
 
 static void reset_execlist(struct intel_vgpu *vgpu,
-		unsigned long engine_mask)
+			   intel_engine_mask_t engine_mask)
 {
 	struct drm_i915_private *dev_priv = vgpu->gvt->dev_priv;
 	struct intel_engine_cs *engine;
-	unsigned int tmp;
+	intel_engine_mask_t tmp;
 
 	for_each_engine_masked(engine, dev_priv, engine_mask, tmp)
 		init_vgpu_execlist(vgpu, engine->id);
 }
 
 static int init_execlist(struct intel_vgpu *vgpu,
-			 unsigned long engine_mask)
+			 intel_engine_mask_t engine_mask)
 {
 	reset_execlist(vgpu, engine_mask);
 	return 0;
diff --git a/drivers/gpu/drm/i915/gvt/execlist.h b/drivers/gpu/drm/i915/gvt/execlist.h
index 714d709829a2..5ccc2c695848 100644
--- a/drivers/gpu/drm/i915/gvt/execlist.h
+++ b/drivers/gpu/drm/i915/gvt/execlist.h
@@ -180,6 +180,6 @@ int intel_vgpu_init_execlist(struct intel_vgpu *vgpu);
 int intel_vgpu_submit_execlist(struct intel_vgpu *vgpu, int ring_id);
 
 void intel_vgpu_reset_execlist(struct intel_vgpu *vgpu,
-		unsigned long engine_mask);
+			       intel_engine_mask_t engine_mask);
 
 #endif /*_GVT_EXECLIST_H_*/
diff --git a/drivers/gpu/drm/i915/gvt/gvt.h b/drivers/gpu/drm/i915/gvt/gvt.h
index 1077f525d91d..f45e5cd57ec4 100644
--- a/drivers/gpu/drm/i915/gvt/gvt.h
+++ b/drivers/gpu/drm/i915/gvt/gvt.h
@@ -144,9 +144,9 @@ enum {
 
 struct intel_vgpu_submission_ops {
 	const char *name;
-	int (*init)(struct intel_vgpu *vgpu, unsigned long engine_mask);
-	void (*clean)(struct intel_vgpu *vgpu, unsigned long engine_mask);
-	void (*reset)(struct intel_vgpu *vgpu, unsigned long engine_mask);
+	int (*init)(struct intel_vgpu *vgpu, intel_engine_mask_t engine_mask);
+	void (*clean)(struct intel_vgpu *vgpu, intel_engine_mask_t engine_mask);
+	void (*reset)(struct intel_vgpu *vgpu, intel_engine_mask_t engine_mask);
 };
 
 struct intel_vgpu_submission {
@@ -488,7 +488,7 @@ struct intel_vgpu *intel_gvt_create_vgpu(struct intel_gvt *gvt,
 void intel_gvt_destroy_vgpu(struct intel_vgpu *vgpu);
 void intel_gvt_release_vgpu(struct intel_vgpu *vgpu);
 void intel_gvt_reset_vgpu_locked(struct intel_vgpu *vgpu, bool dmlr,
-				 unsigned int engine_mask);
+				 intel_engine_mask_t engine_mask);
 void intel_gvt_reset_vgpu(struct intel_vgpu *vgpu);
 void intel_gvt_activate_vgpu(struct intel_vgpu *vgpu);
 void intel_gvt_deactivate_vgpu(struct intel_vgpu *vgpu);
diff --git a/drivers/gpu/drm/i915/gvt/handlers.c b/drivers/gpu/drm/i915/gvt/handlers.c
index b596cb42e24e..5d44db21acc4 100644
--- a/drivers/gpu/drm/i915/gvt/handlers.c
+++ b/drivers/gpu/drm/i915/gvt/handlers.c
@@ -311,7 +311,7 @@ static int mul_force_wake_write(struct intel_vgpu *vgpu,
 static int gdrst_mmio_write(struct intel_vgpu *vgpu, unsigned int offset,
 			    void *p_data, unsigned int bytes)
 {
-	unsigned int engine_mask = 0;
+	intel_engine_mask_t engine_mask = 0;
 	u32 data;
 
 	write_vreg(vgpu, offset, p_data, bytes);
diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c
index 822b606b3078..1b5b6670b268 100644
--- a/drivers/gpu/drm/i915/gvt/scheduler.c
+++ b/drivers/gpu/drm/i915/gvt/scheduler.c
@@ -831,13 +831,13 @@ static void update_guest_context(struct intel_vgpu_workload *workload)
 }
 
 void intel_vgpu_clean_workloads(struct intel_vgpu *vgpu,
-				unsigned long engine_mask)
+				intel_engine_mask_t engine_mask)
 {
 	struct intel_vgpu_submission *s = &vgpu->submission;
 	struct drm_i915_private *dev_priv = vgpu->gvt->dev_priv;
 	struct intel_engine_cs *engine;
 	struct intel_vgpu_workload *pos, *n;
-	unsigned int tmp;
+	intel_engine_mask_t tmp;
 
 	/* free the unsubmited workloads in the queues. */
 	for_each_engine_masked(engine, dev_priv, engine_mask, tmp) {
@@ -1135,7 +1135,7 @@ void intel_vgpu_clean_submission(struct intel_vgpu *vgpu)
  *
  */
 void intel_vgpu_reset_submission(struct intel_vgpu *vgpu,
-		unsigned long engine_mask)
+				 intel_engine_mask_t engine_mask)
 {
 	struct intel_vgpu_submission *s = &vgpu->submission;
 
@@ -1250,7 +1250,7 @@ int intel_vgpu_setup_submission(struct intel_vgpu *vgpu)
  *
  */
 int intel_vgpu_select_submission_ops(struct intel_vgpu *vgpu,
-				     unsigned long engine_mask,
+				     intel_engine_mask_t engine_mask,
 				     unsigned int interface)
 {
 	struct intel_vgpu_submission *s = &vgpu->submission;
diff --git a/drivers/gpu/drm/i915/gvt/scheduler.h b/drivers/gpu/drm/i915/gvt/scheduler.h
index 0635b2c4bed7..90c6756f5453 100644
--- a/drivers/gpu/drm/i915/gvt/scheduler.h
+++ b/drivers/gpu/drm/i915/gvt/scheduler.h
@@ -142,12 +142,12 @@ void intel_gvt_wait_vgpu_idle(struct intel_vgpu *vgpu);
 int intel_vgpu_setup_submission(struct intel_vgpu *vgpu);
 
 void intel_vgpu_reset_submission(struct intel_vgpu *vgpu,
-				 unsigned long engine_mask);
+				 intel_engine_mask_t engine_mask);
 
 void intel_vgpu_clean_submission(struct intel_vgpu *vgpu);
 
 int intel_vgpu_select_submission_ops(struct intel_vgpu *vgpu,
-				     unsigned long engine_mask,
+				     intel_engine_mask_t engine_mask,
 				     unsigned int interface);
 
 extern const struct intel_vgpu_submission_ops
@@ -160,6 +160,6 @@ intel_vgpu_create_workload(struct intel_vgpu *vgpu, int ring_id,
 void intel_vgpu_destroy_workload(struct intel_vgpu_workload *workload);
 
 void intel_vgpu_clean_workloads(struct intel_vgpu *vgpu,
-				unsigned long engine_mask);
+				intel_engine_mask_t engine_mask);
 
 #endif
diff --git a/drivers/gpu/drm/i915/gvt/vgpu.c b/drivers/gpu/drm/i915/gvt/vgpu.c
index 314e40121e47..44ce3c2b9ac1 100644
--- a/drivers/gpu/drm/i915/gvt/vgpu.c
+++ b/drivers/gpu/drm/i915/gvt/vgpu.c
@@ -526,11 +526,11 @@ struct intel_vgpu *intel_gvt_create_vgpu(struct intel_gvt *gvt,
  * GPU engines. For FLR, engine_mask is ignored.
  */
 void intel_gvt_reset_vgpu_locked(struct intel_vgpu *vgpu, bool dmlr,
-				 unsigned int engine_mask)
+				 intel_engine_mask_t engine_mask)
 {
 	struct intel_gvt *gvt = vgpu->gvt;
 	struct intel_gvt_workload_scheduler *scheduler = &gvt->scheduler;
-	unsigned int resetting_eng = dmlr ? ALL_ENGINES : engine_mask;
+	intel_engine_mask_t resetting_eng = dmlr ? ALL_ENGINES : engine_mask;
 
 	gvt_dbg_core("------------------------------------------\n");
 	gvt_dbg_core("resseting vgpu%d, dmlr %d, engine_mask %08x\n",
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 98ff1a14ccf3..d3eb71982160 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2246,7 +2246,7 @@ static int i915_guc_stage_pool(struct seq_file *m, void *data)
 	const struct intel_guc *guc = &dev_priv->guc;
 	struct guc_stage_desc *desc = guc->stage_desc_pool_vaddr;
 	struct intel_guc_client *client = guc->execbuf_client;
-	unsigned int tmp;
+	intel_engine_mask_t tmp;
 	int index;
 
 	if (!USES_GUC_SUBMISSION(dev_priv))
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7c7afe99986c..a228bf69d3f6 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2451,7 +2451,6 @@ static inline unsigned int i915_sg_segment_size(void)
 #define IS_GEN9_LP(dev_priv)	(IS_GEN(dev_priv, 9) && IS_LP(dev_priv))
 #define IS_GEN9_BC(dev_priv)	(IS_GEN(dev_priv, 9) && !IS_LP(dev_priv))
 
-#define ALL_ENGINES	(~0u)
 #define HAS_ENGINE(dev_priv, id) (INTEL_INFO(dev_priv)->engine_mask & BIT(id))
 
 #define HAS_LLC(dev_priv)	(INTEL_INFO(dev_priv)->has_llc)
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 35b2c24a7e30..43b2aa2c1bb9 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -934,9 +934,9 @@ static void cb_retire(struct i915_active *base)
 	kfree(cb);
 }
 
-I915_SELFTEST_DECLARE(static unsigned long context_barrier_inject_fault);
+I915_SELFTEST_DECLARE(static intel_engine_mask_t context_barrier_inject_fault);
 static int context_barrier_task(struct i915_gem_context *ctx,
-				unsigned long engines,
+				intel_engine_mask_t engines,
 				int (*emit)(struct i915_request *rq, void *data),
 				void (*task)(void *data),
 				void *data)
@@ -1000,7 +1000,7 @@ static int context_barrier_task(struct i915_gem_context *ctx,
 }
 
 int i915_gem_switch_to_kernel_context(struct drm_i915_private *i915,
-				      unsigned long mask)
+				      intel_engine_mask_t mask)
 {
 	struct intel_engine_cs *engine;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index f8d49d402371..d4f9e50be7e3 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -160,7 +160,7 @@ void i915_gem_context_close(struct drm_file *file);
 
 int i915_switch_context(struct i915_request *rq);
 int i915_gem_switch_to_kernel_context(struct drm_i915_private *i915,
-				      unsigned long engine_mask);
+				      intel_engine_mask_t engine_mask);
 
 void i915_gem_context_release(struct kref *ctx_ref);
 struct i915_gem_context *
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 83ded9fc761a..f597f35b109b 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -390,7 +390,7 @@ struct i915_hw_ppgtt {
 	struct i915_address_space vm;
 	struct kref ref;
 
-	unsigned long pd_dirty_engines;
+	intel_engine_mask_t pd_dirty_engines;
 	union {
 		struct i915_pml4 pml4;		/* GEN8+ & 48b PPGTT */
 		struct i915_page_directory_pointer pdp;	/* GEN8+ */
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index a9557f92756f..b101f037b61f 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1093,7 +1093,7 @@ static u32 capture_error_bo(struct drm_i915_error_buffer *err,
  * It's only a small step better than a random number in its current form.
  */
 static u32 i915_error_generate_code(struct i915_gpu_state *error,
-				    unsigned long engine_mask)
+				    intel_engine_mask_t engine_mask)
 {
 	/*
 	 * IPEHR would be an ideal way to detect errors, as it's the gross
@@ -1638,7 +1638,8 @@ static void capture_reg_state(struct i915_gpu_state *error)
 }
 
 static const char *
-error_msg(struct i915_gpu_state *error, unsigned long engines, const char *msg)
+error_msg(struct i915_gpu_state *error,
+	  intel_engine_mask_t engines, const char *msg)
 {
 	int len;
 	int i;
@@ -1648,7 +1649,7 @@ error_msg(struct i915_gpu_state *error, unsigned long engines, const char *msg)
 			engines &= ~BIT(i);
 
 	len = scnprintf(error->error_msg, sizeof(error->error_msg),
-			"GPU HANG: ecode %d:%lx:0x%08x",
+			"GPU HANG: ecode %d:%x:0x%08x",
 			INTEL_GEN(error->i915), engines,
 			i915_error_generate_code(error, engines));
 	if (engines) {
@@ -1787,7 +1788,7 @@ i915_capture_gpu_state(struct drm_i915_private *i915)
  * to pick up.
  */
 void i915_capture_error_state(struct drm_i915_private *i915,
-			      unsigned long engine_mask,
+			      intel_engine_mask_t engine_mask,
 			      const char *msg)
 {
 	static bool warned;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h
index 302a14240b45..5dc761e85d9d 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.h
+++ b/drivers/gpu/drm/i915/i915_gpu_error.h
@@ -263,7 +263,7 @@ void i915_error_printf(struct drm_i915_error_state_buf *e, const char *f, ...);
 
 struct i915_gpu_state *i915_capture_gpu_state(struct drm_i915_private *i915);
 void i915_capture_error_state(struct drm_i915_private *dev_priv,
-			      unsigned long engine_mask,
+			      intel_engine_mask_t engine_mask,
 			      const char *error_msg);
 
 static inline struct i915_gpu_state *
diff --git a/drivers/gpu/drm/i915/i915_reset.c b/drivers/gpu/drm/i915/i915_reset.c
index e1f2cf64639a..476c5f3aaf0e 100644
--- a/drivers/gpu/drm/i915/i915_reset.c
+++ b/drivers/gpu/drm/i915/i915_reset.c
@@ -145,15 +145,15 @@ static void gen3_stop_engine(struct intel_engine_cs *engine)
 }
 
 static void i915_stop_engines(struct drm_i915_private *i915,
-			      unsigned int engine_mask)
+			      intel_engine_mask_t engine_mask)
 {
 	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
+	intel_engine_mask_t tmp;
 
 	if (INTEL_GEN(i915) < 3)
 		return;
 
-	for_each_engine_masked(engine, i915, engine_mask, id)
+	for_each_engine_masked(engine, i915, engine_mask, tmp)
 		gen3_stop_engine(engine);
 }
 
@@ -166,7 +166,7 @@ static bool i915_in_reset(struct pci_dev *pdev)
 }
 
 static int i915_do_reset(struct drm_i915_private *i915,
-			 unsigned int engine_mask,
+			 intel_engine_mask_t engine_mask,
 			 unsigned int retry)
 {
 	struct pci_dev *pdev = i915->drm.pdev;
@@ -195,7 +195,7 @@ static bool g4x_reset_complete(struct pci_dev *pdev)
 }
 
 static int g33_do_reset(struct drm_i915_private *i915,
-			unsigned int engine_mask,
+			intel_engine_mask_t engine_mask,
 			unsigned int retry)
 {
 	struct pci_dev *pdev = i915->drm.pdev;
@@ -205,7 +205,7 @@ static int g33_do_reset(struct drm_i915_private *i915,
 }
 
 static int g4x_do_reset(struct drm_i915_private *dev_priv,
-			unsigned int engine_mask,
+			intel_engine_mask_t engine_mask,
 			unsigned int retry)
 {
 	struct pci_dev *pdev = dev_priv->drm.pdev;
@@ -243,7 +243,7 @@ static int g4x_do_reset(struct drm_i915_private *dev_priv,
 }
 
 static int ironlake_do_reset(struct drm_i915_private *dev_priv,
-			     unsigned int engine_mask,
+			     intel_engine_mask_t engine_mask,
 			     unsigned int retry)
 {
 	int ret;
@@ -300,7 +300,7 @@ static int gen6_hw_domain_reset(struct drm_i915_private *dev_priv,
 }
 
 static int gen6_reset_engines(struct drm_i915_private *i915,
-			      unsigned int engine_mask,
+			      intel_engine_mask_t engine_mask,
 			      unsigned int retry)
 {
 	struct intel_engine_cs *engine;
@@ -316,7 +316,7 @@ static int gen6_reset_engines(struct drm_i915_private *i915,
 	if (engine_mask == ALL_ENGINES) {
 		hw_mask = GEN6_GRDOM_FULL;
 	} else {
-		unsigned int tmp;
+		intel_engine_mask_t tmp;
 
 		hw_mask = 0;
 		for_each_engine_masked(engine, i915, engine_mask, tmp) {
@@ -426,7 +426,7 @@ static void gen11_unlock_sfc(struct drm_i915_private *dev_priv,
 }
 
 static int gen11_reset_engines(struct drm_i915_private *i915,
-			       unsigned int engine_mask,
+			       intel_engine_mask_t engine_mask,
 			       unsigned int retry)
 {
 	const u32 hw_engine_mask[] = {
@@ -440,7 +440,7 @@ static int gen11_reset_engines(struct drm_i915_private *i915,
 		[VECS1] = GEN11_GRDOM_VECS2,
 	};
 	struct intel_engine_cs *engine;
-	unsigned int tmp;
+	intel_engine_mask_t tmp;
 	u32 hw_mask;
 	int ret;
 
@@ -493,12 +493,12 @@ static void gen8_engine_reset_cancel(struct intel_engine_cs *engine)
 }
 
 static int gen8_reset_engines(struct drm_i915_private *i915,
-			      unsigned int engine_mask,
+			      intel_engine_mask_t engine_mask,
 			      unsigned int retry)
 {
 	struct intel_engine_cs *engine;
 	const bool reset_non_ready = retry >= 1;
-	unsigned int tmp;
+	intel_engine_mask_t tmp;
 	int ret;
 
 	for_each_engine_masked(engine, i915, engine_mask, tmp) {
@@ -534,7 +534,7 @@ static int gen8_reset_engines(struct drm_i915_private *i915,
 }
 
 typedef int (*reset_func)(struct drm_i915_private *,
-			  unsigned int engine_mask,
+			  intel_engine_mask_t engine_mask,
 			  unsigned int retry);
 
 static reset_func intel_get_gpu_reset(struct drm_i915_private *i915)
@@ -555,7 +555,8 @@ static reset_func intel_get_gpu_reset(struct drm_i915_private *i915)
 		return NULL;
 }
 
-int intel_gpu_reset(struct drm_i915_private *i915, unsigned int engine_mask)
+int intel_gpu_reset(struct drm_i915_private *i915,
+		    intel_engine_mask_t engine_mask)
 {
 	const int retries = engine_mask == ALL_ENGINES ? RESET_MAX_RETRIES : 1;
 	reset_func reset;
@@ -689,7 +690,8 @@ static void gt_revoke(struct drm_i915_private *i915)
 	revoke_mmaps(i915);
 }
 
-static int gt_reset(struct drm_i915_private *i915, unsigned int stalled_mask)
+static int gt_reset(struct drm_i915_private *i915,
+		    intel_engine_mask_t stalled_mask)
 {
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
@@ -947,7 +949,8 @@ bool i915_gem_unset_wedged(struct drm_i915_private *i915)
 	return result;
 }
 
-static int do_reset(struct drm_i915_private *i915, unsigned int stalled_mask)
+static int do_reset(struct drm_i915_private *i915,
+		    intel_engine_mask_t stalled_mask)
 {
 	int err, i;
 
@@ -982,7 +985,7 @@ static int do_reset(struct drm_i915_private *i915, unsigned int stalled_mask)
  *   - re-init display
  */
 void i915_reset(struct drm_i915_private *i915,
-		unsigned int stalled_mask,
+		intel_engine_mask_t stalled_mask,
 		const char *reason)
 {
 	struct i915_gpu_error *error = &i915->gpu_error;
@@ -1224,14 +1227,14 @@ void i915_clear_error_registers(struct drm_i915_private *dev_priv)
  * of a ring dump etc.).
  */
 void i915_handle_error(struct drm_i915_private *i915,
-		       u32 engine_mask,
+		       intel_engine_mask_t engine_mask,
 		       unsigned long flags,
 		       const char *fmt, ...)
 {
 	struct i915_gpu_error *error = &i915->gpu_error;
 	struct intel_engine_cs *engine;
 	intel_wakeref_t wakeref;
-	unsigned int tmp;
+	intel_engine_mask_t tmp;
 	char error_msg[80];
 	char *msg = NULL;
 
diff --git a/drivers/gpu/drm/i915/i915_reset.h b/drivers/gpu/drm/i915/i915_reset.h
index 16f2389f656f..86b1ac8116ce 100644
--- a/drivers/gpu/drm/i915/i915_reset.h
+++ b/drivers/gpu/drm/i915/i915_reset.h
@@ -11,13 +11,15 @@
 #include <linux/types.h>
 #include <linux/srcu.h>
 
+#include "intel_engine_types.h"
+
 struct drm_i915_private;
 struct intel_engine_cs;
 struct intel_guc;
 
 __printf(4, 5)
 void i915_handle_error(struct drm_i915_private *i915,
-		       u32 engine_mask,
+		       intel_engine_mask_t engine_mask,
 		       unsigned long flags,
 		       const char *fmt, ...);
 #define I915_ERROR_CAPTURE BIT(0)
@@ -25,7 +27,7 @@ void i915_handle_error(struct drm_i915_private *i915,
 void i915_clear_error_registers(struct drm_i915_private *i915);
 
 void i915_reset(struct drm_i915_private *i915,
-		unsigned int stalled_mask,
+		intel_engine_mask_t stalled_mask,
 		const char *reason);
 int i915_reset_engine(struct intel_engine_cs *engine,
 		      const char *reason);
@@ -41,7 +43,8 @@ int i915_terminally_wedged(struct drm_i915_private *i915);
 bool intel_has_gpu_reset(struct drm_i915_private *i915);
 bool intel_has_reset_engine(struct drm_i915_private *i915);
 
-int intel_gpu_reset(struct drm_i915_private *i915, u32 engine_mask);
+int intel_gpu_reset(struct drm_i915_private *i915,
+		    intel_engine_mask_t engine_mask);
 
 int intel_reset_guc(struct drm_i915_private *i915);
 
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
index 9a1d257f3d6e..07d243acf553 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.h
+++ b/drivers/gpu/drm/i915/i915_scheduler.h
@@ -8,92 +8,10 @@
 #define _I915_SCHEDULER_H_
 
 #include <linux/bitops.h>
+#include <linux/list.h>
 #include <linux/kernel.h>
 
-#include <uapi/drm/i915_drm.h>
-
-struct drm_i915_private;
-struct i915_request;
-struct intel_engine_cs;
-
-enum {
-	I915_PRIORITY_MIN = I915_CONTEXT_MIN_USER_PRIORITY - 1,
-	I915_PRIORITY_NORMAL = I915_CONTEXT_DEFAULT_PRIORITY,
-	I915_PRIORITY_MAX = I915_CONTEXT_MAX_USER_PRIORITY + 1,
-
-	I915_PRIORITY_INVALID = INT_MIN
-};
-
-#define I915_USER_PRIORITY_SHIFT 3
-#define I915_USER_PRIORITY(x) ((x) << I915_USER_PRIORITY_SHIFT)
-
-#define I915_PRIORITY_COUNT BIT(I915_USER_PRIORITY_SHIFT)
-#define I915_PRIORITY_MASK (I915_PRIORITY_COUNT - 1)
-
-#define I915_PRIORITY_WAIT		((u8)BIT(0))
-#define I915_PRIORITY_NEWCLIENT		((u8)BIT(1))
-#define I915_PRIORITY_NOSEMAPHORE	((u8)BIT(2))
-
-#define __NO_PREEMPTION (I915_PRIORITY_WAIT)
-
-struct i915_sched_attr {
-	/**
-	 * @priority: execution and service priority
-	 *
-	 * All clients are equal, but some are more equal than others!
-	 *
-	 * Requests from a context with a greater (more positive) value of
-	 * @priority will be executed before those with a lower @priority
-	 * value, forming a simple QoS.
-	 *
-	 * The &drm_i915_private.kernel_context is assigned the lowest priority.
-	 */
-	int priority;
-};
-
-/*
- * "People assume that time is a strict progression of cause to effect, but
- * actually, from a nonlinear, non-subjective viewpoint, it's more like a big
- * ball of wibbly-wobbly, timey-wimey ... stuff." -The Doctor, 2015
- *
- * Requests exist in a complex web of interdependencies. Each request
- * has to wait for some other request to complete before it is ready to be run
- * (e.g. we have to wait until the pixels have been rendering into a texture
- * before we can copy from it). We track the readiness of a request in terms
- * of fences, but we also need to keep the dependency tree for the lifetime
- * of the request (beyond the life of an individual fence). We use the tree
- * at various points to reorder the requests whilst keeping the requests
- * in order with respect to their various dependencies.
- *
- * There is no active component to the "scheduler". As we know the dependency
- * DAG of each request, we are able to insert it into a sorted queue when it
- * is ready, and are able to reorder its portion of the graph to accommodate
- * dynamic priority changes.
- */
-struct i915_sched_node {
-	struct list_head signalers_list; /* those before us, we depend upon */
-	struct list_head waiters_list; /* those after us, they depend upon us */
-	struct list_head link;
-	struct i915_sched_attr attr;
-	unsigned int flags;
-#define I915_SCHED_HAS_SEMAPHORE	BIT(0)
-};
-
-struct i915_dependency {
-	struct i915_sched_node *signaler;
-	struct list_head signal_link;
-	struct list_head wait_link;
-	struct list_head dfs_link;
-	unsigned long flags;
-#define I915_DEPENDENCY_ALLOC BIT(0)
-};
-
-struct i915_priolist {
-	struct list_head requests[I915_PRIORITY_COUNT];
-	struct rb_node node;
-	unsigned long used;
-	int priority;
-};
+#include "i915_scheduler_types.h"
 
 #define priolist_for_each_request(it, plist, idx) \
 	for (idx = 0; idx < ARRAY_SIZE((plist)->requests); idx++) \
diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
new file mode 100644
index 000000000000..5c94b3eb5c81
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
@@ -0,0 +1,98 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2018 Intel Corporation
+ */
+
+#ifndef _I915_SCHEDULER_TYPES_H_
+#define _I915_SCHEDULER_TYPES_H_
+
+#include <linux/list.h>
+#include <linux/rbtree.h>
+
+#include <uapi/drm/i915_drm.h>
+
+struct drm_i915_private;
+struct i915_request;
+struct intel_engine_cs;
+
+enum {
+	I915_PRIORITY_MIN = I915_CONTEXT_MIN_USER_PRIORITY - 1,
+	I915_PRIORITY_NORMAL = I915_CONTEXT_DEFAULT_PRIORITY,
+	I915_PRIORITY_MAX = I915_CONTEXT_MAX_USER_PRIORITY + 1,
+
+	I915_PRIORITY_INVALID = INT_MIN
+};
+
+#define I915_USER_PRIORITY_SHIFT 3
+#define I915_USER_PRIORITY(x) ((x) << I915_USER_PRIORITY_SHIFT)
+
+#define I915_PRIORITY_COUNT BIT(I915_USER_PRIORITY_SHIFT)
+#define I915_PRIORITY_MASK (I915_PRIORITY_COUNT - 1)
+
+#define I915_PRIORITY_WAIT		((u8)BIT(0))
+#define I915_PRIORITY_NEWCLIENT		((u8)BIT(1))
+#define I915_PRIORITY_NOSEMAPHORE	((u8)BIT(2))
+
+#define __NO_PREEMPTION (I915_PRIORITY_WAIT)
+
+struct i915_sched_attr {
+	/**
+	 * @priority: execution and service priority
+	 *
+	 * All clients are equal, but some are more equal than others!
+	 *
+	 * Requests from a context with a greater (more positive) value of
+	 * @priority will be executed before those with a lower @priority
+	 * value, forming a simple QoS.
+	 *
+	 * The &drm_i915_private.kernel_context is assigned the lowest priority.
+	 */
+	int priority;
+};
+
+/*
+ * "People assume that time is a strict progression of cause to effect, but
+ * actually, from a nonlinear, non-subjective viewpoint, it's more like a big
+ * ball of wibbly-wobbly, timey-wimey ... stuff." -The Doctor, 2015
+ *
+ * Requests exist in a complex web of interdependencies. Each request
+ * has to wait for some other request to complete before it is ready to be run
+ * (e.g. we have to wait until the pixels have been rendering into a texture
+ * before we can copy from it). We track the readiness of a request in terms
+ * of fences, but we also need to keep the dependency tree for the lifetime
+ * of the request (beyond the life of an individual fence). We use the tree
+ * at various points to reorder the requests whilst keeping the requests
+ * in order with respect to their various dependencies.
+ *
+ * There is no active component to the "scheduler". As we know the dependency
+ * DAG of each request, we are able to insert it into a sorted queue when it
+ * is ready, and are able to reorder its portion of the graph to accommodate
+ * dynamic priority changes.
+ */
+struct i915_sched_node {
+	struct list_head signalers_list; /* those before us, we depend upon */
+	struct list_head waiters_list; /* those after us, they depend upon us */
+	struct list_head link;
+	struct i915_sched_attr attr;
+	unsigned int flags;
+#define I915_SCHED_HAS_SEMAPHORE	BIT(0)
+};
+
+struct i915_dependency {
+	struct i915_sched_node *signaler;
+	struct list_head signal_link;
+	struct list_head wait_link;
+	struct list_head dfs_link;
+	unsigned long flags;
+#define I915_DEPENDENCY_ALLOC BIT(0)
+};
+
+struct i915_priolist {
+	struct list_head requests[I915_PRIORITY_COUNT];
+	struct rb_node node;
+	unsigned long used;
+	int priority;
+};
+
+#endif /* _I915_SCHEDULER_TYPES_H_ */
diff --git a/drivers/gpu/drm/i915/i915_timeline.h b/drivers/gpu/drm/i915/i915_timeline.h
index c1e47a423d85..4ca7f80bdf6d 100644
--- a/drivers/gpu/drm/i915/i915_timeline.h
+++ b/drivers/gpu/drm/i915/i915_timeline.h
@@ -27,6 +27,7 @@
 
 #include <linux/lockdep.h>
 
+#include "i915_active.h"
 #include "i915_syncmap.h"
 #include "i915_timeline_types.h"
 
diff --git a/drivers/gpu/drm/i915/i915_timeline_types.h b/drivers/gpu/drm/i915/i915_timeline_types.h
index ccd7bea4a342..57c79830bb8c 100644
--- a/drivers/gpu/drm/i915/i915_timeline_types.h
+++ b/drivers/gpu/drm/i915/i915_timeline_types.h
@@ -9,9 +9,10 @@
 
 #include <linux/list.h>
 #include <linux/kref.h>
+#include <linux/mutex.h>
 #include <linux/types.h>
 
-#include "i915_active.h"
+#include "i915_active_types.h"
 
 struct drm_i915_private;
 struct i915_vma;
diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
index 98acefaacec9..f77eae67c2cc 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -27,6 +27,7 @@
 
 #include <uapi/drm/i915_drm.h>
 
+#include "intel_engine_types.h"
 #include "intel_display.h"
 
 struct drm_printer;
@@ -150,8 +151,6 @@ struct sseu_dev_info {
 	u8 eu_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICES];
 };
 
-typedef u8 intel_engine_mask_t;
-
 struct intel_device_info {
 	u16 gen_mask;
 
diff --git a/drivers/gpu/drm/i915/intel_engine_types.h b/drivers/gpu/drm/i915/intel_engine_types.h
index 322fbda65190..4d526767c371 100644
--- a/drivers/gpu/drm/i915/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/intel_engine_types.h
@@ -13,8 +13,10 @@
 #include <linux/list.h>
 #include <linux/types.h>
 
+#include "i915_gem.h"
+#include "i915_scheduler_types.h"
+#include "i915_selftest.h"
 #include "i915_timeline_types.h"
-#include "intel_device_info.h"
 #include "intel_workarounds_types.h"
 
 #include "i915_gem_batch_pool.h"
@@ -25,11 +27,15 @@
 
 #define I915_CMD_HASH_ORDER 9
 
+struct dma_fence;
 struct drm_i915_reg_table;
 struct i915_gem_context;
 struct i915_request;
 struct i915_sched_attr;
 
+typedef u8 intel_engine_mask_t;
+#define ALL_ENGINES ((intel_engine_mask_t)~0ul)
+
 struct intel_hw_status_page {
 	struct i915_vma *vma;
 	u32 *addr;
diff --git a/drivers/gpu/drm/i915/intel_hangcheck.c b/drivers/gpu/drm/i915/intel_hangcheck.c
index 57ed49dc19c4..4e3a7afa7540 100644
--- a/drivers/gpu/drm/i915/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/intel_hangcheck.c
@@ -221,8 +221,8 @@ static void hangcheck_declare_hang(struct drm_i915_private *i915,
 				   unsigned int stuck)
 {
 	struct intel_engine_cs *engine;
+	intel_engine_mask_t tmp;
 	char msg[80];
-	unsigned int tmp;
 	int len;
 
 	/* If some rings hung but others were still busy, only
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index e01d2a2855a0..28d19aa6a64e 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -1588,10 +1588,10 @@ static int igt_vm_isolation(void *arg)
 }
 
 static __maybe_unused const char *
-__engine_name(struct drm_i915_private *i915, unsigned int engines)
+__engine_name(struct drm_i915_private *i915, intel_engine_mask_t engines)
 {
 	struct intel_engine_cs *engine;
-	unsigned int tmp;
+	intel_engine_mask_t tmp;
 
 	if (engines == ALL_ENGINES)
 		return "all";
@@ -1604,10 +1604,10 @@ __engine_name(struct drm_i915_private *i915, unsigned int engines)
 
 static int __igt_switch_to_kernel_context(struct drm_i915_private *i915,
 					  struct i915_gem_context *ctx,
-					  unsigned int engines)
+					  intel_engine_mask_t engines)
 {
 	struct intel_engine_cs *engine;
-	unsigned int tmp;
+	intel_engine_mask_t tmp;
 	int pass;
 
 	GEM_TRACE("Testing %s\n", __engine_name(i915, engines));
diff --git a/drivers/gpu/drm/i915/selftests/igt_gem_utils.c b/drivers/gpu/drm/i915/selftests/igt_gem_utils.c
index f9899adfe12a..ab4b7eaae193 100644
--- a/drivers/gpu/drm/i915/selftests/igt_gem_utils.c
+++ b/drivers/gpu/drm/i915/selftests/igt_gem_utils.c
@@ -8,6 +8,7 @@
 
 #include "../i915_gem_context.h"
 #include "../i915_gem_pm.h"
+#include "../i915_request.h"
 #include "../intel_context.h"
 
 struct i915_request *
diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
index 3c843edc2fe4..0d994d72661b 100644
--- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
@@ -1132,7 +1132,8 @@ static int igt_reset_engines(void *arg)
 	return 0;
 }
 
-static u32 fake_hangcheck(struct drm_i915_private *i915, u32 mask)
+static u32 fake_hangcheck(struct drm_i915_private *i915,
+			  intel_engine_mask_t mask)
 {
 	u32 count = i915_reset_count(&i915->gpu_error);
 
diff --git a/drivers/gpu/drm/i915/test_i915_scheduler_types_standalone.c b/drivers/gpu/drm/i915/test_i915_scheduler_types_standalone.c
new file mode 100644
index 000000000000..8afa2c3719fb
--- /dev/null
+++ b/drivers/gpu/drm/i915/test_i915_scheduler_types_standalone.c
@@ -0,0 +1,7 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "i915_scheduler_types.h"
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 21/22] drm/i915/execlists: Virtual engine bonding
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
                   ` (19 preceding siblings ...)
  2019-03-25  9:04 ` [PATCH 20/22] drm/i915: Move intel_engine_mask_t around for use by i915_request_types.h Chris Wilson
@ 2019-03-25  9:04 ` Chris Wilson
  2019-03-25  9:04 ` [PATCH 22/22] drm/i915: Allow specification of parallel execbuf Chris Wilson
                   ` (4 subsequent siblings)
  25 siblings, 0 replies; 43+ messages in thread
From: Chris Wilson @ 2019-03-25  9:04 UTC (permalink / raw)
  To: intel-gfx

Some users require that when a master batch is executed on one particular
engine, a companion batch is run simultaneously on a specific slave
engine. For this purpose, we introduce virtual engine bonding, allowing
maps of master:slaves to be constructed to constrain which physical
engines a virtual engine may select given a fence on a master engine.

For the moment, we continue to ignore the issue of preemption deferring
the master request for later. Ideally, we would like to then also remove
the slave and run something else rather than have it stall the pipeline.
With load balancing, we should be able to move workload around it, but
there is a similar stall on the master pipeline while it may wait for
the slave to be executed. At the cost of more latency for the bonded
request, it may be interesting to launch both on their engines in
lockstep. (Bubbles abound.)

Opens: Also what about bonding an engine as its own master? It doesn't
break anything internally, so allow the silliness.

v2: Emancipate the bonds
v3: Couple in delayed scheduling for the selftests
v4: Handle invalid mutually exclusive bonding
v5: Mention what the uapi does
v6: s/nbond/num_bonds/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c       |  51 +++++
 drivers/gpu/drm/i915/i915_request.c           |   1 +
 drivers/gpu/drm/i915/i915_request.h           |   3 +
 drivers/gpu/drm/i915/intel_engine_types.h     |   7 +
 drivers/gpu/drm/i915/intel_lrc.c              | 153 +++++++++++++-
 drivers/gpu/drm/i915/intel_lrc.h              |   4 +
 drivers/gpu/drm/i915/selftests/intel_lrc.c    | 193 ++++++++++++++++++
 drivers/gpu/drm/i915/selftests/lib_sw_fence.c |   3 +
 include/uapi/drm/i915_drm.h                   |  33 +++
 9 files changed, 446 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 43b2aa2c1bb9..d3fa3bb5dab6 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -1563,8 +1563,59 @@ set_engines__load_balance(struct i915_user_extension __user *base, void *data)
 	return 0;
 }
 
+static int
+set_engines__bond(struct i915_user_extension __user *base, void *data)
+{
+	struct i915_context_engines_bond __user *ext =
+		container_of_user(base, typeof(*ext), base);
+	const struct set_engines *set = data;
+	unsigned int idx, class, instance;
+	struct intel_engine_cs *virtual;
+	struct intel_engine_cs *master;
+	u64 siblings;
+	int err;
+
+	if (get_user(idx, &ext->virtual_index))
+		return -EFAULT;
+
+	if (idx >= set->engines->num_engines)
+		return -EINVAL;
+
+	idx = array_index_nospec(idx, set->engines->num_engines);
+	if (!set->engines->engines[idx])
+		return -EINVAL;
+
+	/*
+	 * A non-virtual engine has 0 siblings to choose between; and submit
+	 * fence will always be directed to the one engine.
+	 */
+	virtual = set->engines->engines[idx]->engine;
+	if (!intel_engine_is_virtual(virtual))
+		return 0;
+
+	err = check_user_mbz(&ext->mbz);
+	if (err)
+		return err;
+
+	if (get_user(class, &ext->master_class))
+		return -EFAULT;
+
+	if (get_user(instance, &ext->master_instance))
+		return -EFAULT;
+
+	master = intel_engine_lookup_user(set->ctx->i915, class, instance);
+	if (!master)
+		return -EINVAL;
+
+	if (get_user(siblings, &ext->sibling_mask))
+		return -EFAULT;
+
+	return intel_virtual_engine_attach_bond(virtual, master, siblings);
+}
+
 static const i915_user_extension_fn set_engines__extensions[] = {
 	[I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE] = set_engines__load_balance,
+	[I915_CONTEXT_ENGINES_EXT_BOND] = set_engines__bond,
 };
 
 static int
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 6ed4f5efc8bf..a87fadc17996 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -695,6 +695,7 @@ struct i915_request *i915_request_create(struct intel_context *ce)
 	rq->batch = NULL;
 	rq->capture_list = NULL;
 	rq->waitboost = false;
+	rq->execution_mask = ALL_ENGINES;
 
 	/*
 	 * Reserve space in the ring buffer for all the commands required to
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index adaabfcf97da..f4d07c762def 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -32,6 +32,8 @@
 #include "i915_selftest.h"
 #include "i915_sw_fence.h"
 
+#include "intel_engine_types.h"
+
 #include <uapi/drm/i915_drm.h>
 
 struct drm_file;
@@ -145,6 +147,7 @@ struct i915_request {
 	 */
 	struct i915_sched_node sched;
 	struct i915_dependency dep;
+	intel_engine_mask_t execution_mask;
 
 	/*
 	 * A convenience pointer to the current breadcrumb value stored in
diff --git a/drivers/gpu/drm/i915/intel_engine_types.h b/drivers/gpu/drm/i915/intel_engine_types.h
index 4d526767c371..4c0f8ea1c546 100644
--- a/drivers/gpu/drm/i915/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/intel_engine_types.h
@@ -390,6 +390,13 @@ struct intel_engine_cs {
 	 */
 	void		(*submit_request)(struct i915_request *rq);
 
+	/*
+	 * Called on signaling of a SUBMIT_FENCE, passing along the signaling
+	 * request down to the bonded pairs.
+	 */
+	void            (*bond_execute)(struct i915_request *rq,
+					struct dma_fence *signal);
+
 	/*
 	 * Call when the priority on a request has changed and it and its
 	 * dependencies may need rescheduling. Note the request itself may
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 9ea94524567f..2967bd8e0c25 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -191,6 +191,18 @@ struct virtual_engine {
 		int prio;
 	} nodes[I915_NUM_ENGINES];
 
+	/*
+	 * Keep track of bonded pairs -- restrictions upon on our selection
+	 * of physical engines any particular request may be submitted to.
+	 * If we receive a submit-fence from a master engine, we will only
+	 * use one of sibling_mask physical engines.
+	 */
+	struct ve_bond {
+		struct intel_engine_cs *master;
+		intel_engine_mask_t sibling_mask;
+	} *bonds;
+	unsigned int num_bonds;
+
 	/* And finally, which physical engines this virtual engine maps onto. */
 	unsigned int count;
 	struct intel_engine_cs *siblings[0];
@@ -779,6 +791,9 @@ static bool virtual_matches(const struct virtual_engine *ve,
 {
 	const struct intel_engine_cs *active;
 
+	if (!(rq->execution_mask & engine->mask)) /* We peeked too soon! */
+		return false;
+
 	/*
 	 * We track when the HW has completed saving the context image
 	 * (i.e. when we have seen the final CS event switching out of
@@ -954,6 +969,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 			rb_erase_cached(rb, &execlists->virtual);
 			RB_CLEAR_NODE(rb);
 
+			GEM_BUG_ON(!(rq->execution_mask & engine->mask));
 			rq->engine = engine;
 
 			if (engine != ve->siblings[0]) {
@@ -3146,6 +3162,8 @@ static void virtual_context_destroy(struct kref *kref)
 	if (ve->context.state)
 		__execlists_context_fini(&ve->context);
 
+	kfree(ve->bonds);
+
 	i915_timeline_fini(&ve->base.timeline);
 	kfree(ve);
 }
@@ -3197,17 +3215,44 @@ static const struct intel_context_ops virtual_context_ops = {
 	.destroy = virtual_context_destroy,
 };
 
+static intel_engine_mask_t virtual_submission_mask(struct virtual_engine *ve)
+{
+	struct i915_request *rq;
+	intel_engine_mask_t mask;
+
+	rq = READ_ONCE(ve->request);
+	if (!rq)
+		return 0;
+
+	/* The rq is ready for submission; rq->execution_mask is now stable. */
+	mask = rq->execution_mask;
+	if (unlikely(!mask)) {
+		/* Invalid selection, submit to a random engine in error */
+		i915_request_skip(rq, -ENODEV);
+		mask = ve->siblings[0]->mask;
+	}
+
+	return mask;
+}
+
 static void virtual_submission_tasklet(unsigned long data)
 {
 	struct virtual_engine * const ve = (struct virtual_engine *)data;
 	const int prio = ve->base.execlists.queue_priority_hint;
+	intel_engine_mask_t mask;
 	unsigned int n;
 
-	GEM_TRACE("%s: rq=%llx:%lld, prio=%d\n",
+	rcu_read_lock();
+	mask = virtual_submission_mask(ve);
+	rcu_read_unlock();
+	if (unlikely(!mask))
+		return;
+
+	GEM_TRACE("%s: rq=%llx:%lld, mask=%x, prio=%d\n",
 		  ve->base.name,
 		  ve->request->fence.context,
 		  ve->request->fence.seqno,
-		  prio);
+		  mask, prio);
 
 	local_irq_disable();
 	for (n = 0; READ_ONCE(ve->request) && n < ve->count; n++) {
@@ -3216,6 +3261,17 @@ static void virtual_submission_tasklet(unsigned long data)
 		struct rb_node **parent, *rb;
 		bool first;
 
+		if (unlikely(!(mask & sibling->mask))) {
+			if (!RB_EMPTY_NODE(&node->rb)) {
+				spin_lock(&sibling->timeline.lock);
+				rb_erase_cached(&node->rb,
+						&sibling->execlists.virtual);
+				RB_CLEAR_NODE(&node->rb);
+				spin_unlock(&sibling->timeline.lock);
+			}
+			continue;
+		}
+
 		spin_lock(&sibling->timeline.lock);
 
 		if (!RB_EMPTY_NODE(&node->rb)) {
@@ -3283,6 +3339,37 @@ static void virtual_submit_request(struct i915_request *rq)
 	tasklet_schedule(&ve->base.execlists.tasklet);
 }
 
+static struct ve_bond *
+virtual_find_bond(struct virtual_engine *ve, struct intel_engine_cs *master)
+{
+	int i;
+
+	for (i = 0; i < ve->num_bonds; i++) {
+		if (ve->bonds[i].master == master)
+			return &ve->bonds[i];
+	}
+
+	return NULL;
+}
+
+static void
+virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
+{
+	struct virtual_engine *ve = to_virtual_engine(rq->engine);
+	struct ve_bond *bond;
+
+	bond = virtual_find_bond(ve, to_request(signal)->engine);
+	if (bond) {
+		intel_engine_mask_t old, new, cmp;
+
+		cmp = READ_ONCE(rq->execution_mask);
+		do {
+			old = cmp;
+			new = cmp & bond->sibling_mask;
+		} while ((cmp = cmpxchg(&rq->execution_mask, old, new)) != old);
+	}
+}
+
 struct intel_context *
 intel_execlists_create_virtual(struct i915_gem_context *ctx,
 			       struct intel_engine_cs **siblings,
@@ -3321,6 +3408,7 @@ intel_execlists_create_virtual(struct i915_gem_context *ctx,
 
 	ve->base.schedule = i915_schedule;
 	ve->base.submit_request = virtual_submit_request;
+	ve->base.bond_execute = virtual_bond_execute;
 
 	ve->base.execlists.queue_priority_hint = INT_MIN;
 	tasklet_init(&ve->base.execlists.tasklet,
@@ -3409,9 +3497,70 @@ intel_execlists_clone_virtual(struct i915_gem_context *ctx,
 	if (IS_ERR(dst))
 		return dst;
 
+	if (se->num_bonds) {
+		struct virtual_engine *de = to_virtual_engine(dst->engine);
+
+		de->bonds = kmemdup(se->bonds,
+				    sizeof(*se->bonds) * se->num_bonds,
+				    GFP_KERNEL);
+		if (!de->bonds) {
+			intel_context_put(dst);
+			return ERR_PTR(-ENOMEM);
+		}
+
+		de->num_bonds = se->num_bonds;
+	}
+
 	return dst;
 }
 
+static unsigned long
+virtual_sibling_mask(struct virtual_engine *ve, unsigned long mask)
+{
+	unsigned long emask = 0;
+	int bit;
+
+	for_each_set_bit(bit, &mask, ve->count)
+		emask |= ve->siblings[bit]->mask;
+
+	return emask;
+}
+
+int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
+				     struct intel_engine_cs *master,
+				     unsigned long mask)
+{
+	struct virtual_engine *ve = to_virtual_engine(engine);
+	struct ve_bond *bond;
+
+	if (mask >> ve->count)
+		return -EINVAL;
+
+	mask = virtual_sibling_mask(ve, mask);
+	if (!mask)
+		return -EINVAL;
+
+	bond = virtual_find_bond(ve, master);
+	if (bond) {
+		bond->sibling_mask |= mask;
+		return 0;
+	}
+
+	bond = krealloc(ve->bonds,
+			sizeof(*bond) * (ve->num_bonds + 1),
+			GFP_KERNEL);
+	if (!bond)
+		return -ENOMEM;
+
+	bond[ve->num_bonds].master = master;
+	bond[ve->num_bonds].sibling_mask = mask;
+
+	ve->bonds = bond;
+	ve->num_bonds++;
+
+	return 0;
+}
+
 void intel_execlists_show_requests(struct intel_engine_cs *engine,
 				   struct drm_printer *m,
 				   void (*show_request)(struct drm_printer *m,
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 0a7d1ebbec2a..0487f329791a 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -122,6 +122,10 @@ struct intel_context *
 intel_execlists_clone_virtual(struct i915_gem_context *ctx,
 			      struct intel_engine_cs *src);
 
+int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
+				     struct intel_engine_cs *master,
+				     unsigned long mask);
+
 u32 gen8_make_rpcs(struct drm_i915_private *i915, struct intel_sseu *ctx_sseu);
 
 #endif /* _INTEL_LRC_H_ */
diff --git a/drivers/gpu/drm/i915/selftests/intel_lrc.c b/drivers/gpu/drm/i915/selftests/intel_lrc.c
index 83bcb3ca6501..d4aa62920ba6 100644
--- a/drivers/gpu/drm/i915/selftests/intel_lrc.c
+++ b/drivers/gpu/drm/i915/selftests/intel_lrc.c
@@ -15,6 +15,7 @@
 #include "igt_live_test.h"
 #include "igt_spinner.h"
 #include "i915_random.h"
+#include "lib_sw_fence.h"
 
 #include "mock_context.h"
 
@@ -1298,6 +1299,197 @@ static int live_virtual_engine(void *arg)
 	return err;
 }
 
+static int bond_virtual_engine(struct drm_i915_private *i915,
+			       unsigned int class,
+			       struct intel_engine_cs **siblings,
+			       unsigned int nsibling,
+			       unsigned int flags)
+#define BOND_SCHEDULE BIT(0)
+{
+	struct intel_engine_cs *master;
+	struct i915_gem_context *ctx;
+	struct i915_request *rq[16];
+	enum intel_engine_id id;
+	unsigned long n;
+	int err;
+
+	GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
+
+	ctx = kernel_context(i915);
+	if (!ctx)
+		return -ENOMEM;
+
+	err = 0;
+	rq[0] = ERR_PTR(-ENOMEM);
+	for_each_engine(master, i915, id) {
+		struct i915_sw_fence fence = {};
+
+		if (master->class == class)
+			continue;
+
+		memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
+
+		rq[0] = igt_request_alloc(ctx, master);
+		if (IS_ERR(rq[0])) {
+			err = PTR_ERR(rq[0]);
+			goto out;
+		}
+		i915_request_get(rq[0]);
+
+		if (flags & BOND_SCHEDULE) {
+			onstack_fence_init(&fence);
+			err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
+							       &fence,
+							       GFP_KERNEL);
+		}
+		i915_request_add(rq[0]);
+		if (err < 0)
+			goto out;
+
+		for (n = 0; n < nsibling; n++) {
+			struct intel_context *ve;
+
+			ve = intel_execlists_create_virtual(ctx,
+							    siblings,
+							    nsibling);
+			if (IS_ERR(ve)) {
+				err = PTR_ERR(ve);
+				onstack_fence_fini(&fence);
+				goto out;
+			}
+
+			err = intel_virtual_engine_attach_bond(ve->engine,
+							       master,
+							       BIT(n));
+			if (err) {
+				intel_context_put(ve);
+				onstack_fence_fini(&fence);
+				goto out;
+			}
+
+			err = intel_context_pin(ve);
+			intel_context_put(ve);
+			if (err) {
+				onstack_fence_fini(&fence);
+				goto out;
+			}
+
+			rq[n + 1] = i915_request_create(ve);
+			intel_context_unpin(ve);
+			if (IS_ERR(rq[n + 1])) {
+				err = PTR_ERR(rq[n + 1]);
+				onstack_fence_fini(&fence);
+				goto out;
+			}
+			i915_request_get(rq[n + 1]);
+
+			err = i915_request_await_execution(rq[n + 1],
+							   &rq[0]->fence,
+							   ve->engine->bond_execute);
+			i915_request_add(rq[n + 1]);
+			if (err < 0) {
+				onstack_fence_fini(&fence);
+				goto out;
+			}
+		}
+		onstack_fence_fini(&fence);
+
+		if (i915_request_wait(rq[0],
+				      I915_WAIT_LOCKED,
+				      HZ / 10) < 0) {
+			pr_err("Master request did not execute (on %s)!\n",
+			       rq[0]->engine->name);
+			err = -EIO;
+			goto out;
+		}
+
+		for (n = 0; n < nsibling; n++) {
+			if (i915_request_wait(rq[n + 1],
+					      I915_WAIT_LOCKED,
+					      MAX_SCHEDULE_TIMEOUT) < 0) {
+				err = -EIO;
+				goto out;
+			}
+
+			if (rq[n + 1]->engine != siblings[n]) {
+				pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
+				       siblings[n]->name,
+				       rq[n + 1]->engine->name,
+				       rq[0]->engine->name);
+				err = -EINVAL;
+				goto out;
+			}
+		}
+
+		for (n = 0; !IS_ERR(rq[n]); n++)
+			i915_request_put(rq[n]);
+		rq[0] = ERR_PTR(-ENOMEM);
+	}
+
+out:
+	for (n = 0; !IS_ERR(rq[n]); n++)
+		i915_request_put(rq[n]);
+	if (igt_flush_test(i915, I915_WAIT_LOCKED))
+		err = -EIO;
+
+	kernel_context_close(ctx);
+	return err;
+}
+
+static int live_virtual_bond(void *arg)
+{
+	static const struct phase {
+		const char *name;
+		unsigned int flags;
+	} phases[] = {
+		{ "", 0 },
+		{ "schedule", BOND_SCHEDULE },
+		{ },
+	};
+	struct drm_i915_private *i915 = arg;
+	struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
+	unsigned int class, inst;
+	int err = 0;
+
+	if (USES_GUC_SUBMISSION(i915))
+		return 0;
+
+	i915_gem_unpark(i915);
+	mutex_lock(&i915->drm.struct_mutex);
+
+	for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
+		const struct phase *p;
+		int nsibling;
+
+		nsibling = 0;
+		for (inst = 0; inst <= MAX_ENGINE_INSTANCE; inst++) {
+			if (!i915->engine_class[class][inst])
+				break;
+
+			GEM_BUG_ON(nsibling == ARRAY_SIZE(siblings));
+			siblings[nsibling++] = i915->engine_class[class][inst];
+		}
+		if (nsibling < 2)
+			continue;
+
+		for (p = phases; p->name; p++) {
+			err = bond_virtual_engine(i915,
+						  class, siblings, nsibling,
+						  p->flags);
+			if (err) {
+				pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
+				       __func__, p->name, class, nsibling, err);
+				goto out_unlock;
+			}
+		}
+	}
+
+out_unlock:
+	mutex_unlock(&i915->drm.struct_mutex);
+	i915_gem_park(i915);
+	return err;
+}
+
 int intel_execlists_live_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
@@ -1310,6 +1502,7 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(live_preempt_hang),
 		SUBTEST(live_preempt_smoke),
 		SUBTEST(live_virtual_engine),
+		SUBTEST(live_virtual_bond),
 	};
 
 	if (!HAS_EXECLISTS(i915))
diff --git a/drivers/gpu/drm/i915/selftests/lib_sw_fence.c b/drivers/gpu/drm/i915/selftests/lib_sw_fence.c
index 2bfa72c1654b..b976c12817c5 100644
--- a/drivers/gpu/drm/i915/selftests/lib_sw_fence.c
+++ b/drivers/gpu/drm/i915/selftests/lib_sw_fence.c
@@ -45,6 +45,9 @@ void __onstack_fence_init(struct i915_sw_fence *fence,
 
 void onstack_fence_fini(struct i915_sw_fence *fence)
 {
+	if (!fence->flags)
+		return;
+
 	i915_sw_fence_commit(fence);
 	i915_sw_fence_fini(fence);
 }
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 4f49a0c07ab7..619a0a256b4e 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1532,6 +1532,10 @@ struct drm_i915_gem_context_param {
  * sized argument, will revert back to default settings.
  *
  * See struct i915_context_param_engines.
+ *
+ * Extensions:
+ *   i915_context_engines_load_balance (I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE)
+ *   i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND)
  */
 #define I915_CONTEXT_PARAM_ENGINES	0xa
 /* Must be kept compact -- no holes and well documented */
@@ -1626,9 +1630,38 @@ struct i915_context_engines_load_balance {
 	__u64 mbz64[4]; /* reserved for future use; must be zero */
 };
 
+/*
+ * i915_context_engines_bond:
+ *
+ * Constructed bonded pairs for execution within a virtual engine.
+ *
+ * All engines are equal, but some are more equal than others. Given
+ * the distribution of resources in the HW, it may be preferable to run
+ * a request on a given subset of engines in parallel to a request on a
+ * specific engine. We enable this selection of engines within a virtual
+ * engine by specifying bonding pairs, for any given master engine we will
+ * only execute on one of the corresponding siblings within the virtual engine.
+ *
+ * To execute a request in parallel on the master engine and a sibling requires
+ * coordination with a I915_EXEC_FENCE_SUBMIT.
+ */
+struct i915_context_engines_bond {
+	struct i915_user_extension base;
+
+	__u16 virtual_index; /* index of virtual engine in ctx->engines[] */
+	__u16 mbz;
+
+	__u16 master_class;
+	__u16 master_instance;
+
+	__u64 sibling_mask; /* bitmask of BIT(sibling_index) wrt the v.engine */
+	__u64 flags; /* all undefined flags must be zero */
+};
+
 struct i915_context_param_engines {
 	__u64 extensions; /* linked chain of extension blocks, 0 terminates */
 #define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0
+#define I915_CONTEXT_ENGINES_EXT_BOND 1
 
 	struct {
 		__u16 engine_class; /* see enum drm_i915_gem_engine_class */
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 22/22] drm/i915: Allow specification of parallel execbuf
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
                   ` (20 preceding siblings ...)
  2019-03-25  9:04 ` [PATCH 21/22] drm/i915/execlists: Virtual engine bonding Chris Wilson
@ 2019-03-25  9:04 ` Chris Wilson
  2019-03-25  9:19 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/22] drm/i915: Report the correct errno from i915_gem_context_open() Patchwork
                   ` (3 subsequent siblings)
  25 siblings, 0 replies; 43+ messages in thread
From: Chris Wilson @ 2019-03-25  9:04 UTC (permalink / raw)
  To: intel-gfx

There is a desire to split a task onto two engines and have them run at
the same time, e.g. scanline interleaving to spread the workload evenly.
Through the use of the out-fence from the first execbuf, we can
coordinate secondary execbuf to only become ready simultaneously with
the first, so that with all things idle the second execbufs are executed
in parallel with the first. The key difference here between the new
EXEC_FENCE_SUBMIT and the existing EXEC_FENCE_IN is that the in-fence
waits for the completion of the first request (so that all of its
rendering results are visible to the second execbuf, the more common
userspace fence requirement).

Since we only have a single input fence slot, userspace cannot mix an
in-fence and a submit-fence. It has to use one or the other! This is not
such a harsh requirement, since by virtue of the submit-fence, the
secondary execbuf inherit all of the dependencies from the first
request, and for the application the dependencies should be common
between the primary and secondary execbuf.

Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Testcase: igt/gem_exec_fence/parallel
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c            |  1 +
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 25 +++++++++++++++++++++-
 include/uapi/drm/i915_drm.h                | 17 ++++++++++++++-
 3 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 5465b99b4392..73fb4ce6d4a9 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -426,6 +426,7 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
 	case I915_PARAM_HAS_EXEC_CAPTURE:
 	case I915_PARAM_HAS_EXEC_BATCH_FIRST:
 	case I915_PARAM_HAS_EXEC_FENCE_ARRAY:
+	case I915_PARAM_HAS_EXEC_SUBMIT_FENCE:
 		/* For the time being all of these are always true;
 		 * if some supported hardware does not have one of these
 		 * features this value needs to be provided from
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index c2b95cc0f41d..fd6a00479cc6 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -2313,6 +2313,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 {
 	struct i915_execbuffer eb;
 	struct dma_fence *in_fence = NULL;
+	struct dma_fence *exec_fence = NULL;
 	struct sync_file *out_fence = NULL;
 	int out_fence_fd = -1;
 	int err;
@@ -2355,11 +2356,24 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 			return -EINVAL;
 	}
 
+	if (args->flags & I915_EXEC_FENCE_SUBMIT) {
+		if (in_fence) {
+			err = -EINVAL;
+			goto err_in_fence;
+		}
+
+		exec_fence = sync_file_get_fence(lower_32_bits(args->rsvd2));
+		if (!exec_fence) {
+			err = -EINVAL;
+			goto err_in_fence;
+		}
+	}
+
 	if (args->flags & I915_EXEC_FENCE_OUT) {
 		out_fence_fd = get_unused_fd_flags(O_CLOEXEC);
 		if (out_fence_fd < 0) {
 			err = out_fence_fd;
-			goto err_in_fence;
+			goto err_exec_fence;
 		}
 	}
 
@@ -2489,6 +2503,13 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 			goto err_request;
 	}
 
+	if (exec_fence) {
+		err = i915_request_await_execution(eb.request, exec_fence,
+						   eb.engine->bond_execute);
+		if (err < 0)
+			goto err_request;
+	}
+
 	if (fences) {
 		err = await_fence_array(&eb, fences);
 		if (err)
@@ -2550,6 +2571,8 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 err_out_fence:
 	if (out_fence_fd != -1)
 		put_unused_fd(out_fence_fd);
+err_exec_fence:
+	dma_fence_put(exec_fence);
 err_in_fence:
 	dma_fence_put(in_fence);
 	return err;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 619a0a256b4e..6b446315c677 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -593,6 +593,12 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_MMAP_GTT_COHERENT	52
 
+/*
+ * Query whether DRM_I915_GEM_EXECBUFFER2 supports coordination of parallel
+ * execution through use of explicit fence support.
+ * See I915_EXEC_FENCE_OUT and I915_EXEC_FENCE_SUBMIT.
+ */
+#define I915_PARAM_HAS_EXEC_SUBMIT_FENCE 53
 /* Must be kept compact -- no holes and well documented */
 
 typedef struct drm_i915_getparam {
@@ -1115,7 +1121,16 @@ struct drm_i915_gem_execbuffer2 {
  */
 #define I915_EXEC_FENCE_ARRAY   (1<<19)
 
-#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_ARRAY<<1))
+/*
+ * Setting I915_EXEC_FENCE_SUBMIT implies that lower_32_bits(rsvd2) represent
+ * a sync_file fd to wait upon (in a nonblocking manner) prior to executing
+ * the batch.
+ *
+ * Returns -EINVAL if the sync_file fd cannot be found.
+ */
+#define I915_EXEC_FENCE_SUBMIT		(1 << 20)
+
+#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_SUBMIT << 1))
 
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/22] drm/i915: Report the correct errno from i915_gem_context_open()
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
                   ` (21 preceding siblings ...)
  2019-03-25  9:04 ` [PATCH 22/22] drm/i915: Allow specification of parallel execbuf Chris Wilson
@ 2019-03-25  9:19 ` Patchwork
  2019-03-25  9:28 ` ✗ Fi.CI.SPARSE: " Patchwork
                   ` (2 subsequent siblings)
  25 siblings, 0 replies; 43+ messages in thread
From: Patchwork @ 2019-03-25  9:19 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/22] drm/i915: Report the correct errno from i915_gem_context_open()
URL   : https://patchwork.freedesktop.org/series/58517/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
a06b0f01cad0 drm/i915: Report the correct errno from i915_gem_context_open()
a1c4349dcef8 drm/i915/guc: Replace preempt_client lookup with engine->preempt_context
8733fc778e94 drm/i915: Pull the GEM powermangement coupling into its own file
-:417: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#417: 
new file mode 100644

-:422: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#422: FILE: drivers/gpu/drm/i915/i915_gem_pm.c:1:
+/*

-:423: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#423: FILE: drivers/gpu/drm/i915/i915_gem_pm.c:2:
+ * SPDX-License-Identifier: MIT

-:769: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#769: FILE: drivers/gpu/drm/i915/i915_gem_pm.h:1:
+/*

-:770: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#770: FILE: drivers/gpu/drm/i915/i915_gem_pm.h:2:
+ * SPDX-License-Identifier: MIT

-:803: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#803: FILE: drivers/gpu/drm/i915/test_i915_gem_pm_standalone.c:1:
+/*

-:804: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#804: FILE: drivers/gpu/drm/i915/test_i915_gem_pm_standalone.c:2:
+ * SPDX-License-Identifier: MIT

total: 0 errors, 7 warnings, 0 checks, 764 lines checked
81e689f91366 drm/i915: Guard unpark/park with the gt.active_mutex
-:62: CHECK:UNCOMMENTED_DEFINITION: struct mutex definition without comment
#62: FILE: drivers/gpu/drm/i915/i915_drv.h:2012:
+		struct mutex active_mutex;

-:206: WARNING:MEMORY_BARRIER: memory barrier without comment
#206: FILE: drivers/gpu/drm/i915/i915_gem_pm.c:195:
+		smp_mb__before_atomic();

total: 0 errors, 1 warnings, 1 checks, 320 lines checked
813a6beabd0d drm/i915/selftests: Take GEM runtime wakeref
00464cbffc7a drm/i915: Pass intel_context to i915_request_create()
eeda44334ce5 drm/i915/gvt: Pin the per-engine GVT shadow contexts
1b6f6cddd7c4 drm/i915: Explicitly pin the logical context for execbuf
fcd27603aec2 drm/i915: Export intel_context_instance()
3b7120afe774 drm/i915/selftests: Use the real kernel context for sseu isolation tests
b99793a93604 drm/i915/selftests: Pass around intel_context for sseu
995399ebe6ae drm/i915: Pass intel_context to intel_context_pin_lock()
dec01d2d4aa7 drm/i915: Split engine setup/init into two phases
a40d0de34728 drm/i915: Switch back to an array of logical per-engine HW contexts
-:340: WARNING:LINE_SPACING: Missing a blank line after declarations
#340: FILE: drivers/gpu/drm/i915/i915_gem_context.h:217:
+		struct i915_gem_engines *e = rcu_dereference(ctx->engines);
+		if (likely(idx < e->num_engines && e->engines[idx]))

total: 0 errors, 1 warnings, 0 checks, 889 lines checked
8fcd1ab51436 drm/i915: Move i915_request_alloc into selftests/
-:262: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#262: 
new file mode 100644

-:267: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#267: FILE: drivers/gpu/drm/i915/selftests/igt_gem_utils.c:1:
+/*

-:268: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#268: FILE: drivers/gpu/drm/i915/selftests/igt_gem_utils.c:2:
+ * SPDX-License-Identifier: MIT

-:312: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#312: FILE: drivers/gpu/drm/i915/selftests/igt_gem_utils.h:1:
+/*

-:313: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#313: FILE: drivers/gpu/drm/i915/selftests/igt_gem_utils.h:2:
+ * SPDX-License-Identifier: MIT

total: 0 errors, 5 warnings, 0 checks, 392 lines checked
206ad765f3d2 drm/i915: Allow a context to define its set of engines
-:402: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects?
#402: FILE: drivers/gpu/drm/i915/i915_utils.h:107:
+#define check_struct_size(p, member, n, sz) \
+	likely(__check_struct_size(sizeof(*(p)), \
+				   sizeof(*(p)->member) + __must_be_array((p)->member), \
+				   n, sz))

-:402: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'member' - possible side-effects?
#402: FILE: drivers/gpu/drm/i915/i915_utils.h:107:
+#define check_struct_size(p, member, n, sz) \
+	likely(__check_struct_size(sizeof(*(p)), \
+				   sizeof(*(p)->member) + __must_be_array((p)->member), \
+				   n, sz))

-:402: CHECK:MACRO_ARG_PRECEDENCE: Macro argument 'member' may be better as '(member)' to avoid precedence issues
#402: FILE: drivers/gpu/drm/i915/i915_utils.h:107:
+#define check_struct_size(p, member, n, sz) \
+	likely(__check_struct_size(sizeof(*(p)), \
+				   sizeof(*(p)->member) + __must_be_array((p)->member), \
+				   n, sz))

total: 0 errors, 0 warnings, 3 checks, 403 lines checked
266e99111426 drm/i915: Allow userspace to clone contexts on creation
-:183: ERROR:BRACKET_SPACE: space prohibited before open square bracket '['
#183: FILE: drivers/gpu/drm/i915/i915_gem_context.c:1935:
+#define MAP(x, y) [ilog2(I915_CONTEXT_CLONE_##x)] = y

total: 1 errors, 0 warnings, 0 checks, 235 lines checked
ed63d4481857 drm/i915: Load balancing across a virtual engine
3c8fb3d139bb drm/i915: Extend execution fence to support a callback
cd0e579a0cda drm/i915: Move intel_engine_mask_t around for use by i915_request_types.h
-:652: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#652: 
new file mode 100644

-:657: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#657: FILE: drivers/gpu/drm/i915/i915_scheduler_types.h:1:
+/*

-:658: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#658: FILE: drivers/gpu/drm/i915/i915_scheduler_types.h:2:
+ * SPDX-License-Identifier: MIT

-:912: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#912: FILE: drivers/gpu/drm/i915/test_i915_scheduler_types_standalone.c:1:
+/*

-:913: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#913: FILE: drivers/gpu/drm/i915/test_i915_scheduler_types_standalone.c:2:
+ * SPDX-License-Identifier: MIT

total: 0 errors, 5 warnings, 0 checks, 729 lines checked
dd039722b189 drm/i915/execlists: Virtual engine bonding
57e7c5c70ad5 drm/i915: Allow specification of parallel execbuf

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/22] drm/i915: Report the correct errno from i915_gem_context_open()
  2019-03-25  9:03 ` [PATCH 01/22] drm/i915: Report the correct errno from i915_gem_context_open() Chris Wilson
@ 2019-03-25  9:24   ` Tvrtko Ursulin
  0 siblings, 0 replies; 43+ messages in thread
From: Tvrtko Ursulin @ 2019-03-25  9:24 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 25/03/2019 09:03, Chris Wilson wrote:
> Fixup the errno as we adjusted the error path to receive the errno and
> not computed itself from ERR_PTR(ctx) anymore.
> 
> drivers/gpu/drm/i915/i915_gem_context.c:793 i915_gem_context_open() warn: passing a valid pointer to 'PTR_ERR'
> 
> Fixes: 3aa9945a528e ("drm/i915: Separate GEM context construction and registration to userspace")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem_context.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index e6f594668245..25f267a03d3d 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -709,7 +709,7 @@ int i915_gem_context_open(struct drm_i915_private *i915,
>   	idr_destroy(&file_priv->context_idr);
>   	mutex_destroy(&file_priv->vm_idr_lock);
>   	mutex_destroy(&file_priv->context_idr_lock);
> -	return PTR_ERR(ctx);
> +	return err;
>   }
>   
>   void i915_gem_context_close(struct drm_file *file)
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* ✗ Fi.CI.SPARSE: warning for series starting with [01/22] drm/i915: Report the correct errno from i915_gem_context_open()
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
                   ` (22 preceding siblings ...)
  2019-03-25  9:19 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/22] drm/i915: Report the correct errno from i915_gem_context_open() Patchwork
@ 2019-03-25  9:28 ` Patchwork
  2019-03-25  9:39 ` ✗ Fi.CI.BAT: failure " Patchwork
  2019-03-25 10:52 ` ctx->engines[rcs0, rcs0] Tvrtko Ursulin
  25 siblings, 0 replies; 43+ messages in thread
From: Patchwork @ 2019-03-25  9:28 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/22] drm/i915: Report the correct errno from i915_gem_context_open()
URL   : https://patchwork.freedesktop.org/series/58517/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Sparse version: v0.5.2
Commit: drm/i915: Report the correct errno from i915_gem_context_open()
Okay!

Commit: drm/i915/guc: Replace preempt_client lookup with engine->preempt_context
Okay!

Commit: drm/i915: Pull the GEM powermangement coupling into its own file
+./include/uapi/linux/perf_event.h:147:56: warning: cast truncates bits from constant value (8000000000000000 becomes 0)

Commit: drm/i915: Guard unpark/park with the gt.active_mutex
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3585:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3586:16: warning: expression using sizeof(void)

Commit: drm/i915/selftests: Take GEM runtime wakeref
Okay!

Commit: drm/i915: Pass intel_context to i915_request_create()
Okay!

Commit: drm/i915/gvt: Pin the per-engine GVT shadow contexts
Okay!

Commit: drm/i915: Explicitly pin the logical context for execbuf
Okay!

Commit: drm/i915: Export intel_context_instance()
-O:drivers/gpu/drm/i915/intel_context.c:129:22: warning: context imbalance in 'intel_context_pin_lock' - wrong count at exit
+drivers/gpu/drm/i915/intel_context.c:129:22: warning: context imbalance in 'intel_context_pin_lock' - wrong count at exit
+drivers/gpu/drm/i915/intel_context.c:148:6: warning: context imbalance in 'intel_context_pin_unlock' - wrong count at exit

Commit: drm/i915/selftests: Use the real kernel context for sseu isolation tests
Okay!

Commit: drm/i915/selftests: Pass around intel_context for sseu
Okay!

Commit: drm/i915: Pass intel_context to intel_context_pin_lock()
-O:drivers/gpu/drm/i915/intel_context.c:129:22: warning: context imbalance in 'intel_context_pin_lock' - wrong count at exit
-O:drivers/gpu/drm/i915/intel_context.c:148:6: warning: context imbalance in 'intel_context_pin_unlock' - wrong count at exit

Commit: drm/i915: Split engine setup/init into two phases
Okay!

Commit: drm/i915: Switch back to an array of logical per-engine HW contexts
+drivers/gpu/drm/i915/i915_gem_context.c:295:22: warning: dereference of noderef expression
+drivers/gpu/drm/i915/i915_gem_context.c:426:9: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/i915/i915_gem_context.h:185:16: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/i915/i915_gem_context.h:185:16: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/i915/i915_gem_context.h:185:16: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/i915/i915_gem_context.h:185:16: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/i915/i915_gem_context.h:185:16: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/i915/i915_gem_context.h:185:16: warning: dereference of noderef expression
+drivers/gpu/drm/i915/i915_gem_context.h:185:16: warning: dereference of noderef expression
+drivers/gpu/drm/i915/i915_gem_context.h:185:16: warning: dereference of noderef expression
+drivers/gpu/drm/i915/i915_gem_context.h:185:16: warning: dereference of noderef expression
+drivers/gpu/drm/i915/i915_gem_context.h:185:16: warning: dereference of noderef expression
+drivers/gpu/drm/i915/i915_gem_context.h:216:46: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/i915/i915_gem_context.h:216:46: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/i915/i915_gem_context.h:216:46: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/i915/i915_gem_context.h:216:46: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/i915/i915_gem_context.h:216:46: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/i915/i915_gem_context.h:216:46: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/i915/selftests/mock_context.c:48:9: warning: dereference of noderef expression
+drivers/gpu/drm/i915/selftests/mock_context.c:77:22: warning: dereference of noderef expression
+./include/linux/overflow.h:285:13: error: incorrect type in conditional
+./include/linux/overflow.h:285:13: error: undefined identifier '__builtin_mul_overflow'
+./include/linux/overflow.h:285:13:    got void
+./include/linux/overflow.h:285:13: warning: call with no type!
+./include/linux/overflow.h:287:13: error: incorrect type in conditional
+./include/linux/overflow.h:287:13: error: undefined identifier '__builtin_add_overflow'
+./include/linux/overflow.h:287:13:    got void
+./include/linux/overflow.h:287:13: warning: call with no type!

Commit: drm/i915: Move i915_request_alloc into selftests/
-drivers/gpu/drm/i915/i915_gem_context.h:216:46: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/i915/selftests/../i915_gem_context.h:216:46: error: incompatible types in comparison expression (different address spaces)
+./include/uapi/linux/perf_event.h:147:56: warning: cast truncates bits from constant value (8000000000000000 becomes 0)

Commit: drm/i915: Allow a context to define its set of engines
+drivers/gpu/drm/i915/i915_gem_context.c:1636:14: warning: dereference of noderef expression
+drivers/gpu/drm/i915/i915_gem_context.c:1636:14: warning: dereference of noderef expression
+drivers/gpu/drm/i915/i915_gem_context.c:1641:13: warning: dereference of noderef expression
+drivers/gpu/drm/i915/i915_gem_context.c:1660:35: warning: dereference of noderef expression
+drivers/gpu/drm/i915/i915_gem_context.h:203:16: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/i915/i915_gem_context.h:203:16: warning: dereference of noderef expression
+drivers/gpu/drm/i915/i915_utils.h:84:13: error: incorrect type in conditional
+drivers/gpu/drm/i915/i915_utils.h:84:13: error: undefined identifier '__builtin_mul_overflow'
+drivers/gpu/drm/i915/i915_utils.h:84:13:    got void
+drivers/gpu/drm/i915/i915_utils.h:84:13: warning: call with no type!
+drivers/gpu/drm/i915/i915_utils.h:87:13: error: incorrect type in conditional
+drivers/gpu/drm/i915/i915_utils.h:87:13: error: undefined identifier '__builtin_add_overflow'
+drivers/gpu/drm/i915/i915_utils.h:87:13:    got void
+drivers/gpu/drm/i915/i915_utils.h:87:13: warning: call with no type!
-drivers/gpu/drm/i915/selftests/../i915_gem_context.h:216:46: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/i915/selftests/../i915_gem_context.h:234:46: error: incompatible types in comparison expression (different address spaces)
+./include/linux/overflow.h:285:13: error: incorrect type in conditional
+./include/linux/overflow.h:285:13: error: not a function <noident>
+./include/linux/overflow.h:285:13:    got void
+./include/linux/overflow.h:287:13: error: incorrect type in conditional
+./include/linux/overflow.h:287:13: error: not a function <noident>
+./include/linux/overflow.h:287:13:    got void

Commit: drm/i915: Allow userspace to clone contexts on creation
-drivers/gpu/drm/i915/i915_gem_context.c:1636:14: warning: dereference of noderef expression
-drivers/gpu/drm/i915/i915_gem_context.c:1636:14: warning: dereference of noderef expression
-drivers/gpu/drm/i915/i915_gem_context.c:1641:13: warning: dereference of noderef expression
-drivers/gpu/drm/i915/i915_gem_context.c:1660:35: warning: dereference of noderef expression
-drivers/gpu/drm/i915/i915_gem_context.c:306:22: warning: dereference of noderef expression
+drivers/gpu/drm/i915/i915_gem_context.c:1810:9: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/i915/i915_gem_context.c:1936:17: error: bad integer constant expression
+drivers/gpu/drm/i915/i915_gem_context.c:1937:17: error: bad integer constant expression
+drivers/gpu/drm/i915/i915_gem_context.c:1938:17: error: bad integer constant expression
+drivers/gpu/drm/i915/i915_gem_context.c:1939:17: error: bad integer constant expression
+drivers/gpu/drm/i915/i915_gem_context.c:1940:17: error: bad integer constant expression
+drivers/gpu/drm/i915/i915_gem_context.c:1941:17: error: bad integer constant expression
-drivers/gpu/drm/i915/i915_gem_context.h:203:16: warning: dereference of noderef expression
-drivers/gpu/drm/i915/i915_gem_context.h:203:16: warning: dereference of noderef expression
+drivers/gpu/drm/i915/i915_gem_context.h:203:16: error: incompatible types in comparison expression (different address spaces)
+drivers/gpu/drm/i915/i915_gem_context.h:203:16: error: incompatible types in comparison expression (different address spaces)
-drivers/gpu/drm/i915/i915_utils.h:84:13: warning: call with no type!
-drivers/gpu/drm/i915/i915_utils.h:87:13: warning: call with no type!
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:1254:25: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:1254:25: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:453:16: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:569:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:569:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:690:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:690:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/selftests/mock_context.c:48:9: warning: dereference of noderef expression
-drivers/gpu/drm/i915/selftests/mock_context.c:77:22: warning: dereference of noderef expression
+./include/linux/overflow.h:285:13: error: incorrect type in conditional
+./include/linux/overflow.h:285:13: error: not a function <noident>
-./include/linux/overflow.h:285:13: warning: call with no type!
+./include/linux/overflow.h:285:13:    got void
+./include/linux/overflow.h:287:13: error: incorrect type in conditional
+./include/linux/overflow.h:287:13: error: not a function <noident>
-./include/linux/overflow.h:287:13: warning: call with no type!
+./include/linux/overflow.h:287:13:    got void
-./include/linux/slab.h:664:13: warning: call with no type!

Commit: drm/i915: Load balancing across a virtual engine
+./include/linux/overflow.h:285:13: error: incorrect type in conditional
+./include/linux/overflow.h:285:13: error: undefined identifier '__builtin_mul_overflow'
+./include/linux/overflow.h:285:13:    got void
+./include/linux/overflow.h:285:13: warning: call with no type!
+./include/linux/overflow.h:287:13: error: incorrect type in conditional
+./include/linux/overflow.h:287:13: error: undefined identifier '__builtin_add_overflow'
+./include/linux/overflow.h:287:13:    got void
+./include/linux/overflow.h:287:13: warning: call with no type!

Commit: drm/i915: Extend execution fence to support a callback
Okay!

Commit: drm/i915: Move intel_engine_mask_t around for use by i915_request_types.h
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3586:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3585:16: warning: expression using sizeof(void)

Commit: drm/i915/execlists: Virtual engine bonding
Okay!

Commit: drm/i915: Allow specification of parallel execbuf
Okay!

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* ✗ Fi.CI.BAT: failure for series starting with [01/22] drm/i915: Report the correct errno from i915_gem_context_open()
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
                   ` (23 preceding siblings ...)
  2019-03-25  9:28 ` ✗ Fi.CI.SPARSE: " Patchwork
@ 2019-03-25  9:39 ` Patchwork
  2019-03-25 10:52 ` ctx->engines[rcs0, rcs0] Tvrtko Ursulin
  25 siblings, 0 replies; 43+ messages in thread
From: Patchwork @ 2019-03-25  9:39 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/22] drm/i915: Report the correct errno from i915_gem_context_open()
URL   : https://patchwork.freedesktop.org/series/58517/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_5808 -> Patchwork_12588
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_12588 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_12588, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/58517/revisions/1/mbox/

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_12588:

### IGT changes ###

#### Possible regressions ####

  * igt@gem_exec_suspend@basic-s3:
    - fi-ilk-650:         PASS -> INCOMPLETE
    - fi-ivb-3770:        PASS -> INCOMPLETE

  
Known issues
------------

  Here are the changes found in Patchwork_12588 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_suspend@basic-s3:
    - fi-kbl-guc:         PASS -> INCOMPLETE [fdo#108743]
    - fi-kbl-8809g:       PASS -> INCOMPLETE [fdo#103665] / [fdo#108126]
    - fi-icl-u3:          PASS -> INCOMPLETE [fdo#107713]
    - fi-elk-e7500:       PASS -> INCOMPLETE [fdo#103989]
    - fi-cfl-guc:         PASS -> INCOMPLETE [fdo#108126] / [fdo#108743]
    - fi-blb-e6850:       PASS -> INCOMPLETE [fdo#107718]
    - fi-cfl-8700k:       PASS -> INCOMPLETE [fdo#108126]

  
  [fdo#103665]: https://bugs.freedesktop.org/show_bug.cgi?id=103665
  [fdo#103989]: https://bugs.freedesktop.org/show_bug.cgi?id=103989
  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#107718]: https://bugs.freedesktop.org/show_bug.cgi?id=107718
  [fdo#108126]: https://bugs.freedesktop.org/show_bug.cgi?id=108126
  [fdo#108743]: https://bugs.freedesktop.org/show_bug.cgi?id=108743


Participating hosts (36 -> 9)
------------------------------

  ERROR: It appears as if the changes made in Patchwork_12588 prevented too many machines from booting.

  Missing    (27): fi-skl-6770hq fi-bdw-gvtdvm fi-apl-guc fi-snb-2520m fi-bxt-j4205 fi-pnv-d510 fi-skl-lmem fi-byt-n2820 fi-skl-6600u fi-hsw-4770r fi-bxt-dsi fi-bsw-n3050 fi-bwr-2160 fi-kbl-7500u fi-hsw-4770 fi-skl-6700k2 fi-kbl-7567u fi-ilk-m540 fi-skl-gvtdvm fi-skl-guc fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-whl-u fi-kbl-x1275 fi-cfl-8109u fi-skl-iommu 


Build changes
-------------

    * Linux: CI_DRM_5808 -> Patchwork_12588

  CI_DRM_5808: 0053882c75f979b475f4d543c4b14cbbed218f38 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4902: 3d325bb211d8cd84c6862c9945185a937395cb44 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_12588: 57e7c5c70ad561073658c2f565589e4aed191506 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

57e7c5c70ad5 drm/i915: Allow specification of parallel execbuf
dd039722b189 drm/i915/execlists: Virtual engine bonding
cd0e579a0cda drm/i915: Move intel_engine_mask_t around for use by i915_request_types.h
3c8fb3d139bb drm/i915: Extend execution fence to support a callback
ed63d4481857 drm/i915: Load balancing across a virtual engine
266e99111426 drm/i915: Allow userspace to clone contexts on creation
206ad765f3d2 drm/i915: Allow a context to define its set of engines
8fcd1ab51436 drm/i915: Move i915_request_alloc into selftests/
a40d0de34728 drm/i915: Switch back to an array of logical per-engine HW contexts
dec01d2d4aa7 drm/i915: Split engine setup/init into two phases
995399ebe6ae drm/i915: Pass intel_context to intel_context_pin_lock()
b99793a93604 drm/i915/selftests: Pass around intel_context for sseu
3b7120afe774 drm/i915/selftests: Use the real kernel context for sseu isolation tests
fcd27603aec2 drm/i915: Export intel_context_instance()
1b6f6cddd7c4 drm/i915: Explicitly pin the logical context for execbuf
eeda44334ce5 drm/i915/gvt: Pin the per-engine GVT shadow contexts
00464cbffc7a drm/i915: Pass intel_context to i915_request_create()
813a6beabd0d drm/i915/selftests: Take GEM runtime wakeref
81e689f91366 drm/i915: Guard unpark/park with the gt.active_mutex
8733fc778e94 drm/i915: Pull the GEM powermangement coupling into its own file
a1c4349dcef8 drm/i915/guc: Replace preempt_client lookup with engine->preempt_context
a06b0f01cad0 drm/i915: Report the correct errno from i915_gem_context_open()

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12588/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: ctx->engines[rcs0, rcs0]
  2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
                   ` (24 preceding siblings ...)
  2019-03-25  9:39 ` ✗ Fi.CI.BAT: failure " Patchwork
@ 2019-03-25 10:52 ` Tvrtko Ursulin
  2019-03-26  9:34   ` Chris Wilson
  25 siblings, 1 reply; 43+ messages in thread
From: Tvrtko Ursulin @ 2019-03-25 10:52 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 25/03/2019 09:03, Chris Wilson wrote:
> The headline change is the wholehearted decision to allow the user to
> establish an ctx->engines mapping of [rcs0, rcs0] to mean two logically
> distinct pipelines to the same engine. An example of this use case is in
> iris which constructs two GEM contexts in order to have two distinct
> logical ccontexts to create both a render pipeline and a compute pipeline
> within the same GL context. As it it the same GL context, it should be
> a singular timeline and benefits from a shared ppgtt; a natural way of
> constructing a GEM counterpart to this composite GL context is with an
> engine map of [rcs0, rcs0] and single timeline.
> 
> One corollary to this is that given an ctx->engines[], we cannot assume
> that (engine_class, engine_instance) is enough to adequately identify a
> unique logical context. I propose that given user ctx->engines[], we
> force all subsequent user engine lookups to use indices. Thoughts?

I guess we need to hear from Mesa - is one GEM context a big win there, 
as opposed to two contexts with shared PPGTT? If single timeline is 
desired between the two it might be, but I don't know.

Having seen the amount of new patches I am naturally averse towards 
having to review them just to get back to Virtual Engine.

Is there a way to decouple the new feature request to after Media 
Scalability? It sounds that there should be - we could start with 
disallowing [rcs0, rcs0] and then allow it later.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: ctx->engines[rcs0, rcs0]
  2019-03-25 10:52 ` ctx->engines[rcs0, rcs0] Tvrtko Ursulin
@ 2019-03-26  9:34   ` Chris Wilson
  0 siblings, 0 replies; 43+ messages in thread
From: Chris Wilson @ 2019-03-26  9:34 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-03-25 10:52:07)
> 
> On 25/03/2019 09:03, Chris Wilson wrote:
> > The headline change is the wholehearted decision to allow the user to
> > establish an ctx->engines mapping of [rcs0, rcs0] to mean two logically
> > distinct pipelines to the same engine. An example of this use case is in
> > iris which constructs two GEM contexts in order to have two distinct
> > logical ccontexts to create both a render pipeline and a compute pipeline
> > within the same GL context. As it it the same GL context, it should be
> > a singular timeline and benefits from a shared ppgtt; a natural way of
> > constructing a GEM counterpart to this composite GL context is with an
> > engine map of [rcs0, rcs0] and single timeline.
> > 
> > One corollary to this is that given an ctx->engines[], we cannot assume
> > that (engine_class, engine_instance) is enough to adequately identify a
> > unique logical context. I propose that given user ctx->engines[], we
> > force all subsequent user engine lookups to use indices. Thoughts?
> 
> I guess we need to hear from Mesa - is one GEM context a big win there, 
> as opposed to two contexts with shared PPGTT? If single timeline is 
> desired between the two it might be, but I don't know.
> 
> Having seen the amount of new patches I am naturally averse towards 
> having to review them just to get back to Virtual Engine.
> 
> Is there a way to decouple the new feature request to after Media 
> Scalability? It sounds that there should be - we could start with 
> disallowing [rcs0, rcs0] and then allow it later.

It's not strictly a feature either; but an artifact of the virtual
engine implementation is that it doesn't work with a single engine at
present. So the single engine instance is the virtual engine... That
replacing the degenerate veng with just a pointer to the physical had
relevancy after all ;)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 03/22] drm/i915: Pull the GEM powermangement coupling into its own file
  2019-03-25  9:03 ` [PATCH 03/22] drm/i915: Pull the GEM powermangement coupling into its own file Chris Wilson
@ 2019-04-01 14:56   ` Tvrtko Ursulin
  2019-04-01 15:39   ` Lucas De Marchi
  1 sibling, 0 replies; 43+ messages in thread
From: Tvrtko Ursulin @ 2019-04-01 14:56 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 25/03/2019 09:03, Chris Wilson wrote:
> Split out the powermanagement portion (GT wakeref, suspend/resume) of
> GEM from i915_gem.c into its own file.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/Makefile                 |   2 +
>   drivers/gpu/drm/i915/i915_gem.c               | 335 +----------------
>   drivers/gpu/drm/i915/i915_gem_pm.c            | 341 ++++++++++++++++++
>   drivers/gpu/drm/i915/i915_gem_pm.h            |  28 ++
>   .../drm/i915/test_i915_gem_pm_standalone.c    |   7 +
>   5 files changed, 381 insertions(+), 332 deletions(-)
>   create mode 100644 drivers/gpu/drm/i915/i915_gem_pm.c
>   create mode 100644 drivers/gpu/drm/i915/i915_gem_pm.h
>   create mode 100644 drivers/gpu/drm/i915/test_i915_gem_pm_standalone.c
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 60de05f3fa60..bd1657c3d395 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -61,6 +61,7 @@ i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
>   i915-$(CONFIG_DRM_I915_WERROR) += \
>   	test_i915_active_types_standalone.o \
>   	test_i915_gem_context_types_standalone.o \
> +	test_i915_gem_pm_standalone.o \
>   	test_i915_timeline_types_standalone.o \
>   	test_intel_context_types_standalone.o \
>   	test_intel_engine_types_standalone.o \
> @@ -81,6 +82,7 @@ i915-y += \
>   	  i915_gem_internal.o \
>   	  i915_gem.o \
>   	  i915_gem_object.o \
> +	  i915_gem_pm.o \
>   	  i915_gem_render_state.o \
>   	  i915_gem_shrinker.o \
>   	  i915_gem_stolen.o \
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index f6cdd5fb9deb..47c672432594 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -42,7 +42,7 @@
>   #include "i915_drv.h"
>   #include "i915_gem_clflush.h"
>   #include "i915_gemfs.h"
> -#include "i915_globals.h"
> +#include "i915_gem_pm.h"
>   #include "i915_reset.h"
>   #include "i915_trace.h"
>   #include "i915_vgpu.h"
> @@ -101,105 +101,6 @@ static void i915_gem_info_remove_obj(struct drm_i915_private *dev_priv,
>   	spin_unlock(&dev_priv->mm.object_stat_lock);
>   }
>   
> -static void __i915_gem_park(struct drm_i915_private *i915)
> -{
> -	intel_wakeref_t wakeref;
> -
> -	GEM_TRACE("\n");
> -
> -	lockdep_assert_held(&i915->drm.struct_mutex);
> -	GEM_BUG_ON(i915->gt.active_requests);
> -	GEM_BUG_ON(!list_empty(&i915->gt.active_rings));
> -
> -	if (!i915->gt.awake)
> -		return;
> -
> -	/*
> -	 * Be paranoid and flush a concurrent interrupt to make sure
> -	 * we don't reactivate any irq tasklets after parking.
> -	 *
> -	 * FIXME: Note that even though we have waited for execlists to be idle,
> -	 * there may still be an in-flight interrupt even though the CSB
> -	 * is now empty. synchronize_irq() makes sure that a residual interrupt
> -	 * is completed before we continue, but it doesn't prevent the HW from
> -	 * raising a spurious interrupt later. To complete the shield we should
> -	 * coordinate disabling the CS irq with flushing the interrupts.
> -	 */
> -	synchronize_irq(i915->drm.irq);
> -
> -	intel_engines_park(i915);
> -	i915_timelines_park(i915);
> -
> -	i915_pmu_gt_parked(i915);
> -	i915_vma_parked(i915);
> -
> -	wakeref = fetch_and_zero(&i915->gt.awake);
> -	GEM_BUG_ON(!wakeref);
> -
> -	if (INTEL_GEN(i915) >= 6)
> -		gen6_rps_idle(i915);
> -
> -	intel_display_power_put(i915, POWER_DOMAIN_GT_IRQ, wakeref);
> -
> -	i915_globals_park();
> -}
> -
> -void i915_gem_park(struct drm_i915_private *i915)
> -{
> -	GEM_TRACE("\n");
> -
> -	lockdep_assert_held(&i915->drm.struct_mutex);
> -	GEM_BUG_ON(i915->gt.active_requests);
> -
> -	if (!i915->gt.awake)
> -		return;
> -
> -	/* Defer the actual call to __i915_gem_park() to prevent ping-pongs */
> -	mod_delayed_work(i915->wq, &i915->gt.idle_work, msecs_to_jiffies(100));
> -}
> -
> -void i915_gem_unpark(struct drm_i915_private *i915)
> -{
> -	GEM_TRACE("\n");
> -
> -	lockdep_assert_held(&i915->drm.struct_mutex);
> -	GEM_BUG_ON(!i915->gt.active_requests);
> -	assert_rpm_wakelock_held(i915);
> -
> -	if (i915->gt.awake)
> -		return;
> -
> -	/*
> -	 * It seems that the DMC likes to transition between the DC states a lot
> -	 * when there are no connected displays (no active power domains) during
> -	 * command submission.
> -	 *
> -	 * This activity has negative impact on the performance of the chip with
> -	 * huge latencies observed in the interrupt handler and elsewhere.
> -	 *
> -	 * Work around it by grabbing a GT IRQ power domain whilst there is any
> -	 * GT activity, preventing any DC state transitions.
> -	 */
> -	i915->gt.awake = intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
> -	GEM_BUG_ON(!i915->gt.awake);
> -
> -	i915_globals_unpark();
> -
> -	intel_enable_gt_powersave(i915);
> -	i915_update_gfx_val(i915);
> -	if (INTEL_GEN(i915) >= 6)
> -		gen6_rps_busy(i915);
> -	i915_pmu_gt_unparked(i915);
> -
> -	intel_engines_unpark(i915);
> -
> -	i915_queue_hangcheck(i915);
> -
> -	queue_delayed_work(i915->wq,
> -			   &i915->gt.retire_work,
> -			   round_jiffies_up_relative(HZ));
> -}
> -
>   int
>   i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
>   			    struct drm_file *file)
> @@ -2874,108 +2775,6 @@ i915_gem_retire_work_handler(struct work_struct *work)
>   				   round_jiffies_up_relative(HZ));
>   }
>   
> -static bool switch_to_kernel_context_sync(struct drm_i915_private *i915,
> -					  unsigned long mask)
> -{
> -	bool result = true;
> -
> -	/*
> -	 * Even if we fail to switch, give whatever is running a small chance
> -	 * to save itself before we report the failure. Yes, this may be a
> -	 * false positive due to e.g. ENOMEM, caveat emptor!
> -	 */
> -	if (i915_gem_switch_to_kernel_context(i915, mask))
> -		result = false;
> -
> -	if (i915_gem_wait_for_idle(i915,
> -				   I915_WAIT_LOCKED |
> -				   I915_WAIT_FOR_IDLE_BOOST,
> -				   I915_GEM_IDLE_TIMEOUT))
> -		result = false;
> -
> -	if (!result) {
> -		if (i915_modparams.reset) { /* XXX hide warning from gem_eio */
> -			dev_err(i915->drm.dev,
> -				"Failed to idle engines, declaring wedged!\n");
> -			GEM_TRACE_DUMP();
> -		}
> -
> -		/* Forcibly cancel outstanding work and leave the gpu quiet. */
> -		i915_gem_set_wedged(i915);
> -	}
> -
> -	i915_retire_requests(i915); /* ensure we flush after wedging */
> -	return result;
> -}
> -
> -static bool load_power_context(struct drm_i915_private *i915)
> -{
> -	/* Force loading the kernel context on all engines */
> -	if (!switch_to_kernel_context_sync(i915, ALL_ENGINES))
> -		return false;
> -
> -	/*
> -	 * Immediately park the GPU so that we enable powersaving and
> -	 * treat it as idle. The next time we issue a request, we will
> -	 * unpark and start using the engine->pinned_default_state, otherwise
> -	 * it is in limbo and an early reset may fail.
> -	 */
> -	__i915_gem_park(i915);
> -
> -	return true;
> -}
> -
> -static void
> -i915_gem_idle_work_handler(struct work_struct *work)
> -{
> -	struct drm_i915_private *i915 =
> -		container_of(work, typeof(*i915), gt.idle_work.work);
> -	bool rearm_hangcheck;
> -
> -	if (!READ_ONCE(i915->gt.awake))
> -		return;
> -
> -	if (READ_ONCE(i915->gt.active_requests))
> -		return;
> -
> -	rearm_hangcheck =
> -		cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
> -
> -	if (!mutex_trylock(&i915->drm.struct_mutex)) {
> -		/* Currently busy, come back later */
> -		mod_delayed_work(i915->wq,
> -				 &i915->gt.idle_work,
> -				 msecs_to_jiffies(50));
> -		goto out_rearm;
> -	}
> -
> -	/*
> -	 * Flush out the last user context, leaving only the pinned
> -	 * kernel context resident. Should anything unfortunate happen
> -	 * while we are idle (such as the GPU being power cycled), no users
> -	 * will be harmed.
> -	 */
> -	if (!work_pending(&i915->gt.idle_work.work) &&
> -	    !i915->gt.active_requests) {
> -		++i915->gt.active_requests; /* don't requeue idle */
> -
> -		switch_to_kernel_context_sync(i915, i915->gt.active_engines);
> -
> -		if (!--i915->gt.active_requests) {
> -			__i915_gem_park(i915);
> -			rearm_hangcheck = false;
> -		}
> -	}
> -
> -	mutex_unlock(&i915->drm.struct_mutex);
> -
> -out_rearm:
> -	if (rearm_hangcheck) {
> -		GEM_BUG_ON(!i915->gt.awake);
> -		i915_queue_hangcheck(i915);
> -	}
> -}
> -
>   void i915_gem_close_object(struct drm_gem_object *gem, struct drm_file *file)
>   {
>   	struct drm_i915_private *i915 = to_i915(gem->dev);
> @@ -4390,133 +4189,6 @@ void i915_gem_sanitize(struct drm_i915_private *i915)
>   	mutex_unlock(&i915->drm.struct_mutex);
>   }
>   
> -void i915_gem_suspend(struct drm_i915_private *i915)
> -{
> -	intel_wakeref_t wakeref;
> -
> -	GEM_TRACE("\n");
> -
> -	wakeref = intel_runtime_pm_get(i915);
> -
> -	flush_workqueue(i915->wq);
> -
> -	mutex_lock(&i915->drm.struct_mutex);
> -
> -	/*
> -	 * We have to flush all the executing contexts to main memory so
> -	 * that they can saved in the hibernation image. To ensure the last
> -	 * context image is coherent, we have to switch away from it. That
> -	 * leaves the i915->kernel_context still active when
> -	 * we actually suspend, and its image in memory may not match the GPU
> -	 * state. Fortunately, the kernel_context is disposable and we do
> -	 * not rely on its state.
> -	 */
> -	switch_to_kernel_context_sync(i915, i915->gt.active_engines);
> -
> -	mutex_unlock(&i915->drm.struct_mutex);
> -	i915_reset_flush(i915);
> -
> -	drain_delayed_work(&i915->gt.retire_work);
> -
> -	/*
> -	 * As the idle_work is rearming if it detects a race, play safe and
> -	 * repeat the flush until it is definitely idle.
> -	 */
> -	drain_delayed_work(&i915->gt.idle_work);
> -
> -	/*
> -	 * Assert that we successfully flushed all the work and
> -	 * reset the GPU back to its idle, low power state.
> -	 */
> -	GEM_BUG_ON(i915->gt.awake);
> -
> -	intel_uc_suspend(i915);
> -
> -	intel_runtime_pm_put(i915, wakeref);
> -}
> -
> -void i915_gem_suspend_late(struct drm_i915_private *i915)
> -{
> -	struct drm_i915_gem_object *obj;
> -	struct list_head *phases[] = {
> -		&i915->mm.unbound_list,
> -		&i915->mm.bound_list,
> -		NULL
> -	}, **phase;
> -
> -	/*
> -	 * Neither the BIOS, ourselves or any other kernel
> -	 * expects the system to be in execlists mode on startup,
> -	 * so we need to reset the GPU back to legacy mode. And the only
> -	 * known way to disable logical contexts is through a GPU reset.
> -	 *
> -	 * So in order to leave the system in a known default configuration,
> -	 * always reset the GPU upon unload and suspend. Afterwards we then
> -	 * clean up the GEM state tracking, flushing off the requests and
> -	 * leaving the system in a known idle state.
> -	 *
> -	 * Note that is of the upmost importance that the GPU is idle and
> -	 * all stray writes are flushed *before* we dismantle the backing
> -	 * storage for the pinned objects.
> -	 *
> -	 * However, since we are uncertain that resetting the GPU on older
> -	 * machines is a good idea, we don't - just in case it leaves the
> -	 * machine in an unusable condition.
> -	 */
> -
> -	mutex_lock(&i915->drm.struct_mutex);
> -	for (phase = phases; *phase; phase++) {
> -		list_for_each_entry(obj, *phase, mm.link)
> -			WARN_ON(i915_gem_object_set_to_gtt_domain(obj, false));
> -	}
> -	mutex_unlock(&i915->drm.struct_mutex);
> -
> -	intel_uc_sanitize(i915);
> -	i915_gem_sanitize(i915);
> -}
> -
> -void i915_gem_resume(struct drm_i915_private *i915)
> -{
> -	GEM_TRACE("\n");
> -
> -	WARN_ON(i915->gt.awake);
> -
> -	mutex_lock(&i915->drm.struct_mutex);
> -	intel_uncore_forcewake_get(&i915->uncore, FORCEWAKE_ALL);
> -
> -	i915_gem_restore_gtt_mappings(i915);
> -	i915_gem_restore_fences(i915);
> -
> -	/*
> -	 * As we didn't flush the kernel context before suspend, we cannot
> -	 * guarantee that the context image is complete. So let's just reset
> -	 * it and start again.
> -	 */
> -	i915->gt.resume(i915);
> -
> -	if (i915_gem_init_hw(i915))
> -		goto err_wedged;
> -
> -	intel_uc_resume(i915);
> -
> -	/* Always reload a context for powersaving. */
> -	if (!load_power_context(i915))
> -		goto err_wedged;
> -
> -out_unlock:
> -	intel_uncore_forcewake_put(&i915->uncore, FORCEWAKE_ALL);
> -	mutex_unlock(&i915->drm.struct_mutex);
> -	return;
> -
> -err_wedged:
> -	if (!i915_reset_failed(i915)) {
> -		dev_err(i915->drm.dev,
> -			"Failed to re-initialize GPU, declaring it wedged!\n");
> -		i915_gem_set_wedged(i915);
> -	}
> -	goto out_unlock;
> -}
> -
>   void i915_gem_init_swizzling(struct drm_i915_private *dev_priv)
>   {
>   	if (INTEL_GEN(dev_priv) < 5 ||
> @@ -4699,7 +4371,7 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
>   	}
>   
>   	/* Flush the default context image to memory, and enable powersaving. */
> -	if (!load_power_context(i915)) {
> +	if (!i915_gem_load_power_context(i915)) {
>   		err = -EIO;
>   		goto err_active;
>   	}
> @@ -5096,11 +4768,10 @@ int i915_gem_init_early(struct drm_i915_private *dev_priv)
>   	INIT_LIST_HEAD(&dev_priv->gt.closed_vma);
>   
>   	i915_gem_init__mm(dev_priv);
> +	i915_gem_init__pm(dev_priv);
>   
>   	INIT_DELAYED_WORK(&dev_priv->gt.retire_work,
>   			  i915_gem_retire_work_handler);
> -	INIT_DELAYED_WORK(&dev_priv->gt.idle_work,
> -			  i915_gem_idle_work_handler);
>   	init_waitqueue_head(&dev_priv->gpu_error.wait_queue);
>   	init_waitqueue_head(&dev_priv->gpu_error.reset_queue);
>   	mutex_init(&dev_priv->gpu_error.wedge_mutex);
> diff --git a/drivers/gpu/drm/i915/i915_gem_pm.c b/drivers/gpu/drm/i915/i915_gem_pm.c
> new file mode 100644
> index 000000000000..faa4eb42ec0a
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_gem_pm.c
> @@ -0,0 +1,341 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#include "i915_drv.h"
> +#include "i915_gem_pm.h"
> +#include "i915_globals.h"
> +
> +static void __i915_gem_park(struct drm_i915_private *i915)
> +{
> +	intel_wakeref_t wakeref;
> +
> +	GEM_TRACE("\n");
> +
> +	lockdep_assert_held(&i915->drm.struct_mutex);
> +	GEM_BUG_ON(i915->gt.active_requests);
> +	GEM_BUG_ON(!list_empty(&i915->gt.active_rings));
> +
> +	if (!i915->gt.awake)
> +		return;
> +
> +	/*
> +	 * Be paranoid and flush a concurrent interrupt to make sure
> +	 * we don't reactivate any irq tasklets after parking.
> +	 *
> +	 * FIXME: Note that even though we have waited for execlists to be idle,
> +	 * there may still be an in-flight interrupt even though the CSB
> +	 * is now empty. synchronize_irq() makes sure that a residual interrupt
> +	 * is completed before we continue, but it doesn't prevent the HW from
> +	 * raising a spurious interrupt later. To complete the shield we should
> +	 * coordinate disabling the CS irq with flushing the interrupts.
> +	 */
> +	synchronize_irq(i915->drm.irq);
> +
> +	intel_engines_park(i915);
> +	i915_timelines_park(i915);
> +
> +	i915_pmu_gt_parked(i915);
> +	i915_vma_parked(i915);
> +
> +	wakeref = fetch_and_zero(&i915->gt.awake);
> +	GEM_BUG_ON(!wakeref);
> +
> +	if (INTEL_GEN(i915) >= 6)
> +		gen6_rps_idle(i915);
> +
> +	intel_display_power_put(i915, POWER_DOMAIN_GT_IRQ, wakeref);
> +
> +	i915_globals_park();
> +}
> +
> +static bool switch_to_kernel_context_sync(struct drm_i915_private *i915,
> +					  unsigned long mask)
> +{
> +	bool result = true;
> +
> +	/*
> +	 * Even if we fail to switch, give whatever is running a small chance
> +	 * to save itself before we report the failure. Yes, this may be a
> +	 * false positive due to e.g. ENOMEM, caveat emptor!
> +	 */
> +	if (i915_gem_switch_to_kernel_context(i915, mask))
> +		result = false;
> +
> +	if (i915_gem_wait_for_idle(i915,
> +				   I915_WAIT_LOCKED |
> +				   I915_WAIT_FOR_IDLE_BOOST,
> +				   I915_GEM_IDLE_TIMEOUT))
> +		result = false;
> +
> +	if (!result) {
> +		if (i915_modparams.reset) { /* XXX hide warning from gem_eio */
> +			dev_err(i915->drm.dev,
> +				"Failed to idle engines, declaring wedged!\n");
> +			GEM_TRACE_DUMP();
> +		}
> +
> +		/* Forcibly cancel outstanding work and leave the gpu quiet. */
> +		i915_gem_set_wedged(i915);
> +	}
> +
> +	i915_retire_requests(i915); /* ensure we flush after wedging */
> +	return result;
> +}
> +
> +static void idle_work_handler(struct work_struct *work)
> +{
> +	struct drm_i915_private *i915 =
> +		container_of(work, typeof(*i915), gt.idle_work.work);
> +	bool rearm_hangcheck;
> +
> +	if (!READ_ONCE(i915->gt.awake))
> +		return;
> +
> +	if (READ_ONCE(i915->gt.active_requests))
> +		return;
> +
> +	rearm_hangcheck =
> +		cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
> +
> +	if (!mutex_trylock(&i915->drm.struct_mutex)) {
> +		/* Currently busy, come back later */
> +		mod_delayed_work(i915->wq,
> +				 &i915->gt.idle_work,
> +				 msecs_to_jiffies(50));
> +		goto out_rearm;
> +	}
> +
> +	/*
> +	 * Flush out the last user context, leaving only the pinned
> +	 * kernel context resident. Should anything unfortunate happen
> +	 * while we are idle (such as the GPU being power cycled), no users
> +	 * will be harmed.
> +	 */
> +	if (!work_pending(&i915->gt.idle_work.work) &&
> +	    !i915->gt.active_requests) {
> +		++i915->gt.active_requests; /* don't requeue idle */
> +
> +		switch_to_kernel_context_sync(i915, i915->gt.active_engines);
> +
> +		if (!--i915->gt.active_requests) {
> +			__i915_gem_park(i915);
> +			rearm_hangcheck = false;
> +		}
> +	}
> +
> +	mutex_unlock(&i915->drm.struct_mutex);
> +
> +out_rearm:
> +	if (rearm_hangcheck) {
> +		GEM_BUG_ON(!i915->gt.awake);
> +		i915_queue_hangcheck(i915);
> +	}
> +}
> +
> +void i915_gem_park(struct drm_i915_private *i915)
> +{
> +	GEM_TRACE("\n");
> +
> +	lockdep_assert_held(&i915->drm.struct_mutex);
> +	GEM_BUG_ON(i915->gt.active_requests);
> +
> +	if (!i915->gt.awake)
> +		return;
> +
> +	/* Defer the actual call to __i915_gem_park() to prevent ping-pongs */
> +	mod_delayed_work(i915->wq, &i915->gt.idle_work, msecs_to_jiffies(100));
> +}
> +
> +void i915_gem_unpark(struct drm_i915_private *i915)
> +{
> +	GEM_TRACE("\n");
> +
> +	lockdep_assert_held(&i915->drm.struct_mutex);
> +	GEM_BUG_ON(!i915->gt.active_requests);
> +	assert_rpm_wakelock_held(i915);
> +
> +	if (i915->gt.awake)
> +		return;
> +
> +	/*
> +	 * It seems that the DMC likes to transition between the DC states a lot
> +	 * when there are no connected displays (no active power domains) during
> +	 * command submission.
> +	 *
> +	 * This activity has negative impact on the performance of the chip with
> +	 * huge latencies observed in the interrupt handler and elsewhere.
> +	 *
> +	 * Work around it by grabbing a GT IRQ power domain whilst there is any
> +	 * GT activity, preventing any DC state transitions.
> +	 */
> +	i915->gt.awake = intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
> +	GEM_BUG_ON(!i915->gt.awake);
> +
> +	i915_globals_unpark();
> +
> +	intel_enable_gt_powersave(i915);
> +	i915_update_gfx_val(i915);
> +	if (INTEL_GEN(i915) >= 6)
> +		gen6_rps_busy(i915);
> +	i915_pmu_gt_unparked(i915);
> +
> +	intel_engines_unpark(i915);
> +
> +	i915_queue_hangcheck(i915);
> +
> +	queue_delayed_work(i915->wq,
> +			   &i915->gt.retire_work,
> +			   round_jiffies_up_relative(HZ));
> +}
> +
> +bool i915_gem_load_power_context(struct drm_i915_private *i915)
> +{
> +	/* Force loading the kernel context on all engines */
> +	if (!switch_to_kernel_context_sync(i915, ALL_ENGINES))
> +		return false;
> +
> +	/*
> +	 * Immediately park the GPU so that we enable powersaving and
> +	 * treat it as idle. The next time we issue a request, we will
> +	 * unpark and start using the engine->pinned_default_state, otherwise
> +	 * it is in limbo and an early reset may fail.
> +	 */
> +	__i915_gem_park(i915);
> +
> +	return true;
> +}
> +
> +void i915_gem_suspend(struct drm_i915_private *i915)
> +{
> +	intel_wakeref_t wakeref;
> +
> +	GEM_TRACE("\n");
> +
> +	wakeref = intel_runtime_pm_get(i915);
> +
> +	flush_workqueue(i915->wq);
> +
> +	mutex_lock(&i915->drm.struct_mutex);
> +
> +	/*
> +	 * We have to flush all the executing contexts to main memory so
> +	 * that they can saved in the hibernation image. To ensure the last
> +	 * context image is coherent, we have to switch away from it. That
> +	 * leaves the i915->kernel_context still active when
> +	 * we actually suspend, and its image in memory may not match the GPU
> +	 * state. Fortunately, the kernel_context is disposable and we do
> +	 * not rely on its state.
> +	 */
> +	switch_to_kernel_context_sync(i915, i915->gt.active_engines);
> +
> +	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_reset_flush(i915);
> +
> +	drain_delayed_work(&i915->gt.retire_work);
> +
> +	/*
> +	 * As the idle_work is rearming if it detects a race, play safe and
> +	 * repeat the flush until it is definitely idle.
> +	 */
> +	drain_delayed_work(&i915->gt.idle_work);
> +
> +	/*
> +	 * Assert that we successfully flushed all the work and
> +	 * reset the GPU back to its idle, low power state.
> +	 */
> +	GEM_BUG_ON(i915->gt.awake);
> +
> +	intel_uc_suspend(i915);
> +
> +	intel_runtime_pm_put(i915, wakeref);
> +}
> +
> +void i915_gem_suspend_late(struct drm_i915_private *i915)
> +{
> +	struct drm_i915_gem_object *obj;
> +	struct list_head *phases[] = {
> +		&i915->mm.unbound_list,
> +		&i915->mm.bound_list,
> +		NULL
> +	}, **phase;
> +
> +	/*
> +	 * Neither the BIOS, ourselves or any other kernel
> +	 * expects the system to be in execlists mode on startup,
> +	 * so we need to reset the GPU back to legacy mode. And the only
> +	 * known way to disable logical contexts is through a GPU reset.
> +	 *
> +	 * So in order to leave the system in a known default configuration,
> +	 * always reset the GPU upon unload and suspend. Afterwards we then
> +	 * clean up the GEM state tracking, flushing off the requests and
> +	 * leaving the system in a known idle state.
> +	 *
> +	 * Note that is of the upmost importance that the GPU is idle and
> +	 * all stray writes are flushed *before* we dismantle the backing
> +	 * storage for the pinned objects.
> +	 *
> +	 * However, since we are uncertain that resetting the GPU on older
> +	 * machines is a good idea, we don't - just in case it leaves the
> +	 * machine in an unusable condition.
> +	 */
> +
> +	mutex_lock(&i915->drm.struct_mutex);
> +	for (phase = phases; *phase; phase++) {
> +		list_for_each_entry(obj, *phase, mm.link)
> +			WARN_ON(i915_gem_object_set_to_gtt_domain(obj, false));
> +	}
> +	mutex_unlock(&i915->drm.struct_mutex);
> +
> +	intel_uc_sanitize(i915);
> +	i915_gem_sanitize(i915);
> +}
> +
> +void i915_gem_resume(struct drm_i915_private *i915)
> +{
> +	GEM_TRACE("\n");
> +
> +	WARN_ON(i915->gt.awake);
> +
> +	mutex_lock(&i915->drm.struct_mutex);
> +	intel_uncore_forcewake_get(&i915->uncore, FORCEWAKE_ALL);
> +
> +	i915_gem_restore_gtt_mappings(i915);
> +	i915_gem_restore_fences(i915);
> +
> +	/*
> +	 * As we didn't flush the kernel context before suspend, we cannot
> +	 * guarantee that the context image is complete. So let's just reset
> +	 * it and start again.
> +	 */
> +	i915->gt.resume(i915);
> +
> +	if (i915_gem_init_hw(i915))
> +		goto err_wedged;
> +
> +	intel_uc_resume(i915);
> +
> +	/* Always reload a context for powersaving. */
> +	if (!i915_gem_load_power_context(i915))
> +		goto err_wedged;
> +
> +out_unlock:
> +	intel_uncore_forcewake_put(&i915->uncore, FORCEWAKE_ALL);
> +	mutex_unlock(&i915->drm.struct_mutex);
> +	return;
> +
> +err_wedged:
> +	if (!i915_reset_failed(i915)) {
> +		dev_err(i915->drm.dev,
> +			"Failed to re-initialize GPU, declaring it wedged!\n");
> +		i915_gem_set_wedged(i915);
> +	}
> +	goto out_unlock;
> +}
> +
> +void i915_gem_init__pm(struct drm_i915_private *i915)
> +{
> +	INIT_DELAYED_WORK(&i915->gt.idle_work, idle_work_handler);
> +}
> diff --git a/drivers/gpu/drm/i915/i915_gem_pm.h b/drivers/gpu/drm/i915/i915_gem_pm.h
> new file mode 100644
> index 000000000000..52f65e3f06b5
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_gem_pm.h
> @@ -0,0 +1,28 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#ifndef __I915_GEM_PM_H__
> +#define __I915_GEM_PM_H__
> +
> +#include <linux/types.h>
> +
> +struct drm_i915_private;
> +struct work_struct;
> +
> +void i915_gem_init__pm(struct drm_i915_private *i915);
> +
> +bool i915_gem_load_power_context(struct drm_i915_private *i915);
> +void i915_gem_resume(struct drm_i915_private *i915);
> +
> +void i915_gem_unpark(struct drm_i915_private *i915);
> +void i915_gem_park(struct drm_i915_private *i915);
> +
> +void i915_gem_idle_work_handler(struct work_struct *work);
> +
> +void i915_gem_suspend(struct drm_i915_private *i915);
> +void i915_gem_suspend_late(struct drm_i915_private *i915);
> +
> +#endif /* __I915_GEM_PM_H__ */
> diff --git a/drivers/gpu/drm/i915/test_i915_gem_pm_standalone.c b/drivers/gpu/drm/i915/test_i915_gem_pm_standalone.c
> new file mode 100644
> index 000000000000..3524e471b46b
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/test_i915_gem_pm_standalone.c
> @@ -0,0 +1,7 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#include "i915_gem_pm.h"
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/22] drm/i915: Guard unpark/park with the gt.active_mutex
  2019-03-25  9:03 ` [PATCH 04/22] drm/i915: Guard unpark/park with the gt.active_mutex Chris Wilson
@ 2019-04-01 15:22   ` Tvrtko Ursulin
  2019-04-01 15:37     ` Chris Wilson
  0 siblings, 1 reply; 43+ messages in thread
From: Tvrtko Ursulin @ 2019-04-01 15:22 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 25/03/2019 09:03, Chris Wilson wrote:
> If we introduce a new mutex for the exclusive use of GEM's runtime power
> management, we can remove its requirement of holding the struct_mutex.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_debugfs.c           |  9 +--
>   drivers/gpu/drm/i915/i915_drv.h               |  3 +-
>   drivers/gpu/drm/i915/i915_gem.c               |  2 +-
>   drivers/gpu/drm/i915/i915_gem.h               |  3 -
>   drivers/gpu/drm/i915/i915_gem_evict.c         |  2 +-
>   drivers/gpu/drm/i915/i915_gem_pm.c            | 70 ++++++++++++-------
>   drivers/gpu/drm/i915/i915_request.c           | 24 ++-----
>   .../gpu/drm/i915/selftests/i915_gem_context.c |  4 +-
>   .../gpu/drm/i915/selftests/i915_gem_object.c  | 13 ++--
>   .../gpu/drm/i915/selftests/mock_gem_device.c  |  3 +-
>   10 files changed, 68 insertions(+), 65 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 47bf07a59b5e..98ff1a14ccf3 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2034,7 +2034,8 @@ static int i915_rps_boost_info(struct seq_file *m, void *data)
>   
>   	seq_printf(m, "RPS enabled? %d\n", rps->enabled);
>   	seq_printf(m, "GPU busy? %s [%d requests]\n",
> -		   yesno(dev_priv->gt.awake), dev_priv->gt.active_requests);
> +		   yesno(dev_priv->gt.awake),
> +		   atomic_read(&dev_priv->gt.active_requests));
>   	seq_printf(m, "Boosts outstanding? %d\n",
>   		   atomic_read(&rps->num_waiters));
>   	seq_printf(m, "Interactive? %d\n", READ_ONCE(rps->power.interactive));
> @@ -2055,7 +2056,7 @@ static int i915_rps_boost_info(struct seq_file *m, void *data)
>   
>   	if (INTEL_GEN(dev_priv) >= 6 &&
>   	    rps->enabled &&
> -	    dev_priv->gt.active_requests) {
> +	    atomic_read(&dev_priv->gt.active_requests)) {
>   		u32 rpup, rpupei;
>   		u32 rpdown, rpdownei;
>   
> @@ -3087,7 +3088,7 @@ static int i915_engine_info(struct seq_file *m, void *unused)
>   
>   	seq_printf(m, "GT awake? %s\n", yesno(dev_priv->gt.awake));
>   	seq_printf(m, "Global active requests: %d\n",
> -		   dev_priv->gt.active_requests);
> +		   atomic_read(&dev_priv->gt.active_requests));
>   	seq_printf(m, "CS timestamp frequency: %u kHz\n",
>   		   RUNTIME_INFO(dev_priv)->cs_timestamp_frequency_khz);
>   
> @@ -3933,7 +3934,7 @@ i915_drop_caches_set(void *data, u64 val)
>   
>   	if (val & DROP_IDLE) {
>   		do {
> -			if (READ_ONCE(i915->gt.active_requests))
> +			if (atomic_read(&i915->gt.active_requests))
>   				flush_delayed_work(&i915->gt.retire_work);
>   			drain_delayed_work(&i915->gt.idle_work);
>   		} while (READ_ONCE(i915->gt.awake));
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 11803d485275..7c7afe99986c 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2008,7 +2008,8 @@ struct drm_i915_private {
>   		intel_engine_mask_t active_engines;
>   		struct list_head active_rings;
>   		struct list_head closed_vma;
> -		u32 active_requests;
> +		atomic_t active_requests;
> +		struct mutex active_mutex;

My initial reaction is why not gem_pm_mutex to match where code was 
moved and what the commit message says?

>   
>   		/**
>   		 * Is the GPU currently considered idle, or busy executing
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 47c672432594..79919e0cf03d 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2914,7 +2914,7 @@ wait_for_timelines(struct drm_i915_private *i915,
>   	struct i915_gt_timelines *gt = &i915->gt.timelines;
>   	struct i915_timeline *tl;
>   
> -	if (!READ_ONCE(i915->gt.active_requests))
> +	if (!atomic_read(&i915->gt.active_requests))
>   		return timeout;
>   
>   	mutex_lock(&gt->mutex);
> diff --git a/drivers/gpu/drm/i915/i915_gem.h b/drivers/gpu/drm/i915/i915_gem.h
> index 5c073fe73664..bd13198a9058 100644
> --- a/drivers/gpu/drm/i915/i915_gem.h
> +++ b/drivers/gpu/drm/i915/i915_gem.h
> @@ -77,9 +77,6 @@ struct drm_i915_private;
>   
>   #define I915_GEM_IDLE_TIMEOUT (HZ / 5)
>   
> -void i915_gem_park(struct drm_i915_private *i915);
> -void i915_gem_unpark(struct drm_i915_private *i915);
> -
>   static inline void __tasklet_disable_sync_once(struct tasklet_struct *t)
>   {
>   	if (!atomic_fetch_inc(&t->count))
> diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> index 060f5903544a..20e835a7f116 100644
> --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> @@ -38,7 +38,7 @@ I915_SELFTEST_DECLARE(static struct igt_evict_ctl {
>   
>   static bool ggtt_is_idle(struct drm_i915_private *i915)
>   {
> -	return !i915->gt.active_requests;
> +	return !atomic_read(&i915->gt.active_requests);
>   }
>   
>   static int ggtt_flush(struct drm_i915_private *i915)
> diff --git a/drivers/gpu/drm/i915/i915_gem_pm.c b/drivers/gpu/drm/i915/i915_gem_pm.c
> index faa4eb42ec0a..6ecd9f8ac87d 100644
> --- a/drivers/gpu/drm/i915/i915_gem_pm.c
> +++ b/drivers/gpu/drm/i915/i915_gem_pm.c
> @@ -14,8 +14,8 @@ static void __i915_gem_park(struct drm_i915_private *i915)
>   
>   	GEM_TRACE("\n");
>   
> -	lockdep_assert_held(&i915->drm.struct_mutex);
> -	GEM_BUG_ON(i915->gt.active_requests);
> +	lockdep_assert_held(&i915->gt.active_mutex);
> +	GEM_BUG_ON(atomic_read(&i915->gt.active_requests));
>   	GEM_BUG_ON(!list_empty(&i915->gt.active_rings));
>   
>   	if (!i915->gt.awake)
> @@ -94,12 +94,13 @@ static void idle_work_handler(struct work_struct *work)
>   	if (!READ_ONCE(i915->gt.awake))
>   		return;
>   
> -	if (READ_ONCE(i915->gt.active_requests))
> +	if (atomic_read(&i915->gt.active_requests))
>   		return;
>   
>   	rearm_hangcheck =
>   		cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
>   
> +	/* XXX need to support lockless kernel_context before removing! */
>   	if (!mutex_trylock(&i915->drm.struct_mutex)) {
>   		/* Currently busy, come back later */
>   		mod_delayed_work(i915->wq,
> @@ -114,18 +115,19 @@ static void idle_work_handler(struct work_struct *work)
>   	 * while we are idle (such as the GPU being power cycled), no users
>   	 * will be harmed.
>   	 */
> +	mutex_lock(&i915->gt.active_mutex);
>   	if (!work_pending(&i915->gt.idle_work.work) &&
> -	    !i915->gt.active_requests) {
> -		++i915->gt.active_requests; /* don't requeue idle */
> +	    !atomic_read(&i915->gt.active_requests)) {
> +		atomic_inc(&i915->gt.active_requests); /* don't requeue idle */

atomic_inc_not_zero?

>   
>   		switch_to_kernel_context_sync(i915, i915->gt.active_engines);
>   
> -		if (!--i915->gt.active_requests) {
> +		if (atomic_dec_and_test(&i915->gt.active_requests)) {
>   			__i915_gem_park(i915);
>   			rearm_hangcheck = false;
>   		}
>   	}
> -
> +	mutex_unlock(&i915->gt.active_mutex);
>   	mutex_unlock(&i915->drm.struct_mutex);
>   
>   out_rearm:
> @@ -137,26 +139,16 @@ static void idle_work_handler(struct work_struct *work)
>   
>   void i915_gem_park(struct drm_i915_private *i915)
>   {
> -	GEM_TRACE("\n");
> -
> -	lockdep_assert_held(&i915->drm.struct_mutex);
> -	GEM_BUG_ON(i915->gt.active_requests);
> -
> -	if (!i915->gt.awake)
> -		return;
> -
>   	/* Defer the actual call to __i915_gem_park() to prevent ping-pongs */
> -	mod_delayed_work(i915->wq, &i915->gt.idle_work, msecs_to_jiffies(100));
> +	GEM_BUG_ON(!atomic_read(&i915->gt.active_requests));
> +	if (atomic_dec_and_test(&i915->gt.active_requests))
> +		mod_delayed_work(i915->wq,
> +				 &i915->gt.idle_work,
> +				 msecs_to_jiffies(100));
>   }
>   
> -void i915_gem_unpark(struct drm_i915_private *i915)
> +static void __i915_gem_unpark(struct drm_i915_private *i915)
>   {
> -	GEM_TRACE("\n");
> -
> -	lockdep_assert_held(&i915->drm.struct_mutex);
> -	GEM_BUG_ON(!i915->gt.active_requests);
> -	assert_rpm_wakelock_held(i915);
> -
>   	if (i915->gt.awake)
>   		return;
>   
> @@ -191,11 +183,29 @@ void i915_gem_unpark(struct drm_i915_private *i915)
>   			   round_jiffies_up_relative(HZ));
>   }
>   
> +void i915_gem_unpark(struct drm_i915_private *i915)
> +{
> +	if (atomic_add_unless(&i915->gt.active_requests, 1, 0))
> +		return;

This looks wrong - how can it be okay to maybe not increment 
active_requests on unpark? What am I missing?

I would expect here you would need 
"atomic_inc_and_return_true_if_OLD_value_was_zero" but I don't think 
there is such API.

> +
> +	mutex_lock(&i915->gt.active_mutex);
> +	if (!atomic_read(&i915->gt.active_requests)) {
> +		GEM_TRACE("\n");
> +		__i915_gem_unpark(i915);
> +		smp_mb__before_atomic();

Why is this needed? I have no idea.. but I think we want comments with 
all memory barriers.

> +	}
> +	atomic_inc(&i915->gt.active_requests);
> +	mutex_unlock(&i915->gt.active_mutex);
> +}
> +
>   bool i915_gem_load_power_context(struct drm_i915_private *i915)
>   {
> +	mutex_lock(&i915->gt.active_mutex);
> +	atomic_inc(&i915->gt.active_requests);

Why this function has to manually manage active_requests? Can it be 
written in a simpler way?

> +
>   	/* Force loading the kernel context on all engines */
>   	if (!switch_to_kernel_context_sync(i915, ALL_ENGINES))
> -		return false;
> +		goto err_active;
>   
>   	/*
>   	 * Immediately park the GPU so that we enable powersaving and
> @@ -203,9 +213,20 @@ bool i915_gem_load_power_context(struct drm_i915_private *i915)
>   	 * unpark and start using the engine->pinned_default_state, otherwise
>   	 * it is in limbo and an early reset may fail.
>   	 */
> +
> +	if (!atomic_dec_and_test(&i915->gt.active_requests))
> +		goto err_unlock;
> +
>   	__i915_gem_park(i915);
> +	mutex_unlock(&i915->gt.active_mutex);
>   
>   	return true;
> +
> +err_active:
> +	atomic_dec(&i915->gt.active_requests);
> +err_unlock:
> +	mutex_unlock(&i915->gt.active_mutex);
> +	return false;
>   }
>   
>   void i915_gem_suspend(struct drm_i915_private *i915)
> @@ -337,5 +358,6 @@ void i915_gem_resume(struct drm_i915_private *i915)
>   
>   void i915_gem_init__pm(struct drm_i915_private *i915)
>   {
> +	mutex_init(&i915->gt.active_mutex);
>   	INIT_DELAYED_WORK(&i915->gt.idle_work, idle_work_handler);
>   }
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index e9c2094ab8ea..8d396f3c747d 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -31,6 +31,7 @@
>   
>   #include "i915_drv.h"
>   #include "i915_active.h"
> +#include "i915_gem_pm.h" /* XXX layering violation! */
>   #include "i915_globals.h"
>   #include "i915_reset.h"
>   
> @@ -130,19 +131,6 @@ i915_request_remove_from_client(struct i915_request *request)
>   	spin_unlock(&file_priv->mm.lock);
>   }
>   
> -static void reserve_gt(struct drm_i915_private *i915)
> -{
> -	if (!i915->gt.active_requests++)
> -		i915_gem_unpark(i915);
> -}
> -
> -static void unreserve_gt(struct drm_i915_private *i915)
> -{
> -	GEM_BUG_ON(!i915->gt.active_requests);
> -	if (!--i915->gt.active_requests)
> -		i915_gem_park(i915);
> -}
> -
>   static void advance_ring(struct i915_request *request)
>   {
>   	struct intel_ring *ring = request->ring;
> @@ -304,7 +292,7 @@ static void i915_request_retire(struct i915_request *request)
>   
>   	__retire_engine_upto(request->engine, request);
>   
> -	unreserve_gt(request->i915);
> +	i915_gem_park(request->i915);
>   
>   	i915_sched_node_fini(&request->sched);
>   	i915_request_put(request);
> @@ -607,8 +595,6 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   	u32 seqno;
>   	int ret;
>   
> -	lockdep_assert_held(&i915->drm.struct_mutex);
> -
>   	/*
>   	 * Preempt contexts are reserved for exclusive use to inject a
>   	 * preemption context switch. They are never to be used for any trivial
> @@ -633,7 +619,7 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   	if (IS_ERR(ce))
>   		return ERR_CAST(ce);
>   
> -	reserve_gt(i915);
> +	i915_gem_unpark(i915);
>   	mutex_lock(&ce->ring->timeline->mutex);
>   
>   	/* Move our oldest request to the slab-cache (if not in use!) */
> @@ -766,7 +752,7 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   	kmem_cache_free(global.slab_requests, rq);
>   err_unreserve:
>   	mutex_unlock(&ce->ring->timeline->mutex);
> -	unreserve_gt(i915);
> +	i915_gem_park(i915);
>   	intel_context_unpin(ce);
>   	return ERR_PTR(ret);
>   }
> @@ -1356,7 +1342,7 @@ void i915_retire_requests(struct drm_i915_private *i915)
>   
>   	lockdep_assert_held(&i915->drm.struct_mutex);
>   
> -	if (!i915->gt.active_requests)
> +	if (!atomic_read(&i915->gt.active_requests))
>   		return;
>   
>   	list_for_each_entry_safe(ring, tmp,
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> index 45f73b8b4e6d..6ce366091e0b 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> @@ -1646,9 +1646,9 @@ static int __igt_switch_to_kernel_context(struct drm_i915_private *i915,
>   				return err;
>   		}
>   
> -		if (i915->gt.active_requests) {
> +		if (atomic_read(&i915->gt.active_requests)) {
>   			pr_err("%d active requests remain after switching to kernel context, pass %d (%s) on %s engine%s\n",
> -			       i915->gt.active_requests,
> +			       atomic_read(&i915->gt.active_requests),
>   			       pass, from_idle ? "idle" : "busy",
>   			       __engine_name(i915, engines),
>   			       is_power_of_2(engines) ? "" : "s");
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
> index 971148fbe6f5..c2b08fdf23cf 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_object.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
> @@ -506,12 +506,7 @@ static void disable_retire_worker(struct drm_i915_private *i915)
>   	i915_gem_shrinker_unregister(i915);
>   
>   	mutex_lock(&i915->drm.struct_mutex);
> -	if (!i915->gt.active_requests++) {
> -		intel_wakeref_t wakeref;
> -
> -		with_intel_runtime_pm(i915, wakeref)
> -			i915_gem_unpark(i915);
> -	}
> +	i915_gem_unpark(i915);
>   	mutex_unlock(&i915->drm.struct_mutex);
>   
>   	cancel_delayed_work_sync(&i915->gt.retire_work);
> @@ -616,10 +611,10 @@ static int igt_mmap_offset_exhaustion(void *arg)
>   	drm_mm_remove_node(&resv);
>   out_park:
>   	mutex_lock(&i915->drm.struct_mutex);
> -	if (--i915->gt.active_requests)
> -		queue_delayed_work(i915->wq, &i915->gt.retire_work, 0);
> -	else
> +	if (atomic_dec_and_test(&i915->gt.active_requests))
>   		queue_delayed_work(i915->wq, &i915->gt.idle_work, 0);
> +	else
> +		queue_delayed_work(i915->wq, &i915->gt.retire_work, 0);
>   	mutex_unlock(&i915->drm.struct_mutex);
>   	i915_gem_shrinker_register(i915);
>   	return err;
> diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> index 60bbf8b4df40..7afc5ee8dda5 100644
> --- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> +++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> @@ -44,7 +44,7 @@ void mock_device_flush(struct drm_i915_private *i915)
>   		mock_engine_flush(engine);
>   
>   	i915_retire_requests(i915);
> -	GEM_BUG_ON(i915->gt.active_requests);
> +	GEM_BUG_ON(atomic_read(&i915->gt.active_requests));
>   }
>   
>   static void mock_device_release(struct drm_device *dev)
> @@ -203,6 +203,7 @@ struct drm_i915_private *mock_gem_device(void)
>   
>   	i915_timelines_init(i915);
>   
> +	mutex_init(&i915->gt.active_mutex);
>   	INIT_LIST_HEAD(&i915->gt.active_rings);
>   	INIT_LIST_HEAD(&i915->gt.closed_vma);
>   
> 

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 05/22] drm/i915/selftests: Take GEM runtime wakeref
  2019-03-25  9:03 ` [PATCH 05/22] drm/i915/selftests: Take GEM runtime wakeref Chris Wilson
@ 2019-04-01 15:34   ` Tvrtko Ursulin
  2019-04-01 15:44     ` Chris Wilson
  0 siblings, 1 reply; 43+ messages in thread
From: Tvrtko Ursulin @ 2019-04-01 15:34 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 25/03/2019 09:03, Chris Wilson wrote:
> Transition from calling the lower level intel_runtime_pm functions to
> using the GEM runtime_pm functions (i915_gem_unpark, i915_gem_park) now
> that they are decoupled from struct_mutex. This has the small advantage
> of reducing our overhead for request emission and ensuring that GEM
> state is locked awake during the tests (to reduce interference).

Too tedious to read in detail. Actually not purely tedious, but 
inversion of get and unpark (positive and negative) is just constantly 
hard to read.

Otherwise there are some aspect of this I like - like more explicitly 
controlling the GEM/GT readiness, and some which I don't, like: a) 
churn, b) changing the direction from the recommendation insofar to grab 
rpm over the smallest section, to now reversing that, and c) 
i915_gem_unpark as a name was okay in the old world, but if this is now 
a central API to wake up the device I am not so crazy about the unpark name.

Regards,

Tvrtko


> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/selftests/huge_pages.c   |  5 +-
>   drivers/gpu/drm/i915/selftests/i915_active.c  | 11 ++---
>   drivers/gpu/drm/i915/selftests/i915_gem.c     |  5 +-
>   .../drm/i915/selftests/i915_gem_coherency.c   |  6 +--
>   .../gpu/drm/i915/selftests/i915_gem_context.c | 49 ++++++++-----------
>   .../gpu/drm/i915/selftests/i915_gem_evict.c   |  6 +--
>   .../gpu/drm/i915/selftests/i915_gem_object.c  |  8 ++-
>   drivers/gpu/drm/i915/selftests/i915_request.c | 33 ++++++-------
>   .../gpu/drm/i915/selftests/i915_timeline.c    | 21 ++++----
>   drivers/gpu/drm/i915/selftests/intel_guc.c    |  9 ++--
>   .../gpu/drm/i915/selftests/intel_hangcheck.c  | 19 ++++---
>   drivers/gpu/drm/i915/selftests/intel_lrc.c    | 41 +++++++---------
>   .../drm/i915/selftests/intel_workarounds.c    | 23 ++++-----
>   13 files changed, 105 insertions(+), 131 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/selftests/huge_pages.c b/drivers/gpu/drm/i915/selftests/huge_pages.c
> index 90721b54e7ae..1597a6e1f364 100644
> --- a/drivers/gpu/drm/i915/selftests/huge_pages.c
> +++ b/drivers/gpu/drm/i915/selftests/huge_pages.c
> @@ -1753,7 +1753,6 @@ int i915_gem_huge_page_live_selftests(struct drm_i915_private *dev_priv)
>   	};
>   	struct drm_file *file;
>   	struct i915_gem_context *ctx;
> -	intel_wakeref_t wakeref;
>   	int err;
>   
>   	if (!HAS_PPGTT(dev_priv)) {
> @@ -1769,7 +1768,7 @@ int i915_gem_huge_page_live_selftests(struct drm_i915_private *dev_priv)
>   		return PTR_ERR(file);
>   
>   	mutex_lock(&dev_priv->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(dev_priv);
> +	i915_gem_unpark(dev_priv);
>   
>   	ctx = live_context(dev_priv, file);
>   	if (IS_ERR(ctx)) {
> @@ -1783,7 +1782,7 @@ int i915_gem_huge_page_live_selftests(struct drm_i915_private *dev_priv)
>   	err = i915_subtests(tests, ctx);
>   
>   out_unlock:
> -	intel_runtime_pm_put(dev_priv, wakeref);
> +	i915_gem_park(dev_priv);
>   	mutex_unlock(&dev_priv->drm.struct_mutex);
>   
>   	mock_file_free(dev_priv, file);
> diff --git a/drivers/gpu/drm/i915/selftests/i915_active.c b/drivers/gpu/drm/i915/selftests/i915_active.c
> index 27d8f853111b..42bcceba175c 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_active.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_active.c
> @@ -5,6 +5,7 @@
>    */
>   
>   #include "../i915_selftest.h"
> +#include "../i915_gem_pm.h"
>   
>   #include "igt_flush_test.h"
>   #include "lib_sw_fence.h"
> @@ -89,13 +90,12 @@ static int live_active_wait(void *arg)
>   {
>   	struct drm_i915_private *i915 = arg;
>   	struct live_active active;
> -	intel_wakeref_t wakeref;
>   	int err;
>   
>   	/* Check that we get a callback when requests retire upon waiting */
>   
> +	i915_gem_unpark(i915);
>   	mutex_lock(&i915->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(i915);
>   
>   	err = __live_active_setup(i915, &active);
>   
> @@ -109,8 +109,8 @@ static int live_active_wait(void *arg)
>   	if (igt_flush_test(i915, I915_WAIT_LOCKED))
>   		err = -EIO;
>   
> -	intel_runtime_pm_put(i915, wakeref);
>   	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_gem_park(i915);
>   	return err;
>   }
>   
> @@ -118,13 +118,12 @@ static int live_active_retire(void *arg)
>   {
>   	struct drm_i915_private *i915 = arg;
>   	struct live_active active;
> -	intel_wakeref_t wakeref;
>   	int err;
>   
>   	/* Check that we get a callback when requests are indirectly retired */
>   
> +	i915_gem_unpark(i915);
>   	mutex_lock(&i915->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(i915);
>   
>   	err = __live_active_setup(i915, &active);
>   
> @@ -138,8 +137,8 @@ static int live_active_retire(void *arg)
>   	}
>   
>   	i915_active_fini(&active.base);
> -	intel_runtime_pm_put(i915, wakeref);
>   	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_gem_park(i915);
>   	return err;
>   }
>   
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem.c b/drivers/gpu/drm/i915/selftests/i915_gem.c
> index 50bb7bbd26d3..7d79f1fe6bbd 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem.c
> @@ -16,10 +16,9 @@ static int switch_to_context(struct drm_i915_private *i915,
>   {
>   	struct intel_engine_cs *engine;
>   	enum intel_engine_id id;
> -	intel_wakeref_t wakeref;
>   	int err = 0;
>   
> -	wakeref = intel_runtime_pm_get(i915);
> +	i915_gem_unpark(i915);
>   
>   	for_each_engine(engine, i915, id) {
>   		struct i915_request *rq;
> @@ -33,7 +32,7 @@ static int switch_to_context(struct drm_i915_private *i915,
>   		i915_request_add(rq);
>   	}
>   
> -	intel_runtime_pm_put(i915, wakeref);
> +	i915_gem_park(i915);
>   
>   	return err;
>   }
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
> index e43630b40fce..497929238f02 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
> @@ -279,7 +279,6 @@ static int igt_gem_coherency(void *arg)
>   	struct drm_i915_private *i915 = arg;
>   	const struct igt_coherency_mode *read, *write, *over;
>   	struct drm_i915_gem_object *obj;
> -	intel_wakeref_t wakeref;
>   	unsigned long count, n;
>   	u32 *offsets, *values;
>   	int err = 0;
> @@ -298,8 +297,9 @@ static int igt_gem_coherency(void *arg)
>   
>   	values = offsets + ncachelines;
>   
> +	i915_gem_unpark(i915);
>   	mutex_lock(&i915->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(i915);
> +
>   	for (over = igt_coherency_mode; over->name; over++) {
>   		if (!over->set)
>   			continue;
> @@ -377,8 +377,8 @@ static int igt_gem_coherency(void *arg)
>   		}
>   	}
>   unlock:
> -	intel_runtime_pm_put(i915, wakeref);
>   	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_gem_park(i915);
>   	kfree(offsets);
>   	return err;
>   
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> index 6ce366091e0b..b4039df633ec 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> @@ -24,6 +24,7 @@
>   
>   #include <linux/prime_numbers.h>
>   
> +#include "../i915_gem_pm.h"
>   #include "../i915_reset.h"
>   #include "../i915_selftest.h"
>   #include "i915_random.h"
> @@ -45,7 +46,6 @@ static int live_nop_switch(void *arg)
>   	struct intel_engine_cs *engine;
>   	struct i915_gem_context **ctx;
>   	enum intel_engine_id id;
> -	intel_wakeref_t wakeref;
>   	struct igt_live_test t;
>   	struct drm_file *file;
>   	unsigned long n;
> @@ -66,8 +66,8 @@ static int live_nop_switch(void *arg)
>   	if (IS_ERR(file))
>   		return PTR_ERR(file);
>   
> +	i915_gem_unpark(i915);
>   	mutex_lock(&i915->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(i915);
>   
>   	ctx = kcalloc(nctx, sizeof(*ctx), GFP_KERNEL);
>   	if (!ctx) {
> @@ -170,8 +170,8 @@ static int live_nop_switch(void *arg)
>   	}
>   
>   out_unlock:
> -	intel_runtime_pm_put(i915, wakeref);
>   	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_gem_park(i915);
>   	mock_file_free(i915, file);
>   	return err;
>   }
> @@ -514,6 +514,7 @@ static int igt_ctx_exec(void *arg)
>   		if (IS_ERR(file))
>   			return PTR_ERR(file);
>   
> +		i915_gem_unpark(i915);
>   		mutex_lock(&i915->drm.struct_mutex);
>   
>   		err = igt_live_test_begin(&t, i915, __func__, engine->name);
> @@ -525,7 +526,6 @@ static int igt_ctx_exec(void *arg)
>   		dw = 0;
>   		while (!time_after(jiffies, end_time)) {
>   			struct i915_gem_context *ctx;
> -			intel_wakeref_t wakeref;
>   
>   			ctx = live_context(i915, file);
>   			if (IS_ERR(ctx)) {
> @@ -541,8 +541,7 @@ static int igt_ctx_exec(void *arg)
>   				}
>   			}
>   
> -			with_intel_runtime_pm(i915, wakeref)
> -				err = gpu_fill(obj, ctx, engine, dw);
> +			err = gpu_fill(obj, ctx, engine, dw);
>   			if (err) {
>   				pr_err("Failed to fill dword %lu [%lu/%lu] with gpu (%s) in ctx %u [full-ppgtt? %s], err=%d\n",
>   				       ndwords, dw, max_dwords(obj),
> @@ -579,6 +578,7 @@ static int igt_ctx_exec(void *arg)
>   		if (igt_live_test_end(&t))
>   			err = -EIO;
>   		mutex_unlock(&i915->drm.struct_mutex);
> +		i915_gem_park(i915);
>   
>   		mock_file_free(i915, file);
>   		if (err)
> @@ -610,6 +610,7 @@ static int igt_shared_ctx_exec(void *arg)
>   	if (IS_ERR(file))
>   		return PTR_ERR(file);
>   
> +	i915_gem_unpark(i915);
>   	mutex_lock(&i915->drm.struct_mutex);
>   
>   	parent = live_context(i915, file);
> @@ -641,7 +642,6 @@ static int igt_shared_ctx_exec(void *arg)
>   		ncontexts = 0;
>   		while (!time_after(jiffies, end_time)) {
>   			struct i915_gem_context *ctx;
> -			intel_wakeref_t wakeref;
>   
>   			ctx = kernel_context(i915);
>   			if (IS_ERR(ctx)) {
> @@ -660,9 +660,7 @@ static int igt_shared_ctx_exec(void *arg)
>   				}
>   			}
>   
> -			err = 0;
> -			with_intel_runtime_pm(i915, wakeref)
> -				err = gpu_fill(obj, ctx, engine, dw);
> +			err = gpu_fill(obj, ctx, engine, dw);
>   			if (err) {
>   				pr_err("Failed to fill dword %lu [%lu/%lu] with gpu (%s) in ctx %u [full-ppgtt? %s], err=%d\n",
>   				       ndwords, dw, max_dwords(obj),
> @@ -702,6 +700,7 @@ static int igt_shared_ctx_exec(void *arg)
>   		err = -EIO;
>   out_unlock:
>   	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_gem_park(i915);
>   
>   	mock_file_free(i915, file);
>   	return err;
> @@ -1052,7 +1051,6 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
>   	struct drm_i915_gem_object *obj;
>   	struct i915_gem_context *ctx;
>   	struct intel_sseu pg_sseu;
> -	intel_wakeref_t wakeref;
>   	struct drm_file *file;
>   	int ret;
>   
> @@ -1100,7 +1098,7 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
>   		goto out_unlock;
>   	}
>   
> -	wakeref = intel_runtime_pm_get(i915);
> +	i915_gem_unpark(i915);
>   
>   	/* First set the default mask. */
>   	ret = __sseu_test(i915, name, flags, ctx, engine, obj, default_sseu);
> @@ -1128,7 +1126,7 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
>   
>   	i915_gem_object_put(obj);
>   
> -	intel_runtime_pm_put(i915, wakeref);
> +	i915_gem_park(i915);
>   
>   out_unlock:
>   	mutex_unlock(&i915->drm.struct_mutex);
> @@ -1191,6 +1189,7 @@ static int igt_ctx_readonly(void *arg)
>   	if (IS_ERR(file))
>   		return PTR_ERR(file);
>   
> +	i915_gem_unpark(i915);
>   	mutex_lock(&i915->drm.struct_mutex);
>   
>   	err = igt_live_test_begin(&t, i915, __func__, "");
> @@ -1216,8 +1215,6 @@ static int igt_ctx_readonly(void *arg)
>   		unsigned int id;
>   
>   		for_each_engine(engine, i915, id) {
> -			intel_wakeref_t wakeref;
> -
>   			if (!intel_engine_can_store_dword(engine))
>   				continue;
>   
> @@ -1232,9 +1229,7 @@ static int igt_ctx_readonly(void *arg)
>   					i915_gem_object_set_readonly(obj);
>   			}
>   
> -			err = 0;
> -			with_intel_runtime_pm(i915, wakeref)
> -				err = gpu_fill(obj, ctx, engine, dw);
> +			err = gpu_fill(obj, ctx, engine, dw);
>   			if (err) {
>   				pr_err("Failed to fill dword %lu [%lu/%lu] with gpu (%s) in ctx %u [full-ppgtt? %s], err=%d\n",
>   				       ndwords, dw, max_dwords(obj),
> @@ -1275,6 +1270,7 @@ static int igt_ctx_readonly(void *arg)
>   	if (igt_live_test_end(&t))
>   		err = -EIO;
>   	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_gem_park(i915);
>   
>   	mock_file_free(i915, file);
>   	return err;
> @@ -1491,7 +1487,6 @@ static int igt_vm_isolation(void *arg)
>   	struct drm_i915_private *i915 = arg;
>   	struct i915_gem_context *ctx_a, *ctx_b;
>   	struct intel_engine_cs *engine;
> -	intel_wakeref_t wakeref;
>   	struct igt_live_test t;
>   	struct drm_file *file;
>   	I915_RND_STATE(prng);
> @@ -1538,7 +1533,7 @@ static int igt_vm_isolation(void *arg)
>   	GEM_BUG_ON(ctx_b->ppgtt->vm.total != vm_total);
>   	vm_total -= I915_GTT_PAGE_SIZE;
>   
> -	wakeref = intel_runtime_pm_get(i915);
> +	i915_gem_unpark(i915);
>   
>   	count = 0;
>   	for_each_engine(engine, i915, id) {
> @@ -1583,7 +1578,7 @@ static int igt_vm_isolation(void *arg)
>   		count, RUNTIME_INFO(i915)->num_engines);
>   
>   out_rpm:
> -	intel_runtime_pm_put(i915, wakeref);
> +	i915_gem_park(i915);
>   out_unlock:
>   	if (igt_live_test_end(&t))
>   		err = -EIO;
> @@ -1622,6 +1617,7 @@ static int __igt_switch_to_kernel_context(struct drm_i915_private *i915,
>   		int err;
>   
>   		if (!from_idle) {
> +			i915_gem_unpark(i915);
>   			for_each_engine_masked(engine, i915, engines, tmp) {
>   				struct i915_request *rq;
>   
> @@ -1631,6 +1627,7 @@ static int __igt_switch_to_kernel_context(struct drm_i915_private *i915,
>   
>   				i915_request_add(rq);
>   			}
> +			i915_gem_park(i915);
>   		}
>   
>   		err = i915_gem_switch_to_kernel_context(i915,
> @@ -1674,7 +1671,6 @@ static int igt_switch_to_kernel_context(void *arg)
>   	struct intel_engine_cs *engine;
>   	struct i915_gem_context *ctx;
>   	enum intel_engine_id id;
> -	intel_wakeref_t wakeref;
>   	int err;
>   
>   	/*
> @@ -1685,7 +1681,6 @@ static int igt_switch_to_kernel_context(void *arg)
>   	 */
>   
>   	mutex_lock(&i915->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(i915);
>   
>   	ctx = kernel_context(i915);
>   	if (IS_ERR(ctx)) {
> @@ -1708,7 +1703,6 @@ static int igt_switch_to_kernel_context(void *arg)
>   out_unlock:
>   	GEM_TRACE_DUMP_ON(err);
>   
> -	intel_runtime_pm_put(i915, wakeref);
>   	mutex_unlock(&i915->drm.struct_mutex);
>   
>   	kernel_context_close(ctx);
> @@ -1729,7 +1723,6 @@ static int mock_context_barrier(void *arg)
>   	struct drm_i915_private *i915 = arg;
>   	struct i915_gem_context *ctx;
>   	struct i915_request *rq;
> -	intel_wakeref_t wakeref;
>   	unsigned int counter;
>   	int err;
>   
> @@ -1738,6 +1731,7 @@ static int mock_context_barrier(void *arg)
>   	 * a request; useful for retiring old state after loading new.
>   	 */
>   
> +	i915_gem_unpark(i915);
>   	mutex_lock(&i915->drm.struct_mutex);
>   
>   	ctx = mock_context(i915, "mock");
> @@ -1772,9 +1766,7 @@ static int mock_context_barrier(void *arg)
>   		goto out;
>   	}
>   
> -	rq = ERR_PTR(-ENODEV);
> -	with_intel_runtime_pm(i915, wakeref)
> -		rq = i915_request_alloc(i915->engine[RCS0], ctx);
> +	rq = i915_request_alloc(i915->engine[RCS0], ctx);
>   	if (IS_ERR(rq)) {
>   		pr_err("Request allocation failed!\n");
>   		goto out;
> @@ -1816,6 +1808,7 @@ static int mock_context_barrier(void *arg)
>   	mock_context_close(ctx);
>   unlock:
>   	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_gem_park(i915);
>   	return err;
>   #undef pr_fmt
>   #define pr_fmt(x) x
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
> index 9a9451846b33..c0cf26507915 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
> @@ -23,6 +23,7 @@
>    */
>   
>   #include "../i915_selftest.h"
> +#include "../i915_gem_pm.h"
>   
>   #include "lib_sw_fence.h"
>   #include "mock_context.h"
> @@ -377,7 +378,6 @@ static int igt_evict_contexts(void *arg)
>   		struct drm_mm_node node;
>   		struct reserved *next;
>   	} *reserved = NULL;
> -	intel_wakeref_t wakeref;
>   	struct drm_mm_node hole;
>   	unsigned long count;
>   	int err;
> @@ -396,8 +396,8 @@ static int igt_evict_contexts(void *arg)
>   	if (!HAS_FULL_PPGTT(i915))
>   		return 0;
>   
> +	i915_gem_unpark(i915);
>   	mutex_lock(&i915->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(i915);
>   
>   	/* Reserve a block so that we know we have enough to fit a few rq */
>   	memset(&hole, 0, sizeof(hole));
> @@ -508,8 +508,8 @@ static int igt_evict_contexts(void *arg)
>   	}
>   	if (drm_mm_node_allocated(&hole))
>   		drm_mm_remove_node(&hole);
> -	intel_runtime_pm_put(i915, wakeref);
>   	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_gem_park(i915);
>   
>   	return err;
>   }
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
> index c2b08fdf23cf..cd6590e01dec 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_object.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
> @@ -576,8 +576,6 @@ static int igt_mmap_offset_exhaustion(void *arg)
>   
>   	/* Now fill with busy dead objects that we expect to reap */
>   	for (loop = 0; loop < 3; loop++) {
> -		intel_wakeref_t wakeref;
> -
>   		if (i915_terminally_wedged(i915))
>   			break;
>   
> @@ -587,11 +585,11 @@ static int igt_mmap_offset_exhaustion(void *arg)
>   			goto out;
>   		}
>   
> -		err = 0;
> +		i915_gem_unpark(i915);
>   		mutex_lock(&i915->drm.struct_mutex);
> -		with_intel_runtime_pm(i915, wakeref)
> -			err = make_obj_busy(obj);
> +		err = make_obj_busy(obj);
>   		mutex_unlock(&i915->drm.struct_mutex);
> +		i915_gem_park(i915);
>   		if (err) {
>   			pr_err("[loop %d] Failed to busy the object\n", loop);
>   			goto err_obj;
> diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c
> index e6ffe2240126..665cafa82390 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_request.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_request.c
> @@ -505,15 +505,15 @@ int i915_request_mock_selftests(void)
>   		SUBTEST(mock_breadcrumbs_smoketest),
>   	};
>   	struct drm_i915_private *i915;
> -	intel_wakeref_t wakeref;
> -	int err = 0;
> +	int err;
>   
>   	i915 = mock_gem_device();
>   	if (!i915)
>   		return -ENOMEM;
>   
> -	with_intel_runtime_pm(i915, wakeref)
> -		err = i915_subtests(tests, i915);
> +	i915_gem_unpark(i915);
> +	err = i915_subtests(tests, i915);
> +	i915_gem_park(i915);
>   
>   	drm_dev_put(&i915->drm);
>   
> @@ -524,7 +524,6 @@ static int live_nop_request(void *arg)
>   {
>   	struct drm_i915_private *i915 = arg;
>   	struct intel_engine_cs *engine;
> -	intel_wakeref_t wakeref;
>   	struct igt_live_test t;
>   	unsigned int id;
>   	int err = -ENODEV;
> @@ -534,8 +533,8 @@ static int live_nop_request(void *arg)
>   	 * the overhead of submitting requests to the hardware.
>   	 */
>   
> +	i915_gem_unpark(i915);
>   	mutex_lock(&i915->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(i915);
>   
>   	for_each_engine(engine, i915, id) {
>   		struct i915_request *request = NULL;
> @@ -596,8 +595,8 @@ static int live_nop_request(void *arg)
>   	}
>   
>   out_unlock:
> -	intel_runtime_pm_put(i915, wakeref);
>   	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_gem_park(i915);
>   	return err;
>   }
>   
> @@ -669,7 +668,6 @@ static int live_empty_request(void *arg)
>   {
>   	struct drm_i915_private *i915 = arg;
>   	struct intel_engine_cs *engine;
> -	intel_wakeref_t wakeref;
>   	struct igt_live_test t;
>   	struct i915_vma *batch;
>   	unsigned int id;
> @@ -680,8 +678,8 @@ static int live_empty_request(void *arg)
>   	 * the overhead of submitting requests to the hardware.
>   	 */
>   
> +	i915_gem_unpark(i915);
>   	mutex_lock(&i915->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(i915);
>   
>   	batch = empty_batch(i915);
>   	if (IS_ERR(batch)) {
> @@ -745,8 +743,8 @@ static int live_empty_request(void *arg)
>   	i915_vma_unpin(batch);
>   	i915_vma_put(batch);
>   out_unlock:
> -	intel_runtime_pm_put(i915, wakeref);
>   	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_gem_park(i915);
>   	return err;
>   }
>   
> @@ -827,7 +825,6 @@ static int live_all_engines(void *arg)
>   	struct drm_i915_private *i915 = arg;
>   	struct intel_engine_cs *engine;
>   	struct i915_request *request[I915_NUM_ENGINES];
> -	intel_wakeref_t wakeref;
>   	struct igt_live_test t;
>   	struct i915_vma *batch;
>   	unsigned int id;
> @@ -838,8 +835,8 @@ static int live_all_engines(void *arg)
>   	 * block doing so, and that they don't complete too soon.
>   	 */
>   
> +	i915_gem_unpark(i915);
>   	mutex_lock(&i915->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(i915);
>   
>   	err = igt_live_test_begin(&t, i915, __func__, "");
>   	if (err)
> @@ -922,8 +919,8 @@ static int live_all_engines(void *arg)
>   	i915_vma_unpin(batch);
>   	i915_vma_put(batch);
>   out_unlock:
> -	intel_runtime_pm_put(i915, wakeref);
>   	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_gem_park(i915);
>   	return err;
>   }
>   
> @@ -933,7 +930,6 @@ static int live_sequential_engines(void *arg)
>   	struct i915_request *request[I915_NUM_ENGINES] = {};
>   	struct i915_request *prev = NULL;
>   	struct intel_engine_cs *engine;
> -	intel_wakeref_t wakeref;
>   	struct igt_live_test t;
>   	unsigned int id;
>   	int err;
> @@ -944,8 +940,8 @@ static int live_sequential_engines(void *arg)
>   	 * they are running on independent engines.
>   	 */
>   
> +	i915_gem_unpark(i915);
>   	mutex_lock(&i915->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(i915);
>   
>   	err = igt_live_test_begin(&t, i915, __func__, "");
>   	if (err)
> @@ -1052,8 +1048,8 @@ static int live_sequential_engines(void *arg)
>   		i915_request_put(request[id]);
>   	}
>   out_unlock:
> -	intel_runtime_pm_put(i915, wakeref);
>   	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_gem_park(i915);
>   	return err;
>   }
>   
> @@ -1104,7 +1100,6 @@ static int live_breadcrumbs_smoketest(void *arg)
>   	struct task_struct **threads;
>   	struct igt_live_test live;
>   	enum intel_engine_id id;
> -	intel_wakeref_t wakeref;
>   	struct drm_file *file;
>   	unsigned int n;
>   	int ret = 0;
> @@ -1117,7 +1112,7 @@ static int live_breadcrumbs_smoketest(void *arg)
>   	 * On real hardware this time.
>   	 */
>   
> -	wakeref = intel_runtime_pm_get(i915);
> +	i915_gem_unpark(i915);
>   
>   	file = mock_file(i915);
>   	if (IS_ERR(file)) {
> @@ -1224,7 +1219,7 @@ static int live_breadcrumbs_smoketest(void *arg)
>   out_file:
>   	mock_file_free(i915, file);
>   out_rpm:
> -	intel_runtime_pm_put(i915, wakeref);
> +	i915_gem_park(i915);
>   
>   	return ret;
>   }
> diff --git a/drivers/gpu/drm/i915/selftests/i915_timeline.c b/drivers/gpu/drm/i915/selftests/i915_timeline.c
> index 8e7bcaa1eb66..b04969ea74d3 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_timeline.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_timeline.c
> @@ -7,6 +7,7 @@
>   #include <linux/prime_numbers.h>
>   
>   #include "../i915_selftest.h"
> +#include "../i915_gem_pm.h"
>   #include "i915_random.h"
>   
>   #include "igt_flush_test.h"
> @@ -497,7 +498,6 @@ static int live_hwsp_engine(void *arg)
>   	struct i915_timeline **timelines;
>   	struct intel_engine_cs *engine;
>   	enum intel_engine_id id;
> -	intel_wakeref_t wakeref;
>   	unsigned long count, n;
>   	int err = 0;
>   
> @@ -512,8 +512,8 @@ static int live_hwsp_engine(void *arg)
>   	if (!timelines)
>   		return -ENOMEM;
>   
> +	i915_gem_unpark(i915);
>   	mutex_lock(&i915->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(i915);
>   
>   	count = 0;
>   	for_each_engine(engine, i915, id) {
> @@ -556,8 +556,8 @@ static int live_hwsp_engine(void *arg)
>   		i915_timeline_put(tl);
>   	}
>   
> -	intel_runtime_pm_put(i915, wakeref);
>   	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_gem_park(i915);
>   
>   	kvfree(timelines);
>   
> @@ -572,7 +572,6 @@ static int live_hwsp_alternate(void *arg)
>   	struct i915_timeline **timelines;
>   	struct intel_engine_cs *engine;
>   	enum intel_engine_id id;
> -	intel_wakeref_t wakeref;
>   	unsigned long count, n;
>   	int err = 0;
>   
> @@ -588,8 +587,8 @@ static int live_hwsp_alternate(void *arg)
>   	if (!timelines)
>   		return -ENOMEM;
>   
> +	i915_gem_unpark(i915);
>   	mutex_lock(&i915->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(i915);
>   
>   	count = 0;
>   	for (n = 0; n < NUM_TIMELINES; n++) {
> @@ -632,8 +631,8 @@ static int live_hwsp_alternate(void *arg)
>   		i915_timeline_put(tl);
>   	}
>   
> -	intel_runtime_pm_put(i915, wakeref);
>   	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_gem_park(i915);
>   
>   	kvfree(timelines);
>   
> @@ -647,7 +646,6 @@ static int live_hwsp_wrap(void *arg)
>   	struct intel_engine_cs *engine;
>   	struct i915_timeline *tl;
>   	enum intel_engine_id id;
> -	intel_wakeref_t wakeref;
>   	int err = 0;
>   
>   	/*
> @@ -655,8 +653,8 @@ static int live_hwsp_wrap(void *arg)
>   	 * foreign GPU references.
>   	 */
>   
> +	i915_gem_unpark(i915);
>   	mutex_lock(&i915->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(i915);
>   
>   	tl = i915_timeline_create(i915, NULL);
>   	if (IS_ERR(tl)) {
> @@ -747,8 +745,8 @@ static int live_hwsp_wrap(void *arg)
>   out_free:
>   	i915_timeline_put(tl);
>   out_rpm:
> -	intel_runtime_pm_put(i915, wakeref);
>   	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_gem_park(i915);
>   
>   	return err;
>   }
> @@ -758,7 +756,6 @@ static int live_hwsp_recycle(void *arg)
>   	struct drm_i915_private *i915 = arg;
>   	struct intel_engine_cs *engine;
>   	enum intel_engine_id id;
> -	intel_wakeref_t wakeref;
>   	unsigned long count;
>   	int err = 0;
>   
> @@ -768,8 +765,8 @@ static int live_hwsp_recycle(void *arg)
>   	 * want to confuse ourselves or the GPU.
>   	 */
>   
> +	i915_gem_unpark(i915);
>   	mutex_lock(&i915->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(i915);
>   
>   	count = 0;
>   	for_each_engine(engine, i915, id) {
> @@ -823,8 +820,8 @@ static int live_hwsp_recycle(void *arg)
>   out:
>   	if (igt_flush_test(i915, I915_WAIT_LOCKED))
>   		err = -EIO;
> -	intel_runtime_pm_put(i915, wakeref);
>   	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_gem_park(i915);
>   
>   	return err;
>   }
> diff --git a/drivers/gpu/drm/i915/selftests/intel_guc.c b/drivers/gpu/drm/i915/selftests/intel_guc.c
> index b05a21eaa8f4..e62073af4728 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_guc.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_guc.c
> @@ -22,6 +22,7 @@
>    *
>    */
>   
> +#include "../i915_gem_pm.h"
>   #include "../i915_selftest.h"
>   
>   /* max doorbell number + negative test for each client type */
> @@ -137,13 +138,11 @@ static bool client_doorbell_in_sync(struct intel_guc_client *client)
>   static int igt_guc_clients(void *args)
>   {
>   	struct drm_i915_private *dev_priv = args;
> -	intel_wakeref_t wakeref;
>   	struct intel_guc *guc;
>   	int err = 0;
>   
>   	GEM_BUG_ON(!HAS_GUC(dev_priv));
>   	mutex_lock(&dev_priv->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(dev_priv);
>   
>   	guc = &dev_priv->guc;
>   	if (!guc) {
> @@ -226,7 +225,6 @@ static int igt_guc_clients(void *args)
>   	guc_clients_create(guc);
>   	guc_clients_enable(guc);
>   unlock:
> -	intel_runtime_pm_put(dev_priv, wakeref);
>   	mutex_unlock(&dev_priv->drm.struct_mutex);
>   	return err;
>   }
> @@ -239,14 +237,13 @@ static int igt_guc_clients(void *args)
>   static int igt_guc_doorbells(void *arg)
>   {
>   	struct drm_i915_private *dev_priv = arg;
> -	intel_wakeref_t wakeref;
>   	struct intel_guc *guc;
>   	int i, err = 0;
>   	u16 db_id;
>   
>   	GEM_BUG_ON(!HAS_GUC(dev_priv));
> +	i915_gem_unpark(dev_priv);
>   	mutex_lock(&dev_priv->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(dev_priv);
>   
>   	guc = &dev_priv->guc;
>   	if (!guc) {
> @@ -339,8 +336,8 @@ static int igt_guc_doorbells(void *arg)
>   			guc_client_free(clients[i]);
>   		}
>   unlock:
> -	intel_runtime_pm_put(dev_priv, wakeref);
>   	mutex_unlock(&dev_priv->drm.struct_mutex);
> +	i915_gem_park(dev_priv);
>   	return err;
>   }
>   
> diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> index 76b4fa150f2e..f6f417386b9f 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> @@ -24,6 +24,7 @@
>   
>   #include <linux/kthread.h>
>   
> +#include "../i915_gem_pm.h"
>   #include "../i915_selftest.h"
>   #include "i915_random.h"
>   #include "igt_flush_test.h"
> @@ -86,6 +87,7 @@ static int hang_init(struct hang *h, struct drm_i915_private *i915)
>   	}
>   	h->batch = vaddr;
>   
> +	i915_gem_unpark(i915);
>   	return 0;
>   
>   err_unpin_hws:
> @@ -287,6 +289,7 @@ static void hang_fini(struct hang *h)
>   	kernel_context_close(h->ctx);
>   
>   	igt_flush_test(h->i915, I915_WAIT_LOCKED);
> +	i915_gem_park(h->i915);
>   }
>   
>   static bool wait_until_running(struct hang *h, struct i915_request *rq)
> @@ -422,7 +425,6 @@ static int igt_reset_nop(void *arg)
>   	struct i915_gem_context *ctx;
>   	unsigned int reset_count, count;
>   	enum intel_engine_id id;
> -	intel_wakeref_t wakeref;
>   	struct drm_file *file;
>   	IGT_TIMEOUT(end_time);
>   	int err = 0;
> @@ -442,7 +444,7 @@ static int igt_reset_nop(void *arg)
>   	}
>   
>   	i915_gem_context_clear_bannable(ctx);
> -	wakeref = intel_runtime_pm_get(i915);
> +	i915_gem_unpark(i915);
>   	reset_count = i915_reset_count(&i915->gpu_error);
>   	count = 0;
>   	do {
> @@ -502,7 +504,7 @@ static int igt_reset_nop(void *arg)
>   	err = igt_flush_test(i915, I915_WAIT_LOCKED);
>   	mutex_unlock(&i915->drm.struct_mutex);
>   
> -	intel_runtime_pm_put(i915, wakeref);
> +	i915_gem_park(i915);
>   
>   out:
>   	mock_file_free(i915, file);
> @@ -517,7 +519,6 @@ static int igt_reset_nop_engine(void *arg)
>   	struct intel_engine_cs *engine;
>   	struct i915_gem_context *ctx;
>   	enum intel_engine_id id;
> -	intel_wakeref_t wakeref;
>   	struct drm_file *file;
>   	int err = 0;
>   
> @@ -539,7 +540,7 @@ static int igt_reset_nop_engine(void *arg)
>   	}
>   
>   	i915_gem_context_clear_bannable(ctx);
> -	wakeref = intel_runtime_pm_get(i915);
> +	i915_gem_unpark(i915);
>   	for_each_engine(engine, i915, id) {
>   		unsigned int reset_count, reset_engine_count;
>   		unsigned int count;
> @@ -623,7 +624,7 @@ static int igt_reset_nop_engine(void *arg)
>   	err = igt_flush_test(i915, I915_WAIT_LOCKED);
>   	mutex_unlock(&i915->drm.struct_mutex);
>   
> -	intel_runtime_pm_put(i915, wakeref);
> +	i915_gem_park(i915);
>   out:
>   	mock_file_free(i915, file);
>   	if (i915_reset_failed(i915))
> @@ -651,6 +652,7 @@ static int __igt_reset_engine(struct drm_i915_private *i915, bool active)
>   			return err;
>   	}
>   
> +	i915_gem_unpark(i915);
>   	for_each_engine(engine, i915, id) {
>   		unsigned int reset_count, reset_engine_count;
>   		IGT_TIMEOUT(end_time);
> @@ -744,6 +746,7 @@ static int __igt_reset_engine(struct drm_i915_private *i915, bool active)
>   		if (err)
>   			break;
>   	}
> +	i915_gem_park(i915);
>   
>   	if (i915_reset_failed(i915))
>   		err = -EIO;
> @@ -829,6 +832,7 @@ static int active_engine(void *data)
>   		}
>   	}
>   
> +	i915_gem_unpark(engine->i915);
>   	while (!kthread_should_stop()) {
>   		unsigned int idx = count++ & (ARRAY_SIZE(rq) - 1);
>   		struct i915_request *old = rq[idx];
> @@ -856,6 +860,7 @@ static int active_engine(void *data)
>   
>   		cond_resched();
>   	}
> +	i915_gem_park(engine->i915);
>   
>   	for (count = 0; count < ARRAY_SIZE(rq); count++) {
>   		int err__ = active_request_put(rq[count]);
> @@ -897,6 +902,7 @@ static int __igt_reset_engines(struct drm_i915_private *i915,
>   			h.ctx->sched.priority = 1024;
>   	}
>   
> +	i915_gem_unpark(i915);
>   	for_each_engine(engine, i915, id) {
>   		struct active_engine threads[I915_NUM_ENGINES] = {};
>   		unsigned long global = i915_reset_count(&i915->gpu_error);
> @@ -1073,6 +1079,7 @@ static int __igt_reset_engines(struct drm_i915_private *i915,
>   		if (err)
>   			break;
>   	}
> +	i915_gem_park(i915);
>   
>   	if (i915_reset_failed(i915))
>   		err = -EIO;
> diff --git a/drivers/gpu/drm/i915/selftests/intel_lrc.c b/drivers/gpu/drm/i915/selftests/intel_lrc.c
> index 0d3cae564db8..45370922d965 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_lrc.c
> @@ -6,6 +6,7 @@
>   
>   #include <linux/prime_numbers.h>
>   
> +#include "../i915_gem_pm.h"
>   #include "../i915_reset.h"
>   
>   #include "../i915_selftest.h"
> @@ -23,14 +24,13 @@ static int live_sanitycheck(void *arg)
>   	struct i915_gem_context *ctx;
>   	enum intel_engine_id id;
>   	struct igt_spinner spin;
> -	intel_wakeref_t wakeref;
>   	int err = -ENOMEM;
>   
>   	if (!HAS_LOGICAL_RING_CONTEXTS(i915))
>   		return 0;
>   
> +	i915_gem_unpark(i915);
>   	mutex_lock(&i915->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(i915);
>   
>   	if (igt_spinner_init(&spin, i915))
>   		goto err_unlock;
> @@ -71,8 +71,8 @@ static int live_sanitycheck(void *arg)
>   	igt_spinner_fini(&spin);
>   err_unlock:
>   	igt_flush_test(i915, I915_WAIT_LOCKED);
> -	intel_runtime_pm_put(i915, wakeref);
>   	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_gem_park(i915);
>   	return err;
>   }
>   
> @@ -83,7 +83,6 @@ static int live_preempt(void *arg)
>   	struct igt_spinner spin_hi, spin_lo;
>   	struct intel_engine_cs *engine;
>   	enum intel_engine_id id;
> -	intel_wakeref_t wakeref;
>   	int err = -ENOMEM;
>   
>   	if (!HAS_LOGICAL_RING_PREEMPTION(i915))
> @@ -92,8 +91,8 @@ static int live_preempt(void *arg)
>   	if (!(i915->caps.scheduler & I915_SCHEDULER_CAP_PREEMPTION))
>   		pr_err("Logical preemption supported, but not exposed\n");
>   
> +	i915_gem_unpark(i915);
>   	mutex_lock(&i915->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(i915);
>   
>   	if (igt_spinner_init(&spin_hi, i915))
>   		goto err_unlock;
> @@ -178,8 +177,8 @@ static int live_preempt(void *arg)
>   	igt_spinner_fini(&spin_hi);
>   err_unlock:
>   	igt_flush_test(i915, I915_WAIT_LOCKED);
> -	intel_runtime_pm_put(i915, wakeref);
>   	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_gem_park(i915);
>   	return err;
>   }
>   
> @@ -191,14 +190,13 @@ static int live_late_preempt(void *arg)
>   	struct intel_engine_cs *engine;
>   	struct i915_sched_attr attr = {};
>   	enum intel_engine_id id;
> -	intel_wakeref_t wakeref;
>   	int err = -ENOMEM;
>   
>   	if (!HAS_LOGICAL_RING_PREEMPTION(i915))
>   		return 0;
>   
> +	i915_gem_unpark(i915);
>   	mutex_lock(&i915->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(i915);
>   
>   	if (igt_spinner_init(&spin_hi, i915))
>   		goto err_unlock;
> @@ -282,8 +280,8 @@ static int live_late_preempt(void *arg)
>   	igt_spinner_fini(&spin_hi);
>   err_unlock:
>   	igt_flush_test(i915, I915_WAIT_LOCKED);
> -	intel_runtime_pm_put(i915, wakeref);
>   	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_gem_park(i915);
>   	return err;
>   
>   err_wedged:
> @@ -331,7 +329,6 @@ static int live_suppress_self_preempt(void *arg)
>   	};
>   	struct preempt_client a, b;
>   	enum intel_engine_id id;
> -	intel_wakeref_t wakeref;
>   	int err = -ENOMEM;
>   
>   	/*
> @@ -347,8 +344,8 @@ static int live_suppress_self_preempt(void *arg)
>   	if (USES_GUC_SUBMISSION(i915))
>   		return 0; /* presume black blox */
>   
> +	i915_gem_unpark(i915);
>   	mutex_lock(&i915->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(i915);
>   
>   	if (preempt_client_init(i915, &a))
>   		goto err_unlock;
> @@ -422,8 +419,8 @@ static int live_suppress_self_preempt(void *arg)
>   err_unlock:
>   	if (igt_flush_test(i915, I915_WAIT_LOCKED))
>   		err = -EIO;
> -	intel_runtime_pm_put(i915, wakeref);
>   	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_gem_park(i915);
>   	return err;
>   
>   err_wedged:
> @@ -480,7 +477,6 @@ static int live_suppress_wait_preempt(void *arg)
>   	struct preempt_client client[4];
>   	struct intel_engine_cs *engine;
>   	enum intel_engine_id id;
> -	intel_wakeref_t wakeref;
>   	int err = -ENOMEM;
>   	int i;
>   
> @@ -493,8 +489,8 @@ static int live_suppress_wait_preempt(void *arg)
>   	if (!HAS_LOGICAL_RING_PREEMPTION(i915))
>   		return 0;
>   
> +	i915_gem_unpark(i915);
>   	mutex_lock(&i915->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(i915);
>   
>   	if (preempt_client_init(i915, &client[0])) /* ELSP[0] */
>   		goto err_unlock;
> @@ -587,8 +583,8 @@ static int live_suppress_wait_preempt(void *arg)
>   err_unlock:
>   	if (igt_flush_test(i915, I915_WAIT_LOCKED))
>   		err = -EIO;
> -	intel_runtime_pm_put(i915, wakeref);
>   	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_gem_park(i915);
>   	return err;
>   
>   err_wedged:
> @@ -605,7 +601,6 @@ static int live_chain_preempt(void *arg)
>   	struct intel_engine_cs *engine;
>   	struct preempt_client hi, lo;
>   	enum intel_engine_id id;
> -	intel_wakeref_t wakeref;
>   	int err = -ENOMEM;
>   
>   	/*
> @@ -617,8 +612,8 @@ static int live_chain_preempt(void *arg)
>   	if (!HAS_LOGICAL_RING_PREEMPTION(i915))
>   		return 0;
>   
> +	i915_gem_unpark(i915);
>   	mutex_lock(&i915->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(i915);
>   
>   	if (preempt_client_init(i915, &hi))
>   		goto err_unlock;
> @@ -735,8 +730,8 @@ static int live_chain_preempt(void *arg)
>   err_unlock:
>   	if (igt_flush_test(i915, I915_WAIT_LOCKED))
>   		err = -EIO;
> -	intel_runtime_pm_put(i915, wakeref);
>   	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_gem_park(i915);
>   	return err;
>   
>   err_wedged:
> @@ -754,7 +749,6 @@ static int live_preempt_hang(void *arg)
>   	struct igt_spinner spin_hi, spin_lo;
>   	struct intel_engine_cs *engine;
>   	enum intel_engine_id id;
> -	intel_wakeref_t wakeref;
>   	int err = -ENOMEM;
>   
>   	if (!HAS_LOGICAL_RING_PREEMPTION(i915))
> @@ -763,8 +757,8 @@ static int live_preempt_hang(void *arg)
>   	if (!intel_has_reset_engine(i915))
>   		return 0;
>   
> +	i915_gem_unpark(i915);
>   	mutex_lock(&i915->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(i915);
>   
>   	if (igt_spinner_init(&spin_hi, i915))
>   		goto err_unlock;
> @@ -859,8 +853,8 @@ static int live_preempt_hang(void *arg)
>   	igt_spinner_fini(&spin_hi);
>   err_unlock:
>   	igt_flush_test(i915, I915_WAIT_LOCKED);
> -	intel_runtime_pm_put(i915, wakeref);
>   	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_gem_park(i915);
>   	return err;
>   }
>   
> @@ -1047,7 +1041,6 @@ static int live_preempt_smoke(void *arg)
>   		.ncontext = 1024,
>   	};
>   	const unsigned int phase[] = { 0, BATCH };
> -	intel_wakeref_t wakeref;
>   	struct igt_live_test t;
>   	int err = -ENOMEM;
>   	u32 *cs;
> @@ -1062,8 +1055,8 @@ static int live_preempt_smoke(void *arg)
>   	if (!smoke.contexts)
>   		return -ENOMEM;
>   
> +	i915_gem_unpark(smoke.i915);
>   	mutex_lock(&smoke.i915->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(smoke.i915);
>   
>   	smoke.batch = i915_gem_object_create_internal(smoke.i915, PAGE_SIZE);
>   	if (IS_ERR(smoke.batch)) {
> @@ -1116,8 +1109,8 @@ static int live_preempt_smoke(void *arg)
>   err_batch:
>   	i915_gem_object_put(smoke.batch);
>   err_unlock:
> -	intel_runtime_pm_put(smoke.i915, wakeref);
>   	mutex_unlock(&smoke.i915->drm.struct_mutex);
> +	i915_gem_park(smoke.i915);
>   	kfree(smoke.contexts);
>   
>   	return err;
> diff --git a/drivers/gpu/drm/i915/selftests/intel_workarounds.c b/drivers/gpu/drm/i915/selftests/intel_workarounds.c
> index 3baed59008d7..0e42e1a0b46c 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_workarounds.c
> @@ -5,6 +5,7 @@
>    */
>   
>   #include "../i915_selftest.h"
> +#include "../i915_gem_pm.h"
>   #include "../i915_reset.h"
>   
>   #include "igt_flush_test.h"
> @@ -238,7 +239,6 @@ switch_to_scratch_context(struct intel_engine_cs *engine,
>   {
>   	struct i915_gem_context *ctx;
>   	struct i915_request *rq;
> -	intel_wakeref_t wakeref;
>   	int err = 0;
>   
>   	ctx = kernel_context(engine->i915);
> @@ -247,9 +247,9 @@ switch_to_scratch_context(struct intel_engine_cs *engine,
>   
>   	GEM_BUG_ON(i915_gem_context_is_bannable(ctx));
>   
> -	rq = ERR_PTR(-ENODEV);
> -	with_intel_runtime_pm(engine->i915, wakeref)
> -		rq = igt_spinner_create_request(spin, ctx, engine, MI_NOOP);
> +	i915_gem_unpark(engine->i915);
> +	rq = igt_spinner_create_request(spin, ctx, engine, MI_NOOP);
> +	i915_gem_park(engine->i915);
>   
>   	kernel_context_close(ctx);
>   
> @@ -666,7 +666,6 @@ static int live_dirty_whitelist(void *arg)
>   	struct intel_engine_cs *engine;
>   	struct i915_gem_context *ctx;
>   	enum intel_engine_id id;
> -	intel_wakeref_t wakeref;
>   	struct drm_file *file;
>   	int err = 0;
>   
> @@ -675,7 +674,7 @@ static int live_dirty_whitelist(void *arg)
>   	if (INTEL_GEN(i915) < 7) /* minimum requirement for LRI, SRM, LRM */
>   		return 0;
>   
> -	wakeref = intel_runtime_pm_get(i915);
> +	i915_gem_unpark(i915);
>   
>   	mutex_unlock(&i915->drm.struct_mutex);
>   	file = mock_file(i915);
> @@ -705,7 +704,7 @@ static int live_dirty_whitelist(void *arg)
>   	mock_file_free(i915, file);
>   	mutex_lock(&i915->drm.struct_mutex);
>   out_rpm:
> -	intel_runtime_pm_put(i915, wakeref);
> +	i915_gem_park(i915);
>   	return err;
>   }
>   
> @@ -762,7 +761,6 @@ static int
>   live_gpu_reset_gt_engine_workarounds(void *arg)
>   {
>   	struct drm_i915_private *i915 = arg;
> -	intel_wakeref_t wakeref;
>   	struct wa_lists lists;
>   	bool ok;
>   
> @@ -771,8 +769,8 @@ live_gpu_reset_gt_engine_workarounds(void *arg)
>   
>   	pr_info("Verifying after GPU reset...\n");
>   
> +	i915_gem_unpark(i915);
>   	igt_global_reset_lock(i915);
> -	wakeref = intel_runtime_pm_get(i915);
>   
>   	reference_lists_init(i915, &lists);
>   
> @@ -786,8 +784,8 @@ live_gpu_reset_gt_engine_workarounds(void *arg)
>   
>   out:
>   	reference_lists_fini(i915, &lists);
> -	intel_runtime_pm_put(i915, wakeref);
>   	igt_global_reset_unlock(i915);
> +	i915_gem_park(i915);
>   
>   	return ok ? 0 : -ESRCH;
>   }
> @@ -801,7 +799,6 @@ live_engine_reset_gt_engine_workarounds(void *arg)
>   	struct igt_spinner spin;
>   	enum intel_engine_id id;
>   	struct i915_request *rq;
> -	intel_wakeref_t wakeref;
>   	struct wa_lists lists;
>   	int ret = 0;
>   
> @@ -812,8 +809,8 @@ live_engine_reset_gt_engine_workarounds(void *arg)
>   	if (IS_ERR(ctx))
>   		return PTR_ERR(ctx);
>   
> +	i915_gem_unpark(i915);
>   	igt_global_reset_lock(i915);
> -	wakeref = intel_runtime_pm_get(i915);
>   
>   	reference_lists_init(i915, &lists);
>   
> @@ -870,8 +867,8 @@ live_engine_reset_gt_engine_workarounds(void *arg)
>   
>   err:
>   	reference_lists_fini(i915, &lists);
> -	intel_runtime_pm_put(i915, wakeref);
>   	igt_global_reset_unlock(i915);
> +	i915_gem_park(i915);
>   	kernel_context_close(ctx);
>   
>   	igt_flush_test(i915, I915_WAIT_LOCKED);
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/22] drm/i915: Guard unpark/park with the gt.active_mutex
  2019-04-01 15:22   ` Tvrtko Ursulin
@ 2019-04-01 15:37     ` Chris Wilson
  2019-04-01 15:54       ` Tvrtko Ursulin
  0 siblings, 1 reply; 43+ messages in thread
From: Chris Wilson @ 2019-04-01 15:37 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-04-01 16:22:55)
> 
> On 25/03/2019 09:03, Chris Wilson wrote:
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 11803d485275..7c7afe99986c 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -2008,7 +2008,8 @@ struct drm_i915_private {
> >               intel_engine_mask_t active_engines;
> >               struct list_head active_rings;
> >               struct list_head closed_vma;
> > -             u32 active_requests;
> > +             atomic_t active_requests;
> > +             struct mutex active_mutex;
> 
> My initial reaction is why not gem_pm_mutex to match where code was 
> moved and what the commit message says?

Because we are inheriting the name from active_requests. If we renane
that to active_count, and maybe pm_wake_count?

pm_active_count?

> > +     mutex_lock(&i915->gt.active_mutex);
> >       if (!work_pending(&i915->gt.idle_work.work) &&
> > -         !i915->gt.active_requests) {
> > -             ++i915->gt.active_requests; /* don't requeue idle */
> > +         !atomic_read(&i915->gt.active_requests)) {
> > +             atomic_inc(&i915->gt.active_requests); /* don't requeue idle */
> 
> atomic_inc_not_zero?

atomicity of the read & inc are not strictly required, once
active_requests is zero it cannot be raised without holding
active_mutex.

> > @@ -191,11 +183,29 @@ void i915_gem_unpark(struct drm_i915_private *i915)
> >                          round_jiffies_up_relative(HZ));
> >   }
> >   
> > +void i915_gem_unpark(struct drm_i915_private *i915)
> > +{
> > +     if (atomic_add_unless(&i915->gt.active_requests, 1, 0))
> > +             return;
> 
> This looks wrong - how can it be okay to maybe not increment 
> active_requests on unpark? What am I missing?

If the add succeeds, active_requests was non-zero, and we can skip waking
up the device. If the add fails, active_requests might be zero, so we
take the mutex and check.
 
> I would expect here you would need 
> "atomic_inc_and_return_true_if_OLD_value_was_zero" but I don't think 
> there is such API.
> 
> > +
> > +     mutex_lock(&i915->gt.active_mutex);
> > +     if (!atomic_read(&i915->gt.active_requests)) {
> > +             GEM_TRACE("\n");
> > +             __i915_gem_unpark(i915);
> > +             smp_mb__before_atomic();
> 
> Why is this needed? I have no idea.. but I think we want comments with 
> all memory barriers.

Because atomic_inc() is not regarded as a strong barrier, so we turn it
into one. (In documentation at least, on x86 it's just a compiler
barrier().)

> > +     }
> > +     atomic_inc(&i915->gt.active_requests);
> > +     mutex_unlock(&i915->gt.active_mutex);
> > +}
> > +
> >   bool i915_gem_load_power_context(struct drm_i915_private *i915)
> >   {
> > +     mutex_lock(&i915->gt.active_mutex);
> > +     atomic_inc(&i915->gt.active_requests);
> 
> Why this function has to manually manage active_requests? Can it be 
> written in a simpler way?

It's so that this function has control over parking.

This is the first request, this is the place where we will make the
explicit calls to set up the powersaving contexts and state -- currently,
it's implicit on the first request.

> >       /* Force loading the kernel context on all engines */
> >       if (!switch_to_kernel_context_sync(i915, ALL_ENGINES))
> > -             return false;
> > +             goto err_active;
> >   
> >       /*
> >        * Immediately park the GPU so that we enable powersaving and
> > @@ -203,9 +213,20 @@ bool i915_gem_load_power_context(struct drm_i915_private *i915)
> >        * unpark and start using the engine->pinned_default_state, otherwise
> >        * it is in limbo and an early reset may fail.
> >        */
> > +
> > +     if (!atomic_dec_and_test(&i915->gt.active_requests))
> > +             goto err_unlock;
> > +
> >       __i915_gem_park(i915);
> > +     mutex_unlock(&i915->gt.active_mutex);
> >   
> >       return true;
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 03/22] drm/i915: Pull the GEM powermangement coupling into its own file
  2019-03-25  9:03 ` [PATCH 03/22] drm/i915: Pull the GEM powermangement coupling into its own file Chris Wilson
  2019-04-01 14:56   ` Tvrtko Ursulin
@ 2019-04-01 15:39   ` Lucas De Marchi
  2019-04-01 15:52     ` Chris Wilson
  1 sibling, 1 reply; 43+ messages in thread
From: Lucas De Marchi @ 2019-04-01 15:39 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel Graphics

On Mon, Mar 25, 2019 at 2:06 AM Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Split out the powermanagement portion (GT wakeref, suspend/resume) of
> GEM from i915_gem.c into its own file.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/Makefile                 |   2 +
>  drivers/gpu/drm/i915/i915_gem.c               | 335 +----------------
>  drivers/gpu/drm/i915/i915_gem_pm.c            | 341 ++++++++++++++++++
>  drivers/gpu/drm/i915/i915_gem_pm.h            |  28 ++
>  .../drm/i915/test_i915_gem_pm_standalone.c    |   7 +
>  5 files changed, 381 insertions(+), 332 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/i915_gem_pm.c
>  create mode 100644 drivers/gpu/drm/i915/i915_gem_pm.h
>  create mode 100644 drivers/gpu/drm/i915/test_i915_gem_pm_standalone.c
>
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 60de05f3fa60..bd1657c3d395 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -61,6 +61,7 @@ i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
>  i915-$(CONFIG_DRM_I915_WERROR) += \
>         test_i915_active_types_standalone.o \
>         test_i915_gem_context_types_standalone.o \
> +       test_i915_gem_pm_standalone.o \
>         test_i915_timeline_types_standalone.o \
>         test_intel_context_types_standalone.o \
>         test_intel_engine_types_standalone.o \
> @@ -81,6 +82,7 @@ i915-y += \
>           i915_gem_internal.o \
>           i915_gem.o \
>           i915_gem_object.o \
> +         i915_gem_pm.o \
>           i915_gem_render_state.o \
>           i915_gem_shrinker.o \
>           i915_gem_stolen.o \
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index f6cdd5fb9deb..47c672432594 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -42,7 +42,7 @@
>  #include "i915_drv.h"
>  #include "i915_gem_clflush.h"
>  #include "i915_gemfs.h"
> -#include "i915_globals.h"
> +#include "i915_gem_pm.h"
>  #include "i915_reset.h"
>  #include "i915_trace.h"
>  #include "i915_vgpu.h"
> @@ -101,105 +101,6 @@ static void i915_gem_info_remove_obj(struct drm_i915_private *dev_priv,
>         spin_unlock(&dev_priv->mm.object_stat_lock);
>  }
>
> -static void __i915_gem_park(struct drm_i915_private *i915)
> -{
> -       intel_wakeref_t wakeref;
> -
> -       GEM_TRACE("\n");
> -
> -       lockdep_assert_held(&i915->drm.struct_mutex);
> -       GEM_BUG_ON(i915->gt.active_requests);
> -       GEM_BUG_ON(!list_empty(&i915->gt.active_rings));
> -
> -       if (!i915->gt.awake)
> -               return;
> -
> -       /*
> -        * Be paranoid and flush a concurrent interrupt to make sure
> -        * we don't reactivate any irq tasklets after parking.
> -        *
> -        * FIXME: Note that even though we have waited for execlists to be idle,
> -        * there may still be an in-flight interrupt even though the CSB
> -        * is now empty. synchronize_irq() makes sure that a residual interrupt
> -        * is completed before we continue, but it doesn't prevent the HW from
> -        * raising a spurious interrupt later. To complete the shield we should
> -        * coordinate disabling the CS irq with flushing the interrupts.
> -        */
> -       synchronize_irq(i915->drm.irq);
> -
> -       intel_engines_park(i915);
> -       i915_timelines_park(i915);
> -
> -       i915_pmu_gt_parked(i915);
> -       i915_vma_parked(i915);
> -
> -       wakeref = fetch_and_zero(&i915->gt.awake);
> -       GEM_BUG_ON(!wakeref);
> -
> -       if (INTEL_GEN(i915) >= 6)
> -               gen6_rps_idle(i915);
> -
> -       intel_display_power_put(i915, POWER_DOMAIN_GT_IRQ, wakeref);
> -
> -       i915_globals_park();
> -}
> -
> -void i915_gem_park(struct drm_i915_private *i915)
> -{
> -       GEM_TRACE("\n");
> -
> -       lockdep_assert_held(&i915->drm.struct_mutex);
> -       GEM_BUG_ON(i915->gt.active_requests);
> -
> -       if (!i915->gt.awake)
> -               return;
> -
> -       /* Defer the actual call to __i915_gem_park() to prevent ping-pongs */
> -       mod_delayed_work(i915->wq, &i915->gt.idle_work, msecs_to_jiffies(100));
> -}
> -
> -void i915_gem_unpark(struct drm_i915_private *i915)
> -{
> -       GEM_TRACE("\n");
> -
> -       lockdep_assert_held(&i915->drm.struct_mutex);
> -       GEM_BUG_ON(!i915->gt.active_requests);
> -       assert_rpm_wakelock_held(i915);
> -
> -       if (i915->gt.awake)
> -               return;
> -
> -       /*
> -        * It seems that the DMC likes to transition between the DC states a lot
> -        * when there are no connected displays (no active power domains) during
> -        * command submission.
> -        *
> -        * This activity has negative impact on the performance of the chip with
> -        * huge latencies observed in the interrupt handler and elsewhere.
> -        *
> -        * Work around it by grabbing a GT IRQ power domain whilst there is any
> -        * GT activity, preventing any DC state transitions.
> -        */
> -       i915->gt.awake = intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
> -       GEM_BUG_ON(!i915->gt.awake);
> -
> -       i915_globals_unpark();
> -
> -       intel_enable_gt_powersave(i915);
> -       i915_update_gfx_val(i915);
> -       if (INTEL_GEN(i915) >= 6)
> -               gen6_rps_busy(i915);
> -       i915_pmu_gt_unparked(i915);
> -
> -       intel_engines_unpark(i915);
> -
> -       i915_queue_hangcheck(i915);
> -
> -       queue_delayed_work(i915->wq,
> -                          &i915->gt.retire_work,
> -                          round_jiffies_up_relative(HZ));
> -}
> -
>  int
>  i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
>                             struct drm_file *file)
> @@ -2874,108 +2775,6 @@ i915_gem_retire_work_handler(struct work_struct *work)
>                                    round_jiffies_up_relative(HZ));
>  }
>
> -static bool switch_to_kernel_context_sync(struct drm_i915_private *i915,
> -                                         unsigned long mask)
> -{
> -       bool result = true;
> -
> -       /*
> -        * Even if we fail to switch, give whatever is running a small chance
> -        * to save itself before we report the failure. Yes, this may be a
> -        * false positive due to e.g. ENOMEM, caveat emptor!
> -        */
> -       if (i915_gem_switch_to_kernel_context(i915, mask))
> -               result = false;
> -
> -       if (i915_gem_wait_for_idle(i915,
> -                                  I915_WAIT_LOCKED |
> -                                  I915_WAIT_FOR_IDLE_BOOST,
> -                                  I915_GEM_IDLE_TIMEOUT))
> -               result = false;
> -
> -       if (!result) {
> -               if (i915_modparams.reset) { /* XXX hide warning from gem_eio */
> -                       dev_err(i915->drm.dev,
> -                               "Failed to idle engines, declaring wedged!\n");
> -                       GEM_TRACE_DUMP();
> -               }
> -
> -               /* Forcibly cancel outstanding work and leave the gpu quiet. */
> -               i915_gem_set_wedged(i915);
> -       }
> -
> -       i915_retire_requests(i915); /* ensure we flush after wedging */
> -       return result;
> -}
> -
> -static bool load_power_context(struct drm_i915_private *i915)
> -{
> -       /* Force loading the kernel context on all engines */
> -       if (!switch_to_kernel_context_sync(i915, ALL_ENGINES))
> -               return false;
> -
> -       /*
> -        * Immediately park the GPU so that we enable powersaving and
> -        * treat it as idle. The next time we issue a request, we will
> -        * unpark and start using the engine->pinned_default_state, otherwise
> -        * it is in limbo and an early reset may fail.
> -        */
> -       __i915_gem_park(i915);
> -
> -       return true;
> -}
> -
> -static void
> -i915_gem_idle_work_handler(struct work_struct *work)
> -{
> -       struct drm_i915_private *i915 =
> -               container_of(work, typeof(*i915), gt.idle_work.work);
> -       bool rearm_hangcheck;
> -
> -       if (!READ_ONCE(i915->gt.awake))
> -               return;
> -
> -       if (READ_ONCE(i915->gt.active_requests))
> -               return;
> -
> -       rearm_hangcheck =
> -               cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
> -
> -       if (!mutex_trylock(&i915->drm.struct_mutex)) {
> -               /* Currently busy, come back later */
> -               mod_delayed_work(i915->wq,
> -                                &i915->gt.idle_work,
> -                                msecs_to_jiffies(50));
> -               goto out_rearm;
> -       }
> -
> -       /*
> -        * Flush out the last user context, leaving only the pinned
> -        * kernel context resident. Should anything unfortunate happen
> -        * while we are idle (such as the GPU being power cycled), no users
> -        * will be harmed.
> -        */
> -       if (!work_pending(&i915->gt.idle_work.work) &&
> -           !i915->gt.active_requests) {
> -               ++i915->gt.active_requests; /* don't requeue idle */
> -
> -               switch_to_kernel_context_sync(i915, i915->gt.active_engines);
> -
> -               if (!--i915->gt.active_requests) {
> -                       __i915_gem_park(i915);
> -                       rearm_hangcheck = false;
> -               }
> -       }
> -
> -       mutex_unlock(&i915->drm.struct_mutex);
> -
> -out_rearm:
> -       if (rearm_hangcheck) {
> -               GEM_BUG_ON(!i915->gt.awake);
> -               i915_queue_hangcheck(i915);
> -       }
> -}
> -
>  void i915_gem_close_object(struct drm_gem_object *gem, struct drm_file *file)
>  {
>         struct drm_i915_private *i915 = to_i915(gem->dev);
> @@ -4390,133 +4189,6 @@ void i915_gem_sanitize(struct drm_i915_private *i915)
>         mutex_unlock(&i915->drm.struct_mutex);
>  }
>
> -void i915_gem_suspend(struct drm_i915_private *i915)
> -{
> -       intel_wakeref_t wakeref;
> -
> -       GEM_TRACE("\n");
> -
> -       wakeref = intel_runtime_pm_get(i915);
> -
> -       flush_workqueue(i915->wq);
> -
> -       mutex_lock(&i915->drm.struct_mutex);
> -
> -       /*
> -        * We have to flush all the executing contexts to main memory so
> -        * that they can saved in the hibernation image. To ensure the last
> -        * context image is coherent, we have to switch away from it. That
> -        * leaves the i915->kernel_context still active when
> -        * we actually suspend, and its image in memory may not match the GPU
> -        * state. Fortunately, the kernel_context is disposable and we do
> -        * not rely on its state.
> -        */
> -       switch_to_kernel_context_sync(i915, i915->gt.active_engines);
> -
> -       mutex_unlock(&i915->drm.struct_mutex);
> -       i915_reset_flush(i915);
> -
> -       drain_delayed_work(&i915->gt.retire_work);
> -
> -       /*
> -        * As the idle_work is rearming if it detects a race, play safe and
> -        * repeat the flush until it is definitely idle.
> -        */
> -       drain_delayed_work(&i915->gt.idle_work);
> -
> -       /*
> -        * Assert that we successfully flushed all the work and
> -        * reset the GPU back to its idle, low power state.
> -        */
> -       GEM_BUG_ON(i915->gt.awake);
> -
> -       intel_uc_suspend(i915);
> -
> -       intel_runtime_pm_put(i915, wakeref);
> -}
> -
> -void i915_gem_suspend_late(struct drm_i915_private *i915)
> -{
> -       struct drm_i915_gem_object *obj;
> -       struct list_head *phases[] = {
> -               &i915->mm.unbound_list,
> -               &i915->mm.bound_list,
> -               NULL
> -       }, **phase;
> -
> -       /*
> -        * Neither the BIOS, ourselves or any other kernel
> -        * expects the system to be in execlists mode on startup,
> -        * so we need to reset the GPU back to legacy mode. And the only
> -        * known way to disable logical contexts is through a GPU reset.
> -        *
> -        * So in order to leave the system in a known default configuration,
> -        * always reset the GPU upon unload and suspend. Afterwards we then
> -        * clean up the GEM state tracking, flushing off the requests and
> -        * leaving the system in a known idle state.
> -        *
> -        * Note that is of the upmost importance that the GPU is idle and
> -        * all stray writes are flushed *before* we dismantle the backing
> -        * storage for the pinned objects.
> -        *
> -        * However, since we are uncertain that resetting the GPU on older
> -        * machines is a good idea, we don't - just in case it leaves the
> -        * machine in an unusable condition.
> -        */
> -
> -       mutex_lock(&i915->drm.struct_mutex);
> -       for (phase = phases; *phase; phase++) {
> -               list_for_each_entry(obj, *phase, mm.link)
> -                       WARN_ON(i915_gem_object_set_to_gtt_domain(obj, false));
> -       }
> -       mutex_unlock(&i915->drm.struct_mutex);
> -
> -       intel_uc_sanitize(i915);
> -       i915_gem_sanitize(i915);
> -}
> -
> -void i915_gem_resume(struct drm_i915_private *i915)
> -{
> -       GEM_TRACE("\n");
> -
> -       WARN_ON(i915->gt.awake);
> -
> -       mutex_lock(&i915->drm.struct_mutex);
> -       intel_uncore_forcewake_get(&i915->uncore, FORCEWAKE_ALL);
> -
> -       i915_gem_restore_gtt_mappings(i915);
> -       i915_gem_restore_fences(i915);
> -
> -       /*
> -        * As we didn't flush the kernel context before suspend, we cannot
> -        * guarantee that the context image is complete. So let's just reset
> -        * it and start again.
> -        */
> -       i915->gt.resume(i915);
> -
> -       if (i915_gem_init_hw(i915))
> -               goto err_wedged;
> -
> -       intel_uc_resume(i915);
> -
> -       /* Always reload a context for powersaving. */
> -       if (!load_power_context(i915))
> -               goto err_wedged;
> -
> -out_unlock:
> -       intel_uncore_forcewake_put(&i915->uncore, FORCEWAKE_ALL);
> -       mutex_unlock(&i915->drm.struct_mutex);
> -       return;
> -
> -err_wedged:
> -       if (!i915_reset_failed(i915)) {
> -               dev_err(i915->drm.dev,
> -                       "Failed to re-initialize GPU, declaring it wedged!\n");
> -               i915_gem_set_wedged(i915);
> -       }
> -       goto out_unlock;
> -}
> -
>  void i915_gem_init_swizzling(struct drm_i915_private *dev_priv)
>  {
>         if (INTEL_GEN(dev_priv) < 5 ||
> @@ -4699,7 +4371,7 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
>         }
>
>         /* Flush the default context image to memory, and enable powersaving. */
> -       if (!load_power_context(i915)) {
> +       if (!i915_gem_load_power_context(i915)) {
>                 err = -EIO;
>                 goto err_active;
>         }
> @@ -5096,11 +4768,10 @@ int i915_gem_init_early(struct drm_i915_private *dev_priv)
>         INIT_LIST_HEAD(&dev_priv->gt.closed_vma);
>
>         i915_gem_init__mm(dev_priv);
> +       i915_gem_init__pm(dev_priv);
>
>         INIT_DELAYED_WORK(&dev_priv->gt.retire_work,
>                           i915_gem_retire_work_handler);
> -       INIT_DELAYED_WORK(&dev_priv->gt.idle_work,
> -                         i915_gem_idle_work_handler);
>         init_waitqueue_head(&dev_priv->gpu_error.wait_queue);
>         init_waitqueue_head(&dev_priv->gpu_error.reset_queue);
>         mutex_init(&dev_priv->gpu_error.wedge_mutex);
> diff --git a/drivers/gpu/drm/i915/i915_gem_pm.c b/drivers/gpu/drm/i915/i915_gem_pm.c
> new file mode 100644
> index 000000000000..faa4eb42ec0a
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_gem_pm.c
> @@ -0,0 +1,341 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#include "i915_drv.h"
> +#include "i915_gem_pm.h"
> +#include "i915_globals.h"
> +
> +static void __i915_gem_park(struct drm_i915_private *i915)
> +{
> +       intel_wakeref_t wakeref;
> +
> +       GEM_TRACE("\n");
> +
> +       lockdep_assert_held(&i915->drm.struct_mutex);
> +       GEM_BUG_ON(i915->gt.active_requests);
> +       GEM_BUG_ON(!list_empty(&i915->gt.active_rings));
> +
> +       if (!i915->gt.awake)
> +               return;
> +
> +       /*
> +        * Be paranoid and flush a concurrent interrupt to make sure
> +        * we don't reactivate any irq tasklets after parking.
> +        *
> +        * FIXME: Note that even though we have waited for execlists to be idle,
> +        * there may still be an in-flight interrupt even though the CSB
> +        * is now empty. synchronize_irq() makes sure that a residual interrupt
> +        * is completed before we continue, but it doesn't prevent the HW from
> +        * raising a spurious interrupt later. To complete the shield we should
> +        * coordinate disabling the CS irq with flushing the interrupts.
> +        */
> +       synchronize_irq(i915->drm.irq);
> +
> +       intel_engines_park(i915);
> +       i915_timelines_park(i915);
> +
> +       i915_pmu_gt_parked(i915);
> +       i915_vma_parked(i915);
> +
> +       wakeref = fetch_and_zero(&i915->gt.awake);
> +       GEM_BUG_ON(!wakeref);
> +
> +       if (INTEL_GEN(i915) >= 6)
> +               gen6_rps_idle(i915);
> +
> +       intel_display_power_put(i915, POWER_DOMAIN_GT_IRQ, wakeref);
> +
> +       i915_globals_park();
> +}
> +
> +static bool switch_to_kernel_context_sync(struct drm_i915_private *i915,
> +                                         unsigned long mask)
> +{
> +       bool result = true;
> +
> +       /*
> +        * Even if we fail to switch, give whatever is running a small chance
> +        * to save itself before we report the failure. Yes, this may be a
> +        * false positive due to e.g. ENOMEM, caveat emptor!
> +        */
> +       if (i915_gem_switch_to_kernel_context(i915, mask))
> +               result = false;
> +
> +       if (i915_gem_wait_for_idle(i915,
> +                                  I915_WAIT_LOCKED |
> +                                  I915_WAIT_FOR_IDLE_BOOST,
> +                                  I915_GEM_IDLE_TIMEOUT))
> +               result = false;
> +
> +       if (!result) {
> +               if (i915_modparams.reset) { /* XXX hide warning from gem_eio */
> +                       dev_err(i915->drm.dev,
> +                               "Failed to idle engines, declaring wedged!\n");
> +                       GEM_TRACE_DUMP();
> +               }
> +
> +               /* Forcibly cancel outstanding work and leave the gpu quiet. */
> +               i915_gem_set_wedged(i915);
> +       }
> +
> +       i915_retire_requests(i915); /* ensure we flush after wedging */
> +       return result;
> +}
> +
> +static void idle_work_handler(struct work_struct *work)
> +{
> +       struct drm_i915_private *i915 =
> +               container_of(work, typeof(*i915), gt.idle_work.work);
> +       bool rearm_hangcheck;
> +
> +       if (!READ_ONCE(i915->gt.awake))
> +               return;
> +
> +       if (READ_ONCE(i915->gt.active_requests))
> +               return;
> +
> +       rearm_hangcheck =
> +               cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
> +
> +       if (!mutex_trylock(&i915->drm.struct_mutex)) {
> +               /* Currently busy, come back later */
> +               mod_delayed_work(i915->wq,
> +                                &i915->gt.idle_work,
> +                                msecs_to_jiffies(50));
> +               goto out_rearm;
> +       }
> +
> +       /*
> +        * Flush out the last user context, leaving only the pinned
> +        * kernel context resident. Should anything unfortunate happen
> +        * while we are idle (such as the GPU being power cycled), no users
> +        * will be harmed.
> +        */
> +       if (!work_pending(&i915->gt.idle_work.work) &&
> +           !i915->gt.active_requests) {
> +               ++i915->gt.active_requests; /* don't requeue idle */
> +
> +               switch_to_kernel_context_sync(i915, i915->gt.active_engines);
> +
> +               if (!--i915->gt.active_requests) {
> +                       __i915_gem_park(i915);
> +                       rearm_hangcheck = false;
> +               }
> +       }
> +
> +       mutex_unlock(&i915->drm.struct_mutex);
> +
> +out_rearm:
> +       if (rearm_hangcheck) {
> +               GEM_BUG_ON(!i915->gt.awake);
> +               i915_queue_hangcheck(i915);
> +       }
> +}
> +
> +void i915_gem_park(struct drm_i915_private *i915)
> +{
> +       GEM_TRACE("\n");
> +
> +       lockdep_assert_held(&i915->drm.struct_mutex);
> +       GEM_BUG_ON(i915->gt.active_requests);
> +
> +       if (!i915->gt.awake)
> +               return;
> +
> +       /* Defer the actual call to __i915_gem_park() to prevent ping-pongs */
> +       mod_delayed_work(i915->wq, &i915->gt.idle_work, msecs_to_jiffies(100));
> +}
> +
> +void i915_gem_unpark(struct drm_i915_private *i915)
> +{
> +       GEM_TRACE("\n");
> +
> +       lockdep_assert_held(&i915->drm.struct_mutex);
> +       GEM_BUG_ON(!i915->gt.active_requests);
> +       assert_rpm_wakelock_held(i915);
> +
> +       if (i915->gt.awake)
> +               return;
> +
> +       /*
> +        * It seems that the DMC likes to transition between the DC states a lot
> +        * when there are no connected displays (no active power domains) during
> +        * command submission.
> +        *
> +        * This activity has negative impact on the performance of the chip with
> +        * huge latencies observed in the interrupt handler and elsewhere.
> +        *
> +        * Work around it by grabbing a GT IRQ power domain whilst there is any
> +        * GT activity, preventing any DC state transitions.
> +        */
> +       i915->gt.awake = intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
> +       GEM_BUG_ON(!i915->gt.awake);
> +
> +       i915_globals_unpark();
> +
> +       intel_enable_gt_powersave(i915);
> +       i915_update_gfx_val(i915);
> +       if (INTEL_GEN(i915) >= 6)
> +               gen6_rps_busy(i915);
> +       i915_pmu_gt_unparked(i915);
> +
> +       intel_engines_unpark(i915);
> +
> +       i915_queue_hangcheck(i915);
> +
> +       queue_delayed_work(i915->wq,
> +                          &i915->gt.retire_work,
> +                          round_jiffies_up_relative(HZ));
> +}
> +
> +bool i915_gem_load_power_context(struct drm_i915_private *i915)
> +{
> +       /* Force loading the kernel context on all engines */
> +       if (!switch_to_kernel_context_sync(i915, ALL_ENGINES))
> +               return false;
> +
> +       /*
> +        * Immediately park the GPU so that we enable powersaving and
> +        * treat it as idle. The next time we issue a request, we will
> +        * unpark and start using the engine->pinned_default_state, otherwise
> +        * it is in limbo and an early reset may fail.
> +        */
> +       __i915_gem_park(i915);
> +
> +       return true;
> +}
> +
> +void i915_gem_suspend(struct drm_i915_private *i915)
> +{
> +       intel_wakeref_t wakeref;
> +
> +       GEM_TRACE("\n");
> +
> +       wakeref = intel_runtime_pm_get(i915);
> +
> +       flush_workqueue(i915->wq);
> +
> +       mutex_lock(&i915->drm.struct_mutex);
> +
> +       /*
> +        * We have to flush all the executing contexts to main memory so
> +        * that they can saved in the hibernation image. To ensure the last
> +        * context image is coherent, we have to switch away from it. That
> +        * leaves the i915->kernel_context still active when
> +        * we actually suspend, and its image in memory may not match the GPU
> +        * state. Fortunately, the kernel_context is disposable and we do
> +        * not rely on its state.
> +        */
> +       switch_to_kernel_context_sync(i915, i915->gt.active_engines);
> +
> +       mutex_unlock(&i915->drm.struct_mutex);
> +       i915_reset_flush(i915);
> +
> +       drain_delayed_work(&i915->gt.retire_work);
> +
> +       /*
> +        * As the idle_work is rearming if it detects a race, play safe and
> +        * repeat the flush until it is definitely idle.
> +        */
> +       drain_delayed_work(&i915->gt.idle_work);
> +
> +       /*
> +        * Assert that we successfully flushed all the work and
> +        * reset the GPU back to its idle, low power state.
> +        */
> +       GEM_BUG_ON(i915->gt.awake);
> +
> +       intel_uc_suspend(i915);
> +
> +       intel_runtime_pm_put(i915, wakeref);
> +}
> +
> +void i915_gem_suspend_late(struct drm_i915_private *i915)
> +{
> +       struct drm_i915_gem_object *obj;
> +       struct list_head *phases[] = {
> +               &i915->mm.unbound_list,
> +               &i915->mm.bound_list,
> +               NULL
> +       }, **phase;
> +
> +       /*
> +        * Neither the BIOS, ourselves or any other kernel
> +        * expects the system to be in execlists mode on startup,
> +        * so we need to reset the GPU back to legacy mode. And the only
> +        * known way to disable logical contexts is through a GPU reset.
> +        *
> +        * So in order to leave the system in a known default configuration,
> +        * always reset the GPU upon unload and suspend. Afterwards we then
> +        * clean up the GEM state tracking, flushing off the requests and
> +        * leaving the system in a known idle state.
> +        *
> +        * Note that is of the upmost importance that the GPU is idle and
> +        * all stray writes are flushed *before* we dismantle the backing
> +        * storage for the pinned objects.
> +        *
> +        * However, since we are uncertain that resetting the GPU on older
> +        * machines is a good idea, we don't - just in case it leaves the
> +        * machine in an unusable condition.
> +        */
> +
> +       mutex_lock(&i915->drm.struct_mutex);
> +       for (phase = phases; *phase; phase++) {
> +               list_for_each_entry(obj, *phase, mm.link)
> +                       WARN_ON(i915_gem_object_set_to_gtt_domain(obj, false));
> +       }
> +       mutex_unlock(&i915->drm.struct_mutex);
> +
> +       intel_uc_sanitize(i915);
> +       i915_gem_sanitize(i915);
> +}
> +
> +void i915_gem_resume(struct drm_i915_private *i915)
> +{
> +       GEM_TRACE("\n");
> +
> +       WARN_ON(i915->gt.awake);
> +
> +       mutex_lock(&i915->drm.struct_mutex);
> +       intel_uncore_forcewake_get(&i915->uncore, FORCEWAKE_ALL);
> +
> +       i915_gem_restore_gtt_mappings(i915);
> +       i915_gem_restore_fences(i915);
> +
> +       /*
> +        * As we didn't flush the kernel context before suspend, we cannot
> +        * guarantee that the context image is complete. So let's just reset
> +        * it and start again.
> +        */
> +       i915->gt.resume(i915);
> +
> +       if (i915_gem_init_hw(i915))
> +               goto err_wedged;
> +
> +       intel_uc_resume(i915);
> +
> +       /* Always reload a context for powersaving. */
> +       if (!i915_gem_load_power_context(i915))
> +               goto err_wedged;
> +
> +out_unlock:
> +       intel_uncore_forcewake_put(&i915->uncore, FORCEWAKE_ALL);
> +       mutex_unlock(&i915->drm.struct_mutex);
> +       return;
> +
> +err_wedged:
> +       if (!i915_reset_failed(i915)) {
> +               dev_err(i915->drm.dev,
> +                       "Failed to re-initialize GPU, declaring it wedged!\n");
> +               i915_gem_set_wedged(i915);
> +       }
> +       goto out_unlock;
> +}
> +
> +void i915_gem_init__pm(struct drm_i915_private *i915)
> +{
> +       INIT_DELAYED_WORK(&i915->gt.idle_work, idle_work_handler);
> +}
> diff --git a/drivers/gpu/drm/i915/i915_gem_pm.h b/drivers/gpu/drm/i915/i915_gem_pm.h
> new file mode 100644
> index 000000000000..52f65e3f06b5
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_gem_pm.h
> @@ -0,0 +1,28 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#ifndef __I915_GEM_PM_H__
> +#define __I915_GEM_PM_H__
> +
> +#include <linux/types.h>
> +
> +struct drm_i915_private;
> +struct work_struct;
> +
> +void i915_gem_init__pm(struct drm_i915_private *i915);
> +
> +bool i915_gem_load_power_context(struct drm_i915_private *i915);
> +void i915_gem_resume(struct drm_i915_private *i915);
> +
> +void i915_gem_unpark(struct drm_i915_private *i915);
> +void i915_gem_park(struct drm_i915_private *i915);
> +
> +void i915_gem_idle_work_handler(struct work_struct *work);
> +
> +void i915_gem_suspend(struct drm_i915_private *i915);
> +void i915_gem_suspend_late(struct drm_i915_private *i915);
> +
> +#endif /* __I915_GEM_PM_H__ */

I really like the use of small headers like this. We could do more on
other areas, too. One thing to consider is if we should settle on
intel_ or i915_ prefix. On display area we tend to use intel_.

Lucas De Marchi

> diff --git a/drivers/gpu/drm/i915/test_i915_gem_pm_standalone.c b/drivers/gpu/drm/i915/test_i915_gem_pm_standalone.c
> new file mode 100644
> index 000000000000..3524e471b46b
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/test_i915_gem_pm_standalone.c
> @@ -0,0 +1,7 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#include "i915_gem_pm.h"
> --
> 2.20.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx



-- 
Lucas De Marchi
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 05/22] drm/i915/selftests: Take GEM runtime wakeref
  2019-04-01 15:34   ` Tvrtko Ursulin
@ 2019-04-01 15:44     ` Chris Wilson
  0 siblings, 0 replies; 43+ messages in thread
From: Chris Wilson @ 2019-04-01 15:44 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-04-01 16:34:11)
> 
> On 25/03/2019 09:03, Chris Wilson wrote:
> > Transition from calling the lower level intel_runtime_pm functions to
> > using the GEM runtime_pm functions (i915_gem_unpark, i915_gem_park) now
> > that they are decoupled from struct_mutex. This has the small advantage
> > of reducing our overhead for request emission and ensuring that GEM
> > state is locked awake during the tests (to reduce interference).
> 
> Too tedious to read in detail. Actually not purely tedious, but 
> inversion of get and unpark (positive and negative) is just constantly 
> hard to read.
> 
> Otherwise there are some aspect of this I like - like more explicitly 
> controlling the GEM/GT readiness, and some which I don't, like: a) 
> churn, b) changing the direction from the recommendation insofar to grab 
> rpm over the smallest section, to now reversing that, and c) 
> i915_gem_unpark as a name was okay in the old world, but if this is now 
> a central API to wake up the device I am not so crazy about the unpark name.

(c) I'll take suggestions over, as I really don't like having to unpark
first. Tempted by i915_gem_runtime_pm_get(), although that may be too
close to intel_runtime_pm_get() that any difference in semantics will
get confusing.

For (b), I'll just say it's now a choice of how long you want to take
the named wakeref for request emission. You can push that down to the
request alloc, but the intention is to change a lot of these callsites
to requiring explicit control of the GEM wakeref so that we can use
a simpler (simpler only in code because it requires more work by the
caller) i915_request_create.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 03/22] drm/i915: Pull the GEM powermangement coupling into its own file
  2019-04-01 15:39   ` Lucas De Marchi
@ 2019-04-01 15:52     ` Chris Wilson
  0 siblings, 0 replies; 43+ messages in thread
From: Chris Wilson @ 2019-04-01 15:52 UTC (permalink / raw)
  To: Lucas De Marchi; +Cc: Intel Graphics

Quoting Lucas De Marchi (2019-04-01 16:39:29)
> On Mon, Mar 25, 2019 at 2:06 AM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> I really like the use of small headers like this. We could do more on
> other areas, too. One thing to consider is if we should settle on
> intel_ or i915_ prefix. On display area we tend to use intel_.

I classed this as part of the i915 GEM uapi facing code, so i915_gem.
The principal caller of the wakeref pm is i915_gem_execbuffer, but it
looked a bit lonely by itself so I added the suspend/runtime.

However, this code is a bit mixed in terms of functionality between high
level GEM ops and lower level interaction with HW, so I expect some
refinement in future. The low level stuff will have intel_foo as it
principally interacts with the HW (with i915_gem callers for the most
part, and with i915_shiny callers if we house a new uapi).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/22] drm/i915: Guard unpark/park with the gt.active_mutex
  2019-04-01 15:37     ` Chris Wilson
@ 2019-04-01 15:54       ` Tvrtko Ursulin
  2019-04-01 16:38         ` Chris Wilson
  0 siblings, 1 reply; 43+ messages in thread
From: Tvrtko Ursulin @ 2019-04-01 15:54 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 01/04/2019 16:37, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-04-01 16:22:55)
>>
>> On 25/03/2019 09:03, Chris Wilson wrote:
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>> index 11803d485275..7c7afe99986c 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -2008,7 +2008,8 @@ struct drm_i915_private {
>>>                intel_engine_mask_t active_engines;
>>>                struct list_head active_rings;
>>>                struct list_head closed_vma;
>>> -             u32 active_requests;
>>> +             atomic_t active_requests;
>>> +             struct mutex active_mutex;
>>
>> My initial reaction is why not gem_pm_mutex to match where code was
>> moved and what the commit message says?
> 
> Because we are inheriting the name from active_requests. If we renane
> that to active_count, and maybe pm_wake_count?
> 
> pm_active_count?

pm_wake_count sounds good from the PM angle. But it's then wrong for all 
places which query active requests. Don't know.

> 
>>> +     mutex_lock(&i915->gt.active_mutex);
>>>        if (!work_pending(&i915->gt.idle_work.work) &&
>>> -         !i915->gt.active_requests) {
>>> -             ++i915->gt.active_requests; /* don't requeue idle */
>>> +         !atomic_read(&i915->gt.active_requests)) {
>>> +             atomic_inc(&i915->gt.active_requests); /* don't requeue idle */
>>
>> atomic_inc_not_zero?
> 
> atomicity of the read & inc are not strictly required, once
> active_requests is zero it cannot be raised without holding
> active_mutex.

Yeah my bad. atomic_inc_not_zero would be wrong, it is the opposite. You 
would need atomic_inc_if_zero here.

> 
>>> @@ -191,11 +183,29 @@ void i915_gem_unpark(struct drm_i915_private *i915)
>>>                           round_jiffies_up_relative(HZ));
>>>    }
>>>    
>>> +void i915_gem_unpark(struct drm_i915_private *i915)
>>> +{
>>> +     if (atomic_add_unless(&i915->gt.active_requests, 1, 0))
>>> +             return;
>>
>> This looks wrong - how can it be okay to maybe not increment
>> active_requests on unpark? What am I missing?
> 
> If the add succeeds, active_requests was non-zero, and we can skip waking
> up the device. If the add fails, active_requests might be zero, so we
> take the mutex and check.

True true.. so this one is atomic_inc_not_zero! :)

>> I would expect here you would need
>> "atomic_inc_and_return_true_if_OLD_value_was_zero" but I don't think
>> there is such API.
>>
>>> +
>>> +     mutex_lock(&i915->gt.active_mutex);
>>> +     if (!atomic_read(&i915->gt.active_requests)) {
>>> +             GEM_TRACE("\n");
>>> +             __i915_gem_unpark(i915);
>>> +             smp_mb__before_atomic();
>>
>> Why is this needed? I have no idea.. but I think we want comments with
>> all memory barriers.
> 
> Because atomic_inc() is not regarded as a strong barrier, so we turn it
> into one. (In documentation at least, on x86 it's just a compiler
> barrier().)

Document why we need it please.

> 
>>> +     }
>>> +     atomic_inc(&i915->gt.active_requests);
>>> +     mutex_unlock(&i915->gt.active_mutex);
>>> +}
>>> +
>>>    bool i915_gem_load_power_context(struct drm_i915_private *i915)
>>>    {
>>> +     mutex_lock(&i915->gt.active_mutex);
>>> +     atomic_inc(&i915->gt.active_requests);
>>
>> Why this function has to manually manage active_requests? Can it be
>> written in a simpler way?
> 
> It's so that this function has control over parking.
> 
> This is the first request, this is the place where we will make the
> explicit calls to set up the powersaving contexts and state -- currently,
> it's implicit on the first request.

Why does it need control over parking? Does it need a comment perhaps? :)

Regards,

Tvrtko

>>>        /* Force loading the kernel context on all engines */
>>>        if (!switch_to_kernel_context_sync(i915, ALL_ENGINES))
>>> -             return false;
>>> +             goto err_active;
>>>    
>>>        /*
>>>         * Immediately park the GPU so that we enable powersaving and
>>> @@ -203,9 +213,20 @@ bool i915_gem_load_power_context(struct drm_i915_private *i915)
>>>         * unpark and start using the engine->pinned_default_state, otherwise
>>>         * it is in limbo and an early reset may fail.
>>>         */
>>> +
>>> +     if (!atomic_dec_and_test(&i915->gt.active_requests))
>>> +             goto err_unlock;
>>> +
>>>        __i915_gem_park(i915);
>>> +     mutex_unlock(&i915->gt.active_mutex);
>>>    
>>>        return true;
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/22] drm/i915: Guard unpark/park with the gt.active_mutex
  2019-04-01 15:54       ` Tvrtko Ursulin
@ 2019-04-01 16:38         ` Chris Wilson
  0 siblings, 0 replies; 43+ messages in thread
From: Chris Wilson @ 2019-04-01 16:38 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-04-01 16:54:53)
> 
> On 01/04/2019 16:37, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-04-01 16:22:55)
> >>
> >> On 25/03/2019 09:03, Chris Wilson wrote:
> >>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >>> index 11803d485275..7c7afe99986c 100644
> >>> --- a/drivers/gpu/drm/i915/i915_drv.h
> >>> +++ b/drivers/gpu/drm/i915/i915_drv.h
> >>> @@ -2008,7 +2008,8 @@ struct drm_i915_private {
> >>>                intel_engine_mask_t active_engines;
> >>>                struct list_head active_rings;
> >>>                struct list_head closed_vma;
> >>> -             u32 active_requests;
> >>> +             atomic_t active_requests;
> >>> +             struct mutex active_mutex;
> >>
> >> My initial reaction is why not gem_pm_mutex to match where code was
> >> moved and what the commit message says?
> > 
> > Because we are inheriting the name from active_requests. If we renane
> > that to active_count, and maybe pm_wake_count?
> > 
> > pm_active_count?
> 
> pm_wake_count sounds good from the PM angle. But it's then wrong for all 
> places which query active requests. Don't know.
> 
> > 
> >>> +     mutex_lock(&i915->gt.active_mutex);
> >>>        if (!work_pending(&i915->gt.idle_work.work) &&
> >>> -         !i915->gt.active_requests) {
> >>> -             ++i915->gt.active_requests; /* don't requeue idle */
> >>> +         !atomic_read(&i915->gt.active_requests)) {
> >>> +             atomic_inc(&i915->gt.active_requests); /* don't requeue idle */
> >>
> >> atomic_inc_not_zero?
> > 
> > atomicity of the read & inc are not strictly required, once
> > active_requests is zero it cannot be raised without holding
> > active_mutex.
> 
> Yeah my bad. atomic_inc_not_zero would be wrong, it is the opposite. You 
> would need atomic_inc_if_zero here.
> 
> > 
> >>> @@ -191,11 +183,29 @@ void i915_gem_unpark(struct drm_i915_private *i915)
> >>>                           round_jiffies_up_relative(HZ));
> >>>    }
> >>>    
> >>> +void i915_gem_unpark(struct drm_i915_private *i915)
> >>> +{
> >>> +     if (atomic_add_unless(&i915->gt.active_requests, 1, 0))
> >>> +             return;
> >>
> >> This looks wrong - how can it be okay to maybe not increment
> >> active_requests on unpark? What am I missing?
> > 
> > If the add succeeds, active_requests was non-zero, and we can skip waking
> > up the device. If the add fails, active_requests might be zero, so we
> > take the mutex and check.
> 
> True true.. so this one is atomic_inc_not_zero! :)

I had forgotten about that one!

> >> I would expect here you would need
> >> "atomic_inc_and_return_true_if_OLD_value_was_zero" but I don't think
> >> there is such API.
> >>
> >>> +
> >>> +     mutex_lock(&i915->gt.active_mutex);
> >>> +     if (!atomic_read(&i915->gt.active_requests)) {
> >>> +             GEM_TRACE("\n");
> >>> +             __i915_gem_unpark(i915);
> >>> +             smp_mb__before_atomic();
> >>
> >> Why is this needed? I have no idea.. but I think we want comments with
> >> all memory barriers.
> > 
> > Because atomic_inc() is not regarded as a strong barrier, so we turn it
> > into one. (In documentation at least, on x86 it's just a compiler
> > barrier().)
> 
> Document why we need it please.

Imagine if there was an atomic_inc_unlock() or atomic_inc_release(). I
suppose atomic_add_return_release(), with the caveat that we only need
the release if we actually did the deed.

> >>> +     }
> >>> +     atomic_inc(&i915->gt.active_requests);
> >>> +     mutex_unlock(&i915->gt.active_mutex);
> >>> +}
> >>> +
> >>>    bool i915_gem_load_power_context(struct drm_i915_private *i915)
> >>>    {
> >>> +     mutex_lock(&i915->gt.active_mutex);
> >>> +     atomic_inc(&i915->gt.active_requests);
> >>
> >> Why this function has to manually manage active_requests? Can it be
> >> written in a simpler way?
> > 
> > It's so that this function has control over parking.
> > 
> > This is the first request, this is the place where we will make the
> > explicit calls to set up the powersaving contexts and state -- currently,
> > it's implicit on the first request.
> 
> Why does it need control over parking? Does it need a comment perhaps? :)

There is a comment! See "Immediately park the GPU..."

> >>>        /* Force loading the kernel context on all engines */
> >>>        if (!switch_to_kernel_context_sync(i915, ALL_ENGINES))
> >>> -             return false;
> >>> +             goto err_active;
> >>>    
> >>>        /*
> >>>         * Immediately park the GPU so that we enable powersaving and
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 06/22] drm/i915: Pass intel_context to i915_request_create()
  2019-03-25  9:03 ` [PATCH 06/22] drm/i915: Pass intel_context to i915_request_create() Chris Wilson
@ 2019-04-02 13:17   ` Tvrtko Ursulin
  2019-04-03  7:54     ` Chris Wilson
  0 siblings, 1 reply; 43+ messages in thread
From: Tvrtko Ursulin @ 2019-04-02 13:17 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 25/03/2019 09:03, Chris Wilson wrote:
> Start acquiring the logical intel_context and using that as our primary
> means for request allocation. This is the initial step to allow us to
> avoid requiring struct_mutex for request allocation along the
> perma-pinned kernel context, but it also provides a foundation for
> breaking up the complex request allocation to handle different scenarios
> inside execbuf.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem_context.c       |  20 +--
>   drivers/gpu/drm/i915/i915_perf.c              |  22 ++--
>   drivers/gpu/drm/i915/i915_request.c           | 118 ++++++++++--------
>   drivers/gpu/drm/i915/i915_request.h           |   3 +
>   drivers/gpu/drm/i915/i915_reset.c             |   8 +-
>   drivers/gpu/drm/i915/intel_overlay.c          |  28 +++--
>   drivers/gpu/drm/i915/selftests/i915_active.c  |   2 +-
>   .../drm/i915/selftests/i915_gem_coherency.c   |   2 +-
>   .../gpu/drm/i915/selftests/i915_gem_object.c  |   9 +-
>   drivers/gpu/drm/i915/selftests/i915_request.c |   9 +-
>   .../gpu/drm/i915/selftests/i915_timeline.c    |   4 +-
>   11 files changed, 135 insertions(+), 90 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 25f267a03d3d..6a452345ffdb 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -88,6 +88,7 @@
>   #include <linux/log2.h>
>   #include <drm/i915_drm.h>
>   #include "i915_drv.h"
> +#include "i915_gem_pm.h"
>   #include "i915_globals.h"
>   #include "i915_trace.h"
>   #include "i915_user_extensions.h"
> @@ -863,7 +864,6 @@ static int context_barrier_task(struct i915_gem_context *ctx,
>   	struct drm_i915_private *i915 = ctx->i915;
>   	struct context_barrier_task *cb;
>   	struct intel_context *ce, *next;
> -	intel_wakeref_t wakeref;
>   	int err = 0;
>   
>   	lockdep_assert_held(&i915->drm.struct_mutex);
> @@ -876,7 +876,7 @@ static int context_barrier_task(struct i915_gem_context *ctx,
>   	i915_active_init(i915, &cb->base, cb_retire);
>   	i915_active_acquire(&cb->base);
>   
> -	wakeref = intel_runtime_pm_get(i915);
> +	i915_gem_unpark(i915);
>   	rbtree_postorder_for_each_entry_safe(ce, next, &ctx->hw_contexts, node) {
>   		struct intel_engine_cs *engine = ce->engine;
>   		struct i915_request *rq;
> @@ -890,7 +890,7 @@ static int context_barrier_task(struct i915_gem_context *ctx,
>   			break;
>   		}
>   
> -		rq = i915_request_alloc(engine, ctx);
> +		rq = i915_request_create(ce);
>   		if (IS_ERR(rq)) {
>   			err = PTR_ERR(rq);
>   			break;
> @@ -906,7 +906,7 @@ static int context_barrier_task(struct i915_gem_context *ctx,
>   		if (err)
>   			break;
>   	}
> -	intel_runtime_pm_put(i915, wakeref);
> +	i915_gem_park(i915);
>   
>   	cb->task = err ? NULL : task; /* caller needs to unwind instead */
>   	cb->data = data;
> @@ -930,11 +930,12 @@ int i915_gem_switch_to_kernel_context(struct drm_i915_private *i915,
>   	if (i915_terminally_wedged(i915))
>   		return 0;
>   
> +	i915_gem_unpark(i915);
>   	for_each_engine_masked(engine, i915, mask, mask) {
>   		struct intel_ring *ring;
>   		struct i915_request *rq;
>   
> -		rq = i915_request_alloc(engine, i915->kernel_context);
> +		rq = i915_request_create(engine->kernel_context);
>   		if (IS_ERR(rq))
>   			return PTR_ERR(rq);
>   
> @@ -960,6 +961,7 @@ int i915_gem_switch_to_kernel_context(struct drm_i915_private *i915,
>   
>   		i915_request_add(rq);
>   	}
> +	i915_gem_park(i915);
>   
>   	return 0;
>   }
> @@ -1158,7 +1160,6 @@ gen8_modify_rpcs(struct intel_context *ce, struct intel_sseu sseu)
>   {
>   	struct drm_i915_private *i915 = ce->engine->i915;
>   	struct i915_request *rq, *prev;
> -	intel_wakeref_t wakeref;
>   	int ret;
>   
>   	lockdep_assert_held(&ce->pin_mutex);
> @@ -1173,9 +1174,9 @@ gen8_modify_rpcs(struct intel_context *ce, struct intel_sseu sseu)
>   		return 0;
>   
>   	/* Submitting requests etc needs the hw awake. */
> -	wakeref = intel_runtime_pm_get(i915);
> +	i915_gem_unpark(i915);
>   
> -	rq = i915_request_alloc(ce->engine, i915->kernel_context);
> +	rq = i915_request_create(ce->engine->kernel_context);
>   	if (IS_ERR(rq)) {
>   		ret = PTR_ERR(rq);
>   		goto out_put;
> @@ -1213,8 +1214,7 @@ gen8_modify_rpcs(struct intel_context *ce, struct intel_sseu sseu)
>   out_add:
>   	i915_request_add(rq);
>   out_put:
> -	intel_runtime_pm_put(i915, wakeref);
> -
> +	i915_gem_park(i915);
>   	return ret;
>   }
>   
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index 85c5cb779297..fe7267da52e5 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -196,6 +196,7 @@
>   #include <linux/uuid.h>
>   
>   #include "i915_drv.h"
> +#include "i915_gem_pm.h"
>   #include "i915_oa_hsw.h"
>   #include "i915_oa_bdw.h"
>   #include "i915_oa_chv.h"
> @@ -1716,6 +1717,7 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
>   	int ret;
>   
>   	lockdep_assert_held(&dev_priv->drm.struct_mutex);
> +	i915_gem_unpark(dev_priv);
>   
>   	/*
>   	 * The OA register config is setup through the context image. This image
> @@ -1734,7 +1736,7 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
>   				     I915_WAIT_LOCKED,
>   				     MAX_SCHEDULE_TIMEOUT);
>   	if (ret)
> -		return ret;
> +		goto out_pm;
>   
>   	/* Update all contexts now that we've stalled the submission. */
>   	list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
> @@ -1746,8 +1748,10 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
>   			continue;
>   
>   		regs = i915_gem_object_pin_map(ce->state->obj, map_type);
> -		if (IS_ERR(regs))
> -			return PTR_ERR(regs);
> +		if (IS_ERR(regs)) {
> +			ret = PTR_ERR(regs);
> +			goto out_pm;
> +		}
>   
>   		ce->state->obj->mm.dirty = true;
>   		regs += LRC_STATE_PN * PAGE_SIZE / sizeof(*regs);
> @@ -1761,13 +1765,17 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
>   	 * Apply the configuration by doing one context restore of the edited
>   	 * context image.
>   	 */
> -	rq = i915_request_alloc(engine, dev_priv->kernel_context);
> -	if (IS_ERR(rq))
> -		return PTR_ERR(rq);
> +	rq = i915_request_create(engine->kernel_context);
> +	if (IS_ERR(rq)) {
> +		ret = PTR_ERR(rq);
> +		goto out_pm;
> +	}
>   
>   	i915_request_add(rq);
>   
> -	return 0;
> +out_pm:
> +	i915_gem_park(dev_priv);
> +	return ret;
>   }
>   
>   static int gen8_enable_metric_set(struct i915_perf_stream *stream)
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 8d396f3c747d..fd24f576ca61 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -576,51 +576,19 @@ static int add_timeline_barrier(struct i915_request *rq)
>   	return i915_request_await_active_request(rq, &rq->timeline->barrier);
>   }
>   
> -/**
> - * i915_request_alloc - allocate a request structure
> - *
> - * @engine: engine that we wish to issue the request on.
> - * @ctx: context that the request will be associated with.
> - *
> - * Returns a pointer to the allocated request if successful,
> - * or an error code if not.
> - */
> -struct i915_request *
> -i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
> +struct i915_request *i915_request_create(struct intel_context *ce)
>   {
> -	struct drm_i915_private *i915 = engine->i915;
> -	struct intel_context *ce;
> -	struct i915_timeline *tl;
> +	struct drm_i915_private *i915 = ce->engine->i915;
> +	struct i915_timeline *tl = ce->ring->timeline;
>   	struct i915_request *rq;
>   	u32 seqno;
>   	int ret;
>   
> -	/*
> -	 * Preempt contexts are reserved for exclusive use to inject a
> -	 * preemption context switch. They are never to be used for any trivial
> -	 * request!
> -	 */
> -	GEM_BUG_ON(ctx == i915->preempt_context);
> +	GEM_BUG_ON(!i915->gt.awake);
>   
> -	/*
> -	 * ABI: Before userspace accesses the GPU (e.g. execbuffer), report
> -	 * EIO if the GPU is already wedged.
> -	 */
> -	ret = i915_terminally_wedged(i915);
> -	if (ret)
> -		return ERR_PTR(ret);
> -
> -	/*
> -	 * Pinning the contexts may generate requests in order to acquire
> -	 * GGTT space, so do this first before we reserve a seqno for
> -	 * ourselves.
> -	 */
> -	ce = intel_context_pin(ctx, engine);
> -	if (IS_ERR(ce))
> -		return ERR_CAST(ce);
> -
> -	i915_gem_unpark(i915);
> -	mutex_lock(&ce->ring->timeline->mutex);
> +	/* Check that the caller provided an already pinned context */
> +	__intel_context_pin(ce);
> +	mutex_lock(&tl->mutex);
>   
>   	/* Move our oldest request to the slab-cache (if not in use!) */
>   	rq = list_first_entry(&ce->ring->request_list, typeof(*rq), ring_link);
> @@ -670,18 +638,17 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   	INIT_LIST_HEAD(&rq->active_list);
>   	INIT_LIST_HEAD(&rq->execute_cb);
>   
> -	tl = ce->ring->timeline;
>   	ret = i915_timeline_get_seqno(tl, rq, &seqno);
>   	if (ret)
>   		goto err_free;
>   
>   	rq->i915 = i915;
> -	rq->engine = engine;
> -	rq->gem_context = ctx;
>   	rq->hw_context = ce;
> +	rq->gem_context = ce->gem_context;
> +	rq->engine = ce->engine;
>   	rq->ring = ce->ring;
>   	rq->timeline = tl;
> -	GEM_BUG_ON(rq->timeline == &engine->timeline);
> +	GEM_BUG_ON(rq->timeline == &ce->engine->timeline);
>   	rq->hwsp_seqno = tl->hwsp_seqno;
>   	rq->hwsp_cacheline = tl->hwsp_cacheline;
>   	rq->rcustate = get_state_synchronize_rcu(); /* acts as smp_mb() */
> @@ -713,7 +680,8 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   	 * around inside i915_request_add() there is sufficient space at
>   	 * the beginning of the ring as well.
>   	 */
> -	rq->reserved_space = 2 * engine->emit_fini_breadcrumb_dw * sizeof(u32);
> +	rq->reserved_space =
> +		2 * rq->engine->emit_fini_breadcrumb_dw * sizeof(u32);
>   
>   	/*
>   	 * Record the position of the start of the request so that
> @@ -727,17 +695,19 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   	if (ret)
>   		goto err_unwind;
>   
> -	ret = engine->request_alloc(rq);
> +	ret = rq->engine->request_alloc(rq);
>   	if (ret)
>   		goto err_unwind;
>   
> +	rq->infix = rq->ring->emit; /* end of header; start of user payload */
> +
>   	/* Keep a second pin for the dual retirement along engine and ring */
>   	__intel_context_pin(ce);
> -
> -	rq->infix = rq->ring->emit; /* end of header; start of user payload */
> +	atomic_inc(&i915->gt.active_requests);

Oh I hate this..

>   
>   	/* Check that we didn't interrupt ourselves with a new request */
>   	GEM_BUG_ON(rq->timeline->seqno != rq->fence.seqno);
> +	lockdep_assert_held(&tl->mutex);
>   	return rq;
>   
>   err_unwind:
> @@ -751,12 +721,62 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   err_free:
>   	kmem_cache_free(global.slab_requests, rq);
>   err_unreserve:
> -	mutex_unlock(&ce->ring->timeline->mutex);
> -	i915_gem_park(i915);
> +	mutex_unlock(&tl->mutex);
>   	intel_context_unpin(ce);
>   	return ERR_PTR(ret);
>   }
>   
> +/**
> + * i915_request_alloc - allocate a request structure
> + *
> + * @engine: engine that we wish to issue the request on.
> + * @ctx: context that the request will be associated with.
> + *
> + * Returns a pointer to the allocated request if successful,
> + * or an error code if not.
> + */
> +struct i915_request *
> +i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
> +{
> +	struct drm_i915_private *i915 = engine->i915;
> +	struct intel_context *ce;
> +	struct i915_request *rq;
> +	int ret;
> +
> +	/*
> +	 * Preempt contexts are reserved for exclusive use to inject a
> +	 * preemption context switch. They are never to be used for any trivial
> +	 * request!
> +	 */
> +	GEM_BUG_ON(ctx == i915->preempt_context);
> +
> +	/*
> +	 * ABI: Before userspace accesses the GPU (e.g. execbuffer), report
> +	 * EIO if the GPU is already wedged.
> +	 */
> +	ret = i915_terminally_wedged(i915);
> +	if (ret)
> +		return ERR_PTR(ret);
> +
> +	/*
> +	 * Pinning the contexts may generate requests in order to acquire
> +	 * GGTT space, so do this first before we reserve a seqno for
> +	 * ourselves.
> +	 */
> +	ce = intel_context_pin(ctx, engine);
> +	if (IS_ERR(ce))
> +		return ERR_CAST(ce);
> +
> +	i915_gem_unpark(i915);
> +
> +	rq = i915_request_create(ce);
> +
> +	i915_gem_park(i915);

... because it is so anti-self-documenting.

Maybe we could have something like:

__i915_gem_unpark(...)
{
	GEM_BUG_ON(!atomic_read(active_requests));
	atomic_inc(active_requests);
}

Similar to __intel_context_pin, just so it is more obvious what the code 
is doing.

> +	intel_context_unpin(ce);
> +
> +	return rq;
> +}
> +
>   static int
>   emit_semaphore_wait(struct i915_request *to,
>   		    struct i915_request *from,
> diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> index cd6c130964cd..37f84ad067da 100644
> --- a/drivers/gpu/drm/i915/i915_request.h
> +++ b/drivers/gpu/drm/i915/i915_request.h
> @@ -228,6 +228,9 @@ static inline bool dma_fence_is_i915(const struct dma_fence *fence)
>   	return fence->ops == &i915_fence_ops;
>   }
>   
> +struct i915_request * __must_check
> +i915_request_create(struct intel_context *ce);
> +
>   struct i915_request * __must_check
>   i915_request_alloc(struct intel_engine_cs *engine,
>   		   struct i915_gem_context *ctx);
> diff --git a/drivers/gpu/drm/i915/i915_reset.c b/drivers/gpu/drm/i915/i915_reset.c
> index 0aea19cefe4a..e1f2cf64639a 100644
> --- a/drivers/gpu/drm/i915/i915_reset.c
> +++ b/drivers/gpu/drm/i915/i915_reset.c
> @@ -8,6 +8,7 @@
>   #include <linux/stop_machine.h>
>   
>   #include "i915_drv.h"
> +#include "i915_gem_pm.h"
>   #include "i915_gpu_error.h"
>   #include "i915_reset.h"
>   
> @@ -727,9 +728,8 @@ static void restart_work(struct work_struct *work)
>   	struct drm_i915_private *i915 = arg->i915;
>   	struct intel_engine_cs *engine;
>   	enum intel_engine_id id;
> -	intel_wakeref_t wakeref;
>   
> -	wakeref = intel_runtime_pm_get(i915);
> +	i915_gem_unpark(i915);
>   	mutex_lock(&i915->drm.struct_mutex);
>   	WRITE_ONCE(i915->gpu_error.restart, NULL);
>   
> @@ -744,13 +744,13 @@ static void restart_work(struct work_struct *work)
>   		if (!intel_engine_is_idle(engine))
>   			continue;
>   
> -		rq = i915_request_alloc(engine, i915->kernel_context);
> +		rq = i915_request_create(engine->kernel_context);
>   		if (!IS_ERR(rq))
>   			i915_request_add(rq);
>   	}
>   
>   	mutex_unlock(&i915->drm.struct_mutex);
> -	intel_runtime_pm_put(i915, wakeref);
> +	i915_gem_park(i915);
>   
>   	kfree(arg);
>   }
> diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> index a882b8d42bd9..e23a91eb9149 100644
> --- a/drivers/gpu/drm/i915/intel_overlay.c
> +++ b/drivers/gpu/drm/i915/intel_overlay.c
> @@ -29,6 +29,7 @@
>   #include <drm/drm_fourcc.h>
>   
>   #include "i915_drv.h"
> +#include "i915_gem_pm.h"
>   #include "i915_reg.h"
>   #include "intel_drv.h"
>   #include "intel_frontbuffer.h"
> @@ -235,10 +236,9 @@ static int intel_overlay_do_wait_request(struct intel_overlay *overlay,
>   
>   static struct i915_request *alloc_request(struct intel_overlay *overlay)
>   {
> -	struct drm_i915_private *dev_priv = overlay->i915;
> -	struct intel_engine_cs *engine = dev_priv->engine[RCS0];
> +	struct intel_engine_cs *engine = overlay->i915->engine[RCS0];
>   
> -	return i915_request_alloc(engine, dev_priv->kernel_context);
> +	return i915_request_create(engine->kernel_context);
>   }
>   
>   /* overlay needs to be disable in OCMD reg */
> @@ -247,17 +247,21 @@ static int intel_overlay_on(struct intel_overlay *overlay)
>   	struct drm_i915_private *dev_priv = overlay->i915;
>   	struct i915_request *rq;
>   	u32 *cs;
> +	int err;
>   
>   	WARN_ON(overlay->active);
> +	i915_gem_unpark(dev_priv);
>   
>   	rq = alloc_request(overlay);
> -	if (IS_ERR(rq))
> -		return PTR_ERR(rq);
> +	if (IS_ERR(rq)) {
> +		err = PTR_ERR(rq);
> +		goto err_pm;
> +	}
>   
>   	cs = intel_ring_begin(rq, 4);
>   	if (IS_ERR(cs)) {
> -		i915_request_add(rq);
> -		return PTR_ERR(cs);
> +		err = PTR_ERR(cs);
> +		goto err_rq;
>   	}
>   
>   	overlay->active = true;
> @@ -272,6 +276,12 @@ static int intel_overlay_on(struct intel_overlay *overlay)
>   	intel_ring_advance(rq, cs);
>   
>   	return intel_overlay_do_wait_request(overlay, rq, NULL);
> +
> +err_rq:
> +	i915_request_add(rq);
> +err_pm:
> +	i915_gem_park(dev_priv);
> +	return err;
>   }
>   
>   static void intel_overlay_flip_prepare(struct intel_overlay *overlay,
> @@ -376,6 +386,8 @@ static void intel_overlay_off_tail(struct i915_active_request *active,
>   
>   	if (IS_I830(dev_priv))
>   		i830_overlay_clock_gating(dev_priv, true);
> +
> +	i915_gem_park(dev_priv);
>   }
>   
>   /* overlay needs to be disabled in OCMD reg */
> @@ -485,6 +497,8 @@ void intel_overlay_reset(struct drm_i915_private *dev_priv)
>   	overlay->old_yscale = 0;
>   	overlay->crtc = NULL;
>   	overlay->active = false;
> +
> +	i915_gem_park(dev_priv);
>   }
>   
>   static int packed_depth_bytes(u32 format)
> diff --git a/drivers/gpu/drm/i915/selftests/i915_active.c b/drivers/gpu/drm/i915/selftests/i915_active.c
> index 42bcceba175c..b10308c20e7d 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_active.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_active.c
> @@ -47,7 +47,7 @@ static int __live_active_setup(struct drm_i915_private *i915,
>   	for_each_engine(engine, i915, id) {
>   		struct i915_request *rq;
>   
> -		rq = i915_request_alloc(engine, i915->kernel_context);
> +		rq = i915_request_create(engine->kernel_context);
>   		if (IS_ERR(rq)) {
>   			err = PTR_ERR(rq);
>   			break;
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
> index 497929238f02..b926f40bb01a 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
> @@ -202,7 +202,7 @@ static int gpu_set(struct drm_i915_gem_object *obj,
>   	if (IS_ERR(vma))
>   		return PTR_ERR(vma);
>   
> -	rq = i915_request_alloc(i915->engine[RCS0], i915->kernel_context);
> +	rq = i915_request_create(i915->engine[RCS0]->kernel_context);
>   	if (IS_ERR(rq)) {
>   		i915_vma_unpin(vma);
>   		return PTR_ERR(rq);
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
> index cd6590e01dec..e5166f9b04c1 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_object.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
> @@ -308,7 +308,6 @@ static int igt_partial_tiling(void *arg)
>   	const unsigned int nreal = 1 << 12; /* largest tile row x2 */
>   	struct drm_i915_private *i915 = arg;
>   	struct drm_i915_gem_object *obj;
> -	intel_wakeref_t wakeref;
>   	int tiling;
>   	int err;
>   
> @@ -333,8 +332,8 @@ static int igt_partial_tiling(void *arg)
>   		goto out;
>   	}
>   
> +	i915_gem_unpark(i915);
>   	mutex_lock(&i915->drm.struct_mutex);
> -	wakeref = intel_runtime_pm_get(i915);
>   
>   	if (1) {
>   		IGT_TIMEOUT(end);
> @@ -445,8 +444,8 @@ next_tiling: ;
>   	}
>   
>   out_unlock:
> -	intel_runtime_pm_put(i915, wakeref);
>   	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_gem_park(i915);
>   	i915_gem_object_unpin_pages(obj);
>   out:
>   	i915_gem_object_put(obj);
> @@ -468,7 +467,9 @@ static int make_obj_busy(struct drm_i915_gem_object *obj)
>   	if (err)
>   		return err;
>   
> -	rq = i915_request_alloc(i915->engine[RCS0], i915->kernel_context);
> +	i915_gem_unpark(i915);
> +	rq = i915_request_create(i915->engine[RCS0]->kernel_context);
> +	i915_gem_park(i915);
>   	if (IS_ERR(rq)) {
>   		i915_vma_unpin(vma);
>   		return PTR_ERR(rq);
> diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c
> index 665cafa82390..03bea3caaafe 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_request.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_request.c
> @@ -550,8 +550,7 @@ static int live_nop_request(void *arg)
>   			times[1] = ktime_get_raw();
>   
>   			for (n = 0; n < prime; n++) {
> -				request = i915_request_alloc(engine,
> -							     i915->kernel_context);
> +				request = i915_request_create(engine->kernel_context);
>   				if (IS_ERR(request)) {
>   					err = PTR_ERR(request);
>   					goto out_unlock;
> @@ -648,7 +647,7 @@ empty_request(struct intel_engine_cs *engine,
>   	struct i915_request *request;
>   	int err;
>   
> -	request = i915_request_alloc(engine, engine->i915->kernel_context);
> +	request = i915_request_create(engine->kernel_context);
>   	if (IS_ERR(request))
>   		return request;
>   
> @@ -850,7 +849,7 @@ static int live_all_engines(void *arg)
>   	}
>   
>   	for_each_engine(engine, i915, id) {
> -		request[id] = i915_request_alloc(engine, i915->kernel_context);
> +		request[id] = i915_request_create(engine->kernel_context);
>   		if (IS_ERR(request[id])) {
>   			err = PTR_ERR(request[id]);
>   			pr_err("%s: Request allocation failed with err=%d\n",
> @@ -958,7 +957,7 @@ static int live_sequential_engines(void *arg)
>   			goto out_unlock;
>   		}
>   
> -		request[id] = i915_request_alloc(engine, i915->kernel_context);
> +		request[id] = i915_request_create(engine->kernel_context);
>   		if (IS_ERR(request[id])) {
>   			err = PTR_ERR(request[id]);
>   			pr_err("%s: Request allocation failed for %s with err=%d\n",
> diff --git a/drivers/gpu/drm/i915/selftests/i915_timeline.c b/drivers/gpu/drm/i915/selftests/i915_timeline.c
> index b04969ea74d3..77f6c8c66568 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_timeline.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_timeline.c
> @@ -455,7 +455,7 @@ tl_write(struct i915_timeline *tl, struct intel_engine_cs *engine, u32 value)
>   		goto out;
>   	}
>   
> -	rq = i915_request_alloc(engine, engine->i915->kernel_context);
> +	rq = i915_request_create(engine->kernel_context);
>   	if (IS_ERR(rq))
>   		goto out_unpin;
>   
> @@ -676,7 +676,7 @@ static int live_hwsp_wrap(void *arg)
>   		if (!intel_engine_can_store_dword(engine))
>   			continue;
>   
> -		rq = i915_request_alloc(engine, i915->kernel_context);
> +		rq = i915_request_create(engine->kernel_context);
>   		if (IS_ERR(rq)) {
>   			err = PTR_ERR(rq);
>   			goto out;
> 

No other complaints.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 08/22] drm/i915: Explicitly pin the logical context for execbuf
  2019-03-25  9:03 ` [PATCH 08/22] drm/i915: Explicitly pin the logical context for execbuf Chris Wilson
@ 2019-04-02 16:09   ` Tvrtko Ursulin
  2019-04-10 19:55     ` Chris Wilson
  0 siblings, 1 reply; 43+ messages in thread
From: Tvrtko Ursulin @ 2019-04-02 16:09 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 25/03/2019 09:03, Chris Wilson wrote:
> In order to separate the reservation phase of building a request from
> its emission phase, we need to pull some of the request alloc activities
> from deep inside i915_request to the surface, GEM_EXECBUFFER.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_gem_execbuffer.c | 106 +++++++++++++--------
>   drivers/gpu/drm/i915/i915_request.c        |   9 --
>   2 files changed, 67 insertions(+), 48 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 3d672c9edb94..8754bb02c6ec 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -36,6 +36,7 @@
>   
>   #include "i915_drv.h"
>   #include "i915_gem_clflush.h"
> +#include "i915_gem_pm.h"
>   #include "i915_trace.h"
>   #include "intel_drv.h"
>   #include "intel_frontbuffer.h"
> @@ -236,7 +237,8 @@ struct i915_execbuffer {
>   	unsigned int *flags;
>   
>   	struct intel_engine_cs *engine; /** engine to queue the request to */
> -	struct i915_gem_context *ctx; /** context for building the request */
> +	struct intel_context *context; /* logical state for the request */
> +	struct i915_gem_context *gem_context; /** caller's context */
>   	struct i915_address_space *vm; /** GTT and vma for the request */
>   
>   	struct i915_request *request; /** our request to build */
> @@ -738,7 +740,7 @@ static int eb_select_context(struct i915_execbuffer *eb)
>   	if (unlikely(!ctx))
>   		return -ENOENT;
>   
> -	eb->ctx = ctx;
> +	eb->gem_context = ctx;
>   	if (ctx->ppgtt) {
>   		eb->vm = &ctx->ppgtt->vm;
>   		eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
> @@ -761,7 +763,7 @@ static struct i915_request *__eb_wait_for_ring(struct intel_ring *ring)
>   	 * Completely unscientific finger-in-the-air estimates for suitable
>   	 * maximum user request size (to avoid blocking) and then backoff.
>   	 */
> -	if (intel_ring_update_space(ring) >= PAGE_SIZE)
> +	if (!ring || intel_ring_update_space(ring) >= PAGE_SIZE)

Move the comment explaining why ring can be NULL while moving the code?

>   		return NULL;
>   
>   	/*
> @@ -784,7 +786,6 @@ static struct i915_request *__eb_wait_for_ring(struct intel_ring *ring)
>   
>   static int eb_wait_for_ring(const struct i915_execbuffer *eb)
>   {
> -	const struct intel_context *ce;
>   	struct i915_request *rq;
>   	int ret = 0;
>   
> @@ -794,11 +795,7 @@ static int eb_wait_for_ring(const struct i915_execbuffer *eb)
>   	 * keeping all of their resources pinned.
>   	 */
>   
> -	ce = intel_context_lookup(eb->ctx, eb->engine);
> -	if (!ce || !ce->ring) /* first use, assume empty! */
> -		return 0;
> -
> -	rq = __eb_wait_for_ring(ce->ring);
> +	rq = __eb_wait_for_ring(eb->context->ring);
>   	if (rq) {
>   		mutex_unlock(&eb->i915->drm.struct_mutex);
>   
> @@ -817,15 +814,15 @@ static int eb_wait_for_ring(const struct i915_execbuffer *eb)
>   
>   static int eb_lookup_vmas(struct i915_execbuffer *eb)
>   {
> -	struct radix_tree_root *handles_vma = &eb->ctx->handles_vma;
> +	struct radix_tree_root *handles_vma = &eb->gem_context->handles_vma;
>   	struct drm_i915_gem_object *obj;
>   	unsigned int i, batch;
>   	int err;
>   
> -	if (unlikely(i915_gem_context_is_closed(eb->ctx)))
> +	if (unlikely(i915_gem_context_is_closed(eb->gem_context)))
>   		return -ENOENT;
>   
> -	if (unlikely(i915_gem_context_is_banned(eb->ctx)))
> +	if (unlikely(i915_gem_context_is_banned(eb->gem_context)))
>   		return -EIO;
>   
>   	INIT_LIST_HEAD(&eb->relocs);
> @@ -870,8 +867,8 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb)
>   		if (!vma->open_count++)
>   			i915_vma_reopen(vma);
>   		list_add(&lut->obj_link, &obj->lut_list);
> -		list_add(&lut->ctx_link, &eb->ctx->handles_list);
> -		lut->ctx = eb->ctx;
> +		list_add(&lut->ctx_link, &eb->gem_context->handles_list);
> +		lut->ctx = eb->gem_context;
>   		lut->handle = handle;
>   
>   add_vma:
> @@ -1227,7 +1224,7 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb,
>   	if (err)
>   		goto err_unmap;
>   
> -	rq = i915_request_alloc(eb->engine, eb->ctx);
> +	rq = i915_request_create(eb->context);
>   	if (IS_ERR(rq)) {
>   		err = PTR_ERR(rq);
>   		goto err_unpin;
> @@ -2088,8 +2085,41 @@ static const enum intel_engine_id user_ring_map[I915_USER_RINGS + 1] = {
>   	[I915_EXEC_VEBOX]	= VECS0
>   };
>   
> -static struct intel_engine_cs *
> -eb_select_engine(struct drm_i915_private *dev_priv,
> +static int eb_pin_context(struct i915_execbuffer *eb,
> +			  struct intel_engine_cs *engine)
> +{
> +	struct intel_context *ce;
> +	int err;
> +
> +	/*
> +	 * ABI: Before userspace accesses the GPU (e.g. execbuffer), report
> +	 * EIO if the GPU is already wedged.
> +	 */
> +	err = i915_terminally_wedged(eb->i915);
> +	if (err)
> +		return err;
> +
> +	/*
> +	 * Pinning the contexts may generate requests in order to acquire
> +	 * GGTT space, so do this first before we reserve a seqno for
> +	 * ourselves.
> +	 */
> +	ce = intel_context_pin(eb->gem_context, engine);
> +	if (IS_ERR(ce))
> +		return PTR_ERR(ce);
> +
> +	eb->engine = engine;
> +	eb->context = ce;
> +	return 0;
> +}
> +
> +static void eb_unpin_context(struct i915_execbuffer *eb)
> +{
> +	intel_context_unpin(eb->context);
> +}
> +
> +static int
> +eb_select_engine(struct i915_execbuffer *eb,
>   		 struct drm_file *file,
>   		 struct drm_i915_gem_execbuffer2 *args)
>   {
> @@ -2098,21 +2128,21 @@ eb_select_engine(struct drm_i915_private *dev_priv,
>   
>   	if (user_ring_id > I915_USER_RINGS) {
>   		DRM_DEBUG("execbuf with unknown ring: %u\n", user_ring_id);
> -		return NULL;
> +		return -EINVAL;
>   	}
>   
>   	if ((user_ring_id != I915_EXEC_BSD) &&
>   	    ((args->flags & I915_EXEC_BSD_MASK) != 0)) {
>   		DRM_DEBUG("execbuf with non bsd ring but with invalid "
>   			  "bsd dispatch flags: %d\n", (int)(args->flags));
> -		return NULL;
> +		return -EINVAL;
>   	}
>   
> -	if (user_ring_id == I915_EXEC_BSD && HAS_ENGINE(dev_priv, VCS1)) {
> +	if (user_ring_id == I915_EXEC_BSD && HAS_ENGINE(eb->i915, VCS1)) {
>   		unsigned int bsd_idx = args->flags & I915_EXEC_BSD_MASK;
>   
>   		if (bsd_idx == I915_EXEC_BSD_DEFAULT) {
> -			bsd_idx = gen8_dispatch_bsd_engine(dev_priv, file);
> +			bsd_idx = gen8_dispatch_bsd_engine(eb->i915, file);
>   		} else if (bsd_idx >= I915_EXEC_BSD_RING1 &&
>   			   bsd_idx <= I915_EXEC_BSD_RING2) {
>   			bsd_idx >>= I915_EXEC_BSD_SHIFT;
> @@ -2120,20 +2150,20 @@ eb_select_engine(struct drm_i915_private *dev_priv,
>   		} else {
>   			DRM_DEBUG("execbuf with unknown bsd ring: %u\n",
>   				  bsd_idx);
> -			return NULL;
> +			return -EINVAL;
>   		}
>   
> -		engine = dev_priv->engine[_VCS(bsd_idx)];
> +		engine = eb->i915->engine[_VCS(bsd_idx)];
>   	} else {
> -		engine = dev_priv->engine[user_ring_map[user_ring_id]];
> +		engine = eb->i915->engine[user_ring_map[user_ring_id]];

Cache dev_priv/i915 in a local?

>   	}
>   
>   	if (!engine) {
>   		DRM_DEBUG("execbuf with invalid ring: %u\n", user_ring_id);
> -		return NULL;
> +		return -EINVAL;
>   	}
>   
> -	return engine;
> +	return eb_pin_context(eb, engine);
>   }
>   
>   static void
> @@ -2275,7 +2305,6 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>   	struct i915_execbuffer eb;
>   	struct dma_fence *in_fence = NULL;
>   	struct sync_file *out_fence = NULL;
> -	intel_wakeref_t wakeref;
>   	int out_fence_fd = -1;
>   	int err;
>   
> @@ -2335,12 +2364,6 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>   	if (unlikely(err))
>   		goto err_destroy;
>   
> -	eb.engine = eb_select_engine(eb.i915, file, args);
> -	if (!eb.engine) {
> -		err = -EINVAL;
> -		goto err_engine;
> -	}
> -
>   	/*
>   	 * Take a local wakeref for preparing to dispatch the execbuf as
>   	 * we expect to access the hardware fairly frequently in the
> @@ -2348,16 +2371,20 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>   	 * wakeref that we hold until the GPU has been idle for at least
>   	 * 100ms.
>   	 */
> -	wakeref = intel_runtime_pm_get(eb.i915);
> +	i915_gem_unpark(eb.i915);
>   
>   	err = i915_mutex_lock_interruptible(dev);
>   	if (err)
>   		goto err_rpm;
>   
> -	err = eb_wait_for_ring(&eb); /* may temporarily drop struct_mutex */
> +	err = eb_select_engine(&eb, file, args);
>   	if (unlikely(err))
>   		goto err_unlock;
>   
> +	err = eb_wait_for_ring(&eb); /* may temporarily drop struct_mutex */
> +	if (unlikely(err))
> +		goto err_engine;
> +
>   	err = eb_relocate(&eb);
>   	if (err) {
>   		/*
> @@ -2441,7 +2468,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>   	GEM_BUG_ON(eb.reloc_cache.rq);
>   
>   	/* Allocate a request for this batch buffer nice and early. */
> -	eb.request = i915_request_alloc(eb.engine, eb.ctx);
> +	eb.request = i915_request_create(eb.context);
>   	if (IS_ERR(eb.request)) {
>   		err = PTR_ERR(eb.request);
>   		goto err_batch_unpin;
> @@ -2502,12 +2529,13 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>   err_vma:
>   	if (eb.exec)
>   		eb_release_vmas(&eb);
> +err_engine:
> +	eb_unpin_context(&eb);
>   err_unlock:
>   	mutex_unlock(&dev->struct_mutex);
>   err_rpm:
> -	intel_runtime_pm_put(eb.i915, wakeref);
> -err_engine:
> -	i915_gem_context_put(eb.ctx);
> +	i915_gem_park(eb.i915);
> +	i915_gem_context_put(eb.gem_context);
>   err_destroy:
>   	eb_destroy(&eb);
>   err_out_fence:
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index fd24f576ca61..10edeb285870 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -741,7 +741,6 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   	struct drm_i915_private *i915 = engine->i915;
>   	struct intel_context *ce;
>   	struct i915_request *rq;
> -	int ret;
>   
>   	/*
>   	 * Preempt contexts are reserved for exclusive use to inject a
> @@ -750,14 +749,6 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   	 */
>   	GEM_BUG_ON(ctx == i915->preempt_context);
>   
> -	/*
> -	 * ABI: Before userspace accesses the GPU (e.g. execbuffer), report
> -	 * EIO if the GPU is already wedged.
> -	 */
> -	ret = i915_terminally_wedged(i915);
> -	if (ret)
> -		return ERR_PTR(ret);
> -
>   	/*
>   	 * Pinning the contexts may generate requests in order to acquire
>   	 * GGTT space, so do this first before we reserve a seqno for
> 

With the comment preserved:

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 06/22] drm/i915: Pass intel_context to i915_request_create()
  2019-04-02 13:17   ` Tvrtko Ursulin
@ 2019-04-03  7:54     ` Chris Wilson
  2019-04-03  7:57       ` Chris Wilson
  0 siblings, 1 reply; 43+ messages in thread
From: Chris Wilson @ 2019-04-03  7:54 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-04-02 14:17:30)
> 
> On 25/03/2019 09:03, Chris Wilson wrote:
> > @@ -727,17 +695,19 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
> >       if (ret)
> >               goto err_unwind;
> >   
> > -     ret = engine->request_alloc(rq);
> > +     ret = rq->engine->request_alloc(rq);
> >       if (ret)
> >               goto err_unwind;
> >   
> > +     rq->infix = rq->ring->emit; /* end of header; start of user payload */
> > +
> >       /* Keep a second pin for the dual retirement along engine and ring */
> >       __intel_context_pin(ce);
> > -
> > -     rq->infix = rq->ring->emit; /* end of header; start of user payload */
> > +     atomic_inc(&i915->gt.active_requests);
> 
> Oh I hate this..
> 
> >   
> >       /* Check that we didn't interrupt ourselves with a new request */
> >       GEM_BUG_ON(rq->timeline->seqno != rq->fence.seqno);
> > +     lockdep_assert_held(&tl->mutex);
> >       return rq;

[snip]

> > +     /*
> > +      * Pinning the contexts may generate requests in order to acquire
> > +      * GGTT space, so do this first before we reserve a seqno for
> > +      * ourselves.
> > +      */
> > +     ce = intel_context_pin(ctx, engine);
> > +     if (IS_ERR(ce))
> > +             return ERR_CAST(ce);
> > +
> > +     i915_gem_unpark(i915);
> > +
> > +     rq = i915_request_create(ce);
> > +
> > +     i915_gem_park(i915);
> 
> ... because it is so anti-self-documenting.
> 
> Maybe we could have something like:
> 
> __i915_gem_unpark(...)
> {
>         GEM_BUG_ON(!atomic_read(active_requests));
>         atomic_inc(active_requests);
> }
> 
> Similar to __intel_context_pin, just so it is more obvious what the code 
> is doing.

But not here though. Since this is the path that has to wake up the
device itself, pin the context (eventually run retirement, handle
allocation fallback) and not presume the caller already has (or will).

Inside i915_require_create we do have the assert that the device is
awake, which should be has GT wakeref instead. (Note, not GEM wakeref at
this point, that's the troubling bit.)

Post snip: Saw the connected argument.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 06/22] drm/i915: Pass intel_context to i915_request_create()
  2019-04-03  7:54     ` Chris Wilson
@ 2019-04-03  7:57       ` Chris Wilson
  0 siblings, 0 replies; 43+ messages in thread
From: Chris Wilson @ 2019-04-03  7:57 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Chris Wilson (2019-04-03 08:54:41)
> Quoting Tvrtko Ursulin (2019-04-02 14:17:30)
> > 
> > On 25/03/2019 09:03, Chris Wilson wrote:
> > > @@ -727,17 +695,19 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
> > >       if (ret)
> > >               goto err_unwind;
> > >   
> > > -     ret = engine->request_alloc(rq);
> > > +     ret = rq->engine->request_alloc(rq);
> > >       if (ret)
> > >               goto err_unwind;
> > >   
> > > +     rq->infix = rq->ring->emit; /* end of header; start of user payload */
> > > +
> > >       /* Keep a second pin for the dual retirement along engine and ring */
> > >       __intel_context_pin(ce);
> > > -
> > > -     rq->infix = rq->ring->emit; /* end of header; start of user payload */
> > > +     atomic_inc(&i915->gt.active_requests);
> > 
> > Oh I hate this..
> > 
> > >   
> > >       /* Check that we didn't interrupt ourselves with a new request */
> > >       GEM_BUG_ON(rq->timeline->seqno != rq->fence.seqno);
> > > +     lockdep_assert_held(&tl->mutex);
> > >       return rq;
> 
> [snip]
> 
> > > +     /*
> > > +      * Pinning the contexts may generate requests in order to acquire
> > > +      * GGTT space, so do this first before we reserve a seqno for
> > > +      * ourselves.
> > > +      */
> > > +     ce = intel_context_pin(ctx, engine);
> > > +     if (IS_ERR(ce))
> > > +             return ERR_CAST(ce);
> > > +
> > > +     i915_gem_unpark(i915);
> > > +
> > > +     rq = i915_request_create(ce);
> > > +
> > > +     i915_gem_park(i915);
> > 
> > ... because it is so anti-self-documenting.
> > 
> > Maybe we could have something like:
> > 
> > __i915_gem_unpark(...)
> > {
> >         GEM_BUG_ON(!atomic_read(active_requests));
> >         atomic_inc(active_requests);
> > }
> > 
> > Similar to __intel_context_pin, just so it is more obvious what the code 
> > is doing.
> 
> But not here though. Since this is the path that has to wake up the
> device itself, pin the context (eventually run retirement, handle
> allocation fallback) and not presume the caller already has (or will).
> 
> Inside i915_require_create we do have the assert that the device is
> awake, which should be has GT wakeref instead. (Note, not GEM wakeref at
> this point, that's the troubling bit.)
> 
> Post snip: Saw the connected argument.

The problem with that is no, we are not putting a __i915_gem_unpark()
into i915_request_create. At this point, we care about the gt.wakeref.

Maybe I should take another shot at splitting the GEM wakeref from GT.

i915_gem_runtime_pm_get()
  i915_gt_unpark()
  i915_gt_park()
i915_gem_runtime_pm_put()

-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 08/22] drm/i915: Explicitly pin the logical context for execbuf
  2019-04-02 16:09   ` Tvrtko Ursulin
@ 2019-04-10 19:55     ` Chris Wilson
  0 siblings, 0 replies; 43+ messages in thread
From: Chris Wilson @ 2019-04-10 19:55 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-04-02 17:09:18)
> 
> On 25/03/2019 09:03, Chris Wilson wrote:
> > @@ -2120,20 +2150,20 @@ eb_select_engine(struct drm_i915_private *dev_priv,
> >               } else {
> >                       DRM_DEBUG("execbuf with unknown bsd ring: %u\n",
> >                                 bsd_idx);
> > -                     return NULL;
> > +                     return -EINVAL;
> >               }
> >   
> > -             engine = dev_priv->engine[_VCS(bsd_idx)];
> > +             engine = eb->i915->engine[_VCS(bsd_idx)];
> >       } else {
> > -             engine = dev_priv->engine[user_ring_map[user_ring_id]];
> > +             engine = eb->i915->engine[user_ring_map[user_ring_id]];
> 
> Cache dev_priv/i915 in a local?

i915_gem_do_execbuffer                      4742    4740      -2

Barely seems worth it this time.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2019-04-10 19:55 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-25  9:03 ctx->engines[rcs0, rcs0] Chris Wilson
2019-03-25  9:03 ` [PATCH 01/22] drm/i915: Report the correct errno from i915_gem_context_open() Chris Wilson
2019-03-25  9:24   ` Tvrtko Ursulin
2019-03-25  9:03 ` [PATCH 02/22] drm/i915/guc: Replace preempt_client lookup with engine->preempt_context Chris Wilson
2019-03-25  9:03 ` [PATCH 03/22] drm/i915: Pull the GEM powermangement coupling into its own file Chris Wilson
2019-04-01 14:56   ` Tvrtko Ursulin
2019-04-01 15:39   ` Lucas De Marchi
2019-04-01 15:52     ` Chris Wilson
2019-03-25  9:03 ` [PATCH 04/22] drm/i915: Guard unpark/park with the gt.active_mutex Chris Wilson
2019-04-01 15:22   ` Tvrtko Ursulin
2019-04-01 15:37     ` Chris Wilson
2019-04-01 15:54       ` Tvrtko Ursulin
2019-04-01 16:38         ` Chris Wilson
2019-03-25  9:03 ` [PATCH 05/22] drm/i915/selftests: Take GEM runtime wakeref Chris Wilson
2019-04-01 15:34   ` Tvrtko Ursulin
2019-04-01 15:44     ` Chris Wilson
2019-03-25  9:03 ` [PATCH 06/22] drm/i915: Pass intel_context to i915_request_create() Chris Wilson
2019-04-02 13:17   ` Tvrtko Ursulin
2019-04-03  7:54     ` Chris Wilson
2019-04-03  7:57       ` Chris Wilson
2019-03-25  9:03 ` [PATCH 07/22] drm/i915/gvt: Pin the per-engine GVT shadow contexts Chris Wilson
2019-03-25  9:03 ` [PATCH 08/22] drm/i915: Explicitly pin the logical context for execbuf Chris Wilson
2019-04-02 16:09   ` Tvrtko Ursulin
2019-04-10 19:55     ` Chris Wilson
2019-03-25  9:04 ` [PATCH 09/22] drm/i915: Export intel_context_instance() Chris Wilson
2019-03-25  9:04 ` [PATCH 10/22] drm/i915/selftests: Use the real kernel context for sseu isolation tests Chris Wilson
2019-03-25  9:04 ` [PATCH 11/22] drm/i915/selftests: Pass around intel_context for sseu Chris Wilson
2019-03-25  9:04 ` [PATCH 12/22] drm/i915: Pass intel_context to intel_context_pin_lock() Chris Wilson
2019-03-25  9:04 ` [PATCH 13/22] drm/i915: Split engine setup/init into two phases Chris Wilson
2019-03-25  9:04 ` [PATCH 14/22] drm/i915: Switch back to an array of logical per-engine HW contexts Chris Wilson
2019-03-25  9:04 ` [PATCH 15/22] drm/i915: Move i915_request_alloc into selftests/ Chris Wilson
2019-03-25  9:04 ` [PATCH 16/22] drm/i915: Allow a context to define its set of engines Chris Wilson
2019-03-25  9:04 ` [PATCH 17/22] drm/i915: Allow userspace to clone contexts on creation Chris Wilson
2019-03-25  9:04 ` [PATCH 18/22] drm/i915: Load balancing across a virtual engine Chris Wilson
2019-03-25  9:04 ` [PATCH 19/22] drm/i915: Extend execution fence to support a callback Chris Wilson
2019-03-25  9:04 ` [PATCH 20/22] drm/i915: Move intel_engine_mask_t around for use by i915_request_types.h Chris Wilson
2019-03-25  9:04 ` [PATCH 21/22] drm/i915/execlists: Virtual engine bonding Chris Wilson
2019-03-25  9:04 ` [PATCH 22/22] drm/i915: Allow specification of parallel execbuf Chris Wilson
2019-03-25  9:19 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/22] drm/i915: Report the correct errno from i915_gem_context_open() Patchwork
2019-03-25  9:28 ` ✗ Fi.CI.SPARSE: " Patchwork
2019-03-25  9:39 ` ✗ Fi.CI.BAT: failure " Patchwork
2019-03-25 10:52 ` ctx->engines[rcs0, rcs0] Tvrtko Ursulin
2019-03-26  9:34   ` Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.