All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 01/29] drm/i915: Mark up ips for RCU protection
@ 2019-04-08  9:17 Chris Wilson
  2019-04-08  9:17 ` [PATCH 02/29] drm/i915/guc: Replace WARN with a DRM_ERROR Chris Wilson
                   ` (31 more replies)
  0 siblings, 32 replies; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

drivers/gpu/drm/i915/intel_pm.c:8352:9: error: incompatible types in comparison expression (different address spaces)
drivers/gpu/drm/i915/intel_pm.c:8359:9: error: incompatible types in comparison expression (different address spaces)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_pm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index bba477e62a12..1a6f36e08a60 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -8191,14 +8191,14 @@ unsigned long i915_gfx_val(struct drm_i915_private *dev_priv)
 	return val;
 }
 
-static struct drm_i915_private *i915_mch_dev;
+static struct drm_i915_private __rcu *i915_mch_dev;
 
 static struct drm_i915_private *mchdev_get(void)
 {
 	struct drm_i915_private *i915;
 
 	rcu_read_lock();
-	i915 = i915_mch_dev;
+	i915 = rcu_dereference(i915_mch_dev);
 	if (!kref_get_unless_zero(&i915->drm.ref))
 		i915 = NULL;
 	rcu_read_unlock();
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 02/29] drm/i915/guc: Replace WARN with a DRM_ERROR
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-08 14:26   ` Tvrtko Ursulin
  2019-04-08  9:17 ` [PATCH 03/29] drm/i915: Use static allocation for i915_globals_park() Chris Wilson
                   ` (30 subsequent siblings)
  31 siblings, 1 reply; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

Replace the WARN with a simple if() + error message to squech the sparse
warning that entire wait_for() macro was being stringified:

drivers/gpu/drm/i915/intel_guc_submission.c:658:9: error: too long token expansion

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_guc_submission.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
index c58922226d47..42fcd622d7a3 100644
--- a/drivers/gpu/drm/i915/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/intel_guc_submission.c
@@ -650,9 +650,10 @@ static void wait_for_guc_preempt_report(struct intel_engine_cs *engine)
 	struct guc_ctx_report *report =
 		&data->preempt_ctx_report[engine->guc_id];
 
-	WARN_ON(wait_for_atomic(report->report_return_status ==
-				INTEL_GUC_REPORT_STATUS_COMPLETE,
-				GUC_PREEMPT_POSTPROCESS_DELAY_MS));
+	if (wait_for_atomic(report->report_return_status ==
+			    INTEL_GUC_REPORT_STATUS_COMPLETE,
+			    GUC_PREEMPT_POSTPROCESS_DELAY_MS))
+		DRM_ERROR("Timed out waiting for GuC preemption report\n");
 	/*
 	 * GuC is expecting that we're also going to clear the affected context
 	 * counter, let's also reset the return status to not depend on GuC
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 03/29] drm/i915: Use static allocation for i915_globals_park()
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
  2019-04-08  9:17 ` [PATCH 02/29] drm/i915/guc: Replace WARN with a DRM_ERROR Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-08 14:31   ` Tvrtko Ursulin
  2019-04-08  9:17 ` [PATCH 04/29] drm/i915: Consolidate the timeline->barrier Chris Wilson
                   ` (29 subsequent siblings)
  31 siblings, 1 reply; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

In order to avoid the malloc inside i915_globals_park() occurring
underneath a lock connected to the shrinker (thus causing circular
lockdeps warnings), move the rcu_worker to a global.

<4> [39.085073] ======================================================
<4> [39.085273] WARNING: possible circular locking dependency detected
<4> [39.085552] 5.1.0-rc3-CI-Trybot_4088+ #1 Tainted: G     U
<4> [39.085752] ------------------------------------------------------
<4> [39.085949] kswapd0/32 is trying to acquire lock:
<4> [39.086121] 00000000004b5f91 (wakeref#3){+.+.}, at: intel_engine_pm_put+0x1b/0x40 [i915]
<4> [39.086493]
but task is already holding lock:
<4> [39.086682] 00000000dd009a9a (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x0/0x30
<4> [39.086910]
which lock already depends on the new lock.

<4> [39.087139]
the existing dependency chain (in reverse order) is:
<4> [39.087356]
-> #2 (fs_reclaim){+.+.}:
<4> [39.087604]        fs_reclaim_acquire.part.24+0x24/0x30
<4> [39.087785]        kmem_cache_alloc_trace+0x2a/0x290
<4> [39.087998]        i915_globals_park+0x22/0xa0 [i915]
<4> [39.088478]        idle_work_handler+0x1df/0x220 [i915]
<4> [39.089016]        process_one_work+0x245/0x610
<4> [39.089447]        worker_thread+0x37/0x380
<4> [39.089956]        kthread+0x119/0x130
<4> [39.090374]        ret_from_fork+0x3a/0x50
<4> [39.090868]
-> #1 (wakeref#4){+.+.}:
<4> [39.091569]        __mutex_lock+0x8c/0x960
<4> [39.092054]        atomic_dec_and_mutex_lock+0x33/0x50
<4> [39.092521]        intel_gt_pm_put+0x1b/0x40 [i915]
<4> [39.093047]        intel_engine_park+0xeb/0x1d0 [i915]
<4> [39.093514]        __intel_wakeref_put_once+0x10/0x30 [i915]
<4> [39.094062]        i915_request_retire+0x477/0xaf0 [i915]
<4> [39.094547]        ring_retire_requests+0x86/0x160 [i915]
<4> [39.095110]        i915_retire_requests+0x58/0xc0 [i915]
<4> [39.095587]        i915_gem_wait_for_idle.part.22+0xb2/0xf0 [i915]
<4> [39.096142]        switch_to_kernel_context_sync+0x2a/0x70 [i915]
<4> [39.096633]        i915_gem_init+0x59c/0x9c0 [i915]
<4> [39.097174]        i915_driver_load+0xd96/0x1880 [i915]
<4> [39.097640]        i915_pci_probe+0x29/0xa0 [i915]
<4> [39.098145]        pci_device_probe+0xa1/0x120
<4> [39.098607]        really_probe+0xf3/0x3e0
<4> [39.099031]        driver_probe_device+0x10a/0x120
<4> [39.099599]        device_driver_attach+0x4b/0x50
<4> [39.100033]        __driver_attach+0x97/0x130
<4> [39.100525]        bus_for_each_dev+0x74/0xc0
<4> [39.100954]        bus_add_driver+0x13f/0x210
<4> [39.101441]        driver_register+0x56/0xe0
<4> [39.101891]        do_one_initcall+0x58/0x2e0
<4> [39.102319]        do_init_module+0x56/0x1ea
<4> [39.102805]        load_module+0x2701/0x29e0
<4> [39.103231]        __se_sys_finit_module+0xd3/0xf0
<4> [39.103727]        do_syscall_64+0x55/0x190
<4> [39.104153]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [39.104736]
-> #0 (wakeref#3){+.+.}:
<4> [39.105437]        lock_acquire+0xa6/0x1c0
<4> [39.105923]        __mutex_lock+0x8c/0x960
<4> [39.106345]        atomic_dec_and_mutex_lock+0x33/0x50
<4> [39.106897]        intel_engine_pm_put+0x1b/0x40 [i915]
<4> [39.107375]        i915_request_retire+0x477/0xaf0 [i915]
<4> [39.107930]        ring_retire_requests+0x86/0x160 [i915]
<4> [39.108412]        i915_retire_requests+0x58/0xc0 [i915]
<4> [39.108934]        i915_gem_shrink+0xd8/0x5b0 [i915]
<4> [39.109431]        i915_gem_shrinker_scan+0x59/0x130 [i915]
<4> [39.109884]        do_shrink_slab+0x131/0x3e0
<4> [39.110380]        shrink_slab+0x228/0x2c0
<4> [39.110810]        shrink_node+0x177/0x460
<4> [39.111317]        balance_pgdat+0x239/0x580
<4> [39.111743]        kswapd+0x186/0x570
<4> [39.112221]        kthread+0x119/0x130
<4> [39.112641]        ret_from_fork+0x3a/0x50

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_globals.c | 74 +++++++++++++----------------
 1 file changed, 32 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_globals.c b/drivers/gpu/drm/i915/i915_globals.c
index 2f5c72e2a9d1..81e5c2ce336b 100644
--- a/drivers/gpu/drm/i915/i915_globals.c
+++ b/drivers/gpu/drm/i915/i915_globals.c
@@ -17,6 +17,33 @@
 
 static LIST_HEAD(globals);
 
+static atomic_t active;
+static atomic_t epoch;
+static struct park_work {
+	struct rcu_work work;
+	int epoch;
+} park;
+
+static void i915_globals_shrink(void)
+{
+	struct i915_global *global;
+
+	/*
+	 * kmem_cache_shrink() discards empty slabs and reorders partially
+	 * filled slabs to prioritise allocating from the mostly full slabs,
+	 * with the aim of reducing fragmentation.
+	 */
+	list_for_each_entry(global, &globals, link)
+		global->shrink();
+}
+
+static void __i915_globals_park(struct work_struct *work)
+{
+	/* Confirm nothing woke up in the last grace period */
+	if (park.epoch == atomic_read(&epoch))
+		i915_globals_shrink();
+}
+
 void __init i915_global_register(struct i915_global *global)
 {
 	GEM_BUG_ON(!global->shrink);
@@ -57,44 +84,12 @@ int __init i915_globals_init(void)
 		}
 	}
 
+	INIT_RCU_WORK(&park.work, __i915_globals_park);
 	return 0;
 }
 
-static void i915_globals_shrink(void)
-{
-	struct i915_global *global;
-
-	/*
-	 * kmem_cache_shrink() discards empty slabs and reorders partially
-	 * filled slabs to prioritise allocating from the mostly full slabs,
-	 * with the aim of reducing fragmentation.
-	 */
-	list_for_each_entry(global, &globals, link)
-		global->shrink();
-}
-
-static atomic_t active;
-static atomic_t epoch;
-struct park_work {
-	struct rcu_work work;
-	int epoch;
-};
-
-static void __i915_globals_park(struct work_struct *work)
-{
-	struct park_work *wrk = container_of(work, typeof(*wrk), work.work);
-
-	/* Confirm nothing woke up in the last grace period */
-	if (wrk->epoch == atomic_read(&epoch))
-		i915_globals_shrink();
-
-	kfree(wrk);
-}
-
 void i915_globals_park(void)
 {
-	struct park_work *wrk;
-
 	/*
 	 * Defer shrinking the global slab caches (and other work) until
 	 * after a RCU grace period has completed with no activity. This
@@ -107,13 +102,8 @@ void i915_globals_park(void)
 	if (!atomic_dec_and_test(&active))
 		return;
 
-	wrk = kmalloc(sizeof(*wrk), GFP_KERNEL);
-	if (!wrk)
-		return;
-
-	wrk->epoch = atomic_inc_return(&epoch);
-	INIT_RCU_WORK(&wrk->work, __i915_globals_park);
-	queue_rcu_work(system_wq, &wrk->work);
+	park.epoch = atomic_inc_return(&epoch);
+	queue_rcu_work(system_wq, &park.work);
 }
 
 void i915_globals_unpark(void)
@@ -125,8 +115,8 @@ void i915_globals_unpark(void)
 void __exit i915_globals_exit(void)
 {
 	/* Flush any residual park_work */
-	rcu_barrier();
-	flush_scheduled_work();
+	atomic_inc(&epoch);
+	flush_rcu_work(&park.work);
 
 	__i915_globals_cleanup();
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 04/29] drm/i915: Consolidate the timeline->barrier
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
  2019-04-08  9:17 ` [PATCH 02/29] drm/i915/guc: Replace WARN with a DRM_ERROR Chris Wilson
  2019-04-08  9:17 ` [PATCH 03/29] drm/i915: Use static allocation for i915_globals_park() Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-08 14:42   ` Tvrtko Ursulin
  2019-04-08  9:17 ` [PATCH 05/29] drm/i915: Store the default sseu setup on the engine Chris Wilson
                   ` (28 subsequent siblings)
  31 siblings, 1 reply; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

The timeline is strictly ordered, so by inserting the timeline->barrier
request into the timeline->last_request it naturally provides the same
barrier. Consolidate the pair of barriers into one as they serve the
same purpose.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_context.c        | 13 ++-----------
 drivers/gpu/drm/i915/i915_request.c            |  9 ---------
 drivers/gpu/drm/i915/i915_timeline.c           |  2 --
 drivers/gpu/drm/i915/i915_timeline.h           | 15 ---------------
 drivers/gpu/drm/i915/i915_timeline_types.h     | 10 ----------
 drivers/gpu/drm/i915/selftests/mock_timeline.c |  1 -
 6 files changed, 2 insertions(+), 48 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 66b6852cb4d2..7fc34ab6df87 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -1167,7 +1167,7 @@ static int
 gen8_modify_rpcs(struct intel_context *ce, struct intel_sseu sseu)
 {
 	struct drm_i915_private *i915 = ce->engine->i915;
-	struct i915_request *rq, *prev;
+	struct i915_request *rq;
 	intel_wakeref_t wakeref;
 	int ret;
 
@@ -1192,16 +1192,7 @@ gen8_modify_rpcs(struct intel_context *ce, struct intel_sseu sseu)
 	}
 
 	/* Queue this switch after all other activity by this context. */
-	prev = i915_active_request_raw(&ce->ring->timeline->last_request,
-				       &i915->drm.struct_mutex);
-	if (prev && !i915_request_completed(prev)) {
-		ret = i915_request_await_dma_fence(rq, &prev->fence);
-		if (ret < 0)
-			goto out_add;
-	}
-
-	/* Order all following requests to be after. */
-	ret = i915_timeline_set_barrier(ce->ring->timeline, rq);
+	ret = i915_active_request_set(&ce->ring->timeline->last_request, rq);
 	if (ret)
 		goto out_add;
 
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 2da0d6436a1a..96a9e8bcd805 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -584,11 +584,6 @@ i915_request_alloc_slow(struct intel_context *ce)
 	return kmem_cache_alloc(global.slab_requests, GFP_KERNEL);
 }
 
-static int add_timeline_barrier(struct i915_request *rq)
-{
-	return i915_request_await_active_request(rq, &rq->timeline->barrier);
-}
-
 /**
  * i915_request_alloc - allocate a request structure
  *
@@ -738,10 +733,6 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	 */
 	rq->head = rq->ring->emit;
 
-	ret = add_timeline_barrier(rq);
-	if (ret)
-		goto err_unwind;
-
 	ret = engine->request_alloc(rq);
 	if (ret)
 		goto err_unwind;
diff --git a/drivers/gpu/drm/i915/i915_timeline.c b/drivers/gpu/drm/i915/i915_timeline.c
index 2f4907364920..5fbea0892f33 100644
--- a/drivers/gpu/drm/i915/i915_timeline.c
+++ b/drivers/gpu/drm/i915/i915_timeline.c
@@ -253,7 +253,6 @@ int i915_timeline_init(struct drm_i915_private *i915,
 	spin_lock_init(&timeline->lock);
 	mutex_init(&timeline->mutex);
 
-	INIT_ACTIVE_REQUEST(&timeline->barrier);
 	INIT_ACTIVE_REQUEST(&timeline->last_request);
 	INIT_LIST_HEAD(&timeline->requests);
 
@@ -326,7 +325,6 @@ void i915_timeline_fini(struct i915_timeline *timeline)
 {
 	GEM_BUG_ON(timeline->pin_count);
 	GEM_BUG_ON(!list_empty(&timeline->requests));
-	GEM_BUG_ON(i915_active_request_isset(&timeline->barrier));
 
 	i915_syncmap_free(&timeline->sync);
 
diff --git a/drivers/gpu/drm/i915/i915_timeline.h b/drivers/gpu/drm/i915/i915_timeline.h
index 4ca7f80bdf6d..27668a1a69a3 100644
--- a/drivers/gpu/drm/i915/i915_timeline.h
+++ b/drivers/gpu/drm/i915/i915_timeline.h
@@ -110,19 +110,4 @@ void i915_timelines_init(struct drm_i915_private *i915);
 void i915_timelines_park(struct drm_i915_private *i915);
 void i915_timelines_fini(struct drm_i915_private *i915);
 
-/**
- * i915_timeline_set_barrier - orders submission between different timelines
- * @timeline: timeline to set the barrier on
- * @rq: request after which new submissions can proceed
- *
- * Sets the passed in request as the serialization point for all subsequent
- * submissions on @timeline. Subsequent requests will not be submitted to GPU
- * until the barrier has been completed.
- */
-static inline int
-i915_timeline_set_barrier(struct i915_timeline *tl, struct i915_request *rq)
-{
-	return i915_active_request_set(&tl->barrier, rq);
-}
-
 #endif
diff --git a/drivers/gpu/drm/i915/i915_timeline_types.h b/drivers/gpu/drm/i915/i915_timeline_types.h
index 1f5b55d9ffb5..5256a0b5c5f7 100644
--- a/drivers/gpu/drm/i915/i915_timeline_types.h
+++ b/drivers/gpu/drm/i915/i915_timeline_types.h
@@ -61,16 +61,6 @@ struct i915_timeline {
 	 */
 	struct i915_syncmap *sync;
 
-	/**
-	 * Barrier provides the ability to serialize ordering between different
-	 * timelines.
-	 *
-	 * Users can call i915_timeline_set_barrier which will make all
-	 * subsequent submissions to this timeline be executed only after the
-	 * barrier has been completed.
-	 */
-	struct i915_active_request barrier;
-
 	struct list_head link;
 	struct drm_i915_private *i915;
 
diff --git a/drivers/gpu/drm/i915/selftests/mock_timeline.c b/drivers/gpu/drm/i915/selftests/mock_timeline.c
index 416d85233263..e084476469ef 100644
--- a/drivers/gpu/drm/i915/selftests/mock_timeline.c
+++ b/drivers/gpu/drm/i915/selftests/mock_timeline.c
@@ -16,7 +16,6 @@ void mock_timeline_init(struct i915_timeline *timeline, u64 context)
 	spin_lock_init(&timeline->lock);
 	mutex_init(&timeline->mutex);
 
-	INIT_ACTIVE_REQUEST(&timeline->barrier);
 	INIT_ACTIVE_REQUEST(&timeline->last_request);
 	INIT_LIST_HEAD(&timeline->requests);
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 05/29] drm/i915: Store the default sseu setup on the engine
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (2 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 04/29] drm/i915: Consolidate the timeline->barrier Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-08 14:54   ` Tvrtko Ursulin
  2019-04-08  9:17 ` [PATCH 06/29] drm/i915: Move GraphicsTechnology files under gt/ Chris Wilson
                   ` (27 subsequent siblings)
  31 siblings, 1 reply; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

As we push for better compartmentalisation, it is more convenient to
copy the default sseu configuration from the engine into the derived
logical context, than it is to dig it out from i915->runtime_info.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/Makefile                 |   1 +
 drivers/gpu/drm/i915/Makefile.header-test     |   1 +
 drivers/gpu/drm/i915/i915_drv.h               |  14 --
 drivers/gpu/drm/i915/i915_gem_context.c       |   2 +-
 drivers/gpu/drm/i915/i915_perf.c              |   2 +-
 drivers/gpu/drm/i915/intel_context.c          |   4 +-
 drivers/gpu/drm/i915/intel_context_types.h    |  11 +-
 drivers/gpu/drm/i915/intel_device_info.h      |  28 +---
 drivers/gpu/drm/i915/intel_engine_cs.c        |   4 +
 drivers/gpu/drm/i915/intel_engine_types.h     |   3 +
 drivers/gpu/drm/i915/intel_lrc.c              | 134 +----------------
 drivers/gpu/drm/i915/intel_lrc.h              |   2 -
 drivers/gpu/drm/i915/intel_sseu.c             | 142 ++++++++++++++++++
 drivers/gpu/drm/i915/intel_sseu.h             |  67 +++++++++
 .../gpu/drm/i915/selftests/i915_gem_context.c |   5 +-
 15 files changed, 226 insertions(+), 194 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_sseu.c
 create mode 100644 drivers/gpu/drm/i915/intel_sseu.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index fbcb0904f4a8..53ff209b91bb 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -95,6 +95,7 @@ i915-y += \
 	  intel_lrc.o \
 	  intel_mocs.o \
 	  intel_ringbuffer.o \
+	  intel_sseu.o \
 	  intel_uncore.o \
 	  intel_wopcm.o
 
diff --git a/drivers/gpu/drm/i915/Makefile.header-test b/drivers/gpu/drm/i915/Makefile.header-test
index c1c391816fa7..5bcc78d7ac96 100644
--- a/drivers/gpu/drm/i915/Makefile.header-test
+++ b/drivers/gpu/drm/i915/Makefile.header-test
@@ -33,6 +33,7 @@ header_test := \
 	intel_psr.h \
 	intel_sdvo.h \
 	intel_sprite.h \
+	intel_sseu.h \
 	intel_tv.h \
 	intel_workarounds_types.h
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 63eca3061d10..6e46fbc7b350 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3388,20 +3388,6 @@ mkwrite_device_info(struct drm_i915_private *dev_priv)
 	return (struct intel_device_info *)INTEL_INFO(dev_priv);
 }
 
-static inline struct intel_sseu
-intel_device_default_sseu(struct drm_i915_private *i915)
-{
-	const struct sseu_dev_info *sseu = &RUNTIME_INFO(i915)->sseu;
-	struct intel_sseu value = {
-		.slice_mask = sseu->slice_mask,
-		.subslice_mask = sseu->subslice_mask[0],
-		.min_eus_per_subslice = sseu->max_eus_per_subslice,
-		.max_eus_per_subslice = sseu->max_eus_per_subslice,
-	};
-
-	return value;
-}
-
 /* modesetting */
 extern void intel_modeset_init_hw(struct drm_device *dev);
 extern int intel_modeset_init(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 7fc34ab6df87..0883a879d3e7 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -1156,7 +1156,7 @@ static int gen8_emit_rpcs_config(struct i915_request *rq,
 	*cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
 	*cs++ = lower_32_bits(offset);
 	*cs++ = upper_32_bits(offset);
-	*cs++ = gen8_make_rpcs(rq->i915, &sseu);
+	*cs++ = intel_sseu_make_rpcs(rq->i915, &sseu);
 
 	intel_ring_advance(rq, cs);
 
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 39a4804091d7..56da457bed21 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1679,7 +1679,7 @@ gen8_update_reg_state_unlocked(struct intel_context *ce,
 
 	CTX_REG(reg_state,
 		CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE,
-		gen8_make_rpcs(i915, &ce->sseu));
+		intel_sseu_make_rpcs(i915, &ce->sseu));
 }
 
 /*
diff --git a/drivers/gpu/drm/i915/intel_context.c b/drivers/gpu/drm/i915/intel_context.c
index 8931e0fee873..961d1445833d 100644
--- a/drivers/gpu/drm/i915/intel_context.c
+++ b/drivers/gpu/drm/i915/intel_context.c
@@ -230,15 +230,13 @@ intel_context_init(struct intel_context *ce,
 	ce->gem_context = ctx;
 	ce->engine = engine;
 	ce->ops = engine->cops;
+	ce->sseu = engine->sseu;
 
 	INIT_LIST_HEAD(&ce->signal_link);
 	INIT_LIST_HEAD(&ce->signals);
 
 	mutex_init(&ce->pin_mutex);
 
-	/* Use the whole device by default */
-	ce->sseu = intel_device_default_sseu(ctx->i915);
-
 	i915_active_request_init(&ce->active_tracker,
 				 NULL, intel_context_retire);
 }
diff --git a/drivers/gpu/drm/i915/intel_context_types.h b/drivers/gpu/drm/i915/intel_context_types.h
index 624729a35875..2109610e21e8 100644
--- a/drivers/gpu/drm/i915/intel_context_types.h
+++ b/drivers/gpu/drm/i915/intel_context_types.h
@@ -14,6 +14,7 @@
 #include <linux/types.h>
 
 #include "i915_active_types.h"
+#include "intel_sseu.h"
 
 struct i915_gem_context;
 struct i915_vma;
@@ -27,16 +28,6 @@ struct intel_context_ops {
 	void (*destroy)(struct kref *kref);
 };
 
-/*
- * Powergating configuration for a particular (context,engine).
- */
-struct intel_sseu {
-	u8 slice_mask;
-	u8 subslice_mask;
-	u8 min_eus_per_subslice;
-	u8 max_eus_per_subslice;
-};
-
 struct intel_context {
 	struct kref ref;
 
diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
index 0e579f158016..3045e0dee2a1 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -29,6 +29,7 @@
 
 #include "intel_engine_types.h"
 #include "intel_display.h"
+#include "intel_sseu.h"
 
 struct drm_printer;
 struct drm_i915_private;
@@ -139,33 +140,6 @@ enum intel_ppgtt_type {
 	func(overlay_needs_physical); \
 	func(supports_tv);
 
-#define GEN_MAX_SLICES		(6) /* CNL upper bound */
-#define GEN_MAX_SUBSLICES	(8) /* ICL upper bound */
-
-struct sseu_dev_info {
-	u8 slice_mask;
-	u8 subslice_mask[GEN_MAX_SLICES];
-	u16 eu_total;
-	u8 eu_per_subslice;
-	u8 min_eu_in_pool;
-	/* For each slice, which subslice(s) has(have) 7 EUs (bitfield)? */
-	u8 subslice_7eu[3];
-	u8 has_slice_pg:1;
-	u8 has_subslice_pg:1;
-	u8 has_eu_pg:1;
-
-	/* Topology fields */
-	u8 max_slices;
-	u8 max_subslices;
-	u8 max_eus_per_subslice;
-
-	/* We don't have more than 8 eus per subslice at the moment and as we
-	 * store eus enabled using bits, no need to multiply by eus per
-	 * subslice.
-	 */
-	u8 eu_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICES];
-};
-
 struct intel_device_info {
 	u16 gen_mask;
 
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index d0427c2e3997..62f4b7cfe0df 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -588,6 +588,10 @@ int intel_engine_setup_common(struct intel_engine_cs *engine)
 	intel_engine_init_batch_pool(engine);
 	intel_engine_init_cmd_parser(engine);
 
+	/* Use the whole device by default */
+	engine->sseu =
+		intel_device_default_sseu(&RUNTIME_INFO(engine->i915)->sseu);
+
 	return 0;
 
 err_hwsp:
diff --git a/drivers/gpu/drm/i915/intel_engine_types.h b/drivers/gpu/drm/i915/intel_engine_types.h
index 6c6d8a9aca94..60cd1299a8f2 100644
--- a/drivers/gpu/drm/i915/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/intel_engine_types.h
@@ -17,6 +17,7 @@
 #include "i915_priolist_types.h"
 #include "i915_selftest.h"
 #include "i915_timeline_types.h"
+#include "intel_sseu.h"
 #include "intel_workarounds_types.h"
 
 #include "i915_gem_batch_pool.h"
@@ -273,6 +274,8 @@ struct intel_engine_cs {
 	u32 context_size;
 	u32 mmio_base;
 
+	struct intel_sseu sseu;
+
 	struct intel_ring *buffer;
 
 	struct i915_timeline timeline;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 6931dbb2888c..fac3968d7f1b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1321,7 +1321,7 @@ __execlists_update_reg_state(struct intel_context *ce,
 	/* RPCS */
 	if (engine->class == RENDER_CLASS)
 		regs[CTX_R_PWR_CLK_STATE + 1] =
-			gen8_make_rpcs(engine->i915, &ce->sseu);
+			intel_sseu_make_rpcs(engine->i915, &ce->sseu);
 }
 
 static int
@@ -2501,138 +2501,6 @@ int logical_xcs_ring_init(struct intel_engine_cs *engine)
 	return logical_ring_init(engine);
 }
 
-u32 gen8_make_rpcs(struct drm_i915_private *i915, struct intel_sseu *req_sseu)
-{
-	const struct sseu_dev_info *sseu = &RUNTIME_INFO(i915)->sseu;
-	bool subslice_pg = sseu->has_subslice_pg;
-	struct intel_sseu ctx_sseu;
-	u8 slices, subslices;
-	u32 rpcs = 0;
-
-	/*
-	 * No explicit RPCS request is needed to ensure full
-	 * slice/subslice/EU enablement prior to Gen9.
-	*/
-	if (INTEL_GEN(i915) < 9)
-		return 0;
-
-	/*
-	 * If i915/perf is active, we want a stable powergating configuration
-	 * on the system.
-	 *
-	 * We could choose full enablement, but on ICL we know there are use
-	 * cases which disable slices for functional, apart for performance
-	 * reasons. So in this case we select a known stable subset.
-	 */
-	if (!i915->perf.oa.exclusive_stream) {
-		ctx_sseu = *req_sseu;
-	} else {
-		ctx_sseu = intel_device_default_sseu(i915);
-
-		if (IS_GEN(i915, 11)) {
-			/*
-			 * We only need subslice count so it doesn't matter
-			 * which ones we select - just turn off low bits in the
-			 * amount of half of all available subslices per slice.
-			 */
-			ctx_sseu.subslice_mask =
-				~(~0 << (hweight8(ctx_sseu.subslice_mask) / 2));
-			ctx_sseu.slice_mask = 0x1;
-		}
-	}
-
-	slices = hweight8(ctx_sseu.slice_mask);
-	subslices = hweight8(ctx_sseu.subslice_mask);
-
-	/*
-	 * Since the SScount bitfield in GEN8_R_PWR_CLK_STATE is only three bits
-	 * wide and Icelake has up to eight subslices, specfial programming is
-	 * needed in order to correctly enable all subslices.
-	 *
-	 * According to documentation software must consider the configuration
-	 * as 2x4x8 and hardware will translate this to 1x8x8.
-	 *
-	 * Furthemore, even though SScount is three bits, maximum documented
-	 * value for it is four. From this some rules/restrictions follow:
-	 *
-	 * 1.
-	 * If enabled subslice count is greater than four, two whole slices must
-	 * be enabled instead.
-	 *
-	 * 2.
-	 * When more than one slice is enabled, hardware ignores the subslice
-	 * count altogether.
-	 *
-	 * From these restrictions it follows that it is not possible to enable
-	 * a count of subslices between the SScount maximum of four restriction,
-	 * and the maximum available number on a particular SKU. Either all
-	 * subslices are enabled, or a count between one and four on the first
-	 * slice.
-	 */
-	if (IS_GEN(i915, 11) &&
-	    slices == 1 &&
-	    subslices > min_t(u8, 4, hweight8(sseu->subslice_mask[0]) / 2)) {
-		GEM_BUG_ON(subslices & 1);
-
-		subslice_pg = false;
-		slices *= 2;
-	}
-
-	/*
-	 * Starting in Gen9, render power gating can leave
-	 * slice/subslice/EU in a partially enabled state. We
-	 * must make an explicit request through RPCS for full
-	 * enablement.
-	*/
-	if (sseu->has_slice_pg) {
-		u32 mask, val = slices;
-
-		if (INTEL_GEN(i915) >= 11) {
-			mask = GEN11_RPCS_S_CNT_MASK;
-			val <<= GEN11_RPCS_S_CNT_SHIFT;
-		} else {
-			mask = GEN8_RPCS_S_CNT_MASK;
-			val <<= GEN8_RPCS_S_CNT_SHIFT;
-		}
-
-		GEM_BUG_ON(val & ~mask);
-		val &= mask;
-
-		rpcs |= GEN8_RPCS_ENABLE | GEN8_RPCS_S_CNT_ENABLE | val;
-	}
-
-	if (subslice_pg) {
-		u32 val = subslices;
-
-		val <<= GEN8_RPCS_SS_CNT_SHIFT;
-
-		GEM_BUG_ON(val & ~GEN8_RPCS_SS_CNT_MASK);
-		val &= GEN8_RPCS_SS_CNT_MASK;
-
-		rpcs |= GEN8_RPCS_ENABLE | GEN8_RPCS_SS_CNT_ENABLE | val;
-	}
-
-	if (sseu->has_eu_pg) {
-		u32 val;
-
-		val = ctx_sseu.min_eus_per_subslice << GEN8_RPCS_EU_MIN_SHIFT;
-		GEM_BUG_ON(val & ~GEN8_RPCS_EU_MIN_MASK);
-		val &= GEN8_RPCS_EU_MIN_MASK;
-
-		rpcs |= val;
-
-		val = ctx_sseu.max_eus_per_subslice << GEN8_RPCS_EU_MAX_SHIFT;
-		GEM_BUG_ON(val & ~GEN8_RPCS_EU_MAX_MASK);
-		val &= GEN8_RPCS_EU_MAX_MASK;
-
-		rpcs |= val;
-
-		rpcs |= GEN8_RPCS_ENABLE;
-	}
-
-	return rpcs;
-}
-
 static u32 intel_lr_indirect_ctx_offset(struct intel_engine_cs *engine)
 {
 	u32 indirect_ctx_offset;
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 92642ab91472..72488e293d77 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -112,6 +112,4 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
 							const char *prefix),
 				   unsigned int max);
 
-u32 gen8_make_rpcs(struct drm_i915_private *i915, struct intel_sseu *ctx_sseu);
-
 #endif /* _INTEL_LRC_H_ */
diff --git a/drivers/gpu/drm/i915/intel_sseu.c b/drivers/gpu/drm/i915/intel_sseu.c
new file mode 100644
index 000000000000..cfc80813f662
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_sseu.c
@@ -0,0 +1,142 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "i915_drv.h"
+#include "intel_lrc_reg.h"
+#include "intel_sseu.h"
+
+u32 intel_sseu_make_rpcs(struct drm_i915_private *i915,
+			 const struct intel_sseu *req_sseu)
+{
+	const struct sseu_dev_info *sseu = &RUNTIME_INFO(i915)->sseu;
+	bool subslice_pg = sseu->has_subslice_pg;
+	struct intel_sseu ctx_sseu;
+	u8 slices, subslices;
+	u32 rpcs = 0;
+
+	/*
+	 * No explicit RPCS request is needed to ensure full
+	 * slice/subslice/EU enablement prior to Gen9.
+	 */
+	if (INTEL_GEN(i915) < 9)
+		return 0;
+
+	/*
+	 * If i915/perf is active, we want a stable powergating configuration
+	 * on the system.
+	 *
+	 * We could choose full enablement, but on ICL we know there are use
+	 * cases which disable slices for functional, apart for performance
+	 * reasons. So in this case we select a known stable subset.
+	 */
+	if (!i915->perf.oa.exclusive_stream) {
+		ctx_sseu = *req_sseu;
+	} else {
+		ctx_sseu = intel_device_default_sseu(sseu);
+
+		if (IS_GEN(i915, 11)) {
+			/*
+			 * We only need subslice count so it doesn't matter
+			 * which ones we select - just turn off low bits in the
+			 * amount of half of all available subslices per slice.
+			 */
+			ctx_sseu.subslice_mask =
+				~(~0 << (hweight8(ctx_sseu.subslice_mask) / 2));
+			ctx_sseu.slice_mask = 0x1;
+		}
+	}
+
+	slices = hweight8(ctx_sseu.slice_mask);
+	subslices = hweight8(ctx_sseu.subslice_mask);
+
+	/*
+	 * Since the SScount bitfield in GEN8_R_PWR_CLK_STATE is only three bits
+	 * wide and Icelake has up to eight subslices, specfial programming is
+	 * needed in order to correctly enable all subslices.
+	 *
+	 * According to documentation software must consider the configuration
+	 * as 2x4x8 and hardware will translate this to 1x8x8.
+	 *
+	 * Furthemore, even though SScount is three bits, maximum documented
+	 * value for it is four. From this some rules/restrictions follow:
+	 *
+	 * 1.
+	 * If enabled subslice count is greater than four, two whole slices must
+	 * be enabled instead.
+	 *
+	 * 2.
+	 * When more than one slice is enabled, hardware ignores the subslice
+	 * count altogether.
+	 *
+	 * From these restrictions it follows that it is not possible to enable
+	 * a count of subslices between the SScount maximum of four restriction,
+	 * and the maximum available number on a particular SKU. Either all
+	 * subslices are enabled, or a count between one and four on the first
+	 * slice.
+	 */
+	if (IS_GEN(i915, 11) &&
+	    slices == 1 &&
+	    subslices > min_t(u8, 4, hweight8(sseu->subslice_mask[0]) / 2)) {
+		GEM_BUG_ON(subslices & 1);
+
+		subslice_pg = false;
+		slices *= 2;
+	}
+
+	/*
+	 * Starting in Gen9, render power gating can leave
+	 * slice/subslice/EU in a partially enabled state. We
+	 * must make an explicit request through RPCS for full
+	 * enablement.
+	 */
+	if (sseu->has_slice_pg) {
+		u32 mask, val = slices;
+
+		if (INTEL_GEN(i915) >= 11) {
+			mask = GEN11_RPCS_S_CNT_MASK;
+			val <<= GEN11_RPCS_S_CNT_SHIFT;
+		} else {
+			mask = GEN8_RPCS_S_CNT_MASK;
+			val <<= GEN8_RPCS_S_CNT_SHIFT;
+		}
+
+		GEM_BUG_ON(val & ~mask);
+		val &= mask;
+
+		rpcs |= GEN8_RPCS_ENABLE | GEN8_RPCS_S_CNT_ENABLE | val;
+	}
+
+	if (subslice_pg) {
+		u32 val = subslices;
+
+		val <<= GEN8_RPCS_SS_CNT_SHIFT;
+
+		GEM_BUG_ON(val & ~GEN8_RPCS_SS_CNT_MASK);
+		val &= GEN8_RPCS_SS_CNT_MASK;
+
+		rpcs |= GEN8_RPCS_ENABLE | GEN8_RPCS_SS_CNT_ENABLE | val;
+	}
+
+	if (sseu->has_eu_pg) {
+		u32 val;
+
+		val = ctx_sseu.min_eus_per_subslice << GEN8_RPCS_EU_MIN_SHIFT;
+		GEM_BUG_ON(val & ~GEN8_RPCS_EU_MIN_MASK);
+		val &= GEN8_RPCS_EU_MIN_MASK;
+
+		rpcs |= val;
+
+		val = ctx_sseu.max_eus_per_subslice << GEN8_RPCS_EU_MAX_SHIFT;
+		GEM_BUG_ON(val & ~GEN8_RPCS_EU_MAX_MASK);
+		val &= GEN8_RPCS_EU_MAX_MASK;
+
+		rpcs |= val;
+
+		rpcs |= GEN8_RPCS_ENABLE;
+	}
+
+	return rpcs;
+}
diff --git a/drivers/gpu/drm/i915/intel_sseu.h b/drivers/gpu/drm/i915/intel_sseu.h
new file mode 100644
index 000000000000..bf6fa019fd00
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_sseu.h
@@ -0,0 +1,67 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef __INTEL_SSEU_H__
+#define __INTEL_SSEU_H__
+
+#include <linux/types.h>
+
+struct drm_i915_private;
+
+#define GEN_MAX_SLICES		(6) /* CNL upper bound */
+#define GEN_MAX_SUBSLICES	(8) /* ICL upper bound */
+
+struct sseu_dev_info {
+	u8 slice_mask;
+	u8 subslice_mask[GEN_MAX_SLICES];
+	u16 eu_total;
+	u8 eu_per_subslice;
+	u8 min_eu_in_pool;
+	/* For each slice, which subslice(s) has(have) 7 EUs (bitfield)? */
+	u8 subslice_7eu[3];
+	u8 has_slice_pg:1;
+	u8 has_subslice_pg:1;
+	u8 has_eu_pg:1;
+
+	/* Topology fields */
+	u8 max_slices;
+	u8 max_subslices;
+	u8 max_eus_per_subslice;
+
+	/* We don't have more than 8 eus per subslice at the moment and as we
+	 * store eus enabled using bits, no need to multiply by eus per
+	 * subslice.
+	 */
+	u8 eu_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICES];
+};
+
+/*
+ * Powergating configuration for a particular (context,engine).
+ */
+struct intel_sseu {
+	u8 slice_mask;
+	u8 subslice_mask;
+	u8 min_eus_per_subslice;
+	u8 max_eus_per_subslice;
+};
+
+static inline struct intel_sseu
+intel_device_default_sseu(const struct sseu_dev_info *sseu)
+{
+	struct intel_sseu value = {
+		.slice_mask = sseu->slice_mask,
+		.subslice_mask = sseu->subslice_mask[0],
+		.min_eus_per_subslice = sseu->max_eus_per_subslice,
+		.max_eus_per_subslice = sseu->max_eus_per_subslice,
+	};
+
+	return value;
+}
+
+u32 intel_sseu_make_rpcs(struct drm_i915_private *i915,
+			 const struct intel_sseu *req_sseu);
+
+#endif /* __INTEL_SSEU_H__ */
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index 4e1b6efc6b22..e1cb22f03e8e 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -962,8 +962,7 @@ __sseu_finish(struct drm_i915_private *i915,
 	      unsigned int expected,
 	      struct igt_spinner *spin)
 {
-	unsigned int slices =
-		hweight32(intel_device_default_sseu(i915).slice_mask);
+	unsigned int slices = hweight32(engine->sseu.slice_mask);
 	u32 rpcs = 0;
 	int ret = 0;
 
@@ -1047,8 +1046,8 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
 	       const char *name,
 	       unsigned int flags)
 {
-	struct intel_sseu default_sseu = intel_device_default_sseu(i915);
 	struct intel_engine_cs *engine = i915->engine[RCS0];
+	struct intel_sseu default_sseu = engine->sseu;
 	struct drm_i915_gem_object *obj;
 	struct i915_gem_context *ctx;
 	struct intel_sseu pg_sseu;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 06/29] drm/i915: Move GraphicsTechnology files under gt/
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (3 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 05/29] drm/i915: Store the default sseu setup on the engine Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-08  9:17 ` [PATCH 07/29] drm/i915: Only reset the pinned kernel contexts on resume Chris Wilson
                   ` (26 subsequent siblings)
  31 siblings, 0 replies; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

Start partitioning off the code that talks to the hardware (GT) from the
uapi layers and move the device facing code under gt/

One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to
subdivide that header and body further (and split out the submission
code from the ringbuffer and logical context handling). This patch aims
to be simple motion so git can fixup inflight patches with little mess.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/Makefile                 | 46 ++++++++++++-------
 drivers/gpu/drm/i915/Makefile.header-test     |  6 +--
 drivers/gpu/drm/i915/gt/Makefile              |  2 +
 drivers/gpu/drm/i915/gt/Makefile.header-test  | 16 +++++++
 .../gpu/drm/i915/{ => gt}/intel_breadcrumbs.c |  0
 drivers/gpu/drm/i915/{ => gt}/intel_context.c |  3 +-
 drivers/gpu/drm/i915/{ => gt}/intel_context.h |  0
 .../drm/i915/{ => gt}/intel_context_types.h   |  0
 .../{intel_ringbuffer.h => gt/intel_engine.h} |  0
 .../gpu/drm/i915/{ => gt}/intel_engine_cs.c   |  8 ++--
 .../drm/i915/{ => gt}/intel_engine_types.h    |  5 +-
 .../drm/i915/{ => gt}/intel_gpu_commands.h    |  0
 .../gpu/drm/i915/{ => gt}/intel_hangcheck.c   |  4 +-
 drivers/gpu/drm/i915/{ => gt}/intel_lrc.c     |  5 +-
 drivers/gpu/drm/i915/{ => gt}/intel_lrc.h     |  4 +-
 drivers/gpu/drm/i915/{ => gt}/intel_lrc_reg.h |  0
 drivers/gpu/drm/i915/{ => gt}/intel_mocs.c    |  4 +-
 drivers/gpu/drm/i915/{ => gt}/intel_mocs.h    |  4 +-
 .../i915/{i915_reset.c => gt/intel_reset.c}   |  2 +-
 .../i915/{i915_reset.h => gt/intel_reset.h}   |  2 +-
 .../gpu/drm/i915/{ => gt}/intel_ringbuffer.c  |  3 +-
 drivers/gpu/drm/i915/{ => gt}/intel_sseu.c    |  0
 drivers/gpu/drm/i915/{ => gt}/intel_sseu.h    |  0
 .../gpu/drm/i915/{ => gt}/intel_workarounds.c |  2 +-
 .../gpu/drm/i915/{ => gt}/intel_workarounds.h |  8 +++-
 .../i915/{ => gt}/intel_workarounds_types.h   |  0
 .../drm/i915/{selftests => gt}/mock_engine.c  | 10 ++--
 .../drm/i915/{selftests => gt}/mock_engine.h  |  2 +-
 .../selftest_engine_cs.c}                     |  0
 .../selftest_hangcheck.c}                     | 16 +++----
 .../intel_lrc.c => gt/selftest_lrc.c}         | 16 +++----
 .../selftest_workarounds.c}                   | 18 ++++----
 drivers/gpu/drm/i915/i915_cmd_parser.c        |  3 +-
 drivers/gpu/drm/i915/i915_debugfs.c           |  3 +-
 drivers/gpu/drm/i915/i915_drv.c               |  5 +-
 drivers/gpu/drm/i915/i915_drv.h               |  7 +--
 drivers/gpu/drm/i915/i915_gem.c               |  7 +--
 drivers/gpu/drm/i915/i915_gem_context.c       |  7 ++-
 drivers/gpu/drm/i915/i915_gem_context.h       |  3 +-
 drivers/gpu/drm/i915/i915_gem_context_types.h |  3 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c           |  1 -
 drivers/gpu/drm/i915/i915_gem_gtt.h           |  2 +-
 drivers/gpu/drm/i915/i915_gpu_error.h         |  3 +-
 drivers/gpu/drm/i915/i915_perf.c              |  3 +-
 drivers/gpu/drm/i915/i915_pmu.c               |  4 +-
 drivers/gpu/drm/i915/i915_request.c           |  1 -
 drivers/gpu/drm/i915/i915_scheduler_types.h   |  2 +-
 drivers/gpu/drm/i915/i915_trace.h             |  3 +-
 drivers/gpu/drm/i915/i915_vma.c               |  3 +-
 drivers/gpu/drm/i915/intel_device_info.h      |  6 ++-
 drivers/gpu/drm/i915/intel_display.c          |  1 -
 drivers/gpu/drm/i915/intel_guc_submission.c   |  3 +-
 drivers/gpu/drm/i915/intel_guc_submission.h   |  3 +-
 drivers/gpu/drm/i915/intel_uc.c               |  2 +-
 .../gpu/drm/i915/selftests/i915_gem_context.c |  5 +-
 drivers/gpu/drm/i915/selftests/igt_reset.c    |  3 +-
 drivers/gpu/drm/i915/selftests/igt_spinner.h  |  3 +-
 .../gpu/drm/i915/selftests/mock_gem_device.c  |  3 +-
 drivers/gpu/drm/i915/selftests/mock_request.c |  3 +-
 59 files changed, 166 insertions(+), 112 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gt/Makefile
 create mode 100644 drivers/gpu/drm/i915/gt/Makefile.header-test
 rename drivers/gpu/drm/i915/{ => gt}/intel_breadcrumbs.c (100%)
 rename drivers/gpu/drm/i915/{ => gt}/intel_context.c (99%)
 rename drivers/gpu/drm/i915/{ => gt}/intel_context.h (100%)
 rename drivers/gpu/drm/i915/{ => gt}/intel_context_types.h (100%)
 rename drivers/gpu/drm/i915/{intel_ringbuffer.h => gt/intel_engine.h} (100%)
 rename drivers/gpu/drm/i915/{ => gt}/intel_engine_cs.c (99%)
 rename drivers/gpu/drm/i915/{ => gt}/intel_engine_types.h (99%)
 rename drivers/gpu/drm/i915/{ => gt}/intel_gpu_commands.h (100%)
 rename drivers/gpu/drm/i915/{ => gt}/intel_hangcheck.c (99%)
 rename drivers/gpu/drm/i915/{ => gt}/intel_lrc.c (99%)
 rename drivers/gpu/drm/i915/{ => gt}/intel_lrc.h (98%)
 rename drivers/gpu/drm/i915/{ => gt}/intel_lrc_reg.h (100%)
 rename drivers/gpu/drm/i915/{ => gt}/intel_mocs.c (99%)
 rename drivers/gpu/drm/i915/{ => gt}/intel_mocs.h (97%)
 rename drivers/gpu/drm/i915/{i915_reset.c => gt/intel_reset.c} (99%)
 rename drivers/gpu/drm/i915/{i915_reset.h => gt/intel_reset.h} (98%)
 rename drivers/gpu/drm/i915/{ => gt}/intel_ringbuffer.c (99%)
 rename drivers/gpu/drm/i915/{ => gt}/intel_sseu.c (100%)
 rename drivers/gpu/drm/i915/{ => gt}/intel_sseu.h (100%)
 rename drivers/gpu/drm/i915/{ => gt}/intel_workarounds.c (99%)
 rename drivers/gpu/drm/i915/{ => gt}/intel_workarounds.h (87%)
 rename drivers/gpu/drm/i915/{ => gt}/intel_workarounds_types.h (100%)
 rename drivers/gpu/drm/i915/{selftests => gt}/mock_engine.c (97%)
 rename drivers/gpu/drm/i915/{selftests => gt}/mock_engine.h (98%)
 rename drivers/gpu/drm/i915/{selftests/intel_engine_cs.c => gt/selftest_engine_cs.c} (100%)
 rename drivers/gpu/drm/i915/{selftests/intel_hangcheck.c => gt/selftest_hangcheck.c} (99%)
 rename drivers/gpu/drm/i915/{selftests/intel_lrc.c => gt/selftest_lrc.c} (99%)
 rename drivers/gpu/drm/i915/{selftests/intel_workarounds.c => gt/selftest_workarounds.c} (98%)

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 53ff209b91bb..40130cf5c003 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -35,32 +35,53 @@ subdir-ccflags-y += \
 # Extra header tests
 include $(src)/Makefile.header-test
 
+subdir-ccflags-y += -I$(src)
+
 # Please keep these build lists sorted!
 
 # core driver code
 i915-y += i915_drv.o \
 	  i915_irq.o \
-	  i915_memcpy.o \
-	  i915_mm.o \
 	  i915_params.o \
 	  i915_pci.o \
-	  i915_reset.o \
 	  i915_suspend.o \
-	  i915_sw_fence.o \
-	  i915_syncmap.o \
 	  i915_sysfs.o \
-	  i915_user_extensions.o \
 	  intel_csr.o \
 	  intel_device_info.o \
 	  intel_pm.o \
 	  intel_runtime_pm.o \
-	  intel_workarounds.o
+	  intel_uncore.o
+
+# core library code
+i915-y += \
+	i915_memcpy.o \
+	i915_mm.o \
+	i915_sw_fence.o \
+	i915_syncmap.o \
+	i915_user_extensions.o
 
 i915-$(CONFIG_COMPAT)   += i915_ioc32.o
 i915-$(CONFIG_DEBUG_FS) += i915_debugfs.o intel_pipe_crc.o
 i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
 
-# GEM code
+# "Graphics Technology" (aka we talk to the gpu)
+obj-y += gt/
+gt-y += \
+	gt/intel_breadcrumbs.o \
+	gt/intel_context.o \
+	gt/intel_engine_cs.o \
+	gt/intel_hangcheck.o \
+	gt/intel_lrc.o \
+	gt/intel_reset.o \
+	gt/intel_ringbuffer.o \
+	gt/intel_mocs.o \
+	gt/intel_sseu.o \
+	gt/intel_workarounds.o
+gt-$(CONFIG_DRM_I915_SELFTEST) += \
+	gt/mock_engine.o
+i915-y += $(gt-y)
+
+# GEM (Graphics Execution Management) code
 i915-y += \
 	  i915_active.o \
 	  i915_cmd_parser.o \
@@ -88,15 +109,6 @@ i915-y += \
 	  i915_timeline.o \
 	  i915_trace_points.o \
 	  i915_vma.o \
-	  intel_breadcrumbs.o \
-	  intel_context.o \
-	  intel_engine_cs.o \
-	  intel_hangcheck.o \
-	  intel_lrc.o \
-	  intel_mocs.o \
-	  intel_ringbuffer.o \
-	  intel_sseu.o \
-	  intel_uncore.o \
 	  intel_wopcm.o
 
 # general-purpose microcontroller (GuC) support
diff --git a/drivers/gpu/drm/i915/Makefile.header-test b/drivers/gpu/drm/i915/Makefile.header-test
index 5bcc78d7ac96..96a5d90629ec 100644
--- a/drivers/gpu/drm/i915/Makefile.header-test
+++ b/drivers/gpu/drm/i915/Makefile.header-test
@@ -13,13 +13,11 @@ header_test := \
 	intel_cdclk.h \
 	intel_color.h \
 	intel_connector.h \
-	intel_context_types.h \
 	intel_crt.h \
 	intel_csr.h \
 	intel_ddi.h \
 	intel_dp.h \
 	intel_dvo.h \
-	intel_engine_types.h \
 	intel_fbc.h \
 	intel_fbdev.h \
 	intel_frontbuffer.h \
@@ -33,9 +31,7 @@ header_test := \
 	intel_psr.h \
 	intel_sdvo.h \
 	intel_sprite.h \
-	intel_sseu.h \
-	intel_tv.h \
-	intel_workarounds_types.h
+	intel_tv.h
 
 quiet_cmd_header_test = HDRTEST $@
       cmd_header_test = echo "\#include \"$(<F)\"" > $@
diff --git a/drivers/gpu/drm/i915/gt/Makefile b/drivers/gpu/drm/i915/gt/Makefile
new file mode 100644
index 000000000000..1c75b5c9790c
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/Makefile
@@ -0,0 +1,2 @@
+# Extra header tests
+include $(src)/Makefile.header-test
diff --git a/drivers/gpu/drm/i915/gt/Makefile.header-test b/drivers/gpu/drm/i915/gt/Makefile.header-test
new file mode 100644
index 000000000000..61e06cbb4b32
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/Makefile.header-test
@@ -0,0 +1,16 @@
+# SPDX-License-Identifier: MIT
+# Copyright © 2019 Intel Corporation
+
+# Test the headers are compilable as standalone units
+header_test := $(notdir $(wildcard $(src)/*.h))
+
+quiet_cmd_header_test = HDRTEST $@
+      cmd_header_test = echo "\#include \"$(<F)\"" > $@
+
+header_test_%.c: %.h
+	$(call cmd,header_test)
+
+extra-$(CONFIG_DRM_I915_WERROR) += \
+	$(foreach h,$(header_test),$(patsubst %.h,header_test_%.o,$(h)))
+
+clean-files += $(foreach h,$(header_test),$(patsubst %.h,header_test_%.c,$(h)))
diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
similarity index 100%
rename from drivers/gpu/drm/i915/intel_breadcrumbs.c
rename to drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
diff --git a/drivers/gpu/drm/i915/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
similarity index 99%
rename from drivers/gpu/drm/i915/intel_context.c
rename to drivers/gpu/drm/i915/gt/intel_context.c
index 961d1445833d..ebd1e5919a4a 100644
--- a/drivers/gpu/drm/i915/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -7,8 +7,9 @@
 #include "i915_drv.h"
 #include "i915_gem_context.h"
 #include "i915_globals.h"
+
 #include "intel_context.h"
-#include "intel_ringbuffer.h"
+#include "intel_engine.h"
 
 static struct i915_global_context {
 	struct i915_global base;
diff --git a/drivers/gpu/drm/i915/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
similarity index 100%
rename from drivers/gpu/drm/i915/intel_context.h
rename to drivers/gpu/drm/i915/gt/intel_context.h
diff --git a/drivers/gpu/drm/i915/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h
similarity index 100%
rename from drivers/gpu/drm/i915/intel_context_types.h
rename to drivers/gpu/drm/i915/gt/intel_context_types.h
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/gt/intel_engine.h
similarity index 100%
rename from drivers/gpu/drm/i915/intel_ringbuffer.h
rename to drivers/gpu/drm/i915/gt/intel_engine.h
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
similarity index 99%
rename from drivers/gpu/drm/i915/intel_engine_cs.c
rename to drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 62f4b7cfe0df..68eebf292314 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -25,9 +25,10 @@
 #include <drm/drm_print.h>
 
 #include "i915_drv.h"
-#include "i915_reset.h"
-#include "intel_ringbuffer.h"
+
+#include "intel_engine.h"
 #include "intel_lrc.h"
+#include "intel_reset.h"
 
 /* Haswell does have the CXT_SIZE register however it does not appear to be
  * valid. Now, docs explain in dwords what is in the context object. The full
@@ -1739,6 +1740,5 @@ intel_engine_find_active_request(struct intel_engine_cs *engine)
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
-#include "selftests/mock_engine.c"
-#include "selftests/intel_engine_cs.c"
+#include "selftest_engine_cs.c"
 #endif
diff --git a/drivers/gpu/drm/i915/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
similarity index 99%
rename from drivers/gpu/drm/i915/intel_engine_types.h
rename to drivers/gpu/drm/i915/gt/intel_engine_types.h
index 60cd1299a8f2..51faf6d40ba5 100644
--- a/drivers/gpu/drm/i915/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -14,15 +14,14 @@
 #include <linux/types.h>
 
 #include "i915_gem.h"
+#include "i915_gem_batch_pool.h"
+#include "i915_pmu.h"
 #include "i915_priolist_types.h"
 #include "i915_selftest.h"
 #include "i915_timeline_types.h"
 #include "intel_sseu.h"
 #include "intel_workarounds_types.h"
 
-#include "i915_gem_batch_pool.h"
-#include "i915_pmu.h"
-
 #define I915_MAX_SLICES	3
 #define I915_MAX_SUBSLICES 8
 
diff --git a/drivers/gpu/drm/i915/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
similarity index 100%
rename from drivers/gpu/drm/i915/intel_gpu_commands.h
rename to drivers/gpu/drm/i915/gt/intel_gpu_commands.h
diff --git a/drivers/gpu/drm/i915/intel_hangcheck.c b/drivers/gpu/drm/i915/gt/intel_hangcheck.c
similarity index 99%
rename from drivers/gpu/drm/i915/intel_hangcheck.c
rename to drivers/gpu/drm/i915/gt/intel_hangcheck.c
index 3d51ed1428d4..3053a706a561 100644
--- a/drivers/gpu/drm/i915/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/intel_hangcheck.c
@@ -22,8 +22,8 @@
  *
  */
 
+#include "intel_reset.h"
 #include "i915_drv.h"
-#include "i915_reset.h"
 
 struct hangcheck {
 	u64 acthd;
@@ -330,5 +330,5 @@ void intel_hangcheck_init(struct drm_i915_private *i915)
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
-#include "selftests/intel_hangcheck.c"
+#include "selftest_hangcheck.c"
 #endif
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
similarity index 99%
rename from drivers/gpu/drm/i915/intel_lrc.c
rename to drivers/gpu/drm/i915/gt/intel_lrc.c
index fac3968d7f1b..85bdae1db76c 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -133,13 +133,12 @@
  */
 #include <linux/interrupt.h>
 
-#include <drm/i915_drm.h>
 #include "i915_drv.h"
 #include "i915_gem_render_state.h"
-#include "i915_reset.h"
 #include "i915_vgpu.h"
 #include "intel_lrc_reg.h"
 #include "intel_mocs.h"
+#include "intel_reset.h"
 #include "intel_workarounds.h"
 
 #define RING_EXECLIST_QFULL		(1 << 0x2)
@@ -2849,5 +2848,5 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
-#include "selftests/intel_lrc.c"
+#include "selftest_lrc.c"
 #endif
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/gt/intel_lrc.h
similarity index 98%
rename from drivers/gpu/drm/i915/intel_lrc.h
rename to drivers/gpu/drm/i915/gt/intel_lrc.h
index 72488e293d77..b3274e707423 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.h
@@ -24,8 +24,7 @@
 #ifndef _INTEL_LRC_H_
 #define _INTEL_LRC_H_
 
-#include "intel_ringbuffer.h"
-#include "i915_gem_context.h"
+#include "intel_engine.h"
 
 /* Execlists regs */
 #define RING_ELSP(base)				_MMIO((base) + 0x230)
@@ -100,7 +99,6 @@ int logical_xcs_ring_init(struct intel_engine_cs *engine);
 struct drm_printer;
 
 struct drm_i915_private;
-struct i915_gem_context;
 
 void intel_lr_context_resume(struct drm_i915_private *dev_priv);
 void intel_execlists_set_default_submission(struct intel_engine_cs *engine);
diff --git a/drivers/gpu/drm/i915/intel_lrc_reg.h b/drivers/gpu/drm/i915/gt/intel_lrc_reg.h
similarity index 100%
rename from drivers/gpu/drm/i915/intel_lrc_reg.h
rename to drivers/gpu/drm/i915/gt/intel_lrc_reg.h
diff --git a/drivers/gpu/drm/i915/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c
similarity index 99%
rename from drivers/gpu/drm/i915/intel_mocs.c
rename to drivers/gpu/drm/i915/gt/intel_mocs.c
index 274ba78500c0..79df66022d3a 100644
--- a/drivers/gpu/drm/i915/intel_mocs.c
+++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
@@ -20,9 +20,11 @@
  * SOFTWARE.
  */
 
+#include "i915_drv.h"
+
+#include "intel_engine.h"
 #include "intel_mocs.h"
 #include "intel_lrc.h"
-#include "intel_ringbuffer.h"
 
 /* structures required */
 struct drm_i915_mocs_entry {
diff --git a/drivers/gpu/drm/i915/intel_mocs.h b/drivers/gpu/drm/i915/gt/intel_mocs.h
similarity index 97%
rename from drivers/gpu/drm/i915/intel_mocs.h
rename to drivers/gpu/drm/i915/gt/intel_mocs.h
index 3d99d1271b2b..0913704a1af2 100644
--- a/drivers/gpu/drm/i915/intel_mocs.h
+++ b/drivers/gpu/drm/i915/gt/intel_mocs.h
@@ -49,7 +49,9 @@
  * context handling keep the MOCS in step.
  */
 
-#include "i915_drv.h"
+struct drm_i915_private;
+struct i915_request;
+struct intel_engine_cs;
 
 int intel_rcs_context_init_mocs(struct i915_request *rq);
 void intel_mocs_init_l3cc_table(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
similarity index 99%
rename from drivers/gpu/drm/i915/i915_reset.c
rename to drivers/gpu/drm/i915/gt/intel_reset.c
index 68875ba43b8d..dce49b8b1118 100644
--- a/drivers/gpu/drm/i915/i915_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -9,7 +9,7 @@
 
 #include "i915_drv.h"
 #include "i915_gpu_error.h"
-#include "i915_reset.h"
+#include "intel_reset.h"
 
 #include "intel_guc.h"
 
diff --git a/drivers/gpu/drm/i915/i915_reset.h b/drivers/gpu/drm/i915/gt/intel_reset.h
similarity index 98%
rename from drivers/gpu/drm/i915/i915_reset.h
rename to drivers/gpu/drm/i915/gt/intel_reset.h
index 3c0450289b8f..8e662bb43a9b 100644
--- a/drivers/gpu/drm/i915/i915_reset.h
+++ b/drivers/gpu/drm/i915/gt/intel_reset.h
@@ -11,7 +11,7 @@
 #include <linux/types.h>
 #include <linux/srcu.h>
 
-#include "intel_engine_types.h"
+#include "gt/intel_engine_types.h"
 
 struct drm_i915_private;
 struct i915_request;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
similarity index 99%
rename from drivers/gpu/drm/i915/intel_ringbuffer.c
rename to drivers/gpu/drm/i915/gt/intel_ringbuffer.c
index 8a19eee9c5d4..be83cb8167f3 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
@@ -33,9 +33,8 @@
 
 #include "i915_drv.h"
 #include "i915_gem_render_state.h"
-#include "i915_reset.h"
 #include "i915_trace.h"
-#include "intel_drv.h"
+#include "intel_reset.h"
 #include "intel_workarounds.h"
 
 /* Rough estimate of the typical request size, performing a flush,
diff --git a/drivers/gpu/drm/i915/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c
similarity index 100%
rename from drivers/gpu/drm/i915/intel_sseu.c
rename to drivers/gpu/drm/i915/gt/intel_sseu.c
diff --git a/drivers/gpu/drm/i915/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h
similarity index 100%
rename from drivers/gpu/drm/i915/intel_sseu.h
rename to drivers/gpu/drm/i915/gt/intel_sseu.h
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
similarity index 99%
rename from drivers/gpu/drm/i915/intel_workarounds.c
rename to drivers/gpu/drm/i915/gt/intel_workarounds.c
index a04dbc58ec1c..2d6d17ee3601 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -1257,5 +1257,5 @@ void intel_engine_apply_workarounds(struct intel_engine_cs *engine)
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
-#include "selftests/intel_workarounds.c"
+#include "selftest_workarounds.c"
 #endif
diff --git a/drivers/gpu/drm/i915/intel_workarounds.h b/drivers/gpu/drm/i915/gt/intel_workarounds.h
similarity index 87%
rename from drivers/gpu/drm/i915/intel_workarounds.h
rename to drivers/gpu/drm/i915/gt/intel_workarounds.h
index a1bf51c611a9..82882f43b2ec 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.h
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.h
@@ -4,13 +4,17 @@
  * Copyright © 2014-2018 Intel Corporation
  */
 
-#ifndef _I915_WORKAROUNDS_H_
-#define _I915_WORKAROUNDS_H_
+#ifndef _INTEL_WORKAROUNDS_H_
+#define _INTEL_WORKAROUNDS_H_
 
 #include <linux/slab.h>
 
 #include "intel_workarounds_types.h"
 
+struct drm_i915_private;
+struct i915_request;
+struct intel_engine_cs;
+
 static inline void intel_wa_list_free(struct i915_wa_list *wal)
 {
 	kfree(wal->list);
diff --git a/drivers/gpu/drm/i915/intel_workarounds_types.h b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h
similarity index 100%
rename from drivers/gpu/drm/i915/intel_workarounds_types.h
rename to drivers/gpu/drm/i915/gt/intel_workarounds_types.h
diff --git a/drivers/gpu/drm/i915/selftests/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c
similarity index 97%
rename from drivers/gpu/drm/i915/selftests/mock_engine.c
rename to drivers/gpu/drm/i915/gt/mock_engine.c
index 61a8206ed677..414afd2f27fe 100644
--- a/drivers/gpu/drm/i915/selftests/mock_engine.c
+++ b/drivers/gpu/drm/i915/gt/mock_engine.c
@@ -22,8 +22,11 @@
  *
  */
 
+#include "i915_drv.h"
+#include "intel_context.h"
+
 #include "mock_engine.h"
-#include "mock_request.h"
+#include "selftests/mock_request.h"
 
 struct mock_ring {
 	struct intel_ring base;
@@ -268,8 +271,9 @@ struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
 	timer_setup(&engine->hw_delay, hw_delay_complete, 0);
 	INIT_LIST_HEAD(&engine->hw_queue);
 
-	if (pin_context(i915->kernel_context, &engine->base,
-			&engine->base.kernel_context))
+	engine->base.kernel_context =
+		intel_context_pin(i915->kernel_context, &engine->base);
+	if (IS_ERR(engine->base.kernel_context))
 		goto err_breadcrumbs;
 
 	return &engine->base;
diff --git a/drivers/gpu/drm/i915/selftests/mock_engine.h b/drivers/gpu/drm/i915/gt/mock_engine.h
similarity index 98%
rename from drivers/gpu/drm/i915/selftests/mock_engine.h
rename to drivers/gpu/drm/i915/gt/mock_engine.h
index b9cc3a245f16..44b35a85e9d1 100644
--- a/drivers/gpu/drm/i915/selftests/mock_engine.h
+++ b/drivers/gpu/drm/i915/gt/mock_engine.h
@@ -29,7 +29,7 @@
 #include <linux/spinlock.h>
 #include <linux/timer.h>
 
-#include "../intel_ringbuffer.h"
+#include "gt/intel_engine.h"
 
 struct mock_engine {
 	struct intel_engine_cs base;
diff --git a/drivers/gpu/drm/i915/selftests/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/selftest_engine_cs.c
similarity index 100%
rename from drivers/gpu/drm/i915/selftests/intel_engine_cs.c
rename to drivers/gpu/drm/i915/gt/selftest_engine_cs.c
diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
similarity index 99%
rename from drivers/gpu/drm/i915/selftests/intel_hangcheck.c
rename to drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index 050bd1e19e02..87c26920212f 100644
--- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -24,14 +24,14 @@
 
 #include <linux/kthread.h>
 
-#include "../i915_selftest.h"
-#include "i915_random.h"
-#include "igt_flush_test.h"
-#include "igt_reset.h"
-#include "igt_wedge_me.h"
-
-#include "mock_context.h"
-#include "mock_drm.h"
+#include "i915_selftest.h"
+#include "selftests/i915_random.h"
+#include "selftests/igt_flush_test.h"
+#include "selftests/igt_reset.h"
+#include "selftests/igt_wedge_me.h"
+
+#include "selftests/mock_context.h"
+#include "selftests/mock_drm.h"
 
 #define IGT_IDLE_TIMEOUT 50 /* ms; time to wait after flushing between tests */
 
diff --git a/drivers/gpu/drm/i915/selftests/intel_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
similarity index 99%
rename from drivers/gpu/drm/i915/selftests/intel_lrc.c
rename to drivers/gpu/drm/i915/gt/selftest_lrc.c
index fbee030db940..cd0551f97c2f 100644
--- a/drivers/gpu/drm/i915/selftests/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -6,15 +6,13 @@
 
 #include <linux/prime_numbers.h>
 
-#include "../i915_reset.h"
-
-#include "../i915_selftest.h"
-#include "igt_flush_test.h"
-#include "igt_live_test.h"
-#include "igt_spinner.h"
-#include "i915_random.h"
-
-#include "mock_context.h"
+#include "gt/intel_reset.h"
+#include "i915_selftest.h"
+#include "selftests/i915_random.h"
+#include "selftests/igt_flush_test.h"
+#include "selftests/igt_live_test.h"
+#include "selftests/igt_spinner.h"
+#include "selftests/mock_context.h"
 
 static int live_sanitycheck(void *arg)
 {
diff --git a/drivers/gpu/drm/i915/selftests/intel_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
similarity index 98%
rename from drivers/gpu/drm/i915/selftests/intel_workarounds.c
rename to drivers/gpu/drm/i915/gt/selftest_workarounds.c
index 3baed59008d7..9b2fd24ea950 100644
--- a/drivers/gpu/drm/i915/selftests/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
@@ -4,15 +4,15 @@
  * Copyright © 2018 Intel Corporation
  */
 
-#include "../i915_selftest.h"
-#include "../i915_reset.h"
-
-#include "igt_flush_test.h"
-#include "igt_reset.h"
-#include "igt_spinner.h"
-#include "igt_wedge_me.h"
-#include "mock_context.h"
-#include "mock_drm.h"
+#include "i915_selftest.h"
+#include "intel_reset.h"
+
+#include "selftests/igt_flush_test.h"
+#include "selftests/igt_reset.h"
+#include "selftests/igt_spinner.h"
+#include "selftests/igt_wedge_me.h"
+#include "selftests/mock_context.h"
+#include "selftests/mock_drm.h"
 
 static const struct wo_register {
 	enum intel_platform platform;
diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 503d548a55f7..e9fadcb4d592 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -25,8 +25,9 @@
  *
  */
 
+#include "gt/intel_engine.h"
+
 #include "i915_drv.h"
-#include "intel_ringbuffer.h"
 
 /**
  * DOC: batch buffer command parser
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 527b995a4d7a..eac4095278b2 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -32,7 +32,8 @@
 #include <drm/drm_debugfs.h>
 #include <drm/drm_fourcc.h>
 
-#include "i915_reset.h"
+#include "gt/intel_reset.h"
+
 #include "intel_dp.h"
 #include "intel_drv.h"
 #include "intel_fbc.h"
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 1ad88e6d7c04..98b997526daa 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -47,10 +47,12 @@
 #include <drm/drm_probe_helper.h>
 #include <drm/i915_drm.h>
 
+#include "gt/intel_workarounds.h"
+#include "gt/intel_reset.h"
+
 #include "i915_drv.h"
 #include "i915_pmu.h"
 #include "i915_query.h"
-#include "i915_reset.h"
 #include "i915_trace.h"
 #include "i915_vgpu.h"
 #include "intel_audio.h"
@@ -62,7 +64,6 @@
 #include "intel_pm.h"
 #include "intel_sprite.h"
 #include "intel_uc.h"
-#include "intel_workarounds.h"
 
 static struct drm_driver driver;
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 6e46fbc7b350..312278723712 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -62,18 +62,19 @@
 #include "i915_reg.h"
 #include "i915_utils.h"
 
+#include "gt/intel_lrc.h"
+#include "gt/intel_engine.h"
+#include "gt/intel_workarounds.h"
+
 #include "intel_bios.h"
 #include "intel_device_info.h"
 #include "intel_display.h"
 #include "intel_dpll_mgr.h"
 #include "intel_frontbuffer.h"
-#include "intel_lrc.h"
 #include "intel_opregion.h"
-#include "intel_ringbuffer.h"
 #include "intel_uc.h"
 #include "intel_uncore.h"
 #include "intel_wopcm.h"
-#include "intel_workarounds.h"
 
 #include "i915_gem.h"
 #include "i915_gem_context.h"
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index bf3d12f94365..0aa28d622d2a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -39,19 +39,20 @@
 #include <linux/dma-buf.h>
 #include <linux/mman.h>
 
+#include "gt/intel_mocs.h"
+#include "gt/intel_reset.h"
+#include "gt/intel_workarounds.h"
+
 #include "i915_drv.h"
 #include "i915_gem_clflush.h"
 #include "i915_gemfs.h"
 #include "i915_globals.h"
-#include "i915_reset.h"
 #include "i915_trace.h"
 #include "i915_vgpu.h"
 
 #include "intel_drv.h"
 #include "intel_frontbuffer.h"
-#include "intel_mocs.h"
 #include "intel_pm.h"
-#include "intel_workarounds.h"
 
 static void i915_gem_flush_free_objects(struct drm_i915_private *i915);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 0883a879d3e7..170500d49049 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -86,13 +86,16 @@
  */
 
 #include <linux/log2.h>
+
 #include <drm/i915_drm.h>
+
+#include "gt/intel_lrc_reg.h"
+#include "gt/intel_workarounds.h"
+
 #include "i915_drv.h"
 #include "i915_globals.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
-#include "intel_lrc_reg.h"
-#include "intel_workarounds.h"
 
 #define I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE (1 << 1)
 #define I915_CONTEXT_PARAM_VM 0x9
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index 23dcb01bfd82..cec278ab04e2 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -27,9 +27,10 @@
 
 #include "i915_gem_context_types.h"
 
+#include "gt/intel_context.h"
+
 #include "i915_gem.h"
 #include "i915_scheduler.h"
-#include "intel_context.h"
 #include "intel_device_info.h"
 
 struct drm_device;
diff --git a/drivers/gpu/drm/i915/i915_gem_context_types.h b/drivers/gpu/drm/i915/i915_gem_context_types.h
index e2ec58b10fb2..d282a6ab3b9f 100644
--- a/drivers/gpu/drm/i915/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/i915_gem_context_types.h
@@ -17,8 +17,9 @@
 #include <linux/rcupdate.h>
 #include <linux/types.h>
 
+#include "gt/intel_context_types.h"
+
 #include "i915_scheduler.h"
-#include "intel_context_types.h"
 
 struct pid;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 736c845eb77f..dde60a0f0eb6 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -37,7 +37,6 @@
 
 #include "i915_drv.h"
 #include "i915_vgpu.h"
-#include "i915_reset.h"
 #include "i915_trace.h"
 #include "intel_drv.h"
 #include "intel_frontbuffer.h"
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index f597f35b109b..c8d96e91f3dc 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -38,8 +38,8 @@
 #include <linux/mm.h>
 #include <linux/pagevec.h>
 
+#include "gt/intel_reset.h"
 #include "i915_request.h"
-#include "i915_reset.h"
 #include "i915_selftest.h"
 #include "i915_timeline.h"
 
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h
index 5dc761e85d9d..b419d0f59275 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.h
+++ b/drivers/gpu/drm/i915/i915_gpu_error.h
@@ -13,8 +13,9 @@
 
 #include <drm/drm_mm.h>
 
+#include "gt/intel_engine.h"
+
 #include "intel_device_info.h"
-#include "intel_ringbuffer.h"
 #include "intel_uc_fw.h"
 
 #include "i915_gem.h"
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 56da457bed21..a87f790335c1 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -195,6 +195,8 @@
 #include <linux/sizes.h>
 #include <linux/uuid.h>
 
+#include "gt/intel_lrc_reg.h"
+
 #include "i915_drv.h"
 #include "i915_oa_hsw.h"
 #include "i915_oa_bdw.h"
@@ -210,7 +212,6 @@
 #include "i915_oa_cflgt3.h"
 #include "i915_oa_cnl.h"
 #include "i915_oa_icl.h"
-#include "intel_lrc_reg.h"
 
 /* HW requires this to be a power of two, between 128k and 16M, though driver
  * is currently generally designed assuming the largest 16M size is used such
diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index 46a52da3db29..35e502481f29 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -6,8 +6,10 @@
 
 #include <linux/irq.h>
 #include <linux/pm_runtime.h>
+
+#include "gt/intel_engine.h"
+
 #include "i915_pmu.h"
-#include "intel_ringbuffer.h"
 #include "i915_drv.h"
 
 /* Frequency for the sampling timer for events which need it. */
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 96a9e8bcd805..f2b9e0e1910b 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -32,7 +32,6 @@
 #include "i915_active.h"
 #include "i915_drv.h"
 #include "i915_globals.h"
-#include "i915_reset.h"
 #include "intel_pm.h"
 
 struct execute_cb {
diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
index f1af3916a808..166a457884b2 100644
--- a/drivers/gpu/drm/i915/i915_scheduler_types.h
+++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
@@ -9,8 +9,8 @@
 
 #include <linux/list.h>
 
+#include "gt/intel_engine_types.h"
 #include "i915_priolist_types.h"
-#include "intel_engine_types.h"
 
 struct drm_i915_private;
 struct i915_request;
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index 12893304c8f8..b5286f3d8146 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -8,9 +8,10 @@
 
 #include <drm/drm_drv.h>
 
+#include "gt/intel_engine.h"
+
 #include "i915_drv.h"
 #include "intel_drv.h"
-#include "intel_ringbuffer.h"
 
 #undef TRACE_SYSTEM
 #define TRACE_SYSTEM i915
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 36726392e737..d4d308b6d1d8 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -22,11 +22,12 @@
  *
  */
 
+#include "gt/intel_engine.h"
+
 #include "i915_vma.h"
 
 #include "i915_drv.h"
 #include "i915_globals.h"
-#include "intel_ringbuffer.h"
 #include "intel_frontbuffer.h"
 
 #include <drm/drm_gem.h>
diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
index 3045e0dee2a1..aa89a9adeffb 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -27,9 +27,11 @@
 
 #include <uapi/drm/i915_drm.h>
 
-#include "intel_engine_types.h"
+#include "gt/intel_engine_types.h"
+#include "gt/intel_context_types.h"
+#include "gt/intel_sseu.h"
+
 #include "intel_display.h"
-#include "intel_sseu.h"
 
 struct drm_printer;
 struct drm_i915_private;
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index cb7f99618f02..fa7765af9c9d 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -46,7 +46,6 @@
 
 #include "i915_drv.h"
 #include "i915_gem_clflush.h"
-#include "i915_reset.h"
 #include "i915_trace.h"
 #include "intel_atomic_plane.h"
 #include "intel_color.h"
diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
index 42fcd622d7a3..4361d052a571 100644
--- a/drivers/gpu/drm/i915/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/intel_guc_submission.c
@@ -25,8 +25,9 @@
 #include <linux/circ_buf.h>
 #include <trace/events/dma_fence.h>
 
+#include "gt/intel_lrc_reg.h"
+
 #include "intel_guc_submission.h"
-#include "intel_lrc_reg.h"
 #include "i915_drv.h"
 
 #define GUC_PREEMPT_FINISHED		0x1
diff --git a/drivers/gpu/drm/i915/intel_guc_submission.h b/drivers/gpu/drm/i915/intel_guc_submission.h
index aa5e6749c925..7d823a513b9c 100644
--- a/drivers/gpu/drm/i915/intel_guc_submission.h
+++ b/drivers/gpu/drm/i915/intel_guc_submission.h
@@ -27,9 +27,10 @@
 
 #include <linux/spinlock.h>
 
+#include "gt/intel_engine_types.h"
+
 #include "i915_gem.h"
 #include "i915_selftest.h"
-#include "intel_engine_types.h"
 
 struct drm_i915_private;
 
diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c
index 25b80ffe71ad..13f823ff8083 100644
--- a/drivers/gpu/drm/i915/intel_uc.c
+++ b/drivers/gpu/drm/i915/intel_uc.c
@@ -22,11 +22,11 @@
  *
  */
 
+#include "gt/intel_reset.h"
 #include "intel_uc.h"
 #include "intel_guc_submission.h"
 #include "intel_guc.h"
 #include "i915_drv.h"
-#include "i915_reset.h"
 
 static void guc_free_load_err_log(struct intel_guc *guc);
 
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index e1cb22f03e8e..6f52ca881173 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -24,8 +24,9 @@
 
 #include <linux/prime_numbers.h>
 
-#include "../i915_reset.h"
-#include "../i915_selftest.h"
+#include "gt/intel_reset.h"
+#include "i915_selftest.h"
+
 #include "i915_random.h"
 #include "igt_flush_test.h"
 #include "igt_live_test.h"
diff --git a/drivers/gpu/drm/i915/selftests/igt_reset.c b/drivers/gpu/drm/i915/selftests/igt_reset.c
index 208a966da8ca..4f31b137c428 100644
--- a/drivers/gpu/drm/i915/selftests/igt_reset.c
+++ b/drivers/gpu/drm/i915/selftests/igt_reset.c
@@ -6,8 +6,9 @@
 
 #include "igt_reset.h"
 
+#include "gt/intel_engine.h"
+
 #include "../i915_drv.h"
-#include "../intel_ringbuffer.h"
 
 void igt_global_reset_lock(struct drm_i915_private *i915)
 {
diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.h b/drivers/gpu/drm/i915/selftests/igt_spinner.h
index 391777c76dc7..d312e7cdab68 100644
--- a/drivers/gpu/drm/i915/selftests/igt_spinner.h
+++ b/drivers/gpu/drm/i915/selftests/igt_spinner.h
@@ -9,9 +9,10 @@
 
 #include "../i915_selftest.h"
 
+#include "gt/intel_engine.h"
+
 #include "../i915_drv.h"
 #include "../i915_request.h"
-#include "../intel_ringbuffer.h"
 #include "../i915_gem_context.h"
 
 struct igt_spinner {
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index 60bbf8b4df40..f444ee5add27 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -25,7 +25,8 @@
 #include <linux/pm_domain.h>
 #include <linux/pm_runtime.h>
 
-#include "mock_engine.h"
+#include "gt/mock_engine.h"
+
 #include "mock_context.h"
 #include "mock_request.h"
 #include "mock_gem_device.h"
diff --git a/drivers/gpu/drm/i915/selftests/mock_request.c b/drivers/gpu/drm/i915/selftests/mock_request.c
index d1a7c9608712..f739ba63057f 100644
--- a/drivers/gpu/drm/i915/selftests/mock_request.c
+++ b/drivers/gpu/drm/i915/selftests/mock_request.c
@@ -22,7 +22,8 @@
  *
  */
 
-#include "mock_engine.h"
+#include "gt/mock_engine.h"
+
 #include "mock_request.h"
 
 struct i915_request *
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 07/29] drm/i915: Only reset the pinned kernel contexts on resume
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (4 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 06/29] drm/i915: Move GraphicsTechnology files under gt/ Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-10  9:39   ` Tvrtko Ursulin
  2019-04-08  9:17 ` [PATCH 08/29] drm/i915: Introduce struct intel_wakeref Chris Wilson
                   ` (25 subsequent siblings)
  31 siblings, 1 reply; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

On resume, we know that the only pinned contexts in danger of seeing
corruption are the kernel context, and so we do not need to walk the
list of all GEM contexts as we tracked them on each engine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_context_types.h |  1 +
 drivers/gpu/drm/i915/gt/intel_engine.h        |  3 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     | 18 +++++++
 drivers/gpu/drm/i915/gt/intel_lrc.c           | 49 +++++++++----------
 drivers/gpu/drm/i915/gt/intel_lrc.h           |  1 -
 drivers/gpu/drm/i915/gt/intel_ringbuffer.c    | 17 +++----
 drivers/gpu/drm/i915/i915_drv.h               |  1 -
 drivers/gpu/drm/i915/i915_gem.c               |  9 ++--
 8 files changed, 54 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 2109610e21e8..9ec4f787c908 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -25,6 +25,7 @@ struct intel_context_ops {
 	int (*pin)(struct intel_context *ce);
 	void (*unpin)(struct intel_context *ce);
 
+	void (*reset)(struct intel_context *ce);
 	void (*destroy)(struct kref *kref);
 };
 
diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index e58d6f04177b..0dea6c7fd438 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -268,8 +268,6 @@ static inline void intel_ring_put(struct intel_ring *ring)
 void intel_engine_stop(struct intel_engine_cs *engine);
 void intel_engine_cleanup(struct intel_engine_cs *engine);
 
-void intel_legacy_submission_resume(struct drm_i915_private *dev_priv);
-
 int __must_check intel_ring_cacheline_align(struct i915_request *rq);
 
 u32 __must_check *intel_ring_begin(struct i915_request *rq, unsigned int n);
@@ -463,6 +461,7 @@ static inline void intel_engine_reset(struct intel_engine_cs *engine,
 }
 
 void intel_engines_sanitize(struct drm_i915_private *i915, bool force);
+void intel_gt_resume(struct drm_i915_private *i915);
 
 bool intel_engine_is_idle(struct intel_engine_cs *engine);
 bool intel_engines_are_idle(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 68eebf292314..fdfce11e4944 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -758,6 +758,24 @@ int intel_engine_init_common(struct intel_engine_cs *engine)
 	return ret;
 }
 
+void intel_gt_resume(struct drm_i915_private *i915)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	/*
+	 * After resume, we may need to poke into the pinned kernel
+	 * contexts to paper over any damage caused by the sudden suspend.
+	 * Only the kernel contexts should remain pinned over suspend,
+	 * allowing us to fixup the user contexts on their first pin.
+	 */
+	for_each_engine(engine, i915, id) {
+		struct intel_context *ce = engine->kernel_context;
+
+		ce->ops->reset(ce);
+	}
+}
+
 /**
  * intel_engines_cleanup_common - cleans up the engine state created by
  *                                the common initiailizers.
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 85bdae1db76c..1565b11b15f6 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1378,9 +1378,33 @@ static int execlists_context_pin(struct intel_context *ce)
 	return __execlists_context_pin(ce, ce->engine);
 }
 
+static void execlists_context_reset(struct intel_context *ce)
+{
+	/*
+	 * Because we emit WA_TAIL_DWORDS there may be a disparity
+	 * between our bookkeeping in ce->ring->head and ce->ring->tail and
+	 * that stored in context. As we only write new commands from
+	 * ce->ring->tail onwards, everything before that is junk. If the GPU
+	 * starts reading from its RING_HEAD from the context, it may try to
+	 * execute that junk and die.
+	 *
+	 * The contexts that are stilled pinned on resume belong to the
+	 * kernel, and are local to each engine. All other contexts will
+	 * have their head/tail sanitized upon pinning before use, so they
+	 * will never see garbage,
+	 *
+	 * So to avoid that we reset the context images upon resume. For
+	 * simplicity, we just zero everything out.
+	 */
+	intel_ring_reset(ce->ring, 0);
+	__execlists_update_reg_state(ce, ce->engine);
+}
+
 static const struct intel_context_ops execlists_context_ops = {
 	.pin = execlists_context_pin,
 	.unpin = execlists_context_unpin,
+
+	.reset = execlists_context_reset,
 	.destroy = execlists_context_destroy,
 };
 
@@ -2762,31 +2786,6 @@ static int execlists_context_deferred_alloc(struct intel_context *ce,
 	return ret;
 }
 
-void intel_lr_context_resume(struct drm_i915_private *i915)
-{
-	struct i915_gem_context *ctx;
-	struct intel_context *ce;
-
-	/*
-	 * Because we emit WA_TAIL_DWORDS there may be a disparity
-	 * between our bookkeeping in ce->ring->head and ce->ring->tail and
-	 * that stored in context. As we only write new commands from
-	 * ce->ring->tail onwards, everything before that is junk. If the GPU
-	 * starts reading from its RING_HEAD from the context, it may try to
-	 * execute that junk and die.
-	 *
-	 * So to avoid that we reset the context images upon resume. For
-	 * simplicity, we just zero everything out.
-	 */
-	list_for_each_entry(ctx, &i915->contexts.list, link) {
-		list_for_each_entry(ce, &ctx->active_engines, active_link) {
-			GEM_BUG_ON(!ce->ring);
-			intel_ring_reset(ce->ring, 0);
-			__execlists_update_reg_state(ce, ce->engine);
-		}
-	}
-}
-
 void intel_execlists_show_requests(struct intel_engine_cs *engine,
 				   struct drm_printer *m,
 				   void (*show_request)(struct drm_printer *m,
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.h b/drivers/gpu/drm/i915/gt/intel_lrc.h
index b3274e707423..a598f2d56de3 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.h
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.h
@@ -100,7 +100,6 @@ struct drm_printer;
 
 struct drm_i915_private;
 
-void intel_lr_context_resume(struct drm_i915_private *dev_priv);
 void intel_execlists_set_default_submission(struct intel_engine_cs *engine);
 
 void intel_execlists_show_requests(struct intel_engine_cs *engine,
diff --git a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
index be83cb8167f3..bcc842b8bbe7 100644
--- a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
@@ -1507,9 +1507,16 @@ static int ring_context_pin(struct intel_context *ce)
 	return err;
 }
 
+static void ring_context_reset(struct intel_context *ce)
+{
+	intel_ring_reset(ce->ring, 0);
+}
+
 static const struct intel_context_ops ring_context_ops = {
 	.pin = ring_context_pin,
 	.unpin = ring_context_unpin,
+
+	.reset = ring_context_reset,
 	.destroy = ring_context_destroy,
 };
 
@@ -1580,16 +1587,6 @@ void intel_engine_cleanup(struct intel_engine_cs *engine)
 	kfree(engine);
 }
 
-void intel_legacy_submission_resume(struct drm_i915_private *dev_priv)
-{
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-
-	/* Restart from the beginning of the rings for convenience */
-	for_each_engine(engine, dev_priv, id)
-		intel_ring_reset(engine->buffer, 0);
-}
-
 static int load_pd_dir(struct i915_request *rq,
 		       const struct i915_hw_ppgtt *ppgtt)
 {
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 312278723712..fad5306f07da 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1996,7 +1996,6 @@ struct drm_i915_private {
 
 	/* Abstract the submission mechanism (legacy ringbuffer or execlists) away */
 	struct {
-		void (*resume)(struct drm_i915_private *);
 		void (*cleanup_engine)(struct intel_engine_cs *engine);
 
 		struct i915_gt_timelines {
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0aa28d622d2a..a8b0aa05288b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4514,7 +4514,7 @@ void i915_gem_resume(struct drm_i915_private *i915)
 	 * guarantee that the context image is complete. So let's just reset
 	 * it and start again.
 	 */
-	i915->gt.resume(i915);
+	intel_gt_resume(i915);
 
 	if (i915_gem_init_hw(i915))
 		goto err_wedged;
@@ -4854,13 +4854,10 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 
 	dev_priv->mm.unordered_timeline = dma_fence_context_alloc(1);
 
-	if (HAS_LOGICAL_RING_CONTEXTS(dev_priv)) {
-		dev_priv->gt.resume = intel_lr_context_resume;
+	if (HAS_LOGICAL_RING_CONTEXTS(dev_priv))
 		dev_priv->gt.cleanup_engine = intel_logical_ring_cleanup;
-	} else {
-		dev_priv->gt.resume = intel_legacy_submission_resume;
+	else
 		dev_priv->gt.cleanup_engine = intel_engine_cleanup;
-	}
 
 	i915_timelines_init(dev_priv);
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 08/29] drm/i915: Introduce struct intel_wakeref
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (5 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 07/29] drm/i915: Only reset the pinned kernel contexts on resume Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-10  9:49   ` Tvrtko Ursulin
  2019-04-08  9:17 ` [PATCH 09/29] drm/i915: Pull the GEM powermangement coupling into its own file Chris Wilson
                   ` (24 subsequent siblings)
  31 siblings, 1 reply; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

For controlling runtime pm of the GT and engines, we would like to have
a callback to do extra work the first time we wake up and the last time
we drop the wakeref. This first/last access needs serialisation and so
we encompass a mutex with the regular intel_wakeref_t tracker.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/Makefile             |  1 +
 drivers/gpu/drm/i915/Makefile.header-test |  3 +-
 drivers/gpu/drm/i915/i915_drv.h           |  3 +-
 drivers/gpu/drm/i915/intel_wakeref.c      | 51 +++++++++++++++
 drivers/gpu/drm/i915/intel_wakeref.h      | 77 +++++++++++++++++++++++
 5 files changed, 132 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_wakeref.c
 create mode 100644 drivers/gpu/drm/i915/intel_wakeref.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 40130cf5c003..233bad5e361f 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -50,6 +50,7 @@ i915-y += i915_drv.o \
 	  intel_device_info.o \
 	  intel_pm.o \
 	  intel_runtime_pm.o \
+	  intel_wakeref.o \
 	  intel_uncore.o
 
 # core library code
diff --git a/drivers/gpu/drm/i915/Makefile.header-test b/drivers/gpu/drm/i915/Makefile.header-test
index 96a5d90629ec..e6b3e7588860 100644
--- a/drivers/gpu/drm/i915/Makefile.header-test
+++ b/drivers/gpu/drm/i915/Makefile.header-test
@@ -31,7 +31,8 @@ header_test := \
 	intel_psr.h \
 	intel_sdvo.h \
 	intel_sprite.h \
-	intel_tv.h
+	intel_tv.h \
+	intel_wakeref.h
 
 quiet_cmd_header_test = HDRTEST $@
       cmd_header_test = echo "\#include \"$(<F)\"" > $@
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index fad5306f07da..62a7e91acd7f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -74,6 +74,7 @@
 #include "intel_opregion.h"
 #include "intel_uc.h"
 #include "intel_uncore.h"
+#include "intel_wakeref.h"
 #include "intel_wopcm.h"
 
 #include "i915_gem.h"
@@ -134,8 +135,6 @@ bool i915_error_injected(void);
 	__i915_printk(i915, i915_error_injected() ? KERN_DEBUG : KERN_ERR, \
 		      fmt, ##__VA_ARGS__)
 
-typedef depot_stack_handle_t intel_wakeref_t;
-
 enum hpd_pin {
 	HPD_NONE = 0,
 	HPD_TV = HPD_NONE,     /* TV is known to be unreliable */
diff --git a/drivers/gpu/drm/i915/intel_wakeref.c b/drivers/gpu/drm/i915/intel_wakeref.c
new file mode 100644
index 000000000000..c133c91cf277
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_wakeref.c
@@ -0,0 +1,51 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "intel_drv.h"
+#include "intel_wakeref.h"
+
+void __intel_wakeref_get_once(struct drm_i915_private *i915,
+			      struct intel_wakeref_count *wc,
+			      bool (*fn)(struct intel_wakeref_count *wc))
+{
+	/*
+	 * Treat get/put as different subclasses, as we may need to run
+	 * the put callback from under the shrinker and do not want to
+	 * cross-contanimate that callback with any extra work performed
+	 * upon acquiring the wakeref.
+	 */
+	mutex_lock_nested(&wc->mutex, SINGLE_DEPTH_NESTING);
+	if (!atomic_read(&wc->count)) {
+		wc->wakeref = intel_runtime_pm_get(i915);
+		if (unlikely(!fn(wc))) {
+			intel_runtime_pm_put(i915, wc->wakeref);
+			mutex_unlock(&wc->mutex);
+			return;
+		}
+
+		smp_mb__before_atomic(); /* release wc->count */
+	}
+	atomic_inc(&wc->count);
+	mutex_unlock(&wc->mutex);
+}
+
+void __intel_wakeref_put_once(struct drm_i915_private *i915,
+			      struct intel_wakeref_count *wc,
+			      bool (*fn)(struct intel_wakeref_count *wc))
+{
+	if (likely(fn(wc)))
+		intel_runtime_pm_put(i915, wc->wakeref);
+	else
+		atomic_inc(&wc->count);
+	mutex_unlock(&wc->mutex);
+}
+
+void __intel_wakeref_init(struct intel_wakeref_count *wc,
+			  struct lock_class_key *key)
+{
+	__mutex_init(&wc->mutex, "wakeref", key);
+	atomic_set(&wc->count, 0);
+}
diff --git a/drivers/gpu/drm/i915/intel_wakeref.h b/drivers/gpu/drm/i915/intel_wakeref.h
new file mode 100644
index 000000000000..cd4d8f57e502
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_wakeref.h
@@ -0,0 +1,77 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef INTEL_WAKEREF_H
+#define INTEL_WAKEREF_H
+
+#include <linux/atomic.h>
+#include <linux/mutex.h>
+#include <linux/stackdepot.h>
+
+struct drm_i915_private;
+
+typedef depot_stack_handle_t intel_wakeref_t;
+
+struct intel_wakeref_count {
+	atomic_t count;
+	struct mutex mutex;
+	intel_wakeref_t wakeref;
+};
+
+void __intel_wakeref_init(struct intel_wakeref_count *wc,
+			  struct lock_class_key *key);
+#define intel_wakeref_init(wc) do {					\
+	static struct lock_class_key __key;				\
+									\
+	__intel_wakeref_init((wc), &__key);				\
+} while (0)
+
+void __intel_wakeref_get_once(struct drm_i915_private *i915,
+			      struct intel_wakeref_count *wc,
+			      bool (*fn)(struct intel_wakeref_count *wc));
+void __intel_wakeref_put_once(struct drm_i915_private *i915,
+			      struct intel_wakeref_count *wc,
+			      bool (*fn)(struct intel_wakeref_count *wc));
+
+static inline void
+intel_wakeref_get_once(struct drm_i915_private *i915,
+		       struct intel_wakeref_count *wc,
+		       bool (*fn)(struct intel_wakeref_count *wc))
+{
+	if (unlikely(!atomic_inc_not_zero(&wc->count)))
+		__intel_wakeref_get_once(i915, wc, fn);
+}
+
+static inline void
+intel_wakeref_put_once(struct drm_i915_private *i915,
+		       struct intel_wakeref_count *wc,
+		       bool (*fn)(struct intel_wakeref_count *wc))
+{
+	if (atomic_dec_and_mutex_lock(&wc->count, &wc->mutex))
+		__intel_wakeref_put_once(i915, wc, fn);
+}
+
+static inline void
+intel_wakeref_lock(struct intel_wakeref_count *wc)
+	__acquires(wc->mutex)
+{
+	mutex_lock(&wc->mutex);
+}
+
+static inline void
+intel_wakeref_unlock(struct intel_wakeref_count *wc)
+	__releases(wc->mutex)
+{
+	mutex_unlock(&wc->mutex);
+}
+
+static inline bool
+intel_wakeref_active(struct intel_wakeref_count *wc)
+{
+	return atomic_read(&wc->count);
+}
+
+#endif /* INTEL_WAKEREF_H */
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 09/29] drm/i915: Pull the GEM powermangement coupling into its own file
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (6 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 08/29] drm/i915: Introduce struct intel_wakeref Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-08 14:56   ` Tvrtko Ursulin
  2019-04-10  9:57   ` Tvrtko Ursulin
  2019-04-08  9:17 ` [PATCH 10/29] drm/i915: Introduce context->enter() and context->exit() Chris Wilson
                   ` (23 subsequent siblings)
  31 siblings, 2 replies; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

Split out the powermanagement portion (GT wakeref, suspend/resume) of
GEM from i915_gem.c into its own file.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/Makefile                 |   1 +
 drivers/gpu/drm/i915/Makefile.header-test     |   1 +
 drivers/gpu/drm/i915/i915_debugfs.c           |   4 +-
 drivers/gpu/drm/i915/i915_drv.h               |  12 +-
 drivers/gpu/drm/i915/i915_gem.c               | 363 +----------------
 drivers/gpu/drm/i915/i915_gem_pm.c            | 365 ++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_pm.h            |  28 ++
 .../gpu/drm/i915/selftests/i915_gem_context.c |   2 +-
 .../gpu/drm/i915/selftests/i915_gem_object.c  |   8 +-
 .../gpu/drm/i915/selftests/mock_gem_device.c  |  10 +-
 10 files changed, 418 insertions(+), 376 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_gem_pm.c
 create mode 100644 drivers/gpu/drm/i915/i915_gem_pm.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 233bad5e361f..858642c7bc40 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -97,6 +97,7 @@ i915-y += \
 	  i915_gem_internal.o \
 	  i915_gem.o \
 	  i915_gem_object.o \
+	  i915_gem_pm.o \
 	  i915_gem_render_state.o \
 	  i915_gem_shrinker.o \
 	  i915_gem_stolen.o \
diff --git a/drivers/gpu/drm/i915/Makefile.header-test b/drivers/gpu/drm/i915/Makefile.header-test
index e6b3e7588860..702e3a7ade4c 100644
--- a/drivers/gpu/drm/i915/Makefile.header-test
+++ b/drivers/gpu/drm/i915/Makefile.header-test
@@ -5,6 +5,7 @@
 header_test := \
 	i915_active_types.h \
 	i915_gem_context_types.h \
+	i915_gem_pm.h \
 	i915_priolist_types.h \
 	i915_scheduler_types.h \
 	i915_timeline_types.h \
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index eac4095278b2..e3eca88614b2 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3941,8 +3941,8 @@ i915_drop_caches_set(void *data, u64 val)
 	if (val & DROP_IDLE) {
 		do {
 			if (READ_ONCE(i915->gt.active_requests))
-				flush_delayed_work(&i915->gt.retire_work);
-			drain_delayed_work(&i915->gt.idle_work);
+				flush_delayed_work(&i915->gem.retire_work);
+			drain_delayed_work(&i915->gem.idle_work);
 		} while (READ_ONCE(i915->gt.awake));
 	}
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 62a7e91acd7f..cbae9be052e0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2020,6 +2020,12 @@ struct drm_i915_private {
 		 */
 		intel_wakeref_t awake;
 
+		ktime_t last_init_time;
+
+		struct i915_vma *scratch;
+	} gt;
+
+	struct {
 		/**
 		 * We leave the user IRQ off as much as possible,
 		 * but this means that requests will finish and never
@@ -2037,11 +2043,7 @@ struct drm_i915_private {
 		 * off the idle_work.
 		 */
 		struct delayed_work idle_work;
-
-		ktime_t last_init_time;
-
-		struct i915_vma *scratch;
-	} gt;
+	} gem;
 
 	/* For i945gm vblank irq vs. C3 workaround */
 	struct {
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a8b0aa05288b..c56633e87c32 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -46,7 +46,7 @@
 #include "i915_drv.h"
 #include "i915_gem_clflush.h"
 #include "i915_gemfs.h"
-#include "i915_globals.h"
+#include "i915_gem_pm.h"
 #include "i915_trace.h"
 #include "i915_vgpu.h"
 
@@ -103,105 +103,6 @@ static void i915_gem_info_remove_obj(struct drm_i915_private *dev_priv,
 	spin_unlock(&dev_priv->mm.object_stat_lock);
 }
 
-static void __i915_gem_park(struct drm_i915_private *i915)
-{
-	intel_wakeref_t wakeref;
-
-	GEM_TRACE("\n");
-
-	lockdep_assert_held(&i915->drm.struct_mutex);
-	GEM_BUG_ON(i915->gt.active_requests);
-	GEM_BUG_ON(!list_empty(&i915->gt.active_rings));
-
-	if (!i915->gt.awake)
-		return;
-
-	/*
-	 * Be paranoid and flush a concurrent interrupt to make sure
-	 * we don't reactivate any irq tasklets after parking.
-	 *
-	 * FIXME: Note that even though we have waited for execlists to be idle,
-	 * there may still be an in-flight interrupt even though the CSB
-	 * is now empty. synchronize_irq() makes sure that a residual interrupt
-	 * is completed before we continue, but it doesn't prevent the HW from
-	 * raising a spurious interrupt later. To complete the shield we should
-	 * coordinate disabling the CS irq with flushing the interrupts.
-	 */
-	synchronize_irq(i915->drm.irq);
-
-	intel_engines_park(i915);
-	i915_timelines_park(i915);
-
-	i915_pmu_gt_parked(i915);
-	i915_vma_parked(i915);
-
-	wakeref = fetch_and_zero(&i915->gt.awake);
-	GEM_BUG_ON(!wakeref);
-
-	if (INTEL_GEN(i915) >= 6)
-		gen6_rps_idle(i915);
-
-	intel_display_power_put(i915, POWER_DOMAIN_GT_IRQ, wakeref);
-
-	i915_globals_park();
-}
-
-void i915_gem_park(struct drm_i915_private *i915)
-{
-	GEM_TRACE("\n");
-
-	lockdep_assert_held(&i915->drm.struct_mutex);
-	GEM_BUG_ON(i915->gt.active_requests);
-
-	if (!i915->gt.awake)
-		return;
-
-	/* Defer the actual call to __i915_gem_park() to prevent ping-pongs */
-	mod_delayed_work(i915->wq, &i915->gt.idle_work, msecs_to_jiffies(100));
-}
-
-void i915_gem_unpark(struct drm_i915_private *i915)
-{
-	GEM_TRACE("\n");
-
-	lockdep_assert_held(&i915->drm.struct_mutex);
-	GEM_BUG_ON(!i915->gt.active_requests);
-	assert_rpm_wakelock_held(i915);
-
-	if (i915->gt.awake)
-		return;
-
-	/*
-	 * It seems that the DMC likes to transition between the DC states a lot
-	 * when there are no connected displays (no active power domains) during
-	 * command submission.
-	 *
-	 * This activity has negative impact on the performance of the chip with
-	 * huge latencies observed in the interrupt handler and elsewhere.
-	 *
-	 * Work around it by grabbing a GT IRQ power domain whilst there is any
-	 * GT activity, preventing any DC state transitions.
-	 */
-	i915->gt.awake = intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
-	GEM_BUG_ON(!i915->gt.awake);
-
-	i915_globals_unpark();
-
-	intel_enable_gt_powersave(i915);
-	i915_update_gfx_val(i915);
-	if (INTEL_GEN(i915) >= 6)
-		gen6_rps_busy(i915);
-	i915_pmu_gt_unparked(i915);
-
-	intel_engines_unpark(i915);
-
-	i915_queue_hangcheck(i915);
-
-	queue_delayed_work(i915->wq,
-			   &i915->gt.retire_work,
-			   round_jiffies_up_relative(HZ));
-}
-
 int
 i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
 			    struct drm_file *file)
@@ -2088,7 +1989,7 @@ static int i915_gem_object_create_mmap_offset(struct drm_i915_gem_object *obj)
 		if (!err)
 			break;
 
-	} while (flush_delayed_work(&dev_priv->gt.retire_work));
+	} while (flush_delayed_work(&dev_priv->gem.retire_work));
 
 	return err;
 }
@@ -2871,132 +2772,6 @@ i915_gem_object_pwrite_gtt(struct drm_i915_gem_object *obj,
 	return 0;
 }
 
-static void
-i915_gem_retire_work_handler(struct work_struct *work)
-{
-	struct drm_i915_private *dev_priv =
-		container_of(work, typeof(*dev_priv), gt.retire_work.work);
-	struct drm_device *dev = &dev_priv->drm;
-
-	/* Come back later if the device is busy... */
-	if (mutex_trylock(&dev->struct_mutex)) {
-		i915_retire_requests(dev_priv);
-		mutex_unlock(&dev->struct_mutex);
-	}
-
-	/*
-	 * Keep the retire handler running until we are finally idle.
-	 * We do not need to do this test under locking as in the worst-case
-	 * we queue the retire worker once too often.
-	 */
-	if (READ_ONCE(dev_priv->gt.awake))
-		queue_delayed_work(dev_priv->wq,
-				   &dev_priv->gt.retire_work,
-				   round_jiffies_up_relative(HZ));
-}
-
-static bool switch_to_kernel_context_sync(struct drm_i915_private *i915,
-					  unsigned long mask)
-{
-	bool result = true;
-
-	/*
-	 * Even if we fail to switch, give whatever is running a small chance
-	 * to save itself before we report the failure. Yes, this may be a
-	 * false positive due to e.g. ENOMEM, caveat emptor!
-	 */
-	if (i915_gem_switch_to_kernel_context(i915, mask))
-		result = false;
-
-	if (i915_gem_wait_for_idle(i915,
-				   I915_WAIT_LOCKED |
-				   I915_WAIT_FOR_IDLE_BOOST,
-				   I915_GEM_IDLE_TIMEOUT))
-		result = false;
-
-	if (!result) {
-		if (i915_modparams.reset) { /* XXX hide warning from gem_eio */
-			dev_err(i915->drm.dev,
-				"Failed to idle engines, declaring wedged!\n");
-			GEM_TRACE_DUMP();
-		}
-
-		/* Forcibly cancel outstanding work and leave the gpu quiet. */
-		i915_gem_set_wedged(i915);
-	}
-
-	i915_retire_requests(i915); /* ensure we flush after wedging */
-	return result;
-}
-
-static bool load_power_context(struct drm_i915_private *i915)
-{
-	/* Force loading the kernel context on all engines */
-	if (!switch_to_kernel_context_sync(i915, ALL_ENGINES))
-		return false;
-
-	/*
-	 * Immediately park the GPU so that we enable powersaving and
-	 * treat it as idle. The next time we issue a request, we will
-	 * unpark and start using the engine->pinned_default_state, otherwise
-	 * it is in limbo and an early reset may fail.
-	 */
-	__i915_gem_park(i915);
-
-	return true;
-}
-
-static void
-i915_gem_idle_work_handler(struct work_struct *work)
-{
-	struct drm_i915_private *i915 =
-		container_of(work, typeof(*i915), gt.idle_work.work);
-	bool rearm_hangcheck;
-
-	if (!READ_ONCE(i915->gt.awake))
-		return;
-
-	if (READ_ONCE(i915->gt.active_requests))
-		return;
-
-	rearm_hangcheck =
-		cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
-
-	if (!mutex_trylock(&i915->drm.struct_mutex)) {
-		/* Currently busy, come back later */
-		mod_delayed_work(i915->wq,
-				 &i915->gt.idle_work,
-				 msecs_to_jiffies(50));
-		goto out_rearm;
-	}
-
-	/*
-	 * Flush out the last user context, leaving only the pinned
-	 * kernel context resident. Should anything unfortunate happen
-	 * while we are idle (such as the GPU being power cycled), no users
-	 * will be harmed.
-	 */
-	if (!work_pending(&i915->gt.idle_work.work) &&
-	    !i915->gt.active_requests) {
-		++i915->gt.active_requests; /* don't requeue idle */
-
-		switch_to_kernel_context_sync(i915, i915->gt.active_engines);
-
-		if (!--i915->gt.active_requests) {
-			__i915_gem_park(i915);
-			rearm_hangcheck = false;
-		}
-	}
-
-	mutex_unlock(&i915->drm.struct_mutex);
-
-out_rearm:
-	if (rearm_hangcheck) {
-		GEM_BUG_ON(!i915->gt.awake);
-		i915_queue_hangcheck(i915);
-	}
-}
-
 void i915_gem_close_object(struct drm_gem_object *gem, struct drm_file *file)
 {
 	struct drm_i915_private *i915 = to_i915(gem->dev);
@@ -4412,133 +4187,6 @@ void i915_gem_sanitize(struct drm_i915_private *i915)
 	mutex_unlock(&i915->drm.struct_mutex);
 }
 
-void i915_gem_suspend(struct drm_i915_private *i915)
-{
-	intel_wakeref_t wakeref;
-
-	GEM_TRACE("\n");
-
-	wakeref = intel_runtime_pm_get(i915);
-
-	flush_workqueue(i915->wq);
-
-	mutex_lock(&i915->drm.struct_mutex);
-
-	/*
-	 * We have to flush all the executing contexts to main memory so
-	 * that they can saved in the hibernation image. To ensure the last
-	 * context image is coherent, we have to switch away from it. That
-	 * leaves the i915->kernel_context still active when
-	 * we actually suspend, and its image in memory may not match the GPU
-	 * state. Fortunately, the kernel_context is disposable and we do
-	 * not rely on its state.
-	 */
-	switch_to_kernel_context_sync(i915, i915->gt.active_engines);
-
-	mutex_unlock(&i915->drm.struct_mutex);
-	i915_reset_flush(i915);
-
-	drain_delayed_work(&i915->gt.retire_work);
-
-	/*
-	 * As the idle_work is rearming if it detects a race, play safe and
-	 * repeat the flush until it is definitely idle.
-	 */
-	drain_delayed_work(&i915->gt.idle_work);
-
-	/*
-	 * Assert that we successfully flushed all the work and
-	 * reset the GPU back to its idle, low power state.
-	 */
-	GEM_BUG_ON(i915->gt.awake);
-
-	intel_uc_suspend(i915);
-
-	intel_runtime_pm_put(i915, wakeref);
-}
-
-void i915_gem_suspend_late(struct drm_i915_private *i915)
-{
-	struct drm_i915_gem_object *obj;
-	struct list_head *phases[] = {
-		&i915->mm.unbound_list,
-		&i915->mm.bound_list,
-		NULL
-	}, **phase;
-
-	/*
-	 * Neither the BIOS, ourselves or any other kernel
-	 * expects the system to be in execlists mode on startup,
-	 * so we need to reset the GPU back to legacy mode. And the only
-	 * known way to disable logical contexts is through a GPU reset.
-	 *
-	 * So in order to leave the system in a known default configuration,
-	 * always reset the GPU upon unload and suspend. Afterwards we then
-	 * clean up the GEM state tracking, flushing off the requests and
-	 * leaving the system in a known idle state.
-	 *
-	 * Note that is of the upmost importance that the GPU is idle and
-	 * all stray writes are flushed *before* we dismantle the backing
-	 * storage for the pinned objects.
-	 *
-	 * However, since we are uncertain that resetting the GPU on older
-	 * machines is a good idea, we don't - just in case it leaves the
-	 * machine in an unusable condition.
-	 */
-
-	mutex_lock(&i915->drm.struct_mutex);
-	for (phase = phases; *phase; phase++) {
-		list_for_each_entry(obj, *phase, mm.link)
-			WARN_ON(i915_gem_object_set_to_gtt_domain(obj, false));
-	}
-	mutex_unlock(&i915->drm.struct_mutex);
-
-	intel_uc_sanitize(i915);
-	i915_gem_sanitize(i915);
-}
-
-void i915_gem_resume(struct drm_i915_private *i915)
-{
-	GEM_TRACE("\n");
-
-	WARN_ON(i915->gt.awake);
-
-	mutex_lock(&i915->drm.struct_mutex);
-	intel_uncore_forcewake_get(&i915->uncore, FORCEWAKE_ALL);
-
-	i915_gem_restore_gtt_mappings(i915);
-	i915_gem_restore_fences(i915);
-
-	/*
-	 * As we didn't flush the kernel context before suspend, we cannot
-	 * guarantee that the context image is complete. So let's just reset
-	 * it and start again.
-	 */
-	intel_gt_resume(i915);
-
-	if (i915_gem_init_hw(i915))
-		goto err_wedged;
-
-	intel_uc_resume(i915);
-
-	/* Always reload a context for powersaving. */
-	if (!load_power_context(i915))
-		goto err_wedged;
-
-out_unlock:
-	intel_uncore_forcewake_put(&i915->uncore, FORCEWAKE_ALL);
-	mutex_unlock(&i915->drm.struct_mutex);
-	return;
-
-err_wedged:
-	if (!i915_reset_failed(i915)) {
-		dev_err(i915->drm.dev,
-			"Failed to re-initialize GPU, declaring it wedged!\n");
-		i915_gem_set_wedged(i915);
-	}
-	goto out_unlock;
-}
-
 void i915_gem_init_swizzling(struct drm_i915_private *dev_priv)
 {
 	if (INTEL_GEN(dev_priv) < 5 ||
@@ -4721,7 +4369,7 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
 	}
 
 	/* Flush the default context image to memory, and enable powersaving. */
-	if (!load_power_context(i915)) {
+	if (!i915_gem_load_power_context(i915)) {
 		err = -EIO;
 		goto err_active;
 	}
@@ -5115,11 +4763,8 @@ int i915_gem_init_early(struct drm_i915_private *dev_priv)
 	INIT_LIST_HEAD(&dev_priv->gt.closed_vma);
 
 	i915_gem_init__mm(dev_priv);
+	i915_gem_init__pm(dev_priv);
 
-	INIT_DELAYED_WORK(&dev_priv->gt.retire_work,
-			  i915_gem_retire_work_handler);
-	INIT_DELAYED_WORK(&dev_priv->gt.idle_work,
-			  i915_gem_idle_work_handler);
 	init_waitqueue_head(&dev_priv->gpu_error.wait_queue);
 	init_waitqueue_head(&dev_priv->gpu_error.reset_queue);
 	mutex_init(&dev_priv->gpu_error.wedge_mutex);
diff --git a/drivers/gpu/drm/i915/i915_gem_pm.c b/drivers/gpu/drm/i915/i915_gem_pm.c
new file mode 100644
index 000000000000..bc0663cac7b6
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_gem_pm.c
@@ -0,0 +1,365 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "i915_drv.h"
+#include "i915_gem_pm.h"
+#include "i915_globals.h"
+#include "intel_pm.h"
+
+static void __i915_gem_park(struct drm_i915_private *i915)
+{
+	intel_wakeref_t wakeref;
+
+	GEM_TRACE("\n");
+
+	lockdep_assert_held(&i915->drm.struct_mutex);
+	GEM_BUG_ON(i915->gt.active_requests);
+	GEM_BUG_ON(!list_empty(&i915->gt.active_rings));
+
+	if (!i915->gt.awake)
+		return;
+
+	/*
+	 * Be paranoid and flush a concurrent interrupt to make sure
+	 * we don't reactivate any irq tasklets after parking.
+	 *
+	 * FIXME: Note that even though we have waited for execlists to be idle,
+	 * there may still be an in-flight interrupt even though the CSB
+	 * is now empty. synchronize_irq() makes sure that a residual interrupt
+	 * is completed before we continue, but it doesn't prevent the HW from
+	 * raising a spurious interrupt later. To complete the shield we should
+	 * coordinate disabling the CS irq with flushing the interrupts.
+	 */
+	synchronize_irq(i915->drm.irq);
+
+	intel_engines_park(i915);
+	i915_timelines_park(i915);
+
+	i915_pmu_gt_parked(i915);
+	i915_vma_parked(i915);
+
+	wakeref = fetch_and_zero(&i915->gt.awake);
+	GEM_BUG_ON(!wakeref);
+
+	if (INTEL_GEN(i915) >= 6)
+		gen6_rps_idle(i915);
+
+	intel_display_power_put(i915, POWER_DOMAIN_GT_IRQ, wakeref);
+
+	i915_globals_park();
+}
+
+static bool switch_to_kernel_context_sync(struct drm_i915_private *i915,
+					  unsigned long mask)
+{
+	bool result = true;
+
+	/*
+	 * Even if we fail to switch, give whatever is running a small chance
+	 * to save itself before we report the failure. Yes, this may be a
+	 * false positive due to e.g. ENOMEM, caveat emptor!
+	 */
+	if (i915_gem_switch_to_kernel_context(i915, mask))
+		result = false;
+
+	if (i915_gem_wait_for_idle(i915,
+				   I915_WAIT_LOCKED |
+				   I915_WAIT_FOR_IDLE_BOOST,
+				   I915_GEM_IDLE_TIMEOUT))
+		result = false;
+
+	if (!result) {
+		if (i915_modparams.reset) { /* XXX hide warning from gem_eio */
+			dev_err(i915->drm.dev,
+				"Failed to idle engines, declaring wedged!\n");
+			GEM_TRACE_DUMP();
+		}
+
+		/* Forcibly cancel outstanding work and leave the gpu quiet. */
+		i915_gem_set_wedged(i915);
+	}
+
+	i915_retire_requests(i915); /* ensure we flush after wedging */
+	return result;
+}
+
+static void idle_work_handler(struct work_struct *work)
+{
+	struct drm_i915_private *i915 =
+		container_of(work, typeof(*i915), gem.idle_work.work);
+	bool rearm_hangcheck;
+
+	if (!READ_ONCE(i915->gt.awake))
+		return;
+
+	if (READ_ONCE(i915->gt.active_requests))
+		return;
+
+	rearm_hangcheck =
+		cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
+
+	if (!mutex_trylock(&i915->drm.struct_mutex)) {
+		/* Currently busy, come back later */
+		mod_delayed_work(i915->wq,
+				 &i915->gem.idle_work,
+				 msecs_to_jiffies(50));
+		goto out_rearm;
+	}
+
+	/*
+	 * Flush out the last user context, leaving only the pinned
+	 * kernel context resident. Should anything unfortunate happen
+	 * while we are idle (such as the GPU being power cycled), no users
+	 * will be harmed.
+	 */
+	if (!work_pending(&i915->gem.idle_work.work) &&
+	    !i915->gt.active_requests) {
+		++i915->gt.active_requests; /* don't requeue idle */
+
+		switch_to_kernel_context_sync(i915, i915->gt.active_engines);
+
+		if (!--i915->gt.active_requests) {
+			__i915_gem_park(i915);
+			rearm_hangcheck = false;
+		}
+	}
+
+	mutex_unlock(&i915->drm.struct_mutex);
+
+out_rearm:
+	if (rearm_hangcheck) {
+		GEM_BUG_ON(!i915->gt.awake);
+		i915_queue_hangcheck(i915);
+	}
+}
+
+static void retire_work_handler(struct work_struct *work)
+{
+	struct drm_i915_private *i915 =
+		container_of(work, typeof(*i915), gem.retire_work.work);
+
+	/* Come back later if the device is busy... */
+	if (mutex_trylock(&i915->drm.struct_mutex)) {
+		i915_retire_requests(i915);
+		mutex_unlock(&i915->drm.struct_mutex);
+	}
+
+	/*
+	 * Keep the retire handler running until we are finally idle.
+	 * We do not need to do this test under locking as in the worst-case
+	 * we queue the retire worker once too often.
+	 */
+	if (READ_ONCE(i915->gt.awake))
+		queue_delayed_work(i915->wq,
+				   &i915->gem.retire_work,
+				   round_jiffies_up_relative(HZ));
+}
+
+void i915_gem_park(struct drm_i915_private *i915)
+{
+	GEM_TRACE("\n");
+
+	lockdep_assert_held(&i915->drm.struct_mutex);
+	GEM_BUG_ON(i915->gt.active_requests);
+
+	if (!i915->gt.awake)
+		return;
+
+	/* Defer the actual call to __i915_gem_park() to prevent ping-pongs */
+	mod_delayed_work(i915->wq, &i915->gem.idle_work, msecs_to_jiffies(100));
+}
+
+void i915_gem_unpark(struct drm_i915_private *i915)
+{
+	GEM_TRACE("\n");
+
+	lockdep_assert_held(&i915->drm.struct_mutex);
+	GEM_BUG_ON(!i915->gt.active_requests);
+	assert_rpm_wakelock_held(i915);
+
+	if (i915->gt.awake)
+		return;
+
+	/*
+	 * It seems that the DMC likes to transition between the DC states a lot
+	 * when there are no connected displays (no active power domains) during
+	 * command submission.
+	 *
+	 * This activity has negative impact on the performance of the chip with
+	 * huge latencies observed in the interrupt handler and elsewhere.
+	 *
+	 * Work around it by grabbing a GT IRQ power domain whilst there is any
+	 * GT activity, preventing any DC state transitions.
+	 */
+	i915->gt.awake = intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
+	GEM_BUG_ON(!i915->gt.awake);
+
+	i915_globals_unpark();
+
+	intel_enable_gt_powersave(i915);
+	i915_update_gfx_val(i915);
+	if (INTEL_GEN(i915) >= 6)
+		gen6_rps_busy(i915);
+	i915_pmu_gt_unparked(i915);
+
+	intel_engines_unpark(i915);
+
+	i915_queue_hangcheck(i915);
+
+	queue_delayed_work(i915->wq,
+			   &i915->gem.retire_work,
+			   round_jiffies_up_relative(HZ));
+}
+
+bool i915_gem_load_power_context(struct drm_i915_private *i915)
+{
+	/* Force loading the kernel context on all engines */
+	if (!switch_to_kernel_context_sync(i915, ALL_ENGINES))
+		return false;
+
+	/*
+	 * Immediately park the GPU so that we enable powersaving and
+	 * treat it as idle. The next time we issue a request, we will
+	 * unpark and start using the engine->pinned_default_state, otherwise
+	 * it is in limbo and an early reset may fail.
+	 */
+	__i915_gem_park(i915);
+
+	return true;
+}
+
+void i915_gem_suspend(struct drm_i915_private *i915)
+{
+	intel_wakeref_t wakeref;
+
+	GEM_TRACE("\n");
+
+	wakeref = intel_runtime_pm_get(i915);
+
+	flush_workqueue(i915->wq);
+
+	mutex_lock(&i915->drm.struct_mutex);
+
+	/*
+	 * We have to flush all the executing contexts to main memory so
+	 * that they can saved in the hibernation image. To ensure the last
+	 * context image is coherent, we have to switch away from it. That
+	 * leaves the i915->kernel_context still active when
+	 * we actually suspend, and its image in memory may not match the GPU
+	 * state. Fortunately, the kernel_context is disposable and we do
+	 * not rely on its state.
+	 */
+	switch_to_kernel_context_sync(i915, i915->gt.active_engines);
+
+	mutex_unlock(&i915->drm.struct_mutex);
+	i915_reset_flush(i915);
+
+	drain_delayed_work(&i915->gem.retire_work);
+
+	/*
+	 * As the idle_work is rearming if it detects a race, play safe and
+	 * repeat the flush until it is definitely idle.
+	 */
+	drain_delayed_work(&i915->gem.idle_work);
+
+	/*
+	 * Assert that we successfully flushed all the work and
+	 * reset the GPU back to its idle, low power state.
+	 */
+	GEM_BUG_ON(i915->gt.awake);
+
+	intel_uc_suspend(i915);
+
+	intel_runtime_pm_put(i915, wakeref);
+}
+
+void i915_gem_suspend_late(struct drm_i915_private *i915)
+{
+	struct drm_i915_gem_object *obj;
+	struct list_head *phases[] = {
+		&i915->mm.unbound_list,
+		&i915->mm.bound_list,
+		NULL
+	}, **phase;
+
+	/*
+	 * Neither the BIOS, ourselves or any other kernel
+	 * expects the system to be in execlists mode on startup,
+	 * so we need to reset the GPU back to legacy mode. And the only
+	 * known way to disable logical contexts is through a GPU reset.
+	 *
+	 * So in order to leave the system in a known default configuration,
+	 * always reset the GPU upon unload and suspend. Afterwards we then
+	 * clean up the GEM state tracking, flushing off the requests and
+	 * leaving the system in a known idle state.
+	 *
+	 * Note that is of the upmost importance that the GPU is idle and
+	 * all stray writes are flushed *before* we dismantle the backing
+	 * storage for the pinned objects.
+	 *
+	 * However, since we are uncertain that resetting the GPU on older
+	 * machines is a good idea, we don't - just in case it leaves the
+	 * machine in an unusable condition.
+	 */
+
+	mutex_lock(&i915->drm.struct_mutex);
+	for (phase = phases; *phase; phase++) {
+		list_for_each_entry(obj, *phase, mm.link)
+			WARN_ON(i915_gem_object_set_to_gtt_domain(obj, false));
+	}
+	mutex_unlock(&i915->drm.struct_mutex);
+
+	intel_uc_sanitize(i915);
+	i915_gem_sanitize(i915);
+}
+
+void i915_gem_resume(struct drm_i915_private *i915)
+{
+	GEM_TRACE("\n");
+
+	WARN_ON(i915->gt.awake);
+
+	mutex_lock(&i915->drm.struct_mutex);
+	intel_uncore_forcewake_get(&i915->uncore, FORCEWAKE_ALL);
+
+	i915_gem_restore_gtt_mappings(i915);
+	i915_gem_restore_fences(i915);
+
+	/*
+	 * As we didn't flush the kernel context before suspend, we cannot
+	 * guarantee that the context image is complete. So let's just reset
+	 * it and start again.
+	 */
+	intel_gt_resume(i915);
+
+	if (i915_gem_init_hw(i915))
+		goto err_wedged;
+
+	intel_uc_resume(i915);
+
+	/* Always reload a context for powersaving. */
+	if (!i915_gem_load_power_context(i915))
+		goto err_wedged;
+
+out_unlock:
+	intel_uncore_forcewake_put(&i915->uncore, FORCEWAKE_ALL);
+	mutex_unlock(&i915->drm.struct_mutex);
+	return;
+
+err_wedged:
+	if (!i915_reset_failed(i915)) {
+		dev_err(i915->drm.dev,
+			"Failed to re-initialize GPU, declaring it wedged!\n");
+		i915_gem_set_wedged(i915);
+	}
+	goto out_unlock;
+}
+
+void i915_gem_init__pm(struct drm_i915_private *i915)
+{
+	INIT_DELAYED_WORK(&i915->gem.idle_work, idle_work_handler);
+	INIT_DELAYED_WORK(&i915->gem.retire_work, retire_work_handler);
+}
diff --git a/drivers/gpu/drm/i915/i915_gem_pm.h b/drivers/gpu/drm/i915/i915_gem_pm.h
new file mode 100644
index 000000000000..52f65e3f06b5
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_gem_pm.h
@@ -0,0 +1,28 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef __I915_GEM_PM_H__
+#define __I915_GEM_PM_H__
+
+#include <linux/types.h>
+
+struct drm_i915_private;
+struct work_struct;
+
+void i915_gem_init__pm(struct drm_i915_private *i915);
+
+bool i915_gem_load_power_context(struct drm_i915_private *i915);
+void i915_gem_resume(struct drm_i915_private *i915);
+
+void i915_gem_unpark(struct drm_i915_private *i915);
+void i915_gem_park(struct drm_i915_private *i915);
+
+void i915_gem_idle_work_handler(struct work_struct *work);
+
+void i915_gem_suspend(struct drm_i915_private *i915);
+void i915_gem_suspend_late(struct drm_i915_private *i915);
+
+#endif /* __I915_GEM_PM_H__ */
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index 6f52ca881173..9d646fa1b74e 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -1658,7 +1658,7 @@ static int __igt_switch_to_kernel_context(struct drm_i915_private *i915,
 		/* XXX Bonus points for proving we are the kernel context! */
 
 		mutex_unlock(&i915->drm.struct_mutex);
-		drain_delayed_work(&i915->gt.idle_work);
+		drain_delayed_work(&i915->gem.idle_work);
 		mutex_lock(&i915->drm.struct_mutex);
 	}
 
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
index 971148fbe6f5..12fc53c694a6 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
@@ -514,8 +514,8 @@ static void disable_retire_worker(struct drm_i915_private *i915)
 	}
 	mutex_unlock(&i915->drm.struct_mutex);
 
-	cancel_delayed_work_sync(&i915->gt.retire_work);
-	cancel_delayed_work_sync(&i915->gt.idle_work);
+	cancel_delayed_work_sync(&i915->gem.retire_work);
+	cancel_delayed_work_sync(&i915->gem.idle_work);
 }
 
 static int igt_mmap_offset_exhaustion(void *arg)
@@ -617,9 +617,9 @@ static int igt_mmap_offset_exhaustion(void *arg)
 out_park:
 	mutex_lock(&i915->drm.struct_mutex);
 	if (--i915->gt.active_requests)
-		queue_delayed_work(i915->wq, &i915->gt.retire_work, 0);
+		queue_delayed_work(i915->wq, &i915->gem.retire_work, 0);
 	else
-		queue_delayed_work(i915->wq, &i915->gt.idle_work, 0);
+		queue_delayed_work(i915->wq, &i915->gem.idle_work, 0);
 	mutex_unlock(&i915->drm.struct_mutex);
 	i915_gem_shrinker_register(i915);
 	return err;
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index f444ee5add27..fb677b4019a0 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -59,8 +59,8 @@ static void mock_device_release(struct drm_device *dev)
 	i915_gem_contexts_lost(i915);
 	mutex_unlock(&i915->drm.struct_mutex);
 
-	drain_delayed_work(&i915->gt.retire_work);
-	drain_delayed_work(&i915->gt.idle_work);
+	drain_delayed_work(&i915->gem.retire_work);
+	drain_delayed_work(&i915->gem.idle_work);
 	i915_gem_drain_workqueue(i915);
 
 	mutex_lock(&i915->drm.struct_mutex);
@@ -111,7 +111,7 @@ static void mock_retire_work_handler(struct work_struct *work)
 static void mock_idle_work_handler(struct work_struct *work)
 {
 	struct drm_i915_private *i915 =
-		container_of(work, typeof(*i915), gt.idle_work.work);
+		container_of(work, typeof(*i915), gem.idle_work.work);
 
 	i915->gt.active_engines = 0;
 }
@@ -197,8 +197,8 @@ struct drm_i915_private *mock_gem_device(void)
 
 	mock_init_contexts(i915);
 
-	INIT_DELAYED_WORK(&i915->gt.retire_work, mock_retire_work_handler);
-	INIT_DELAYED_WORK(&i915->gt.idle_work, mock_idle_work_handler);
+	INIT_DELAYED_WORK(&i915->gem.retire_work, mock_retire_work_handler);
+	INIT_DELAYED_WORK(&i915->gem.idle_work, mock_idle_work_handler);
 
 	i915->gt.awake = true;
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 10/29] drm/i915: Introduce context->enter() and context->exit()
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (7 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 09/29] drm/i915: Pull the GEM powermangement coupling into its own file Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-10 10:05   ` Tvrtko Ursulin
  2019-04-08  9:17 ` [PATCH 11/29] drm/i915: Pass intel_context to i915_request_create() Chris Wilson
                   ` (22 subsequent siblings)
  31 siblings, 1 reply; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

We wish to start segregating the power management into different control
domains, both with respect to the hardware and the user interface. The
first step is that at the lowest level flow of requests, we want to
process a context event (and not a global GEM operation). In this patch,
we introduce the context callbacks that in future patches will be
redirected to per-engine interfaces leading to global operations as
required.

The intent is that this will be guarded by the timeline->mutex, except
that retiring has not quite finished transitioning over from being
guarded by struct_mutex. So at the moment it is protected by
struct_mutex with a reminded to switch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_context.c       | 17 ++++++++++++++
 drivers/gpu/drm/i915/gt/intel_context.h       | 20 +++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_context_types.h |  5 +++++
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  3 +++
 drivers/gpu/drm/i915/gt/intel_ringbuffer.c    |  3 +++
 drivers/gpu/drm/i915/gt/mock_engine.c         |  3 +++
 drivers/gpu/drm/i915/i915_request.c           | 22 ++++---------------
 7 files changed, 55 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index ebd1e5919a4a..bf16f00a10a4 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -266,3 +266,20 @@ int __init i915_global_context_init(void)
 	i915_global_register(&global.base);
 	return 0;
 }
+
+void __intel_context_enter(struct intel_context *ce)
+{
+	struct drm_i915_private *i915 = ce->gem_context->i915;
+
+	if (!i915->gt.active_requests++)
+		i915_gem_unpark(i915);
+}
+
+void __intel_context_exit(struct intel_context *ce)
+{
+	struct drm_i915_private *i915 = ce->gem_context->i915;
+
+	GEM_BUG_ON(!i915->gt.active_requests);
+	if (!--i915->gt.active_requests)
+		i915_gem_park(i915);
+}
diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
index ebc861b1a49e..95b1fdc5826a 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -73,6 +73,26 @@ static inline void __intel_context_pin(struct intel_context *ce)
 
 void intel_context_unpin(struct intel_context *ce);
 
+void __intel_context_enter(struct intel_context *ce);
+void __intel_context_exit(struct intel_context *ce);
+
+static inline void intel_context_enter(struct intel_context *ce)
+{
+	if (!ce->active_count++)
+		ce->ops->enter(ce);
+}
+
+static inline void intel_context_mark_active(struct intel_context *ce)
+{
+	++ce->active_count;
+}
+
+static inline void intel_context_exit(struct intel_context *ce)
+{
+	if (!--ce->active_count)
+		ce->ops->exit(ce);
+}
+
 static inline struct intel_context *intel_context_get(struct intel_context *ce)
 {
 	kref_get(&ce->ref);
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 9ec4f787c908..f02d27734e3b 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -25,6 +25,9 @@ struct intel_context_ops {
 	int (*pin)(struct intel_context *ce);
 	void (*unpin)(struct intel_context *ce);
 
+	void (*enter)(struct intel_context *ce);
+	void (*exit)(struct intel_context *ce);
+
 	void (*reset)(struct intel_context *ce);
 	void (*destroy)(struct kref *kref);
 };
@@ -46,6 +49,8 @@ struct intel_context {
 	u32 *lrc_reg_state;
 	u64 lrc_desc;
 
+	unsigned int active_count; /* notionally protected by timeline->mutex */
+
 	atomic_t pin_count;
 	struct mutex pin_mutex; /* guards pinning and associated on-gpuing */
 
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 1565b11b15f6..7da5e49dd385 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1404,6 +1404,9 @@ static const struct intel_context_ops execlists_context_ops = {
 	.pin = execlists_context_pin,
 	.unpin = execlists_context_unpin,
 
+	.enter = __intel_context_enter,
+	.exit = __intel_context_exit,
+
 	.reset = execlists_context_reset,
 	.destroy = execlists_context_destroy,
 };
diff --git a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
index bcc842b8bbe7..5788e5ab818f 100644
--- a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
@@ -1516,6 +1516,9 @@ static const struct intel_context_ops ring_context_ops = {
 	.pin = ring_context_pin,
 	.unpin = ring_context_unpin,
 
+	.enter = __intel_context_enter,
+	.exit = __intel_context_exit,
+
 	.reset = ring_context_reset,
 	.destroy = ring_context_destroy,
 };
diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c
index 414afd2f27fe..2e84ffb46489 100644
--- a/drivers/gpu/drm/i915/gt/mock_engine.c
+++ b/drivers/gpu/drm/i915/gt/mock_engine.c
@@ -157,6 +157,9 @@ static const struct intel_context_ops mock_context_ops = {
 	.pin = mock_context_pin,
 	.unpin = mock_context_unpin,
 
+	.enter = __intel_context_enter,
+	.exit = __intel_context_exit,
+
 	.destroy = mock_context_destroy,
 };
 
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index f2b9e0e1910b..e72e4087bbfe 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -130,19 +130,6 @@ i915_request_remove_from_client(struct i915_request *request)
 	spin_unlock(&file_priv->mm.lock);
 }
 
-static void reserve_gt(struct drm_i915_private *i915)
-{
-	if (!i915->gt.active_requests++)
-		i915_gem_unpark(i915);
-}
-
-static void unreserve_gt(struct drm_i915_private *i915)
-{
-	GEM_BUG_ON(!i915->gt.active_requests);
-	if (!--i915->gt.active_requests)
-		i915_gem_park(i915);
-}
-
 static void advance_ring(struct i915_request *request)
 {
 	struct intel_ring *ring = request->ring;
@@ -300,11 +287,10 @@ static void i915_request_retire(struct i915_request *request)
 
 	i915_request_remove_from_client(request);
 
-	intel_context_unpin(request->hw_context);
-
 	__retire_engine_upto(request->engine, request);
 
-	unreserve_gt(request->i915);
+	intel_context_unpin(request->hw_context);
+	intel_context_exit(request->hw_context);
 
 	i915_sched_node_fini(&request->sched);
 	i915_request_put(request);
@@ -628,8 +614,8 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	if (IS_ERR(ce))
 		return ERR_CAST(ce);
 
-	reserve_gt(i915);
 	mutex_lock(&ce->ring->timeline->mutex);
+	intel_context_enter(ce);
 
 	/* Move our oldest request to the slab-cache (if not in use!) */
 	rq = list_first_entry(&ce->ring->request_list, typeof(*rq), ring_link);
@@ -759,8 +745,8 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 err_free:
 	kmem_cache_free(global.slab_requests, rq);
 err_unreserve:
+	intel_context_exit(ce);
 	mutex_unlock(&ce->ring->timeline->mutex);
-	unreserve_gt(i915);
 	intel_context_unpin(ce);
 	return ERR_PTR(ret);
 }
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 11/29] drm/i915: Pass intel_context to i915_request_create()
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (8 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 10/29] drm/i915: Introduce context->enter() and context->exit() Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-10 10:38   ` Tvrtko Ursulin
  2019-04-08  9:17 ` [PATCH 12/29] drm/i915: Invert the GEM wakeref hierarchy Chris Wilson
                   ` (21 subsequent siblings)
  31 siblings, 1 reply; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

Start acquiring the logical intel_context and using that as our primary
means for request allocation. This is the initial step to allow us to
avoid requiring struct_mutex for request allocation along the
perma-pinned kernel context, but it also provides a foundation for
breaking up the complex request allocation to handle different scenarios
inside execbuf.

For the purpose of emitting a request from inside retirement (see the
next patch for engine power management), we also need to lift control
over the timeline mutex to the caller.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_context.h       |  12 +
 drivers/gpu/drm/i915/gt/intel_reset.c         |   2 +-
 drivers/gpu/drm/i915/gt/intel_ringbuffer.c    |   3 -
 drivers/gpu/drm/i915/i915_gem_context.c       |   4 +-
 drivers/gpu/drm/i915/i915_perf.c              |   2 +-
 drivers/gpu/drm/i915/i915_request.c           | 245 ++++++++++--------
 drivers/gpu/drm/i915/i915_request.h           |   7 +
 drivers/gpu/drm/i915/intel_overlay.c          |   5 +-
 drivers/gpu/drm/i915/selftests/i915_active.c  |   2 +-
 .../drm/i915/selftests/i915_gem_coherency.c   |   2 +-
 .../gpu/drm/i915/selftests/i915_gem_object.c  |   2 +-
 drivers/gpu/drm/i915/selftests/i915_request.c |   9 +-
 .../gpu/drm/i915/selftests/i915_timeline.c    |   4 +-
 13 files changed, 175 insertions(+), 124 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
index 95b1fdc5826a..9aeef88176b9 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -104,4 +104,16 @@ static inline void intel_context_put(struct intel_context *ce)
 	kref_put(&ce->ref, ce->ops->destroy);
 }
 
+static inline void intel_context_timeline_lock(struct intel_context *ce)
+	__acquires(&ce->ring->timeline->mutex)
+{
+	mutex_lock(&ce->ring->timeline->mutex);
+}
+
+static inline void intel_context_timeline_unlock(struct intel_context *ce)
+	__releases(&ce->ring->timeline->mutex)
+{
+	mutex_unlock(&ce->ring->timeline->mutex);
+}
+
 #endif /* __INTEL_CONTEXT_H__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index dce49b8b1118..a5501b136f0c 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -770,7 +770,7 @@ static void restart_work(struct work_struct *work)
 		if (!intel_engine_is_idle(engine))
 			continue;
 
-		rq = i915_request_alloc(engine, i915->kernel_context);
+		rq = i915_request_create(engine->kernel_context);
 		if (!IS_ERR(rq))
 			i915_request_add(rq);
 	}
diff --git a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
index 5788e5ab818f..63bf092d737b 100644
--- a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
@@ -1778,7 +1778,6 @@ static int switch_context(struct i915_request *rq)
 	u32 hw_flags = 0;
 	int ret, i;
 
-	lockdep_assert_held(&rq->i915->drm.struct_mutex);
 	GEM_BUG_ON(HAS_EXECLISTS(rq->i915));
 
 	if (ppgtt) {
@@ -1908,8 +1907,6 @@ static noinline int wait_for_space(struct intel_ring *ring, unsigned int bytes)
 	struct i915_request *target;
 	long timeout;
 
-	lockdep_assert_held(&ring->vma->vm->i915->drm.struct_mutex);
-
 	if (intel_ring_update_space(ring) >= bytes)
 		return 0;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 170500d49049..de3404479bc3 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -942,7 +942,7 @@ int i915_gem_switch_to_kernel_context(struct drm_i915_private *i915,
 		struct intel_ring *ring;
 		struct i915_request *rq;
 
-		rq = i915_request_alloc(engine, i915->kernel_context);
+		rq = i915_request_create(engine->kernel_context);
 		if (IS_ERR(rq))
 			return PTR_ERR(rq);
 
@@ -1188,7 +1188,7 @@ gen8_modify_rpcs(struct intel_context *ce, struct intel_sseu sseu)
 	/* Submitting requests etc needs the hw awake. */
 	wakeref = intel_runtime_pm_get(i915);
 
-	rq = i915_request_alloc(ce->engine, i915->kernel_context);
+	rq = i915_request_create(ce->engine->kernel_context);
 	if (IS_ERR(rq)) {
 		ret = PTR_ERR(rq);
 		goto out_put;
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index a87f790335c1..328a740e72cb 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1762,7 +1762,7 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
 	 * Apply the configuration by doing one context restore of the edited
 	 * context image.
 	 */
-	rq = i915_request_alloc(engine, dev_priv->kernel_context);
+	rq = i915_request_create(engine->kernel_context);
 	if (IS_ERR(rq))
 		return PTR_ERR(rq);
 
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index e72e4087bbfe..ca4c200ee7ec 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -550,7 +550,7 @@ static void ring_retire_requests(struct intel_ring *ring)
 }
 
 static noinline struct i915_request *
-i915_request_alloc_slow(struct intel_context *ce)
+request_alloc_slow(struct intel_context *ce, gfp_t gfp)
 {
 	struct intel_ring *ring = ce->ring;
 	struct i915_request *rq;
@@ -558,6 +558,9 @@ i915_request_alloc_slow(struct intel_context *ce)
 	if (list_empty(&ring->request_list))
 		goto out;
 
+	if (!gfpflags_allow_blocking(gfp))
+		goto out;
+
 	/* Ratelimit ourselves to prevent oom from malicious clients */
 	rq = list_last_entry(&ring->request_list, typeof(*rq), ring_link);
 	cond_synchronize_rcu(rq->rcustate);
@@ -566,62 +569,21 @@ i915_request_alloc_slow(struct intel_context *ce)
 	ring_retire_requests(ring);
 
 out:
-	return kmem_cache_alloc(global.slab_requests, GFP_KERNEL);
+	return kmem_cache_alloc(global.slab_requests, gfp);
 }
 
-/**
- * i915_request_alloc - allocate a request structure
- *
- * @engine: engine that we wish to issue the request on.
- * @ctx: context that the request will be associated with.
- *
- * Returns a pointer to the allocated request if successful,
- * or an error code if not.
- */
 struct i915_request *
-i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
+__i915_request_create(struct intel_context *ce, gfp_t gfp)
 {
-	struct drm_i915_private *i915 = engine->i915;
-	struct intel_context *ce;
-	struct i915_timeline *tl;
+	struct i915_timeline *tl = ce->ring->timeline;
 	struct i915_request *rq;
 	u32 seqno;
 	int ret;
 
-	lockdep_assert_held(&i915->drm.struct_mutex);
-
-	/*
-	 * Preempt contexts are reserved for exclusive use to inject a
-	 * preemption context switch. They are never to be used for any trivial
-	 * request!
-	 */
-	GEM_BUG_ON(ctx == i915->preempt_context);
-
-	/*
-	 * ABI: Before userspace accesses the GPU (e.g. execbuffer), report
-	 * EIO if the GPU is already wedged.
-	 */
-	ret = i915_terminally_wedged(i915);
-	if (ret)
-		return ERR_PTR(ret);
-
-	/*
-	 * Pinning the contexts may generate requests in order to acquire
-	 * GGTT space, so do this first before we reserve a seqno for
-	 * ourselves.
-	 */
-	ce = intel_context_pin(ctx, engine);
-	if (IS_ERR(ce))
-		return ERR_CAST(ce);
-
-	mutex_lock(&ce->ring->timeline->mutex);
-	intel_context_enter(ce);
+	might_sleep_if(gfpflags_allow_blocking(gfp));
 
-	/* Move our oldest request to the slab-cache (if not in use!) */
-	rq = list_first_entry(&ce->ring->request_list, typeof(*rq), ring_link);
-	if (!list_is_last(&rq->ring_link, &ce->ring->request_list) &&
-	    i915_request_completed(rq))
-		i915_request_retire(rq);
+	/* Check that the caller provided an already pinned context */
+	__intel_context_pin(ce);
 
 	/*
 	 * Beware: Dragons be flying overhead.
@@ -653,30 +615,26 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	 * Do not use kmem_cache_zalloc() here!
 	 */
 	rq = kmem_cache_alloc(global.slab_requests,
-			      GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN);
+			      gfp | __GFP_RETRY_MAYFAIL | __GFP_NOWARN);
 	if (unlikely(!rq)) {
-		rq = i915_request_alloc_slow(ce);
+		rq = request_alloc_slow(ce, gfp);
 		if (!rq) {
 			ret = -ENOMEM;
 			goto err_unreserve;
 		}
 	}
 
-	INIT_LIST_HEAD(&rq->active_list);
-	INIT_LIST_HEAD(&rq->execute_cb);
-
-	tl = ce->ring->timeline;
 	ret = i915_timeline_get_seqno(tl, rq, &seqno);
 	if (ret)
 		goto err_free;
 
-	rq->i915 = i915;
-	rq->engine = engine;
-	rq->gem_context = ctx;
+	rq->i915 = ce->engine->i915;
 	rq->hw_context = ce;
+	rq->gem_context = ce->gem_context;
+	rq->engine = ce->engine;
 	rq->ring = ce->ring;
 	rq->timeline = tl;
-	GEM_BUG_ON(rq->timeline == &engine->timeline);
+	GEM_BUG_ON(rq->timeline == &ce->engine->timeline);
 	rq->hwsp_seqno = tl->hwsp_seqno;
 	rq->hwsp_cacheline = tl->hwsp_cacheline;
 	rq->rcustate = get_state_synchronize_rcu(); /* acts as smp_mb() */
@@ -696,6 +654,9 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	rq->capture_list = NULL;
 	rq->waitboost = false;
 
+	INIT_LIST_HEAD(&rq->active_list);
+	INIT_LIST_HEAD(&rq->execute_cb);
+
 	/*
 	 * Reserve space in the ring buffer for all the commands required to
 	 * eventually emit this request. This is to guarantee that the
@@ -708,7 +669,8 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	 * around inside i915_request_add() there is sufficient space at
 	 * the beginning of the ring as well.
 	 */
-	rq->reserved_space = 2 * engine->emit_fini_breadcrumb_dw * sizeof(u32);
+	rq->reserved_space =
+		2 * rq->engine->emit_fini_breadcrumb_dw * sizeof(u32);
 
 	/*
 	 * Record the position of the start of the request so that
@@ -718,20 +680,16 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	 */
 	rq->head = rq->ring->emit;
 
-	ret = engine->request_alloc(rq);
+	ret = rq->engine->request_alloc(rq);
 	if (ret)
 		goto err_unwind;
 
-	/* Keep a second pin for the dual retirement along engine and ring */
-	__intel_context_pin(ce);
-
 	rq->infix = rq->ring->emit; /* end of header; start of user payload */
 
-	/* Check that we didn't interrupt ourselves with a new request */
-	lockdep_assert_held(&rq->timeline->mutex);
-	GEM_BUG_ON(rq->timeline->seqno != rq->fence.seqno);
-	rq->cookie = lockdep_pin_lock(&rq->timeline->mutex);
+	/* Keep a second pin for the dual retirement along engine and ring */
+	__intel_context_pin(ce);
 
+	intel_context_mark_active(ce);
 	return rq;
 
 err_unwind:
@@ -745,12 +703,86 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 err_free:
 	kmem_cache_free(global.slab_requests, rq);
 err_unreserve:
-	intel_context_exit(ce);
-	mutex_unlock(&ce->ring->timeline->mutex);
 	intel_context_unpin(ce);
 	return ERR_PTR(ret);
 }
 
+struct i915_request *
+i915_request_create(struct intel_context *ce)
+{
+	struct i915_request *rq;
+
+	intel_context_timeline_lock(ce);
+
+	/* Move our oldest request to the slab-cache (if not in use!) */
+	rq = list_first_entry(&ce->ring->request_list, typeof(*rq), ring_link);
+	if (!list_is_last(&rq->ring_link, &ce->ring->request_list) &&
+	    i915_request_completed(rq))
+		i915_request_retire(rq);
+
+	intel_context_enter(ce);
+	rq = __i915_request_create(ce, GFP_KERNEL);
+	intel_context_exit(ce);
+	if (IS_ERR(rq))
+		goto err_unlock;
+
+	/* Check that we do not interrupt ourselves with a new request */
+	rq->cookie = lockdep_pin_lock(&ce->ring->timeline->mutex);
+
+	return rq;
+
+err_unlock:
+	intel_context_timeline_unlock(ce);
+	return rq;
+}
+
+/**
+ * i915_request_alloc - allocate a request structure
+ *
+ * @engine: engine that we wish to issue the request on.
+ * @ctx: context that the request will be associated with.
+ *
+ * Returns a pointer to the allocated request if successful,
+ * or an error code if not.
+ */
+struct i915_request *
+i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
+{
+	struct drm_i915_private *i915 = engine->i915;
+	struct intel_context *ce;
+	struct i915_request *rq;
+	int ret;
+
+	/*
+	 * Preempt contexts are reserved for exclusive use to inject a
+	 * preemption context switch. They are never to be used for any trivial
+	 * request!
+	 */
+	GEM_BUG_ON(ctx == i915->preempt_context);
+
+	/*
+	 * ABI: Before userspace accesses the GPU (e.g. execbuffer), report
+	 * EIO if the GPU is already wedged.
+	 */
+	ret = i915_terminally_wedged(i915);
+	if (ret)
+		return ERR_PTR(ret);
+
+	/*
+	 * Pinning the contexts may generate requests in order to acquire
+	 * GGTT space, so do this first before we reserve a seqno for
+	 * ourselves.
+	 */
+	ce = intel_context_pin(ctx, engine);
+	if (IS_ERR(ce))
+		return ERR_CAST(ce);
+
+	rq = i915_request_create(ce);
+	intel_context_unpin(ce);
+
+	return rq;
+}
+
 static int
 emit_semaphore_wait(struct i915_request *to,
 		    struct i915_request *from,
@@ -1005,8 +1037,7 @@ __i915_request_add_to_timeline(struct i915_request *rq)
 	 * precludes optimising to use semaphores serialisation of a single
 	 * timeline across engines.
 	 */
-	prev = i915_active_request_raw(&timeline->last_request,
-				       &rq->i915->drm.struct_mutex);
+	prev = rcu_dereference_protected(timeline->last_request.request, 1);
 	if (prev && !i915_request_completed(prev)) {
 		if (is_power_of_2(prev->engine->mask | rq->engine->mask))
 			i915_sw_fence_await_sw_fence(&rq->submit,
@@ -1027,6 +1058,11 @@ __i915_request_add_to_timeline(struct i915_request *rq)
 	list_add_tail(&rq->link, &timeline->requests);
 	spin_unlock_irq(&timeline->lock);
 
+	/*
+	 * Make sure that no request gazumped us - if it was allocated after
+	 * our i915_request_alloc() and called __i915_request_add() before
+	 * us, the timeline will hold its seqno which is later than ours.
+	 */
 	GEM_BUG_ON(timeline->seqno != rq->fence.seqno);
 	__i915_active_request_set(&timeline->last_request, rq);
 
@@ -1038,36 +1074,23 @@ __i915_request_add_to_timeline(struct i915_request *rq)
  * request is not being tracked for completion but the work itself is
  * going to happen on the hardware. This would be a Bad Thing(tm).
  */
-void i915_request_add(struct i915_request *request)
+struct i915_request *__i915_request_commit(struct i915_request *rq)
 {
-	struct intel_engine_cs *engine = request->engine;
-	struct i915_timeline *timeline = request->timeline;
-	struct intel_ring *ring = request->ring;
+	struct intel_engine_cs *engine = rq->engine;
+	struct intel_ring *ring = rq->ring;
 	struct i915_request *prev;
 	u32 *cs;
 
 	GEM_TRACE("%s fence %llx:%lld\n",
-		  engine->name, request->fence.context, request->fence.seqno);
-
-	lockdep_assert_held(&request->timeline->mutex);
-	lockdep_unpin_lock(&request->timeline->mutex, request->cookie);
-
-	trace_i915_request_add(request);
-
-	/*
-	 * Make sure that no request gazumped us - if it was allocated after
-	 * our i915_request_alloc() and called __i915_request_add() before
-	 * us, the timeline will hold its seqno which is later than ours.
-	 */
-	GEM_BUG_ON(timeline->seqno != request->fence.seqno);
+		  engine->name, rq->fence.context, rq->fence.seqno);
 
 	/*
 	 * To ensure that this call will not fail, space for its emissions
 	 * should already have been reserved in the ring buffer. Let the ring
 	 * know that it is time to use that space up.
 	 */
-	GEM_BUG_ON(request->reserved_space > request->ring->space);
-	request->reserved_space = 0;
+	GEM_BUG_ON(rq->reserved_space > ring->space);
+	rq->reserved_space = 0;
 
 	/*
 	 * Record the position of the start of the breadcrumb so that
@@ -1075,17 +1098,17 @@ void i915_request_add(struct i915_request *request)
 	 * GPU processing the request, we never over-estimate the
 	 * position of the ring's HEAD.
 	 */
-	cs = intel_ring_begin(request, engine->emit_fini_breadcrumb_dw);
+	cs = intel_ring_begin(rq, engine->emit_fini_breadcrumb_dw);
 	GEM_BUG_ON(IS_ERR(cs));
-	request->postfix = intel_ring_offset(request, cs);
+	rq->postfix = intel_ring_offset(rq, cs);
 
-	prev = __i915_request_add_to_timeline(request);
+	prev = __i915_request_add_to_timeline(rq);
 
-	list_add_tail(&request->ring_link, &ring->request_list);
-	if (list_is_first(&request->ring_link, &ring->request_list))
-		list_add(&ring->active_link, &request->i915->gt.active_rings);
-	request->i915->gt.active_engines |= request->engine->mask;
-	request->emitted_jiffies = jiffies;
+	list_add_tail(&rq->ring_link, &ring->request_list);
+	if (list_is_first(&rq->ring_link, &ring->request_list))
+		list_add(&ring->active_link, &rq->i915->gt.active_rings);
+	rq->i915->gt.active_engines |= rq->engine->mask;
+	rq->emitted_jiffies = jiffies;
 
 	/*
 	 * Let the backend know a new request has arrived that may need
@@ -1101,7 +1124,7 @@ void i915_request_add(struct i915_request *request)
 	local_bh_disable();
 	rcu_read_lock(); /* RCU serialisation for set-wedged protection */
 	if (engine->schedule) {
-		struct i915_sched_attr attr = request->gem_context->sched;
+		struct i915_sched_attr attr = rq->gem_context->sched;
 
 		/*
 		 * Boost actual workloads past semaphores!
@@ -1115,7 +1138,7 @@ void i915_request_add(struct i915_request *request)
 		 * far in the distance past over useful work, we keep a history
 		 * of any semaphore use along our dependency chain.
 		 */
-		if (!(request->sched.flags & I915_SCHED_HAS_SEMAPHORE_CHAIN))
+		if (!(rq->sched.flags & I915_SCHED_HAS_SEMAPHORE_CHAIN))
 			attr.priority |= I915_PRIORITY_NOSEMAPHORE;
 
 		/*
@@ -1124,15 +1147,29 @@ void i915_request_add(struct i915_request *request)
 		 * Allow interactive/synchronous clients to jump ahead of
 		 * the bulk clients. (FQ_CODEL)
 		 */
-		if (list_empty(&request->sched.signalers_list))
+		if (list_empty(&rq->sched.signalers_list))
 			attr.priority |= I915_PRIORITY_NEWCLIENT;
 
-		engine->schedule(request, &attr);
+		engine->schedule(rq, &attr);
 	}
 	rcu_read_unlock();
-	i915_sw_fence_commit(&request->submit);
+	i915_sw_fence_commit(&rq->submit);
 	local_bh_enable(); /* Kick the execlists tasklet if just scheduled */
 
+	return prev;
+}
+
+void i915_request_add(struct i915_request *rq)
+{
+	struct i915_request *prev;
+
+	lockdep_assert_held(&rq->timeline->mutex);
+	lockdep_unpin_lock(&rq->timeline->mutex, rq->cookie);
+
+	trace_i915_request_add(rq);
+
+	prev = __i915_request_commit(rq);
+
 	/*
 	 * In typical scenarios, we do not expect the previous request on
 	 * the timeline to be still tracked by timeline->last_request if it
@@ -1153,7 +1190,7 @@ void i915_request_add(struct i915_request *request)
 	if (prev && i915_request_completed(prev))
 		i915_request_retire_upto(prev);
 
-	mutex_unlock(&request->timeline->mutex);
+	mutex_unlock(&rq->timeline->mutex);
 }
 
 static unsigned long local_clock_us(unsigned int *cpu)
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 875be6f71412..9a50072171f8 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -238,6 +238,13 @@ static inline bool dma_fence_is_i915(const struct dma_fence *fence)
 	return fence->ops == &i915_fence_ops;
 }
 
+struct i915_request * __must_check
+__i915_request_create(struct intel_context *ce, gfp_t gfp);
+struct i915_request * __must_check
+i915_request_create(struct intel_context *ce);
+
+struct i915_request *__i915_request_commit(struct i915_request *request);
+
 struct i915_request * __must_check
 i915_request_alloc(struct intel_engine_cs *engine,
 		   struct i915_gem_context *ctx);
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index a882b8d42bd9..fea8c5869da9 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -235,10 +235,9 @@ static int intel_overlay_do_wait_request(struct intel_overlay *overlay,
 
 static struct i915_request *alloc_request(struct intel_overlay *overlay)
 {
-	struct drm_i915_private *dev_priv = overlay->i915;
-	struct intel_engine_cs *engine = dev_priv->engine[RCS0];
+	struct intel_engine_cs *engine = overlay->i915->engine[RCS0];
 
-	return i915_request_alloc(engine, dev_priv->kernel_context);
+	return i915_request_create(engine->kernel_context);
 }
 
 /* overlay needs to be disable in OCMD reg */
diff --git a/drivers/gpu/drm/i915/selftests/i915_active.c b/drivers/gpu/drm/i915/selftests/i915_active.c
index 27d8f853111b..eee838dc0634 100644
--- a/drivers/gpu/drm/i915/selftests/i915_active.c
+++ b/drivers/gpu/drm/i915/selftests/i915_active.c
@@ -46,7 +46,7 @@ static int __live_active_setup(struct drm_i915_private *i915,
 	for_each_engine(engine, i915, id) {
 		struct i915_request *rq;
 
-		rq = i915_request_alloc(engine, i915->kernel_context);
+		rq = i915_request_create(engine->kernel_context);
 		if (IS_ERR(rq)) {
 			err = PTR_ERR(rq);
 			break;
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
index e43630b40fce..046a38743152 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
@@ -202,7 +202,7 @@ static int gpu_set(struct drm_i915_gem_object *obj,
 	if (IS_ERR(vma))
 		return PTR_ERR(vma);
 
-	rq = i915_request_alloc(i915->engine[RCS0], i915->kernel_context);
+	rq = i915_request_create(i915->engine[RCS0]->kernel_context);
 	if (IS_ERR(rq)) {
 		i915_vma_unpin(vma);
 		return PTR_ERR(rq);
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
index 12fc53c694a6..12203d665a4e 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
@@ -468,7 +468,7 @@ static int make_obj_busy(struct drm_i915_gem_object *obj)
 	if (err)
 		return err;
 
-	rq = i915_request_alloc(i915->engine[RCS0], i915->kernel_context);
+	rq = i915_request_create(i915->engine[RCS0]->kernel_context);
 	if (IS_ERR(rq)) {
 		i915_vma_unpin(vma);
 		return PTR_ERR(rq);
diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c
index e6ffe2240126..098d7b3aa131 100644
--- a/drivers/gpu/drm/i915/selftests/i915_request.c
+++ b/drivers/gpu/drm/i915/selftests/i915_request.c
@@ -551,8 +551,7 @@ static int live_nop_request(void *arg)
 			times[1] = ktime_get_raw();
 
 			for (n = 0; n < prime; n++) {
-				request = i915_request_alloc(engine,
-							     i915->kernel_context);
+				request = i915_request_create(engine->kernel_context);
 				if (IS_ERR(request)) {
 					err = PTR_ERR(request);
 					goto out_unlock;
@@ -649,7 +648,7 @@ empty_request(struct intel_engine_cs *engine,
 	struct i915_request *request;
 	int err;
 
-	request = i915_request_alloc(engine, engine->i915->kernel_context);
+	request = i915_request_create(engine->kernel_context);
 	if (IS_ERR(request))
 		return request;
 
@@ -853,7 +852,7 @@ static int live_all_engines(void *arg)
 	}
 
 	for_each_engine(engine, i915, id) {
-		request[id] = i915_request_alloc(engine, i915->kernel_context);
+		request[id] = i915_request_create(engine->kernel_context);
 		if (IS_ERR(request[id])) {
 			err = PTR_ERR(request[id]);
 			pr_err("%s: Request allocation failed with err=%d\n",
@@ -962,7 +961,7 @@ static int live_sequential_engines(void *arg)
 			goto out_unlock;
 		}
 
-		request[id] = i915_request_alloc(engine, i915->kernel_context);
+		request[id] = i915_request_create(engine->kernel_context);
 		if (IS_ERR(request[id])) {
 			err = PTR_ERR(request[id]);
 			pr_err("%s: Request allocation failed for %s with err=%d\n",
diff --git a/drivers/gpu/drm/i915/selftests/i915_timeline.c b/drivers/gpu/drm/i915/selftests/i915_timeline.c
index 8e7bcaa1eb66..227479349c61 100644
--- a/drivers/gpu/drm/i915/selftests/i915_timeline.c
+++ b/drivers/gpu/drm/i915/selftests/i915_timeline.c
@@ -454,7 +454,7 @@ tl_write(struct i915_timeline *tl, struct intel_engine_cs *engine, u32 value)
 		goto out;
 	}
 
-	rq = i915_request_alloc(engine, engine->i915->kernel_context);
+	rq = i915_request_create(engine->kernel_context);
 	if (IS_ERR(rq))
 		goto out_unpin;
 
@@ -678,7 +678,7 @@ static int live_hwsp_wrap(void *arg)
 		if (!intel_engine_can_store_dword(engine))
 			continue;
 
-		rq = i915_request_alloc(engine, i915->kernel_context);
+		rq = i915_request_create(engine->kernel_context);
 		if (IS_ERR(rq)) {
 			err = PTR_ERR(rq);
 			goto out;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 12/29] drm/i915: Invert the GEM wakeref hierarchy
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (9 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 11/29] drm/i915: Pass intel_context to i915_request_create() Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-08  9:17 ` [PATCH 13/29] drm/i915/gvt: Pin the per-engine GVT shadow contexts Chris Wilson
                   ` (20 subsequent siblings)
  31 siblings, 0 replies; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

In the current scheme, on submitting a request we take a single global
GEM wakeref, which trickles down to wake up all GT power domains. This
is undesirable as we would like to be able to localise our power
management to the available power domains and to remove the global GEM
operations from the heart of the driver. (The intent there is to push
global GEM decisions to the boundary as used by the GEM user interface.)

Now during request construction, each request is responsible via its
logical context to acquire a wakeref on each power domain it intends to
utilize. Currently, each request takes a wakeref on the engine(s) and
the engines themselves take a chipset wakeref. This gives us a
transition on each engine which we can extend if we want to insert more
powermangement control (such as soft rc6). The global GEM operations
that currently require a struct_mutex are reduced to listening to pm
events from the chipset GT wakeref. As we reduce the struct_mutex
requirement, these listeners should evaporate.

Perhaps the biggest immediate change is that this removes the
struct_mutex requirement around GT power management, allowing us greater
flexibility in request construction. Another important knock-on effect,
is that by tracking engine usage, we can insert a switch back to the
kernel context on that engine immediately, avoiding any extra delay or
inserting global synchronisation barriers. This makes tracking when an
engine and its associated contexts are idle much easier -- important for
when we forgo our assumed execution ordering and need idle barriers to
unpin used contexts. In the process, it means we remove a large chunk of
code whose only purpose was to switch back to the kernel context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |   2 +
 drivers/gpu/drm/i915/gt/intel_context.c       |  18 +-
 drivers/gpu/drm/i915/gt/intel_engine.h        |   9 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     | 136 +---------
 drivers/gpu/drm/i915/gt/intel_engine_pm.c     | 162 +++++++++++
 drivers/gpu/drm/i915/gt/intel_engine_pm.h     |  20 ++
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm.c         | 138 ++++++++++
 drivers/gpu/drm/i915/gt/intel_gt_pm.h         |  27 ++
 drivers/gpu/drm/i915/gt/intel_hangcheck.c     |   7 +
 drivers/gpu/drm/i915/gt/intel_lrc.c           |   6 +-
 drivers/gpu/drm/i915/gt/intel_reset.c         | 101 +------
 drivers/gpu/drm/i915/gt/intel_reset.h         |   1 -
 drivers/gpu/drm/i915/gt/intel_ringbuffer.c    |  16 +-
 drivers/gpu/drm/i915/gt/mock_engine.c         |   3 +
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |  49 +---
 .../gpu/drm/i915/gt/selftest_workarounds.c    |   5 +-
 drivers/gpu/drm/i915/i915_debugfs.c           |  16 +-
 drivers/gpu/drm/i915/i915_drv.c               |   5 +-
 drivers/gpu/drm/i915/i915_drv.h               |   8 +-
 drivers/gpu/drm/i915/i915_gem.c               |  41 +--
 drivers/gpu/drm/i915/i915_gem.h               |   3 -
 drivers/gpu/drm/i915/i915_gem_context.c       |  85 +-----
 drivers/gpu/drm/i915/i915_gem_context.h       |   4 -
 drivers/gpu/drm/i915/i915_gem_evict.c         |  47 +---
 drivers/gpu/drm/i915/i915_gem_pm.c            | 256 ++++++------------
 drivers/gpu/drm/i915/i915_gem_pm.h            |   3 -
 drivers/gpu/drm/i915/i915_gpu_error.h         |   4 -
 drivers/gpu/drm/i915/i915_request.c           |  10 +-
 drivers/gpu/drm/i915/i915_request.h           |   2 +-
 drivers/gpu/drm/i915/selftests/i915_gem.c     |  16 +-
 .../gpu/drm/i915/selftests/i915_gem_context.c | 114 +-------
 .../gpu/drm/i915/selftests/i915_gem_object.c  |  29 +-
 .../gpu/drm/i915/selftests/igt_flush_test.c   |  32 ++-
 .../gpu/drm/i915/selftests/mock_gem_device.c  |  15 +-
 35 files changed, 590 insertions(+), 807 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_pm.c
 create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_pm.h
 create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_pm.c
 create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_pm.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 858642c7bc40..dd8d923aa1c6 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -71,6 +71,8 @@ gt-y += \
 	gt/intel_breadcrumbs.o \
 	gt/intel_context.o \
 	gt/intel_engine_cs.o \
+	gt/intel_engine_pm.o \
+	gt/intel_gt_pm.o \
 	gt/intel_hangcheck.o \
 	gt/intel_lrc.o \
 	gt/intel_reset.o \
diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index bf16f00a10a4..6ae6a3f58364 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -10,6 +10,7 @@
 
 #include "intel_context.h"
 #include "intel_engine.h"
+#include "intel_engine_pm.h"
 
 static struct i915_global_context {
 	struct i915_global base;
@@ -162,7 +163,11 @@ intel_context_pin(struct i915_gem_context *ctx,
 		return ERR_PTR(-EINTR);
 
 	if (likely(!atomic_read(&ce->pin_count))) {
-		err = ce->ops->pin(ce);
+		intel_wakeref_t wakeref;
+
+		err = 0;
+		with_intel_runtime_pm(ce->engine->i915, wakeref)
+			err = ce->ops->pin(ce);
 		if (err)
 			goto err;
 
@@ -269,17 +274,10 @@ int __init i915_global_context_init(void)
 
 void __intel_context_enter(struct intel_context *ce)
 {
-	struct drm_i915_private *i915 = ce->gem_context->i915;
-
-	if (!i915->gt.active_requests++)
-		i915_gem_unpark(i915);
+	intel_engine_pm_get(ce->engine);
 }
 
 void __intel_context_exit(struct intel_context *ce)
 {
-	struct drm_i915_private *i915 = ce->gem_context->i915;
-
-	GEM_BUG_ON(!i915->gt.active_requests);
-	if (!--i915->gt.active_requests)
-		i915_gem_park(i915);
+	intel_engine_pm_put(ce->engine);
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index 0dea6c7fd438..a17152e96bf8 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -382,6 +382,8 @@ u64 intel_engine_get_last_batch_head(const struct intel_engine_cs *engine);
 void intel_engine_get_instdone(struct intel_engine_cs *engine,
 			       struct intel_instdone *instdone);
 
+void intel_engine_init_execlists(struct intel_engine_cs *engine);
+
 void intel_engine_init_breadcrumbs(struct intel_engine_cs *engine);
 void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine);
 
@@ -458,19 +460,14 @@ static inline void intel_engine_reset(struct intel_engine_cs *engine,
 {
 	if (engine->reset.reset)
 		engine->reset.reset(engine, stalled);
+	engine->serial++; /* contexts lost */
 }
 
-void intel_engines_sanitize(struct drm_i915_private *i915, bool force);
-void intel_gt_resume(struct drm_i915_private *i915);
-
 bool intel_engine_is_idle(struct intel_engine_cs *engine);
 bool intel_engines_are_idle(struct drm_i915_private *dev_priv);
 
 void intel_engine_lost_context(struct intel_engine_cs *engine);
 
-void intel_engines_park(struct drm_i915_private *i915);
-void intel_engines_unpark(struct drm_i915_private *i915);
-
 void intel_engines_reset_default_submission(struct drm_i915_private *i915);
 unsigned int intel_engines_has_context_isolation(struct drm_i915_private *i915);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index fdfce11e4944..d378b276e476 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -27,6 +27,7 @@
 #include "i915_drv.h"
 
 #include "intel_engine.h"
+#include "intel_engine_pm.h"
 #include "intel_lrc.h"
 #include "intel_reset.h"
 
@@ -451,7 +452,7 @@ static void intel_engine_init_batch_pool(struct intel_engine_cs *engine)
 	i915_gem_batch_pool_init(&engine->batch_pool, engine);
 }
 
-static void intel_engine_init_execlist(struct intel_engine_cs *engine)
+void intel_engine_init_execlists(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 
@@ -584,10 +585,11 @@ int intel_engine_setup_common(struct intel_engine_cs *engine)
 	i915_timeline_set_subclass(&engine->timeline, TIMELINE_ENGINE);
 
 	intel_engine_init_breadcrumbs(engine);
-	intel_engine_init_execlist(engine);
+	intel_engine_init_execlists(engine);
 	intel_engine_init_hangcheck(engine);
 	intel_engine_init_batch_pool(engine);
 	intel_engine_init_cmd_parser(engine);
+	intel_engine_init__pm(engine);
 
 	/* Use the whole device by default */
 	engine->sseu =
@@ -758,24 +760,6 @@ int intel_engine_init_common(struct intel_engine_cs *engine)
 	return ret;
 }
 
-void intel_gt_resume(struct drm_i915_private *i915)
-{
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-
-	/*
-	 * After resume, we may need to poke into the pinned kernel
-	 * contexts to paper over any damage caused by the sudden suspend.
-	 * Only the kernel contexts should remain pinned over suspend,
-	 * allowing us to fixup the user contexts on their first pin.
-	 */
-	for_each_engine(engine, i915, id) {
-		struct intel_context *ce = engine->kernel_context;
-
-		ce->ops->reset(ce);
-	}
-}
-
 /**
  * intel_engines_cleanup_common - cleans up the engine state created by
  *                                the common initiailizers.
@@ -1122,117 +1106,6 @@ void intel_engines_reset_default_submission(struct drm_i915_private *i915)
 		engine->set_default_submission(engine);
 }
 
-static bool reset_engines(struct drm_i915_private *i915)
-{
-	if (INTEL_INFO(i915)->gpu_reset_clobbers_display)
-		return false;
-
-	return intel_gpu_reset(i915, ALL_ENGINES) == 0;
-}
-
-/**
- * intel_engines_sanitize: called after the GPU has lost power
- * @i915: the i915 device
- * @force: ignore a failed reset and sanitize engine state anyway
- *
- * Anytime we reset the GPU, either with an explicit GPU reset or through a
- * PCI power cycle, the GPU loses state and we must reset our state tracking
- * to match. Note that calling intel_engines_sanitize() if the GPU has not
- * been reset results in much confusion!
- */
-void intel_engines_sanitize(struct drm_i915_private *i915, bool force)
-{
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-
-	GEM_TRACE("\n");
-
-	if (!reset_engines(i915) && !force)
-		return;
-
-	for_each_engine(engine, i915, id)
-		intel_engine_reset(engine, false);
-}
-
-/**
- * intel_engines_park: called when the GT is transitioning from busy->idle
- * @i915: the i915 device
- *
- * The GT is now idle and about to go to sleep (maybe never to wake again?).
- * Time for us to tidy and put away our toys (release resources back to the
- * system).
- */
-void intel_engines_park(struct drm_i915_private *i915)
-{
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-
-	for_each_engine(engine, i915, id) {
-		/* Flush the residual irq tasklets first. */
-		intel_engine_disarm_breadcrumbs(engine);
-		tasklet_kill(&engine->execlists.tasklet);
-
-		/*
-		 * We are committed now to parking the engines, make sure there
-		 * will be no more interrupts arriving later and the engines
-		 * are truly idle.
-		 */
-		if (wait_for(intel_engine_is_idle(engine), 10)) {
-			struct drm_printer p = drm_debug_printer(__func__);
-
-			dev_err(i915->drm.dev,
-				"%s is not idle before parking\n",
-				engine->name);
-			intel_engine_dump(engine, &p, NULL);
-		}
-
-		/* Must be reset upon idling, or we may miss the busy wakeup. */
-		GEM_BUG_ON(engine->execlists.queue_priority_hint != INT_MIN);
-
-		if (engine->park)
-			engine->park(engine);
-
-		if (engine->pinned_default_state) {
-			i915_gem_object_unpin_map(engine->default_state);
-			engine->pinned_default_state = NULL;
-		}
-
-		i915_gem_batch_pool_fini(&engine->batch_pool);
-		engine->execlists.no_priolist = false;
-	}
-
-	i915->gt.active_engines = 0;
-}
-
-/**
- * intel_engines_unpark: called when the GT is transitioning from idle->busy
- * @i915: the i915 device
- *
- * The GT was idle and now about to fire up with some new user requests.
- */
-void intel_engines_unpark(struct drm_i915_private *i915)
-{
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-
-	for_each_engine(engine, i915, id) {
-		void *map;
-
-		/* Pin the default state for fast resets from atomic context. */
-		map = NULL;
-		if (engine->default_state)
-			map = i915_gem_object_pin_map(engine->default_state,
-						      I915_MAP_WB);
-		if (!IS_ERR_OR_NULL(map))
-			engine->pinned_default_state = map;
-
-		if (engine->unpark)
-			engine->unpark(engine);
-
-		intel_engine_init_hangcheck(engine);
-	}
-}
-
 /**
  * intel_engine_lost_context: called when the GPU is reset into unknown state
  * @engine: the engine
@@ -1524,6 +1397,7 @@ void intel_engine_dump(struct intel_engine_cs *engine,
 	if (i915_reset_failed(engine->i915))
 		drm_printf(m, "*** WEDGED ***\n");
 
+	drm_printf(m, "\tAwake? %d\n", atomic_read(&engine->wakeref.count));
 	drm_printf(m, "\tHangcheck %x:%x [%d ms]\n",
 		   engine->hangcheck.last_seqno,
 		   engine->hangcheck.next_seqno,
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
new file mode 100644
index 000000000000..097822e4d539
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -0,0 +1,162 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "i915_drv.h"
+
+#include "intel_engine.h"
+#include "intel_engine_pm.h"
+#include "intel_gt_pm.h"
+
+static bool intel_engine_unpark(struct intel_wakeref_count *wc)
+{
+	struct intel_engine_cs *engine =
+		container_of(wc, typeof(*engine), wakeref);
+	void *map;
+
+	GEM_TRACE("%s\n", engine->name);
+
+	intel_gt_pm_get(engine->i915);
+
+	/* Pin the default state for fast resets from atomic context. */
+	map = NULL;
+	if (engine->default_state)
+		map = i915_gem_object_pin_map(engine->default_state,
+					      I915_MAP_WB);
+	if (!IS_ERR_OR_NULL(map))
+		engine->pinned_default_state = map;
+
+	if (engine->unpark)
+		engine->unpark(engine);
+
+	intel_engine_init_hangcheck(engine);
+	return true;
+}
+
+void intel_engine_pm_get(struct intel_engine_cs *engine)
+{
+	intel_wakeref_get_once(engine->i915,
+			       &engine->wakeref,
+			       intel_engine_unpark);
+}
+
+static bool switch_to_kernel_context(struct intel_engine_cs *engine)
+{
+	struct i915_request *rq;
+
+	if (i915_reset_failed(engine->i915))
+		return false;
+
+	GEM_TRACE("%s\n", engine->name);
+
+	/*
+	 * Note, we do this without taking the timeline->mutex. We cannot
+	 * as we may be called while retiring the kernel context and so
+	 * already underneath the timeline->mutex. Instead we rely on the
+	 * exclusive property of the intel_engine_park that prevents anyone
+	 * else from creating a request on this engine. This also requires
+	 * that the ring is empty and we avoid any waits while constructing
+	 * the context, as they assume protection by the timeline->mutex.
+	 * This should hold true as we can only park the engine after
+	 * retiring the last request, thus all rings should be empty and
+	 * all timelines idle.
+	 */
+	rq = __i915_request_create(engine->kernel_context, GFP_NOWAIT);
+	if (IS_ERR(rq))
+		return false;
+
+	__i915_request_commit(rq);
+	return true;
+}
+
+static bool intel_engine_park(struct intel_wakeref_count *wc)
+{
+	struct intel_engine_cs *engine =
+		container_of(wc, typeof(*engine), wakeref);
+
+	GEM_TRACE("%s\n", engine->name);
+
+	/*
+	 * If one and only one request is completed between pm events,
+	 * we know that we are inside the kernel context and it is
+	 * safe to power down. (We are paranoid in case that runtime
+	 * suspend causes corruption to the active context image, and
+	 * want to avoid that impacting userspace.)
+	 */
+	if (engine->wakeref_serial != engine->serial) {
+		engine->wakeref_serial = engine->serial + 1;
+
+		if (switch_to_kernel_context(engine))
+			return false;
+
+		/*
+		 * Request submission failed? Oh no, pretend like nothing
+		 * happened and hope for the best. Alternatively, we
+		 * could keep the pm raised until we attempt recovery
+		 * by worker or system suspend.
+		 */
+		engine->wakeref_serial--;
+	}
+
+	intel_engine_disarm_breadcrumbs(engine);
+
+	/* Must be reset upon idling, or we may miss the busy wakeup. */
+	GEM_BUG_ON(engine->execlists.queue_priority_hint != INT_MIN);
+
+	if (engine->park)
+		engine->park(engine);
+
+	if (engine->pinned_default_state) {
+		i915_gem_object_unpin_map(engine->default_state);
+		engine->pinned_default_state = NULL;
+	}
+
+	engine->execlists.no_priolist = false;
+
+	intel_gt_pm_put(engine->i915);
+	return true;
+}
+
+void intel_engine_pm_put(struct intel_engine_cs *engine)
+{
+	intel_wakeref_put_once(engine->i915,
+			       &engine->wakeref,
+			       intel_engine_park);
+}
+
+void intel_engine_init__pm(struct intel_engine_cs *engine)
+{
+	intel_wakeref_init(&engine->wakeref);
+}
+
+int intel_engines_resume(struct drm_i915_private *i915)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	int err = 0;
+
+	/*
+	 * After resume, we may need to poke into the pinned kernel
+	 * contexts to paper over any damage caused by the sudden suspend.
+	 * Only the kernel contexts should remain pinned over suspend,
+	 * allowing us to fixup the user contexts on their first pin.
+	 */
+	intel_gt_pm_get(i915);
+	for_each_engine(engine, i915, id) {
+		intel_engine_pm_get(engine);
+		engine->serial++; /* kernel context lost */
+		err = engine->resume(engine);
+		intel_engine_pm_put(engine);
+		if (err) {
+			dev_err(i915->drm.dev,
+				"Failed to restart %s (%d)\n",
+				engine->name, err);
+			break;
+		}
+	}
+	intel_gt_pm_put(i915);
+
+	return err;
+}
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.h b/drivers/gpu/drm/i915/gt/intel_engine_pm.h
new file mode 100644
index 000000000000..143ac90ba117
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.h
@@ -0,0 +1,20 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef INTEL_ENGINE_PM_H
+#define INTEL_ENGINE_PM_H
+
+struct drm_i915_private;
+struct intel_engine_cs;
+
+void intel_engine_pm_get(struct intel_engine_cs *engine);
+void intel_engine_pm_put(struct intel_engine_cs *engine);
+
+void intel_engine_init__pm(struct intel_engine_cs *engine);
+
+int intel_engines_resume(struct drm_i915_private *i915);
+
+#endif /* INTEL_ENGINE_PM_H */
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 51faf6d40ba5..cd13c3840438 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -20,6 +20,7 @@
 #include "i915_selftest.h"
 #include "i915_timeline_types.h"
 #include "intel_sseu.h"
+#include "intel_wakeref.h"
 #include "intel_workarounds_types.h"
 
 #define I915_MAX_SLICES	3
@@ -282,6 +283,10 @@ struct intel_engine_cs {
 	struct intel_context *kernel_context; /* pinned */
 	struct intel_context *preempt_context; /* pinned; optional */
 
+	unsigned long serial;
+
+	unsigned long wakeref_serial;
+	struct intel_wakeref_count wakeref;
 	struct drm_i915_gem_object *default_state;
 	void *pinned_default_state;
 
@@ -354,7 +359,7 @@ struct intel_engine_cs {
 	void		(*irq_enable)(struct intel_engine_cs *engine);
 	void		(*irq_disable)(struct intel_engine_cs *engine);
 
-	int		(*init_hw)(struct intel_engine_cs *engine);
+	int		(*resume)(struct intel_engine_cs *engine);
 
 	struct {
 		void (*prepare)(struct intel_engine_cs *engine);
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
new file mode 100644
index 000000000000..7d967b29d6e1
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
@@ -0,0 +1,138 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "i915_drv.h"
+#include "intel_gt_pm.h"
+#include "intel_pm.h"
+#include "intel_wakeref.h"
+
+static void pm_notify(struct drm_i915_private *i915, int state)
+{
+	blocking_notifier_call_chain(&i915->gt.pm_notifications,
+				     state, i915);
+}
+
+static bool intel_gt_unpark(struct intel_wakeref_count *wc)
+{
+	struct drm_i915_private *i915 =
+		container_of(wc, typeof(*i915), gt.wakeref);
+
+	GEM_TRACE("\n");
+
+	/*
+	 * It seems that the DMC likes to transition between the DC states a lot
+	 * when there are no connected displays (no active power domains) during
+	 * command submission.
+	 *
+	 * This activity has negative impact on the performance of the chip with
+	 * huge latencies observed in the interrupt handler and elsewhere.
+	 *
+	 * Work around it by grabbing a GT IRQ power domain whilst there is any
+	 * GT activity, preventing any DC state transitions.
+	 */
+	i915->gt.awake = intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
+	GEM_BUG_ON(!i915->gt.awake);
+
+	intel_enable_gt_powersave(i915);
+
+	i915_update_gfx_val(i915);
+	if (INTEL_GEN(i915) >= 6)
+		gen6_rps_busy(i915);
+
+	i915_pmu_gt_unparked(i915);
+
+	i915_queue_hangcheck(i915);
+
+	pm_notify(i915, INTEL_GT_UNPARK);
+
+	return true;
+}
+
+void intel_gt_pm_get(struct drm_i915_private *i915)
+{
+	intel_wakeref_get_once(i915, &i915->gt.wakeref, intel_gt_unpark);
+}
+
+static bool intel_gt_park(struct intel_wakeref_count *wc)
+{
+	struct drm_i915_private *i915 =
+		container_of(wc, typeof(*i915), gt.wakeref);
+	intel_wakeref_t wakeref = fetch_and_zero(&i915->gt.awake);
+
+	GEM_TRACE("\n");
+
+	pm_notify(i915, INTEL_GT_PARK);
+
+	i915_pmu_gt_parked(i915);
+	if (INTEL_GEN(i915) >= 6)
+		gen6_rps_idle(i915);
+
+	GEM_BUG_ON(!wakeref);
+	intel_display_power_put(i915, POWER_DOMAIN_GT_IRQ, wakeref);
+
+	return true;
+}
+
+void intel_gt_pm_put(struct drm_i915_private *i915)
+{
+	intel_wakeref_put_once(i915, &i915->gt.wakeref, intel_gt_park);
+}
+
+void intel_gt_pm_init(struct drm_i915_private *i915)
+{
+	intel_wakeref_init(&i915->gt.wakeref);
+	BLOCKING_INIT_NOTIFIER_HEAD(&i915->gt.pm_notifications);
+}
+
+static bool reset_engines(struct drm_i915_private *i915)
+{
+	if (INTEL_INFO(i915)->gpu_reset_clobbers_display)
+		return false;
+
+	return intel_gpu_reset(i915, ALL_ENGINES) == 0;
+}
+
+/**
+ * intel_gt_sanitize: called after the GPU has lost power
+ * @i915: the i915 device
+ * @force: ignore a failed reset and sanitize engine state anyway
+ *
+ * Anytime we reset the GPU, either with an explicit GPU reset or through a
+ * PCI power cycle, the GPU loses state and we must reset our state tracking
+ * to match. Note that calling intel_gt_sanitize() if the GPU has not
+ * been reset results in much confusion!
+ */
+void intel_gt_sanitize(struct drm_i915_private *i915, bool force)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	GEM_TRACE("\n");
+
+	if (!reset_engines(i915) && !force)
+		return;
+
+	for_each_engine(engine, i915, id)
+		intel_engine_reset(engine, false);
+}
+
+void intel_gt_resume(struct drm_i915_private *i915)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	/*
+	 * After resume, we may need to poke into the pinned kernel
+	 * contexts to paper over any damage caused by the sudden suspend.
+	 * Only the kernel contexts should remain pinned over suspend,
+	 * allowing us to fixup the user contexts on their first pin.
+	 */
+	for_each_engine(engine, i915, id) {
+		struct intel_context *ce = engine->kernel_context;
+
+		ce->ops->reset(ce);
+	}
+}
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.h b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
new file mode 100644
index 000000000000..7dd1130a19a4
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
@@ -0,0 +1,27 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef INTEL_GT_PM_H
+#define INTEL_GT_PM_H
+
+#include <linux/types.h>
+
+struct drm_i915_private;
+
+enum {
+	INTEL_GT_UNPARK,
+	INTEL_GT_PARK,
+};
+
+void intel_gt_pm_get(struct drm_i915_private *i915);
+void intel_gt_pm_put(struct drm_i915_private *i915);
+
+void intel_gt_pm_init(struct drm_i915_private *i915);
+
+void intel_gt_sanitize(struct drm_i915_private *i915, bool force);
+void intel_gt_resume(struct drm_i915_private *i915);
+
+#endif /* INTEL_GT_PM_H */
diff --git a/drivers/gpu/drm/i915/gt/intel_hangcheck.c b/drivers/gpu/drm/i915/gt/intel_hangcheck.c
index 3053a706a561..e5eaa06fe74d 100644
--- a/drivers/gpu/drm/i915/gt/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/intel_hangcheck.c
@@ -256,6 +256,7 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
 	unsigned int hung = 0, stuck = 0, wedged = 0;
+	intel_wakeref_t wakeref;
 
 	if (!i915_modparams.enable_hangcheck)
 		return;
@@ -266,6 +267,10 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
 	if (i915_terminally_wedged(dev_priv))
 		return;
 
+	wakeref = intel_runtime_pm_get_if_in_use(dev_priv);
+	if (!wakeref)
+		return;
+
 	/* As enabling the GPU requires fairly extensive mmio access,
 	 * periodically arm the mmio checker to see if we are triggering
 	 * any invalid access.
@@ -313,6 +318,8 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
 	if (hung)
 		hangcheck_declare_hang(dev_priv, hung, stuck);
 
+	intel_runtime_pm_put(dev_priv, wakeref);
+
 	/* Reset timer in case GPU hangs without another request being added */
 	i915_queue_hangcheck(dev_priv);
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 7da5e49dd385..7a5e6e962e61 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1886,7 +1886,7 @@ static bool unexpected_starting_state(struct intel_engine_cs *engine)
 	return unexpected;
 }
 
-static int gen8_init_common_ring(struct intel_engine_cs *engine)
+static int execlists_resume(struct intel_engine_cs *engine)
 {
 	intel_engine_apply_workarounds(engine);
 	intel_engine_apply_whitelist(engine);
@@ -1919,7 +1919,7 @@ static void execlists_reset_prepare(struct intel_engine_cs *engine)
 	 * completed the reset in i915_gem_reset_finish(). If a request
 	 * is completed by one engine, it may then queue a request
 	 * to a second via its execlists->tasklet *just* as we are
-	 * calling engine->init_hw() and also writing the ELSP.
+	 * calling engine->resume() and also writing the ELSP.
 	 * Turning off the execlists->tasklet until the reset is over
 	 * prevents the race.
 	 */
@@ -2370,7 +2370,7 @@ static void
 logical_ring_default_vfuncs(struct intel_engine_cs *engine)
 {
 	/* Default vfuncs which can be overriden by each engine. */
-	engine->init_hw = gen8_init_common_ring;
+	engine->resume = execlists_resume;
 
 	engine->reset.prepare = execlists_reset_prepare;
 	engine->reset.reset = execlists_reset;
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index a5501b136f0c..9421eaeb610e 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -9,6 +9,8 @@
 
 #include "i915_drv.h"
 #include "i915_gpu_error.h"
+#include "intel_engine_pm.h"
+#include "intel_gt_pm.h"
 #include "intel_reset.h"
 
 #include "intel_guc.h"
@@ -667,6 +669,7 @@ static void reset_prepare_engine(struct intel_engine_cs *engine)
 	 * written to the powercontext is undefined and so we may lose
 	 * GPU state upon resume, i.e. fail to restart after a reset.
 	 */
+	intel_engine_pm_get(engine);
 	intel_uncore_forcewake_get(engine->uncore, FORCEWAKE_ALL);
 	engine->reset.prepare(engine);
 }
@@ -702,6 +705,7 @@ static void reset_prepare(struct drm_i915_private *i915)
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
 
+	intel_gt_pm_get(i915);
 	for_each_engine(engine, i915, id)
 		reset_prepare_engine(engine);
 
@@ -739,48 +743,10 @@ static int gt_reset(struct drm_i915_private *i915,
 static void reset_finish_engine(struct intel_engine_cs *engine)
 {
 	engine->reset.finish(engine);
+	intel_engine_pm_put(engine);
 	intel_uncore_forcewake_put(engine->uncore, FORCEWAKE_ALL);
 }
 
-struct i915_gpu_restart {
-	struct work_struct work;
-	struct drm_i915_private *i915;
-};
-
-static void restart_work(struct work_struct *work)
-{
-	struct i915_gpu_restart *arg = container_of(work, typeof(*arg), work);
-	struct drm_i915_private *i915 = arg->i915;
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-	intel_wakeref_t wakeref;
-
-	wakeref = intel_runtime_pm_get(i915);
-	mutex_lock(&i915->drm.struct_mutex);
-	WRITE_ONCE(i915->gpu_error.restart, NULL);
-
-	for_each_engine(engine, i915, id) {
-		struct i915_request *rq;
-
-		/*
-		 * Ostensibily, we always want a context loaded for powersaving,
-		 * so if the engine is idle after the reset, send a request
-		 * to load our scratch kernel_context.
-		 */
-		if (!intel_engine_is_idle(engine))
-			continue;
-
-		rq = i915_request_create(engine->kernel_context);
-		if (!IS_ERR(rq))
-			i915_request_add(rq);
-	}
-
-	mutex_unlock(&i915->drm.struct_mutex);
-	intel_runtime_pm_put(i915, wakeref);
-
-	kfree(arg);
-}
-
 static void reset_finish(struct drm_i915_private *i915)
 {
 	struct intel_engine_cs *engine;
@@ -790,29 +756,7 @@ static void reset_finish(struct drm_i915_private *i915)
 		reset_finish_engine(engine);
 		intel_engine_signal_breadcrumbs(engine);
 	}
-}
-
-static void reset_restart(struct drm_i915_private *i915)
-{
-	struct i915_gpu_restart *arg;
-
-	/*
-	 * Following the reset, ensure that we always reload context for
-	 * powersaving, and to correct engine->last_retired_context. Since
-	 * this requires us to submit a request, queue a worker to do that
-	 * task for us to evade any locking here.
-	 */
-	if (READ_ONCE(i915->gpu_error.restart))
-		return;
-
-	arg = kmalloc(sizeof(*arg), GFP_KERNEL);
-	if (arg) {
-		arg->i915 = i915;
-		INIT_WORK(&arg->work, restart_work);
-
-		WRITE_ONCE(i915->gpu_error.restart, arg);
-		queue_work(i915->wq, &arg->work);
-	}
+	intel_gt_pm_put(i915);
 }
 
 static void nop_submit_request(struct i915_request *request)
@@ -873,6 +817,7 @@ static void __i915_gem_set_wedged(struct drm_i915_private *i915)
 	 * in nop_submit_request.
 	 */
 	synchronize_rcu_expedited();
+	set_bit(I915_WEDGED, &error->flags);
 
 	/* Mark all executing requests as skipped */
 	for_each_engine(engine, i915, id)
@@ -880,9 +825,6 @@ static void __i915_gem_set_wedged(struct drm_i915_private *i915)
 
 	reset_finish(i915);
 
-	smp_mb__before_atomic();
-	set_bit(I915_WEDGED, &error->flags);
-
 	GEM_TRACE("end\n");
 }
 
@@ -940,7 +882,7 @@ static bool __i915_gem_unset_wedged(struct drm_i915_private *i915)
 	}
 	mutex_unlock(&i915->gt.timelines.mutex);
 
-	intel_engines_sanitize(i915, false);
+	intel_gt_sanitize(i915, false);
 
 	/*
 	 * Undo nop_submit_request. We prevent all new i915 requests from
@@ -1018,7 +960,6 @@ void i915_reset(struct drm_i915_private *i915,
 	GEM_TRACE("flags=%lx\n", error->flags);
 
 	might_sleep();
-	assert_rpm_wakelock_held(i915);
 	GEM_BUG_ON(!test_bit(I915_RESET_BACKOFF, &error->flags));
 
 	/* Clear any previous failed attempts at recovery. Time to try again. */
@@ -1071,8 +1012,6 @@ void i915_reset(struct drm_i915_private *i915,
 
 finish:
 	reset_finish(i915);
-	if (!__i915_wedged(error))
-		reset_restart(i915);
 	return;
 
 taint:
@@ -1121,6 +1060,9 @@ int i915_reset_engine(struct intel_engine_cs *engine, const char *msg)
 	GEM_TRACE("%s flags=%lx\n", engine->name, error->flags);
 	GEM_BUG_ON(!test_bit(I915_RESET_ENGINE + engine->id, &error->flags));
 
+	if (!intel_wakeref_active(&engine->wakeref))
+		return 0;
+
 	reset_prepare_engine(engine);
 
 	if (msg)
@@ -1152,7 +1094,7 @@ int i915_reset_engine(struct intel_engine_cs *engine, const char *msg)
 	 * have been reset to their default values. Follow the init_ring
 	 * process to program RING_MODE, HWSP and re-enable submission.
 	 */
-	ret = engine->init_hw(engine);
+	ret = engine->resume(engine);
 	if (ret)
 		goto out;
 
@@ -1408,25 +1350,6 @@ int i915_terminally_wedged(struct drm_i915_private *i915)
 	return __i915_wedged(error) ? -EIO : 0;
 }
 
-bool i915_reset_flush(struct drm_i915_private *i915)
-{
-	int err;
-
-	cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
-
-	flush_workqueue(i915->wq);
-	GEM_BUG_ON(READ_ONCE(i915->gpu_error.restart));
-
-	mutex_lock(&i915->drm.struct_mutex);
-	err = i915_gem_wait_for_idle(i915,
-				     I915_WAIT_LOCKED |
-				     I915_WAIT_FOR_IDLE_BOOST,
-				     MAX_SCHEDULE_TIMEOUT);
-	mutex_unlock(&i915->drm.struct_mutex);
-
-	return !err;
-}
-
 static void i915_wedge_me(struct work_struct *work)
 {
 	struct i915_wedge_me *w = container_of(work, typeof(*w), work.work);
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.h b/drivers/gpu/drm/i915/gt/intel_reset.h
index 8e662bb43a9b..b52efaab4941 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.h
+++ b/drivers/gpu/drm/i915/gt/intel_reset.h
@@ -34,7 +34,6 @@ int i915_reset_engine(struct intel_engine_cs *engine,
 		      const char *reason);
 
 void i915_reset_request(struct i915_request *rq, bool guilty);
-bool i915_reset_flush(struct drm_i915_private *i915);
 
 int __must_check i915_reset_trylock(struct drm_i915_private *i915);
 void i915_reset_unlock(struct drm_i915_private *i915, int tag);
diff --git a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
index 63bf092d737b..748c133b83b4 100644
--- a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
@@ -637,12 +637,15 @@ static bool stop_ring(struct intel_engine_cs *engine)
 	return (ENGINE_READ(engine, RING_HEAD) & HEAD_ADDR) == 0;
 }
 
-static int init_ring_common(struct intel_engine_cs *engine)
+static int xcs_resume(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
 	struct intel_ring *ring = engine->buffer;
 	int ret = 0;
 
+	GEM_TRACE("%s: ring:{HEAD:%04x, TAIL:%04x}\n",
+		  engine->name, ring->head, ring->tail);
+
 	intel_uncore_forcewake_get(engine->uncore, FORCEWAKE_ALL);
 
 	if (!stop_ring(engine)) {
@@ -827,12 +830,9 @@ static int intel_rcs_ctx_init(struct i915_request *rq)
 	return 0;
 }
 
-static int init_render_ring(struct intel_engine_cs *engine)
+static int rcs_resume(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
-	int ret = init_ring_common(engine);
-	if (ret)
-		return ret;
 
 	/* WaTimedSingleVertexDispatch:cl,bw,ctg,elk,ilk,snb */
 	if (IS_GEN_RANGE(dev_priv, 4, 6))
@@ -875,7 +875,7 @@ static int init_render_ring(struct intel_engine_cs *engine)
 	if (INTEL_GEN(dev_priv) >= 6)
 		ENGINE_WRITE(engine, RING_IMR, ~engine->irq_keep_mask);
 
-	return 0;
+	return xcs_resume(engine);
 }
 
 static void cancel_requests(struct intel_engine_cs *engine)
@@ -2207,7 +2207,7 @@ static void intel_ring_default_vfuncs(struct drm_i915_private *dev_priv,
 
 	intel_ring_init_irq(dev_priv, engine);
 
-	engine->init_hw = init_ring_common;
+	engine->resume = xcs_resume;
 	engine->reset.prepare = reset_prepare;
 	engine->reset.reset = reset_ring;
 	engine->reset.finish = reset_finish;
@@ -2269,7 +2269,7 @@ int intel_init_render_ring_buffer(struct intel_engine_cs *engine)
 	if (IS_HASWELL(dev_priv))
 		engine->emit_bb_start = hsw_emit_bb_start;
 
-	engine->init_hw = init_render_ring;
+	engine->resume = rcs_resume;
 
 	ret = intel_init_ring_buffer(engine);
 	if (ret)
diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c
index 2e84ffb46489..d9f68c89dff4 100644
--- a/drivers/gpu/drm/i915/gt/mock_engine.c
+++ b/drivers/gpu/drm/i915/gt/mock_engine.c
@@ -24,6 +24,7 @@
 
 #include "i915_drv.h"
 #include "intel_context.h"
+#include "intel_engine_pm.h"
 
 #include "mock_engine.h"
 #include "selftests/mock_request.h"
@@ -268,6 +269,8 @@ struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
 	i915_timeline_set_subclass(&engine->base.timeline, TIMELINE_ENGINE);
 
 	intel_engine_init_breadcrumbs(&engine->base);
+	intel_engine_init_execlists(&engine->base);
+	intel_engine_init__pm(&engine->base);
 
 	/* fake hw queue */
 	spin_lock_init(&engine->hw_lock);
diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index 87c26920212f..6004d6907e9c 100644
--- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -24,6 +24,8 @@
 
 #include <linux/kthread.h>
 
+#include "intel_engine_pm.h"
+
 #include "i915_selftest.h"
 #include "selftests/i915_random.h"
 #include "selftests/igt_flush_test.h"
@@ -479,19 +481,6 @@ static int igt_reset_nop(void *arg)
 			break;
 		}
 
-		if (!i915_reset_flush(i915)) {
-			struct drm_printer p =
-				drm_info_printer(i915->drm.dev);
-
-			pr_err("%s failed to idle after reset\n",
-			       engine->name);
-			intel_engine_dump(engine, &p,
-					  "%s\n", engine->name);
-
-			err = -EIO;
-			break;
-		}
-
 		err = igt_flush_test(i915, 0);
 		if (err)
 			break;
@@ -594,19 +583,6 @@ static int igt_reset_nop_engine(void *arg)
 				err = -EINVAL;
 				break;
 			}
-
-			if (!i915_reset_flush(i915)) {
-				struct drm_printer p =
-					drm_info_printer(i915->drm.dev);
-
-				pr_err("%s failed to idle after reset\n",
-				       engine->name);
-				intel_engine_dump(engine, &p,
-						  "%s\n", engine->name);
-
-				err = -EIO;
-				break;
-			}
 		} while (time_before(jiffies, end_time));
 		clear_bit(I915_RESET_ENGINE + id, &i915->gpu_error.flags);
 		pr_info("%s(%s): %d resets\n", __func__, engine->name, count);
@@ -669,6 +645,7 @@ static int __igt_reset_engine(struct drm_i915_private *i915, bool active)
 		reset_engine_count = i915_reset_engine_count(&i915->gpu_error,
 							     engine);
 
+		intel_engine_pm_get(engine);
 		set_bit(I915_RESET_ENGINE + id, &i915->gpu_error.flags);
 		do {
 			if (active) {
@@ -721,21 +698,9 @@ static int __igt_reset_engine(struct drm_i915_private *i915, bool active)
 				err = -EINVAL;
 				break;
 			}
-
-			if (!i915_reset_flush(i915)) {
-				struct drm_printer p =
-					drm_info_printer(i915->drm.dev);
-
-				pr_err("%s failed to idle after reset\n",
-				       engine->name);
-				intel_engine_dump(engine, &p,
-						  "%s\n", engine->name);
-
-				err = -EIO;
-				break;
-			}
 		} while (time_before(jiffies, end_time));
 		clear_bit(I915_RESET_ENGINE + id, &i915->gpu_error.flags);
+		intel_engine_pm_put(engine);
 
 		if (err)
 			break;
@@ -942,6 +907,7 @@ static int __igt_reset_engines(struct drm_i915_private *i915,
 			get_task_struct(tsk);
 		}
 
+		intel_engine_pm_get(engine);
 		set_bit(I915_RESET_ENGINE + id, &i915->gpu_error.flags);
 		do {
 			struct i915_request *rq = NULL;
@@ -1018,6 +984,7 @@ static int __igt_reset_engines(struct drm_i915_private *i915,
 			}
 		} while (time_before(jiffies, end_time));
 		clear_bit(I915_RESET_ENGINE + id, &i915->gpu_error.flags);
+		intel_engine_pm_put(engine);
 		pr_info("i915_reset_engine(%s:%s): %lu resets\n",
 			engine->name, test_name, count);
 
@@ -1069,7 +1036,9 @@ static int __igt_reset_engines(struct drm_i915_private *i915,
 		if (err)
 			break;
 
-		err = igt_flush_test(i915, 0);
+		mutex_lock(&i915->drm.struct_mutex);
+		err = igt_flush_test(i915, I915_WAIT_LOCKED);
+		mutex_unlock(&i915->drm.struct_mutex);
 		if (err)
 			break;
 	}
diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
index 9b2fd24ea950..028942cac4b4 100644
--- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
@@ -71,7 +71,6 @@ read_nonprivs(struct i915_gem_context *ctx, struct intel_engine_cs *engine)
 {
 	const u32 base = engine->mmio_base;
 	struct drm_i915_gem_object *result;
-	intel_wakeref_t wakeref;
 	struct i915_request *rq;
 	struct i915_vma *vma;
 	u32 srm, *cs;
@@ -103,9 +102,7 @@ read_nonprivs(struct i915_gem_context *ctx, struct intel_engine_cs *engine)
 	if (err)
 		goto err_obj;
 
-	rq = ERR_PTR(-ENODEV);
-	with_intel_runtime_pm(engine->i915, wakeref)
-		rq = i915_request_alloc(engine, ctx);
+	rq = i915_request_alloc(engine, ctx);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto err_pin;
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index e3eca88614b2..8ac8a67e7b81 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2041,8 +2041,7 @@ static int i915_rps_boost_info(struct seq_file *m, void *data)
 	}
 
 	seq_printf(m, "RPS enabled? %d\n", rps->enabled);
-	seq_printf(m, "GPU busy? %s [%d requests]\n",
-		   yesno(dev_priv->gt.awake), dev_priv->gt.active_requests);
+	seq_printf(m, "GPU busy? %s\n", yesno(dev_priv->gt.awake));
 	seq_printf(m, "Boosts outstanding? %d\n",
 		   atomic_read(&rps->num_waiters));
 	seq_printf(m, "Interactive? %d\n", READ_ONCE(rps->power.interactive));
@@ -2061,9 +2060,7 @@ static int i915_rps_boost_info(struct seq_file *m, void *data)
 
 	seq_printf(m, "Wait boosts: %d\n", atomic_read(&rps->boosts));
 
-	if (INTEL_GEN(dev_priv) >= 6 &&
-	    rps->enabled &&
-	    dev_priv->gt.active_requests) {
+	if (INTEL_GEN(dev_priv) >= 6 && rps->enabled && dev_priv->gt.awake) {
 		u32 rpup, rpupei;
 		u32 rpdown, rpdownei;
 
@@ -3092,9 +3089,9 @@ static int i915_engine_info(struct seq_file *m, void *unused)
 
 	wakeref = intel_runtime_pm_get(dev_priv);
 
-	seq_printf(m, "GT awake? %s\n", yesno(dev_priv->gt.awake));
-	seq_printf(m, "Global active requests: %d\n",
-		   dev_priv->gt.active_requests);
+	seq_printf(m, "GT awake? %s [%d]\n",
+		   yesno(dev_priv->gt.awake),
+		   atomic_read(&dev_priv->gt.wakeref.count));
 	seq_printf(m, "CS timestamp frequency: %u kHz\n",
 		   RUNTIME_INFO(dev_priv)->cs_timestamp_frequency_khz);
 
@@ -3940,8 +3937,7 @@ i915_drop_caches_set(void *data, u64 val)
 
 	if (val & DROP_IDLE) {
 		do {
-			if (READ_ONCE(i915->gt.active_requests))
-				flush_delayed_work(&i915->gem.retire_work);
+			flush_delayed_work(&i915->gem.retire_work);
 			drain_delayed_work(&i915->gem.idle_work);
 		} while (READ_ONCE(i915->gt.awake));
 	}
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 98b997526daa..c8cb70d4fe91 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -47,8 +47,9 @@
 #include <drm/drm_probe_helper.h>
 #include <drm/i915_drm.h>
 
-#include "gt/intel_workarounds.h"
+#include "gt/intel_gt_pm.h"
 #include "gt/intel_reset.h"
+#include "gt/intel_workarounds.h"
 
 #include "i915_drv.h"
 #include "i915_pmu.h"
@@ -2323,7 +2324,7 @@ static int i915_drm_resume_early(struct drm_device *dev)
 
 	intel_power_domains_resume(dev_priv);
 
-	intel_engines_sanitize(dev_priv, true);
+	intel_gt_sanitize(dev_priv, true);
 
 	enable_rpm_wakeref_asserts(dev_priv);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index cbae9be052e0..9bfe9bb9449c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2006,10 +2006,10 @@ struct drm_i915_private {
 			struct list_head hwsp_free_list;
 		} timelines;
 
-		intel_engine_mask_t active_engines;
 		struct list_head active_rings;
 		struct list_head closed_vma;
-		u32 active_requests;
+
+		struct intel_wakeref_count wakeref;
 
 		/**
 		 * Is the GPU currently considered idle, or busy executing
@@ -2020,12 +2020,16 @@ struct drm_i915_private {
 		 */
 		intel_wakeref_t awake;
 
+		struct blocking_notifier_head pm_notifications;
+
 		ktime_t last_init_time;
 
 		struct i915_vma *scratch;
 	} gt;
 
 	struct {
+		struct notifier_block pm_notifier;
+
 		/**
 		 * We leave the user IRQ off as much as possible,
 		 * but this means that requests will finish and never
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c56633e87c32..a6436c77109a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -39,6 +39,8 @@
 #include <linux/dma-buf.h>
 #include <linux/mman.h>
 
+#include "gt/intel_engine_pm.h"
+#include "gt/intel_gt_pm.h"
 #include "gt/intel_mocs.h"
 #include "gt/intel_reset.h"
 #include "gt/intel_workarounds.h"
@@ -2911,9 +2913,6 @@ wait_for_timelines(struct drm_i915_private *i915,
 	struct i915_gt_timelines *gt = &i915->gt.timelines;
 	struct i915_timeline *tl;
 
-	if (!READ_ONCE(i915->gt.active_requests))
-		return timeout;
-
 	mutex_lock(&gt->mutex);
 	list_for_each_entry(tl, &gt->active_list, link) {
 		struct i915_request *rq;
@@ -2953,9 +2952,10 @@ wait_for_timelines(struct drm_i915_private *i915,
 int i915_gem_wait_for_idle(struct drm_i915_private *i915,
 			   unsigned int flags, long timeout)
 {
-	GEM_TRACE("flags=%x (%s), timeout=%ld%s\n",
+	GEM_TRACE("flags=%x (%s), timeout=%ld%s, awake?=%s\n",
 		  flags, flags & I915_WAIT_LOCKED ? "locked" : "unlocked",
-		  timeout, timeout == MAX_SCHEDULE_TIMEOUT ? " (forever)" : "");
+		  timeout, timeout == MAX_SCHEDULE_TIMEOUT ? " (forever)" : "",
+		  yesno(i915->gt.awake));
 
 	/* If the device is asleep, we have no requests outstanding */
 	if (!READ_ONCE(i915->gt.awake))
@@ -4177,7 +4177,7 @@ void i915_gem_sanitize(struct drm_i915_private *i915)
 	 * it may impact the display and we are uncertain about the stability
 	 * of the reset, so this could be applied to even earlier gen.
 	 */
-	intel_engines_sanitize(i915, false);
+	intel_gt_sanitize(i915, false);
 
 	intel_uncore_forcewake_put(&i915->uncore, FORCEWAKE_ALL);
 	intel_runtime_pm_put(i915, wakeref);
@@ -4235,27 +4235,6 @@ static void init_unused_rings(struct drm_i915_private *dev_priv)
 	}
 }
 
-static int __i915_gem_restart_engines(void *data)
-{
-	struct drm_i915_private *i915 = data;
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-	int err;
-
-	for_each_engine(engine, i915, id) {
-		err = engine->init_hw(engine);
-		if (err) {
-			DRM_ERROR("Failed to restart %s (%d)\n",
-				  engine->name, err);
-			return err;
-		}
-	}
-
-	intel_engines_set_scheduler_caps(i915);
-
-	return 0;
-}
-
 int i915_gem_init_hw(struct drm_i915_private *dev_priv)
 {
 	int ret;
@@ -4314,12 +4293,13 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
 	intel_mocs_init_l3cc_table(dev_priv);
 
 	/* Only when the HW is re-initialised, can we replay the requests */
-	ret = __i915_gem_restart_engines(dev_priv);
+	ret = intel_engines_resume(dev_priv);
 	if (ret)
 		goto cleanup_uc;
 
 	intel_uncore_forcewake_put(&dev_priv->uncore, FORCEWAKE_ALL);
 
+	intel_engines_set_scheduler_caps(dev_priv);
 	return 0;
 
 cleanup_uc:
@@ -4604,6 +4584,7 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 err_init_hw:
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 
+	i915_gem_set_wedged(dev_priv);
 	i915_gem_suspend(dev_priv);
 	i915_gem_suspend_late(dev_priv);
 
@@ -4665,6 +4646,8 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 
 void i915_gem_fini(struct drm_i915_private *dev_priv)
 {
+	GEM_BUG_ON(dev_priv->gt.awake);
+
 	i915_gem_suspend_late(dev_priv);
 	intel_disable_gt_powersave(dev_priv);
 
@@ -4759,6 +4742,8 @@ int i915_gem_init_early(struct drm_i915_private *dev_priv)
 {
 	int err;
 
+	intel_gt_pm_init(dev_priv);
+
 	INIT_LIST_HEAD(&dev_priv->gt.active_rings);
 	INIT_LIST_HEAD(&dev_priv->gt.closed_vma);
 
diff --git a/drivers/gpu/drm/i915/i915_gem.h b/drivers/gpu/drm/i915/i915_gem.h
index 9074eb1e843f..67f8a4a807a0 100644
--- a/drivers/gpu/drm/i915/i915_gem.h
+++ b/drivers/gpu/drm/i915/i915_gem.h
@@ -75,9 +75,6 @@ struct drm_i915_private;
 
 #define I915_GEM_IDLE_TIMEOUT (HZ / 5)
 
-void i915_gem_park(struct drm_i915_private *i915);
-void i915_gem_unpark(struct drm_i915_private *i915);
-
 static inline void __tasklet_disable_sync_once(struct tasklet_struct *t)
 {
 	if (!atomic_fetch_inc(&t->count))
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index de3404479bc3..51c427ac0c21 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -824,26 +824,6 @@ int i915_gem_vm_destroy_ioctl(struct drm_device *dev, void *data,
 	return 0;
 }
 
-static struct i915_request *
-last_request_on_engine(struct i915_timeline *timeline,
-		       struct intel_engine_cs *engine)
-{
-	struct i915_request *rq;
-
-	GEM_BUG_ON(timeline == &engine->timeline);
-
-	rq = i915_active_request_raw(&timeline->last_request,
-				     &engine->i915->drm.struct_mutex);
-	if (rq && rq->engine->mask & engine->mask) {
-		GEM_TRACE("last request on engine %s: %llx:%llu\n",
-			  engine->name, rq->fence.context, rq->fence.seqno);
-		GEM_BUG_ON(rq->timeline != timeline);
-		return rq;
-	}
-
-	return NULL;
-}
-
 struct context_barrier_task {
 	struct i915_active base;
 	void (*task)(void *data);
@@ -871,7 +851,6 @@ static int context_barrier_task(struct i915_gem_context *ctx,
 	struct drm_i915_private *i915 = ctx->i915;
 	struct context_barrier_task *cb;
 	struct intel_context *ce, *next;
-	intel_wakeref_t wakeref;
 	int err = 0;
 
 	lockdep_assert_held(&i915->drm.struct_mutex);
@@ -884,7 +863,6 @@ static int context_barrier_task(struct i915_gem_context *ctx,
 	i915_active_init(i915, &cb->base, cb_retire);
 	i915_active_acquire(&cb->base);
 
-	wakeref = intel_runtime_pm_get(i915);
 	rbtree_postorder_for_each_entry_safe(ce, next, &ctx->hw_contexts, node) {
 		struct intel_engine_cs *engine = ce->engine;
 		struct i915_request *rq;
@@ -914,7 +892,6 @@ static int context_barrier_task(struct i915_gem_context *ctx,
 		if (err)
 			break;
 	}
-	intel_runtime_pm_put(i915, wakeref);
 
 	cb->task = err ? NULL : task; /* caller needs to unwind instead */
 	cb->data = data;
@@ -924,54 +901,6 @@ static int context_barrier_task(struct i915_gem_context *ctx,
 	return err;
 }
 
-int i915_gem_switch_to_kernel_context(struct drm_i915_private *i915,
-				      intel_engine_mask_t mask)
-{
-	struct intel_engine_cs *engine;
-
-	GEM_TRACE("awake?=%s\n", yesno(i915->gt.awake));
-
-	lockdep_assert_held(&i915->drm.struct_mutex);
-	GEM_BUG_ON(!i915->kernel_context);
-
-	/* Inoperable, so presume the GPU is safely pointing into the void! */
-	if (i915_terminally_wedged(i915))
-		return 0;
-
-	for_each_engine_masked(engine, i915, mask, mask) {
-		struct intel_ring *ring;
-		struct i915_request *rq;
-
-		rq = i915_request_create(engine->kernel_context);
-		if (IS_ERR(rq))
-			return PTR_ERR(rq);
-
-		/* Queue this switch after all other activity */
-		list_for_each_entry(ring, &i915->gt.active_rings, active_link) {
-			struct i915_request *prev;
-
-			prev = last_request_on_engine(ring->timeline, engine);
-			if (!prev)
-				continue;
-
-			if (prev->gem_context == i915->kernel_context)
-				continue;
-
-			GEM_TRACE("add barrier on %s for %llx:%lld\n",
-				  engine->name,
-				  prev->fence.context,
-				  prev->fence.seqno);
-			i915_sw_fence_await_sw_fence_gfp(&rq->submit,
-							 &prev->submit,
-							 I915_FENCE_GFP);
-		}
-
-		i915_request_add(rq);
-	}
-
-	return 0;
-}
-
 static int get_ppgtt(struct drm_i915_file_private *file_priv,
 		     struct i915_gem_context *ctx,
 		     struct drm_i915_gem_context_param *args)
@@ -1169,9 +1098,7 @@ static int gen8_emit_rpcs_config(struct i915_request *rq,
 static int
 gen8_modify_rpcs(struct intel_context *ce, struct intel_sseu sseu)
 {
-	struct drm_i915_private *i915 = ce->engine->i915;
 	struct i915_request *rq;
-	intel_wakeref_t wakeref;
 	int ret;
 
 	lockdep_assert_held(&ce->pin_mutex);
@@ -1185,14 +1112,9 @@ gen8_modify_rpcs(struct intel_context *ce, struct intel_sseu sseu)
 	if (!intel_context_is_pinned(ce))
 		return 0;
 
-	/* Submitting requests etc needs the hw awake. */
-	wakeref = intel_runtime_pm_get(i915);
-
 	rq = i915_request_create(ce->engine->kernel_context);
-	if (IS_ERR(rq)) {
-		ret = PTR_ERR(rq);
-		goto out_put;
-	}
+	if (IS_ERR(rq))
+		return PTR_ERR(rq);
 
 	/* Queue this switch after all other activity by this context. */
 	ret = i915_active_request_set(&ce->ring->timeline->last_request, rq);
@@ -1216,9 +1138,6 @@ gen8_modify_rpcs(struct intel_context *ce, struct intel_sseu sseu)
 
 out_add:
 	i915_request_add(rq);
-out_put:
-	intel_runtime_pm_put(i915, wakeref);
-
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index cec278ab04e2..5a8e080499fb 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -141,10 +141,6 @@ int i915_gem_context_open(struct drm_i915_private *i915,
 			  struct drm_file *file);
 void i915_gem_context_close(struct drm_file *file);
 
-int i915_switch_context(struct i915_request *rq);
-int i915_gem_switch_to_kernel_context(struct drm_i915_private *i915,
-				      intel_engine_mask_t engine_mask);
-
 void i915_gem_context_release(struct kref *ctx_ref);
 struct i915_gem_context *
 i915_gem_context_create_gvt(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 060f5903544a..0bdb3e072ba5 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -36,15 +36,8 @@ I915_SELFTEST_DECLARE(static struct igt_evict_ctl {
 	bool fail_if_busy:1;
 } igt_evict_ctl;)
 
-static bool ggtt_is_idle(struct drm_i915_private *i915)
-{
-	return !i915->gt.active_requests;
-}
-
 static int ggtt_flush(struct drm_i915_private *i915)
 {
-	int err;
-
 	/*
 	 * Not everything in the GGTT is tracked via vma (otherwise we
 	 * could evict as required with minimal stalling) so we are forced
@@ -52,19 +45,10 @@ static int ggtt_flush(struct drm_i915_private *i915)
 	 * the hopes that we can then remove contexts and the like only
 	 * bound by their active reference.
 	 */
-	err = i915_gem_switch_to_kernel_context(i915, i915->gt.active_engines);
-	if (err)
-		return err;
-
-	err = i915_gem_wait_for_idle(i915,
-				     I915_WAIT_INTERRUPTIBLE |
-				     I915_WAIT_LOCKED,
-				     MAX_SCHEDULE_TIMEOUT);
-	if (err)
-		return err;
-
-	GEM_BUG_ON(!ggtt_is_idle(i915));
-	return 0;
+	return i915_gem_wait_for_idle(i915,
+				      I915_WAIT_INTERRUPTIBLE |
+				      I915_WAIT_LOCKED,
+				      MAX_SCHEDULE_TIMEOUT);
 }
 
 static bool
@@ -222,24 +206,17 @@ i915_gem_evict_something(struct i915_address_space *vm,
 	 * us a termination condition, when the last retired context is
 	 * the kernel's there is no more we can evict.
 	 */
-	if (!ggtt_is_idle(dev_priv)) {
-		if (I915_SELFTEST_ONLY(igt_evict_ctl.fail_if_busy))
-			return -EBUSY;
+	if (I915_SELFTEST_ONLY(igt_evict_ctl.fail_if_busy))
+		return -EBUSY;
 
-		ret = ggtt_flush(dev_priv);
-		if (ret)
-			return ret;
+	ret = ggtt_flush(dev_priv);
+	if (ret)
+		return ret;
 
-		cond_resched();
-		goto search_again;
-	}
+	cond_resched();
 
-	/*
-	 * If we still have pending pageflip completions, drop
-	 * back to userspace to give our workqueues time to
-	 * acquire our locks and unpin the old scanouts.
-	 */
-	return intel_has_pending_fb_unpin(dev_priv) ? -EAGAIN : -ENOSPC;
+	flags |= PIN_NONBLOCK;
+	goto search_again;
 
 found:
 	/* drm_mm doesn't allow any other other operations while
diff --git a/drivers/gpu/drm/i915/i915_gem_pm.c b/drivers/gpu/drm/i915/i915_gem_pm.c
index bc0663cac7b6..ba81358505ba 100644
--- a/drivers/gpu/drm/i915/i915_gem_pm.c
+++ b/drivers/gpu/drm/i915/i915_gem_pm.c
@@ -4,136 +4,63 @@
  * Copyright © 2019 Intel Corporation
  */
 
+#include "gt/intel_gt_pm.h"
+
 #include "i915_drv.h"
 #include "i915_gem_pm.h"
 #include "i915_globals.h"
-#include "intel_pm.h"
 
-static void __i915_gem_park(struct drm_i915_private *i915)
+static void i915_gem_park(struct drm_i915_private *i915)
 {
-	intel_wakeref_t wakeref;
-
-	GEM_TRACE("\n");
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
 
 	lockdep_assert_held(&i915->drm.struct_mutex);
-	GEM_BUG_ON(i915->gt.active_requests);
-	GEM_BUG_ON(!list_empty(&i915->gt.active_rings));
-
-	if (!i915->gt.awake)
-		return;
-
-	/*
-	 * Be paranoid and flush a concurrent interrupt to make sure
-	 * we don't reactivate any irq tasklets after parking.
-	 *
-	 * FIXME: Note that even though we have waited for execlists to be idle,
-	 * there may still be an in-flight interrupt even though the CSB
-	 * is now empty. synchronize_irq() makes sure that a residual interrupt
-	 * is completed before we continue, but it doesn't prevent the HW from
-	 * raising a spurious interrupt later. To complete the shield we should
-	 * coordinate disabling the CS irq with flushing the interrupts.
-	 */
-	synchronize_irq(i915->drm.irq);
-
-	intel_engines_park(i915);
-	i915_timelines_park(i915);
-
-	i915_pmu_gt_parked(i915);
-	i915_vma_parked(i915);
-
-	wakeref = fetch_and_zero(&i915->gt.awake);
-	GEM_BUG_ON(!wakeref);
-
-	if (INTEL_GEN(i915) >= 6)
-		gen6_rps_idle(i915);
-
-	intel_display_power_put(i915, POWER_DOMAIN_GT_IRQ, wakeref);
-
-	i915_globals_park();
-}
-
-static bool switch_to_kernel_context_sync(struct drm_i915_private *i915,
-					  unsigned long mask)
-{
-	bool result = true;
 
-	/*
-	 * Even if we fail to switch, give whatever is running a small chance
-	 * to save itself before we report the failure. Yes, this may be a
-	 * false positive due to e.g. ENOMEM, caveat emptor!
-	 */
-	if (i915_gem_switch_to_kernel_context(i915, mask))
-		result = false;
-
-	if (i915_gem_wait_for_idle(i915,
-				   I915_WAIT_LOCKED |
-				   I915_WAIT_FOR_IDLE_BOOST,
-				   I915_GEM_IDLE_TIMEOUT))
-		result = false;
+	for_each_engine(engine, i915, id) {
+		/*
+		 * We are committed now to parking the engines, make sure there
+		 * will be no more interrupts arriving later and the engines
+		 * are truly idle.
+		 */
+		if (wait_for(intel_engine_is_idle(engine), 10)) {
+			struct drm_printer p = drm_debug_printer(__func__);
 
-	if (!result) {
-		if (i915_modparams.reset) { /* XXX hide warning from gem_eio */
 			dev_err(i915->drm.dev,
-				"Failed to idle engines, declaring wedged!\n");
-			GEM_TRACE_DUMP();
+				"%s is not idle before parking\n",
+				engine->name);
+			intel_engine_dump(engine, &p, NULL);
 		}
+		tasklet_kill(&engine->execlists.tasklet);
 
-		/* Forcibly cancel outstanding work and leave the gpu quiet. */
-		i915_gem_set_wedged(i915);
+		i915_gem_batch_pool_fini(&engine->batch_pool);
 	}
 
-	i915_retire_requests(i915); /* ensure we flush after wedging */
-	return result;
+	i915_timelines_park(i915);
+	i915_vma_parked(i915);
+
+	i915_globals_park();
 }
 
 static void idle_work_handler(struct work_struct *work)
 {
 	struct drm_i915_private *i915 =
 		container_of(work, typeof(*i915), gem.idle_work.work);
-	bool rearm_hangcheck;
-
-	if (!READ_ONCE(i915->gt.awake))
-		return;
-
-	if (READ_ONCE(i915->gt.active_requests))
-		return;
-
-	rearm_hangcheck =
-		cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
 
 	if (!mutex_trylock(&i915->drm.struct_mutex)) {
 		/* Currently busy, come back later */
 		mod_delayed_work(i915->wq,
 				 &i915->gem.idle_work,
 				 msecs_to_jiffies(50));
-		goto out_rearm;
+		return;
 	}
 
-	/*
-	 * Flush out the last user context, leaving only the pinned
-	 * kernel context resident. Should anything unfortunate happen
-	 * while we are idle (such as the GPU being power cycled), no users
-	 * will be harmed.
-	 */
-	if (!work_pending(&i915->gem.idle_work.work) &&
-	    !i915->gt.active_requests) {
-		++i915->gt.active_requests; /* don't requeue idle */
-
-		switch_to_kernel_context_sync(i915, i915->gt.active_engines);
-
-		if (!--i915->gt.active_requests) {
-			__i915_gem_park(i915);
-			rearm_hangcheck = false;
-		}
-	}
+	intel_wakeref_lock(&i915->gt.wakeref);
+	if (!intel_wakeref_active(&i915->gt.wakeref))
+		i915_gem_park(i915);
+	intel_wakeref_unlock(&i915->gt.wakeref);
 
 	mutex_unlock(&i915->drm.struct_mutex);
-
-out_rearm:
-	if (rearm_hangcheck) {
-		GEM_BUG_ON(!i915->gt.awake);
-		i915_queue_hangcheck(i915);
-	}
 }
 
 static void retire_work_handler(struct work_struct *work)
@@ -147,88 +74,69 @@ static void retire_work_handler(struct work_struct *work)
 		mutex_unlock(&i915->drm.struct_mutex);
 	}
 
-	/*
-	 * Keep the retire handler running until we are finally idle.
-	 * We do not need to do this test under locking as in the worst-case
-	 * we queue the retire worker once too often.
-	 */
-	if (READ_ONCE(i915->gt.awake))
+	if (intel_wakeref_active(&i915->gt.wakeref))
 		queue_delayed_work(i915->wq,
 				   &i915->gem.retire_work,
 				   round_jiffies_up_relative(HZ));
 }
 
-void i915_gem_park(struct drm_i915_private *i915)
+static int pm_notifier(struct notifier_block *nb,
+		       unsigned long action,
+		       void *data)
 {
-	GEM_TRACE("\n");
+	struct drm_i915_private *i915 =
+		container_of(nb, typeof(*i915), gem.pm_notifier);
 
-	lockdep_assert_held(&i915->drm.struct_mutex);
-	GEM_BUG_ON(i915->gt.active_requests);
+	switch (action) {
+	case INTEL_GT_UNPARK:
+		i915_globals_unpark();
+		queue_delayed_work(i915->wq,
+				   &i915->gem.retire_work,
+				   round_jiffies_up_relative(HZ));
+		break;
 
-	if (!i915->gt.awake)
-		return;
+	case INTEL_GT_PARK:
+		mod_delayed_work(i915->wq,
+				 &i915->gem.idle_work,
+				 msecs_to_jiffies(100));
+		break;
+	}
 
-	/* Defer the actual call to __i915_gem_park() to prevent ping-pongs */
-	mod_delayed_work(i915->wq, &i915->gem.idle_work, msecs_to_jiffies(100));
+	return NOTIFY_OK;
 }
 
-void i915_gem_unpark(struct drm_i915_private *i915)
+static bool switch_to_kernel_context_sync(struct drm_i915_private *i915)
 {
-	GEM_TRACE("\n");
-
-	lockdep_assert_held(&i915->drm.struct_mutex);
-	GEM_BUG_ON(!i915->gt.active_requests);
-	assert_rpm_wakelock_held(i915);
-
-	if (i915->gt.awake)
-		return;
-
-	/*
-	 * It seems that the DMC likes to transition between the DC states a lot
-	 * when there are no connected displays (no active power domains) during
-	 * command submission.
-	 *
-	 * This activity has negative impact on the performance of the chip with
-	 * huge latencies observed in the interrupt handler and elsewhere.
-	 *
-	 * Work around it by grabbing a GT IRQ power domain whilst there is any
-	 * GT activity, preventing any DC state transitions.
-	 */
-	i915->gt.awake = intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
-	GEM_BUG_ON(!i915->gt.awake);
-
-	i915_globals_unpark();
-
-	intel_enable_gt_powersave(i915);
-	i915_update_gfx_val(i915);
-	if (INTEL_GEN(i915) >= 6)
-		gen6_rps_busy(i915);
-	i915_pmu_gt_unparked(i915);
-
-	intel_engines_unpark(i915);
+	bool result = true;
 
-	i915_queue_hangcheck(i915);
+	do {
+		if (i915_gem_wait_for_idle(i915,
+					   I915_WAIT_LOCKED |
+					   I915_WAIT_FOR_IDLE_BOOST,
+					   I915_GEM_IDLE_TIMEOUT) == -ETIME) {
+			/* XXX hide warning from gem_eio */
+			if (i915_modparams.reset) {
+				dev_err(i915->drm.dev,
+					"Failed to idle engines, declaring wedged!\n");
+				GEM_TRACE_DUMP();
+			}
+
+			/*
+			 * Forcibly cancel outstanding work and leave
+			 * the gpu quiet.
+			 */
+			i915_gem_set_wedged(i915);
+			result = false;
+		}
+	} while (i915_retire_requests(i915) && result);
 
-	queue_delayed_work(i915->wq,
-			   &i915->gem.retire_work,
-			   round_jiffies_up_relative(HZ));
+	GEM_BUG_ON(i915->gt.awake);
+	return result;
 }
 
 bool i915_gem_load_power_context(struct drm_i915_private *i915)
 {
-	/* Force loading the kernel context on all engines */
-	if (!switch_to_kernel_context_sync(i915, ALL_ENGINES))
-		return false;
-
-	/*
-	 * Immediately park the GPU so that we enable powersaving and
-	 * treat it as idle. The next time we issue a request, we will
-	 * unpark and start using the engine->pinned_default_state, otherwise
-	 * it is in limbo and an early reset may fail.
-	 */
-	__i915_gem_park(i915);
-
-	return true;
+	return switch_to_kernel_context_sync(i915);
 }
 
 void i915_gem_suspend(struct drm_i915_private *i915)
@@ -252,10 +160,16 @@ void i915_gem_suspend(struct drm_i915_private *i915)
 	 * state. Fortunately, the kernel_context is disposable and we do
 	 * not rely on its state.
 	 */
-	switch_to_kernel_context_sync(i915, i915->gt.active_engines);
+	switch_to_kernel_context_sync(i915);
 
 	mutex_unlock(&i915->drm.struct_mutex);
-	i915_reset_flush(i915);
+
+	/*
+	 * Assert that we successfully flushed all the work and
+	 * reset the GPU back to its idle, low power state.
+	 */
+	GEM_BUG_ON(i915->gt.awake);
+	cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
 
 	drain_delayed_work(&i915->gem.retire_work);
 
@@ -265,12 +179,6 @@ void i915_gem_suspend(struct drm_i915_private *i915)
 	 */
 	drain_delayed_work(&i915->gem.idle_work);
 
-	/*
-	 * Assert that we successfully flushed all the work and
-	 * reset the GPU back to its idle, low power state.
-	 */
-	GEM_BUG_ON(i915->gt.awake);
-
 	intel_uc_suspend(i915);
 
 	intel_runtime_pm_put(i915, wakeref);
@@ -362,4 +270,8 @@ void i915_gem_init__pm(struct drm_i915_private *i915)
 {
 	INIT_DELAYED_WORK(&i915->gem.idle_work, idle_work_handler);
 	INIT_DELAYED_WORK(&i915->gem.retire_work, retire_work_handler);
+
+	i915->gem.pm_notifier.notifier_call = pm_notifier;
+	blocking_notifier_chain_register(&i915->gt.pm_notifications,
+					 &i915->gem.pm_notifier);
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_pm.h b/drivers/gpu/drm/i915/i915_gem_pm.h
index 52f65e3f06b5..6f7d5d11ac3b 100644
--- a/drivers/gpu/drm/i915/i915_gem_pm.h
+++ b/drivers/gpu/drm/i915/i915_gem_pm.h
@@ -17,9 +17,6 @@ void i915_gem_init__pm(struct drm_i915_private *i915);
 bool i915_gem_load_power_context(struct drm_i915_private *i915);
 void i915_gem_resume(struct drm_i915_private *i915);
 
-void i915_gem_unpark(struct drm_i915_private *i915);
-void i915_gem_park(struct drm_i915_private *i915);
-
 void i915_gem_idle_work_handler(struct work_struct *work);
 
 void i915_gem_suspend(struct drm_i915_private *i915);
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h
index b419d0f59275..2ecd0c6a1c94 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.h
+++ b/drivers/gpu/drm/i915/i915_gpu_error.h
@@ -179,8 +179,6 @@ struct i915_gpu_state {
 	struct scatterlist *sgl, *fit;
 };
 
-struct i915_gpu_restart;
-
 struct i915_gpu_error {
 	/* For hangcheck timer */
 #define DRM_I915_HANGCHECK_PERIOD 1500 /* in ms */
@@ -241,8 +239,6 @@ struct i915_gpu_error {
 	wait_queue_head_t reset_queue;
 
 	struct srcu_struct reset_backoff_srcu;
-
-	struct i915_gpu_restart *restart;
 };
 
 struct drm_i915_error_state_buf {
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index ca4c200ee7ec..7b973b3c6ecf 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -429,6 +429,8 @@ void __i915_request_submit(struct i915_request *request)
 	/* Transfer from per-context onto the global per-engine timeline */
 	move_to_timeline(request, &engine->timeline);
 
+	engine->serial++;
+
 	trace_i915_request_execute(request);
 }
 
@@ -1107,7 +1109,6 @@ struct i915_request *__i915_request_commit(struct i915_request *rq)
 	list_add_tail(&rq->ring_link, &ring->request_list);
 	if (list_is_first(&rq->ring_link, &ring->request_list))
 		list_add(&ring->active_link, &rq->i915->gt.active_rings);
-	rq->i915->gt.active_engines |= rq->engine->mask;
 	rq->emitted_jiffies = jiffies;
 
 	/*
@@ -1376,21 +1377,20 @@ long i915_request_wait(struct i915_request *rq,
 	return timeout;
 }
 
-void i915_retire_requests(struct drm_i915_private *i915)
+bool i915_retire_requests(struct drm_i915_private *i915)
 {
 	struct intel_ring *ring, *tmp;
 
 	lockdep_assert_held(&i915->drm.struct_mutex);
 
-	if (!i915->gt.active_requests)
-		return;
-
 	list_for_each_entry_safe(ring, tmp,
 				 &i915->gt.active_rings, active_link) {
 		intel_ring_get(ring); /* last rq holds reference! */
 		ring_retire_requests(ring);
 		intel_ring_put(ring);
 	}
+
+	return !list_empty(&i915->gt.active_rings);
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 9a50072171f8..c0b7539daaf5 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -424,6 +424,6 @@ static inline void i915_request_mark_complete(struct i915_request *rq)
 	rq->hwsp_seqno = (u32 *)&rq->fence.seqno; /* decouple from HWSP */
 }
 
-void i915_retire_requests(struct drm_i915_private *i915);
+bool i915_retire_requests(struct drm_i915_private *i915);
 
 #endif /* I915_REQUEST_H */
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem.c b/drivers/gpu/drm/i915/selftests/i915_gem.c
index 50bb7bbd26d3..7460b77b34e0 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem.c
@@ -16,26 +16,18 @@ static int switch_to_context(struct drm_i915_private *i915,
 {
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
-	intel_wakeref_t wakeref;
-	int err = 0;
-
-	wakeref = intel_runtime_pm_get(i915);
 
 	for_each_engine(engine, i915, id) {
 		struct i915_request *rq;
 
 		rq = i915_request_alloc(engine, ctx);
-		if (IS_ERR(rq)) {
-			err = PTR_ERR(rq);
-			break;
-		}
+		if (IS_ERR(rq))
+			return PTR_ERR(rq);
 
 		i915_request_add(rq);
 	}
 
-	intel_runtime_pm_put(i915, wakeref);
-
-	return err;
+	return 0;
 }
 
 static void trash_stolen(struct drm_i915_private *i915)
@@ -120,7 +112,7 @@ static void pm_resume(struct drm_i915_private *i915)
 	 * that runtime-pm just works.
 	 */
 	with_intel_runtime_pm(i915, wakeref) {
-		intel_engines_sanitize(i915, false);
+		intel_gt_sanitize(i915, false);
 		i915_gem_sanitize(i915);
 		i915_gem_resume(i915);
 	}
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index 9d646fa1b74e..71d896bbade2 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -1608,113 +1608,6 @@ __engine_name(struct drm_i915_private *i915, intel_engine_mask_t engines)
 	return "none";
 }
 
-static int __igt_switch_to_kernel_context(struct drm_i915_private *i915,
-					  struct i915_gem_context *ctx,
-					  intel_engine_mask_t engines)
-{
-	struct intel_engine_cs *engine;
-	intel_engine_mask_t tmp;
-	int pass;
-
-	GEM_TRACE("Testing %s\n", __engine_name(i915, engines));
-	for (pass = 0; pass < 4; pass++) { /* Once busy; once idle; repeat */
-		bool from_idle = pass & 1;
-		int err;
-
-		if (!from_idle) {
-			for_each_engine_masked(engine, i915, engines, tmp) {
-				struct i915_request *rq;
-
-				rq = i915_request_alloc(engine, ctx);
-				if (IS_ERR(rq))
-					return PTR_ERR(rq);
-
-				i915_request_add(rq);
-			}
-		}
-
-		err = i915_gem_switch_to_kernel_context(i915,
-							i915->gt.active_engines);
-		if (err)
-			return err;
-
-		if (!from_idle) {
-			err = i915_gem_wait_for_idle(i915,
-						     I915_WAIT_LOCKED,
-						     MAX_SCHEDULE_TIMEOUT);
-			if (err)
-				return err;
-		}
-
-		if (i915->gt.active_requests) {
-			pr_err("%d active requests remain after switching to kernel context, pass %d (%s) on %s engine%s\n",
-			       i915->gt.active_requests,
-			       pass, from_idle ? "idle" : "busy",
-			       __engine_name(i915, engines),
-			       is_power_of_2(engines) ? "" : "s");
-			return -EINVAL;
-		}
-
-		/* XXX Bonus points for proving we are the kernel context! */
-
-		mutex_unlock(&i915->drm.struct_mutex);
-		drain_delayed_work(&i915->gem.idle_work);
-		mutex_lock(&i915->drm.struct_mutex);
-	}
-
-	if (igt_flush_test(i915, I915_WAIT_LOCKED))
-		return -EIO;
-
-	return 0;
-}
-
-static int igt_switch_to_kernel_context(void *arg)
-{
-	struct drm_i915_private *i915 = arg;
-	struct intel_engine_cs *engine;
-	struct i915_gem_context *ctx;
-	enum intel_engine_id id;
-	intel_wakeref_t wakeref;
-	int err;
-
-	/*
-	 * A core premise of switching to the kernel context is that
-	 * if an engine is already idling in the kernel context, we
-	 * do not emit another request and wake it up. The other being
-	 * that we do indeed end up idling in the kernel context.
-	 */
-
-	mutex_lock(&i915->drm.struct_mutex);
-	wakeref = intel_runtime_pm_get(i915);
-
-	ctx = kernel_context(i915);
-	if (IS_ERR(ctx)) {
-		mutex_unlock(&i915->drm.struct_mutex);
-		return PTR_ERR(ctx);
-	}
-
-	/* First check idling each individual engine */
-	for_each_engine(engine, i915, id) {
-		err = __igt_switch_to_kernel_context(i915, ctx, BIT(id));
-		if (err)
-			goto out_unlock;
-	}
-
-	/* Now en masse */
-	err = __igt_switch_to_kernel_context(i915, ctx, ALL_ENGINES);
-	if (err)
-		goto out_unlock;
-
-out_unlock:
-	GEM_TRACE_DUMP_ON(err);
-
-	intel_runtime_pm_put(i915, wakeref);
-	mutex_unlock(&i915->drm.struct_mutex);
-
-	kernel_context_close(ctx);
-	return err;
-}
-
 static void mock_barrier_task(void *data)
 {
 	unsigned int *counter = data;
@@ -1729,7 +1622,6 @@ static int mock_context_barrier(void *arg)
 	struct drm_i915_private *i915 = arg;
 	struct i915_gem_context *ctx;
 	struct i915_request *rq;
-	intel_wakeref_t wakeref;
 	unsigned int counter;
 	int err;
 
@@ -1772,9 +1664,7 @@ static int mock_context_barrier(void *arg)
 		goto out;
 	}
 
-	rq = ERR_PTR(-ENODEV);
-	with_intel_runtime_pm(i915, wakeref)
-		rq = i915_request_alloc(i915->engine[RCS0], ctx);
+	rq = i915_request_alloc(i915->engine[RCS0], ctx);
 	if (IS_ERR(rq)) {
 		pr_err("Request allocation failed!\n");
 		goto out;
@@ -1824,7 +1714,6 @@ static int mock_context_barrier(void *arg)
 int i915_gem_context_mock_selftests(void)
 {
 	static const struct i915_subtest tests[] = {
-		SUBTEST(igt_switch_to_kernel_context),
 		SUBTEST(mock_context_barrier),
 	};
 	struct drm_i915_private *i915;
@@ -1843,7 +1732,6 @@ int i915_gem_context_mock_selftests(void)
 int i915_gem_context_live_selftests(struct drm_i915_private *dev_priv)
 {
 	static const struct i915_subtest tests[] = {
-		SUBTEST(igt_switch_to_kernel_context),
 		SUBTEST(live_nop_switch),
 		SUBTEST(igt_ctx_exec),
 		SUBTEST(igt_ctx_readonly),
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
index 12203d665a4e..088b2aa05dcd 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
@@ -24,6 +24,7 @@
 
 #include "../i915_selftest.h"
 
+#include "igt_flush_test.h"
 #include "mock_gem_device.h"
 #include "huge_gem_object.h"
 
@@ -505,19 +506,23 @@ static void disable_retire_worker(struct drm_i915_private *i915)
 {
 	i915_gem_shrinker_unregister(i915);
 
-	mutex_lock(&i915->drm.struct_mutex);
-	if (!i915->gt.active_requests++) {
-		intel_wakeref_t wakeref;
-
-		with_intel_runtime_pm(i915, wakeref)
-			i915_gem_unpark(i915);
-	}
-	mutex_unlock(&i915->drm.struct_mutex);
+	intel_gt_pm_get(i915);
 
 	cancel_delayed_work_sync(&i915->gem.retire_work);
 	cancel_delayed_work_sync(&i915->gem.idle_work);
 }
 
+static void restore_retire_worker(struct drm_i915_private *i915)
+{
+	intel_gt_pm_put(i915);
+
+	mutex_lock(&i915->drm.struct_mutex);
+	igt_flush_test(i915, I915_WAIT_LOCKED);
+	mutex_unlock(&i915->drm.struct_mutex);
+
+	i915_gem_shrinker_register(i915);
+}
+
 static int igt_mmap_offset_exhaustion(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
@@ -615,13 +620,7 @@ static int igt_mmap_offset_exhaustion(void *arg)
 out:
 	drm_mm_remove_node(&resv);
 out_park:
-	mutex_lock(&i915->drm.struct_mutex);
-	if (--i915->gt.active_requests)
-		queue_delayed_work(i915->wq, &i915->gem.retire_work, 0);
-	else
-		queue_delayed_work(i915->wq, &i915->gem.idle_work, 0);
-	mutex_unlock(&i915->drm.struct_mutex);
-	i915_gem_shrinker_register(i915);
+	restore_retire_worker(i915);
 	return err;
 err_obj:
 	i915_gem_object_put(obj);
diff --git a/drivers/gpu/drm/i915/selftests/igt_flush_test.c b/drivers/gpu/drm/i915/selftests/igt_flush_test.c
index 94aee4071a66..e42f3c58536a 100644
--- a/drivers/gpu/drm/i915/selftests/igt_flush_test.c
+++ b/drivers/gpu/drm/i915/selftests/igt_flush_test.c
@@ -11,23 +11,29 @@
 
 int igt_flush_test(struct drm_i915_private *i915, unsigned int flags)
 {
+	int ret = i915_terminally_wedged(i915) ? -EIO : 0;
+	int repeat = !!(flags & I915_WAIT_LOCKED);
+
 	cond_resched();
 
-	if (flags & I915_WAIT_LOCKED &&
-	    i915_gem_switch_to_kernel_context(i915, i915->gt.active_engines)) {
-		pr_err("Failed to switch back to kernel context; declaring wedged\n");
-		i915_gem_set_wedged(i915);
-	}
+	do {
+		if (i915_gem_wait_for_idle(i915, flags, HZ / 5) == -ETIME) {
+			pr_err("%pS timed out, cancelling all further testing.\n",
+			       __builtin_return_address(0));
 
-	if (i915_gem_wait_for_idle(i915, flags, HZ / 5) == -ETIME) {
-		pr_err("%pS timed out, cancelling all further testing.\n",
-		       __builtin_return_address(0));
+			GEM_TRACE("%pS timed out.\n",
+				  __builtin_return_address(0));
+			GEM_TRACE_DUMP();
 
-		GEM_TRACE("%pS timed out.\n", __builtin_return_address(0));
-		GEM_TRACE_DUMP();
+			i915_gem_set_wedged(i915);
+			repeat = 0;
+			ret = -EIO;
+		}
 
-		i915_gem_set_wedged(i915);
-	}
+		/* Ensure we also flush after wedging. */
+		if (flags & I915_WAIT_LOCKED)
+			i915_retire_requests(i915);
+	} while (repeat--);
 
-	return i915_terminally_wedged(i915);
+	return ret;
 }
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index fb677b4019a0..c072424c6b7c 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -41,11 +41,10 @@ void mock_device_flush(struct drm_i915_private *i915)
 
 	lockdep_assert_held(&i915->drm.struct_mutex);
 
-	for_each_engine(engine, i915, id)
-		mock_engine_flush(engine);
-
-	i915_retire_requests(i915);
-	GEM_BUG_ON(i915->gt.active_requests);
+	do {
+		for_each_engine(engine, i915, id)
+			mock_engine_flush(engine);
+	} while (i915_retire_requests(i915));
 }
 
 static void mock_device_release(struct drm_device *dev)
@@ -110,10 +109,6 @@ static void mock_retire_work_handler(struct work_struct *work)
 
 static void mock_idle_work_handler(struct work_struct *work)
 {
-	struct drm_i915_private *i915 =
-		container_of(work, typeof(*i915), gem.idle_work.work);
-
-	i915->gt.active_engines = 0;
 }
 
 static int pm_domain_resume(struct device *dev)
@@ -185,6 +180,8 @@ struct drm_i915_private *mock_gem_device(void)
 
 	mock_uncore_init(&i915->uncore);
 	i915_gem_init__mm(i915);
+	intel_gt_pm_init(i915);
+	atomic_inc(&i915->gt.wakeref.count); /* disable; no hw support */
 
 	init_waitqueue_head(&i915->gpu_error.wait_queue);
 	init_waitqueue_head(&i915->gpu_error.reset_queue);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 13/29] drm/i915/gvt: Pin the per-engine GVT shadow contexts
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (10 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 12/29] drm/i915: Invert the GEM wakeref hierarchy Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-08  9:17 ` [PATCH 14/29] drm/i915: Explicitly pin the logical context for execbuf Chris Wilson
                   ` (19 subsequent siblings)
  31 siblings, 0 replies; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

OUr eventual goal is to rid request construction of struct_mutex, with
the short term step of lifting the struct_mutex requirements into the
higher levels (i.e. the caller must ensure that the context is already
pinned into the GTT). In this patch, we pin GVT's shadow context upon
allocation and so keep them pinned into the GGTT for as long as the
virtual machine is alive, and so we can use the simpler request
construction path safe in the knowledge that the hard work is already
done.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gvt/gvt.h          |   2 +-
 drivers/gpu/drm/i915/gvt/kvmgt.c        |   2 +-
 drivers/gpu/drm/i915/gvt/mmio_context.c |   3 +-
 drivers/gpu/drm/i915/gvt/scheduler.c    | 137 ++++++++++++------------
 4 files changed, 73 insertions(+), 71 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/gvt.h b/drivers/gpu/drm/i915/gvt/gvt.h
index 7a4e1a6387e5..f45e5cd57ec4 100644
--- a/drivers/gpu/drm/i915/gvt/gvt.h
+++ b/drivers/gpu/drm/i915/gvt/gvt.h
@@ -152,9 +152,9 @@ struct intel_vgpu_submission_ops {
 struct intel_vgpu_submission {
 	struct intel_vgpu_execlist execlist[I915_NUM_ENGINES];
 	struct list_head workload_q_head[I915_NUM_ENGINES];
+	struct intel_context *shadow[I915_NUM_ENGINES];
 	struct kmem_cache *workloads;
 	atomic_t running_workload_num;
-	struct i915_gem_context *shadow_ctx;
 	union {
 		u64 i915_context_pml4;
 		u64 i915_context_pdps[GEN8_3LVL_PDPES];
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index d5fcc447d22f..6fd635e3d29e 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -1576,7 +1576,7 @@ hw_id_show(struct device *dev, struct device_attribute *attr,
 		struct intel_vgpu *vgpu = (struct intel_vgpu *)
 			mdev_get_drvdata(mdev);
 		return sprintf(buf, "%u\n",
-			       vgpu->submission.shadow_ctx->hw_id);
+			       vgpu->submission.shadow[0]->gem_context->hw_id);
 	}
 	return sprintf(buf, "\n");
 }
diff --git a/drivers/gpu/drm/i915/gvt/mmio_context.c b/drivers/gpu/drm/i915/gvt/mmio_context.c
index 76630fbe51b6..7a64a0314ac1 100644
--- a/drivers/gpu/drm/i915/gvt/mmio_context.c
+++ b/drivers/gpu/drm/i915/gvt/mmio_context.c
@@ -495,8 +495,7 @@ static void switch_mmio(struct intel_vgpu *pre,
 			 * itself.
 			 */
 			if (mmio->in_context &&
-			    !is_inhibit_context(intel_context_lookup(s->shadow_ctx,
-								     dev_priv->engine[ring_id])))
+			    !is_inhibit_context(s->shadow[ring_id]))
 				continue;
 
 			if (mmio->mask)
diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c
index 8998fa5ab198..40d9f549a0cd 100644
--- a/drivers/gpu/drm/i915/gvt/scheduler.c
+++ b/drivers/gpu/drm/i915/gvt/scheduler.c
@@ -36,6 +36,7 @@
 #include <linux/kthread.h>
 
 #include "i915_drv.h"
+#include "i915_gem_pm.h"
 #include "gvt.h"
 
 #define RING_CTX_OFF(x) \
@@ -277,18 +278,23 @@ static int shadow_context_status_change(struct notifier_block *nb,
 	return NOTIFY_OK;
 }
 
-static void shadow_context_descriptor_update(struct intel_context *ce)
+static void
+shadow_context_descriptor_update(struct intel_context *ce,
+				 struct intel_vgpu_workload *workload)
 {
-	u64 desc = 0;
-
-	desc = ce->lrc_desc;
+	u64 desc = ce->lrc_desc;
 
-	/* Update bits 0-11 of the context descriptor which includes flags
+	/*
+	 * Update bits 0-11 of the context descriptor which includes flags
 	 * like GEN8_CTX_* cached in desc_template
 	 */
 	desc &= U64_MAX << 12;
 	desc |= ce->gem_context->desc_template & ((1ULL << 12) - 1);
 
+	desc &= ~(0x3 << GEN8_CTX_ADDRESSING_MODE_SHIFT);
+	desc |= workload->ctx_desc.addressing_mode <<
+		GEN8_CTX_ADDRESSING_MODE_SHIFT;
+
 	ce->lrc_desc = desc;
 }
 
@@ -365,26 +371,22 @@ intel_gvt_workload_req_alloc(struct intel_vgpu_workload *workload)
 {
 	struct intel_vgpu *vgpu = workload->vgpu;
 	struct intel_vgpu_submission *s = &vgpu->submission;
-	struct i915_gem_context *shadow_ctx = s->shadow_ctx;
 	struct drm_i915_private *dev_priv = vgpu->gvt->dev_priv;
-	struct intel_engine_cs *engine = dev_priv->engine[workload->ring_id];
 	struct i915_request *rq;
-	int ret = 0;
 
 	lockdep_assert_held(&dev_priv->drm.struct_mutex);
 
 	if (workload->req)
-		goto out;
+		return 0;
 
-	rq = i915_request_alloc(engine, shadow_ctx);
+	rq = i915_request_create(s->shadow[workload->ring_id]);
 	if (IS_ERR(rq)) {
 		gvt_vgpu_err("fail to allocate gem request\n");
-		ret = PTR_ERR(rq);
-		goto out;
+		return PTR_ERR(rq);
 	}
+
 	workload->req = i915_request_get(rq);
-out:
-	return ret;
+	return 0;
 }
 
 /**
@@ -399,10 +401,7 @@ int intel_gvt_scan_and_shadow_workload(struct intel_vgpu_workload *workload)
 {
 	struct intel_vgpu *vgpu = workload->vgpu;
 	struct intel_vgpu_submission *s = &vgpu->submission;
-	struct i915_gem_context *shadow_ctx = s->shadow_ctx;
 	struct drm_i915_private *dev_priv = vgpu->gvt->dev_priv;
-	struct intel_engine_cs *engine = dev_priv->engine[workload->ring_id];
-	struct intel_context *ce;
 	int ret;
 
 	lockdep_assert_held(&dev_priv->drm.struct_mutex);
@@ -410,29 +409,13 @@ int intel_gvt_scan_and_shadow_workload(struct intel_vgpu_workload *workload)
 	if (workload->shadow)
 		return 0;
 
-	/* pin shadow context by gvt even the shadow context will be pinned
-	 * when i915 alloc request. That is because gvt will update the guest
-	 * context from shadow context when workload is completed, and at that
-	 * moment, i915 may already unpined the shadow context to make the
-	 * shadow_ctx pages invalid. So gvt need to pin itself. After update
-	 * the guest context, gvt can unpin the shadow_ctx safely.
-	 */
-	ce = intel_context_pin(shadow_ctx, engine);
-	if (IS_ERR(ce)) {
-		gvt_vgpu_err("fail to pin shadow context\n");
-		return PTR_ERR(ce);
-	}
-
-	shadow_ctx->desc_template &= ~(0x3 << GEN8_CTX_ADDRESSING_MODE_SHIFT);
-	shadow_ctx->desc_template |= workload->ctx_desc.addressing_mode <<
-				    GEN8_CTX_ADDRESSING_MODE_SHIFT;
-
 	if (!test_and_set_bit(workload->ring_id, s->shadow_ctx_desc_updated))
-		shadow_context_descriptor_update(ce);
+		shadow_context_descriptor_update(s->shadow[workload->ring_id],
+						 workload);
 
 	ret = intel_gvt_scan_and_shadow_ringbuffer(workload);
 	if (ret)
-		goto err_unpin;
+		return ret;
 
 	if (workload->ring_id == RCS0 && workload->wa_ctx.indirect_ctx.size) {
 		ret = intel_gvt_scan_and_shadow_wa_ctx(&workload->wa_ctx);
@@ -444,8 +427,6 @@ int intel_gvt_scan_and_shadow_workload(struct intel_vgpu_workload *workload)
 	return 0;
 err_shadow:
 	release_shadow_wa_ctx(&workload->wa_ctx);
-err_unpin:
-	intel_context_unpin(ce);
 	return ret;
 }
 
@@ -672,7 +653,6 @@ static int dispatch_workload(struct intel_vgpu_workload *workload)
 	struct intel_vgpu *vgpu = workload->vgpu;
 	struct drm_i915_private *dev_priv = vgpu->gvt->dev_priv;
 	struct intel_vgpu_submission *s = &vgpu->submission;
-	struct i915_gem_context *shadow_ctx = s->shadow_ctx;
 	struct i915_request *rq;
 	int ring_id = workload->ring_id;
 	int ret;
@@ -683,7 +663,8 @@ static int dispatch_workload(struct intel_vgpu_workload *workload)
 	mutex_lock(&vgpu->vgpu_lock);
 	mutex_lock(&dev_priv->drm.struct_mutex);
 
-	ret = set_context_ppgtt_from_shadow(workload, shadow_ctx);
+	ret = set_context_ppgtt_from_shadow(workload,
+					    s->shadow[ring_id]->gem_context);
 	if (ret < 0) {
 		gvt_vgpu_err("workload shadow ppgtt isn't ready\n");
 		goto err_req;
@@ -994,8 +975,6 @@ static int workload_thread(void *priv)
 				workload->ring_id, workload,
 				workload->vgpu->id);
 
-		intel_runtime_pm_get(gvt->dev_priv);
-
 		gvt_dbg_sched("ring id %d will dispatch workload %p\n",
 				workload->ring_id, workload);
 
@@ -1025,7 +1004,6 @@ static int workload_thread(void *priv)
 			intel_uncore_forcewake_put(&gvt->dev_priv->uncore,
 					FORCEWAKE_ALL);
 
-		intel_runtime_pm_put_unchecked(gvt->dev_priv);
 		if (ret && (vgpu_is_vm_unhealthy(ret)))
 			enter_failsafe_mode(vgpu, GVT_FAILSAFE_GUEST_ERR);
 	}
@@ -1108,17 +1086,17 @@ int intel_gvt_init_workload_scheduler(struct intel_gvt *gvt)
 }
 
 static void
-i915_context_ppgtt_root_restore(struct intel_vgpu_submission *s)
+i915_context_ppgtt_root_restore(struct intel_vgpu_submission *s,
+				struct i915_hw_ppgtt *ppgtt)
 {
-	struct i915_hw_ppgtt *i915_ppgtt = s->shadow_ctx->ppgtt;
 	int i;
 
-	if (i915_vm_is_4lvl(&i915_ppgtt->vm)) {
-		px_dma(&i915_ppgtt->pml4) = s->i915_context_pml4;
+	if (i915_vm_is_4lvl(&ppgtt->vm)) {
+		px_dma(&ppgtt->pml4) = s->i915_context_pml4;
 	} else {
 		for (i = 0; i < GEN8_3LVL_PDPES; i++)
-			px_dma(i915_ppgtt->pdp.page_directory[i]) =
-						s->i915_context_pdps[i];
+			px_dma(ppgtt->pdp.page_directory[i]) =
+				s->i915_context_pdps[i];
 	}
 }
 
@@ -1132,10 +1110,15 @@ i915_context_ppgtt_root_restore(struct intel_vgpu_submission *s)
 void intel_vgpu_clean_submission(struct intel_vgpu *vgpu)
 {
 	struct intel_vgpu_submission *s = &vgpu->submission;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
 
 	intel_vgpu_select_submission_ops(vgpu, ALL_ENGINES, 0);
-	i915_context_ppgtt_root_restore(s);
-	i915_gem_context_put(s->shadow_ctx);
+
+	i915_context_ppgtt_root_restore(s, s->shadow[0]->gem_context->ppgtt);
+	for_each_engine(engine, vgpu->gvt->dev_priv, id)
+		intel_context_unpin(s->shadow[id]);
+
 	kmem_cache_destroy(s->workloads);
 }
 
@@ -1161,17 +1144,17 @@ void intel_vgpu_reset_submission(struct intel_vgpu *vgpu,
 }
 
 static void
-i915_context_ppgtt_root_save(struct intel_vgpu_submission *s)
+i915_context_ppgtt_root_save(struct intel_vgpu_submission *s,
+			     struct i915_hw_ppgtt *ppgtt)
 {
-	struct i915_hw_ppgtt *i915_ppgtt = s->shadow_ctx->ppgtt;
 	int i;
 
-	if (i915_vm_is_4lvl(&i915_ppgtt->vm))
-		s->i915_context_pml4 = px_dma(&i915_ppgtt->pml4);
-	else {
+	if (i915_vm_is_4lvl(&ppgtt->vm)) {
+		s->i915_context_pml4 = px_dma(&ppgtt->pml4);
+	} else {
 		for (i = 0; i < GEN8_3LVL_PDPES; i++)
 			s->i915_context_pdps[i] =
-				px_dma(i915_ppgtt->pdp.page_directory[i]);
+				px_dma(ppgtt->pdp.page_directory[i]);
 	}
 }
 
@@ -1188,16 +1171,31 @@ i915_context_ppgtt_root_save(struct intel_vgpu_submission *s)
 int intel_vgpu_setup_submission(struct intel_vgpu *vgpu)
 {
 	struct intel_vgpu_submission *s = &vgpu->submission;
-	enum intel_engine_id i;
 	struct intel_engine_cs *engine;
+	struct i915_gem_context *ctx;
+	enum intel_engine_id i;
 	int ret;
 
-	s->shadow_ctx = i915_gem_context_create_gvt(
-			&vgpu->gvt->dev_priv->drm);
-	if (IS_ERR(s->shadow_ctx))
-		return PTR_ERR(s->shadow_ctx);
+	ctx = i915_gem_context_create_gvt(&vgpu->gvt->dev_priv->drm);
+	if (IS_ERR(ctx))
+		return PTR_ERR(ctx);
+
+	i915_context_ppgtt_root_save(s, ctx->ppgtt);
+
+	for_each_engine(engine, vgpu->gvt->dev_priv, i) {
+		struct intel_context *ce;
+
+		INIT_LIST_HEAD(&s->workload_q_head[i]);
+		s->shadow[i] = ERR_PTR(-EINVAL);
 
-	i915_context_ppgtt_root_save(s);
+		ce = intel_context_pin(ctx, engine);
+		if (IS_ERR(ce)) {
+			ret = PTR_ERR(ce);
+			goto out_shadow_ctx;
+		}
+
+		s->shadow[i] = ce;
+	}
 
 	bitmap_zero(s->shadow_ctx_desc_updated, I915_NUM_ENGINES);
 
@@ -1213,16 +1211,21 @@ int intel_vgpu_setup_submission(struct intel_vgpu *vgpu)
 		goto out_shadow_ctx;
 	}
 
-	for_each_engine(engine, vgpu->gvt->dev_priv, i)
-		INIT_LIST_HEAD(&s->workload_q_head[i]);
-
 	atomic_set(&s->running_workload_num, 0);
 	bitmap_zero(s->tlb_handle_pending, I915_NUM_ENGINES);
 
+	i915_gem_context_put(ctx);
 	return 0;
 
 out_shadow_ctx:
-	i915_gem_context_put(s->shadow_ctx);
+	i915_context_ppgtt_root_restore(s, ctx->ppgtt);
+	for_each_engine(engine, vgpu->gvt->dev_priv, i) {
+		if (IS_ERR(s->shadow[i]))
+			break;
+
+		intel_context_unpin(s->shadow[i]);
+	}
+	i915_gem_context_put(ctx);
 	return ret;
 }
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 14/29] drm/i915: Explicitly pin the logical context for execbuf
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (11 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 13/29] drm/i915/gvt: Pin the per-engine GVT shadow contexts Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-08 15:17   ` Tvrtko Ursulin
  2019-04-08  9:17 ` [PATCH 15/29] drm/i915/guc: Replace preempt_client lookup with engine->preempt_context Chris Wilson
                   ` (18 subsequent siblings)
  31 siblings, 1 reply; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

In order to separate the reservation phase of building a request from
its emission phase, we need to pull some of the request alloc activities
from deep inside i915_request to the surface, GEM_EXECBUFFER.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 110 +++++++++++++--------
 drivers/gpu/drm/i915/i915_request.c        |   9 --
 2 files changed, 70 insertions(+), 49 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 3d672c9edb94..35732c2ae17f 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -34,8 +34,11 @@
 #include <drm/drm_syncobj.h>
 #include <drm/i915_drm.h>
 
+#include "gt/intel_gt_pm.h"
+
 #include "i915_drv.h"
 #include "i915_gem_clflush.h"
+#include "i915_gem_pm.h"
 #include "i915_trace.h"
 #include "intel_drv.h"
 #include "intel_frontbuffer.h"
@@ -236,7 +239,8 @@ struct i915_execbuffer {
 	unsigned int *flags;
 
 	struct intel_engine_cs *engine; /** engine to queue the request to */
-	struct i915_gem_context *ctx; /** context for building the request */
+	struct intel_context *context; /* logical state for the request */
+	struct i915_gem_context *gem_context; /** caller's context */
 	struct i915_address_space *vm; /** GTT and vma for the request */
 
 	struct i915_request *request; /** our request to build */
@@ -738,7 +742,7 @@ static int eb_select_context(struct i915_execbuffer *eb)
 	if (unlikely(!ctx))
 		return -ENOENT;
 
-	eb->ctx = ctx;
+	eb->gem_context = ctx;
 	if (ctx->ppgtt) {
 		eb->vm = &ctx->ppgtt->vm;
 		eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
@@ -761,7 +765,7 @@ static struct i915_request *__eb_wait_for_ring(struct intel_ring *ring)
 	 * Completely unscientific finger-in-the-air estimates for suitable
 	 * maximum user request size (to avoid blocking) and then backoff.
 	 */
-	if (intel_ring_update_space(ring) >= PAGE_SIZE)
+	if (!ring || intel_ring_update_space(ring) >= PAGE_SIZE)
 		return NULL;
 
 	/*
@@ -784,7 +788,6 @@ static struct i915_request *__eb_wait_for_ring(struct intel_ring *ring)
 
 static int eb_wait_for_ring(const struct i915_execbuffer *eb)
 {
-	const struct intel_context *ce;
 	struct i915_request *rq;
 	int ret = 0;
 
@@ -794,11 +797,7 @@ static int eb_wait_for_ring(const struct i915_execbuffer *eb)
 	 * keeping all of their resources pinned.
 	 */
 
-	ce = intel_context_lookup(eb->ctx, eb->engine);
-	if (!ce || !ce->ring) /* first use, assume empty! */
-		return 0;
-
-	rq = __eb_wait_for_ring(ce->ring);
+	rq = __eb_wait_for_ring(eb->context->ring);
 	if (rq) {
 		mutex_unlock(&eb->i915->drm.struct_mutex);
 
@@ -817,15 +816,15 @@ static int eb_wait_for_ring(const struct i915_execbuffer *eb)
 
 static int eb_lookup_vmas(struct i915_execbuffer *eb)
 {
-	struct radix_tree_root *handles_vma = &eb->ctx->handles_vma;
+	struct radix_tree_root *handles_vma = &eb->gem_context->handles_vma;
 	struct drm_i915_gem_object *obj;
 	unsigned int i, batch;
 	int err;
 
-	if (unlikely(i915_gem_context_is_closed(eb->ctx)))
+	if (unlikely(i915_gem_context_is_closed(eb->gem_context)))
 		return -ENOENT;
 
-	if (unlikely(i915_gem_context_is_banned(eb->ctx)))
+	if (unlikely(i915_gem_context_is_banned(eb->gem_context)))
 		return -EIO;
 
 	INIT_LIST_HEAD(&eb->relocs);
@@ -870,8 +869,8 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb)
 		if (!vma->open_count++)
 			i915_vma_reopen(vma);
 		list_add(&lut->obj_link, &obj->lut_list);
-		list_add(&lut->ctx_link, &eb->ctx->handles_list);
-		lut->ctx = eb->ctx;
+		list_add(&lut->ctx_link, &eb->gem_context->handles_list);
+		lut->ctx = eb->gem_context;
 		lut->handle = handle;
 
 add_vma:
@@ -1227,7 +1226,7 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb,
 	if (err)
 		goto err_unmap;
 
-	rq = i915_request_alloc(eb->engine, eb->ctx);
+	rq = i915_request_create(eb->context);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto err_unpin;
@@ -2088,8 +2087,41 @@ static const enum intel_engine_id user_ring_map[I915_USER_RINGS + 1] = {
 	[I915_EXEC_VEBOX]	= VECS0
 };
 
-static struct intel_engine_cs *
-eb_select_engine(struct drm_i915_private *dev_priv,
+static int eb_pin_context(struct i915_execbuffer *eb,
+			  struct intel_engine_cs *engine)
+{
+	struct intel_context *ce;
+	int err;
+
+	/*
+	 * ABI: Before userspace accesses the GPU (e.g. execbuffer), report
+	 * EIO if the GPU is already wedged.
+	 */
+	err = i915_terminally_wedged(eb->i915);
+	if (err)
+		return err;
+
+	/*
+	 * Pinning the contexts may generate requests in order to acquire
+	 * GGTT space, so do this first before we reserve a seqno for
+	 * ourselves.
+	 */
+	ce = intel_context_pin(eb->gem_context, engine);
+	if (IS_ERR(ce))
+		return PTR_ERR(ce);
+
+	eb->engine = engine;
+	eb->context = ce;
+	return 0;
+}
+
+static void eb_unpin_context(struct i915_execbuffer *eb)
+{
+	intel_context_unpin(eb->context);
+}
+
+static int
+eb_select_engine(struct i915_execbuffer *eb,
 		 struct drm_file *file,
 		 struct drm_i915_gem_execbuffer2 *args)
 {
@@ -2098,21 +2130,21 @@ eb_select_engine(struct drm_i915_private *dev_priv,
 
 	if (user_ring_id > I915_USER_RINGS) {
 		DRM_DEBUG("execbuf with unknown ring: %u\n", user_ring_id);
-		return NULL;
+		return -EINVAL;
 	}
 
 	if ((user_ring_id != I915_EXEC_BSD) &&
 	    ((args->flags & I915_EXEC_BSD_MASK) != 0)) {
 		DRM_DEBUG("execbuf with non bsd ring but with invalid "
 			  "bsd dispatch flags: %d\n", (int)(args->flags));
-		return NULL;
+		return -EINVAL;
 	}
 
-	if (user_ring_id == I915_EXEC_BSD && HAS_ENGINE(dev_priv, VCS1)) {
+	if (user_ring_id == I915_EXEC_BSD && HAS_ENGINE(eb->i915, VCS1)) {
 		unsigned int bsd_idx = args->flags & I915_EXEC_BSD_MASK;
 
 		if (bsd_idx == I915_EXEC_BSD_DEFAULT) {
-			bsd_idx = gen8_dispatch_bsd_engine(dev_priv, file);
+			bsd_idx = gen8_dispatch_bsd_engine(eb->i915, file);
 		} else if (bsd_idx >= I915_EXEC_BSD_RING1 &&
 			   bsd_idx <= I915_EXEC_BSD_RING2) {
 			bsd_idx >>= I915_EXEC_BSD_SHIFT;
@@ -2120,20 +2152,20 @@ eb_select_engine(struct drm_i915_private *dev_priv,
 		} else {
 			DRM_DEBUG("execbuf with unknown bsd ring: %u\n",
 				  bsd_idx);
-			return NULL;
+			return -EINVAL;
 		}
 
-		engine = dev_priv->engine[_VCS(bsd_idx)];
+		engine = eb->i915->engine[_VCS(bsd_idx)];
 	} else {
-		engine = dev_priv->engine[user_ring_map[user_ring_id]];
+		engine = eb->i915->engine[user_ring_map[user_ring_id]];
 	}
 
 	if (!engine) {
 		DRM_DEBUG("execbuf with invalid ring: %u\n", user_ring_id);
-		return NULL;
+		return -EINVAL;
 	}
 
-	return engine;
+	return eb_pin_context(eb, engine);
 }
 
 static void
@@ -2275,7 +2307,6 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	struct i915_execbuffer eb;
 	struct dma_fence *in_fence = NULL;
 	struct sync_file *out_fence = NULL;
-	intel_wakeref_t wakeref;
 	int out_fence_fd = -1;
 	int err;
 
@@ -2335,12 +2366,6 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	if (unlikely(err))
 		goto err_destroy;
 
-	eb.engine = eb_select_engine(eb.i915, file, args);
-	if (!eb.engine) {
-		err = -EINVAL;
-		goto err_engine;
-	}
-
 	/*
 	 * Take a local wakeref for preparing to dispatch the execbuf as
 	 * we expect to access the hardware fairly frequently in the
@@ -2348,16 +2373,20 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	 * wakeref that we hold until the GPU has been idle for at least
 	 * 100ms.
 	 */
-	wakeref = intel_runtime_pm_get(eb.i915);
+	intel_gt_pm_get(eb.i915);
 
 	err = i915_mutex_lock_interruptible(dev);
 	if (err)
 		goto err_rpm;
 
-	err = eb_wait_for_ring(&eb); /* may temporarily drop struct_mutex */
+	err = eb_select_engine(&eb, file, args);
 	if (unlikely(err))
 		goto err_unlock;
 
+	err = eb_wait_for_ring(&eb); /* may temporarily drop struct_mutex */
+	if (unlikely(err))
+		goto err_engine;
+
 	err = eb_relocate(&eb);
 	if (err) {
 		/*
@@ -2441,7 +2470,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	GEM_BUG_ON(eb.reloc_cache.rq);
 
 	/* Allocate a request for this batch buffer nice and early. */
-	eb.request = i915_request_alloc(eb.engine, eb.ctx);
+	eb.request = i915_request_create(eb.context);
 	if (IS_ERR(eb.request)) {
 		err = PTR_ERR(eb.request);
 		goto err_batch_unpin;
@@ -2479,8 +2508,8 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	trace_i915_request_queue(eb.request, eb.batch_flags);
 	err = eb_submit(&eb);
 err_request:
-	i915_request_add(eb.request);
 	add_to_client(eb.request, file);
+	i915_request_add(eb.request);
 
 	if (fences)
 		signal_fence_array(&eb, fences);
@@ -2502,12 +2531,13 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 err_vma:
 	if (eb.exec)
 		eb_release_vmas(&eb);
+err_engine:
+	eb_unpin_context(&eb);
 err_unlock:
 	mutex_unlock(&dev->struct_mutex);
 err_rpm:
-	intel_runtime_pm_put(eb.i915, wakeref);
-err_engine:
-	i915_gem_context_put(eb.ctx);
+	intel_gt_pm_put(eb.i915);
+	i915_gem_context_put(eb.gem_context);
 err_destroy:
 	eb_destroy(&eb);
 err_out_fence:
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 7b973b3c6ecf..7932d1209247 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -753,7 +753,6 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	struct drm_i915_private *i915 = engine->i915;
 	struct intel_context *ce;
 	struct i915_request *rq;
-	int ret;
 
 	/*
 	 * Preempt contexts are reserved for exclusive use to inject a
@@ -762,14 +761,6 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	 */
 	GEM_BUG_ON(ctx == i915->preempt_context);
 
-	/*
-	 * ABI: Before userspace accesses the GPU (e.g. execbuffer), report
-	 * EIO if the GPU is already wedged.
-	 */
-	ret = i915_terminally_wedged(i915);
-	if (ret)
-		return ERR_PTR(ret);
-
 	/*
 	 * Pinning the contexts may generate requests in order to acquire
 	 * GGTT space, so do this first before we reserve a seqno for
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 15/29] drm/i915/guc: Replace preempt_client lookup with engine->preempt_context
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (12 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 14/29] drm/i915: Explicitly pin the logical context for execbuf Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-08 14:57   ` Tvrtko Ursulin
  2019-04-08  9:17 ` [PATCH 16/29] drm/i915: Export intel_context_instance() Chris Wilson
                   ` (17 subsequent siblings)
  31 siblings, 1 reply; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

Circumvent the dance we currently perform to find the preempt_client and
lookup its HW context for this engine, as we know we have already pinned
the preempt_context on the engine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_guc_submission.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
index 4361d052a571..7fbfcb3d63e0 100644
--- a/drivers/gpu/drm/i915/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/intel_guc_submission.c
@@ -568,7 +568,7 @@ static void inject_preempt_context(struct work_struct *work)
 					     preempt_work[engine->id]);
 	struct intel_guc_client *client = guc->preempt_client;
 	struct guc_stage_desc *stage_desc = __get_stage_desc(client);
-	struct intel_context *ce = intel_context_lookup(client->owner, engine);
+	struct intel_context *ce = engine->preempt_context;
 	u32 data[7];
 
 	if (!ce->ring->emit) { /* recreate upon load/resume */
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 16/29] drm/i915: Export intel_context_instance()
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (13 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 15/29] drm/i915/guc: Replace preempt_client lookup with engine->preempt_context Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-10 12:06   ` Tvrtko Ursulin
  2019-04-08  9:17 ` [PATCH 17/29] drm/i915/selftests: Use the real kernel context for sseu isolation tests Chris Wilson
                   ` (16 subsequent siblings)
  31 siblings, 1 reply; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

We want to pass in a intel_context into intel_context_pin() and that
requires us to first be able to lookup the intel_context!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_context.c    | 37 +++++++++++-----------
 drivers/gpu/drm/i915/gt/intel_context.h    | 19 +++++++----
 drivers/gpu/drm/i915/gt/intel_engine_cs.c  |  8 ++++-
 drivers/gpu/drm/i915/gt/mock_engine.c      |  8 ++++-
 drivers/gpu/drm/i915/gvt/scheduler.c       |  7 +++-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 11 +++++--
 drivers/gpu/drm/i915/i915_perf.c           | 21 ++++++++----
 drivers/gpu/drm/i915/i915_request.c        | 11 ++++++-
 8 files changed, 83 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index 6ae6a3f58364..a1267739e369 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -104,7 +104,7 @@ void __intel_context_remove(struct intel_context *ce)
 	spin_unlock(&ctx->hw_contexts_lock);
 }
 
-static struct intel_context *
+struct intel_context *
 intel_context_instance(struct i915_gem_context *ctx,
 		       struct intel_engine_cs *engine)
 {
@@ -112,7 +112,7 @@ intel_context_instance(struct i915_gem_context *ctx,
 
 	ce = intel_context_lookup(ctx, engine);
 	if (likely(ce))
-		return ce;
+		return intel_context_get(ce);
 
 	ce = intel_context_alloc();
 	if (!ce)
@@ -125,7 +125,7 @@ intel_context_instance(struct i915_gem_context *ctx,
 		intel_context_free(ce);
 
 	GEM_BUG_ON(intel_context_lookup(ctx, engine) != pos);
-	return pos;
+	return intel_context_get(pos);
 }
 
 struct intel_context *
@@ -139,30 +139,30 @@ intel_context_pin_lock(struct i915_gem_context *ctx,
 	if (IS_ERR(ce))
 		return ce;
 
-	if (mutex_lock_interruptible(&ce->pin_mutex))
+	if (mutex_lock_interruptible(&ce->pin_mutex)) {
+		intel_context_put(ce);
 		return ERR_PTR(-EINTR);
+	}
 
 	return ce;
 }
 
-struct intel_context *
-intel_context_pin(struct i915_gem_context *ctx,
-		  struct intel_engine_cs *engine)
+void intel_context_pin_unlock(struct intel_context *ce)
+	__releases(ce->pin_mutex)
 {
-	struct intel_context *ce;
-	int err;
-
-	ce = intel_context_instance(ctx, engine);
-	if (IS_ERR(ce))
-		return ce;
+	mutex_unlock(&ce->pin_mutex);
+	intel_context_put(ce);
+}
 
-	if (likely(atomic_inc_not_zero(&ce->pin_count)))
-		return ce;
+int __intel_context_do_pin(struct intel_context *ce)
+{
+	int err;
 
 	if (mutex_lock_interruptible(&ce->pin_mutex))
-		return ERR_PTR(-EINTR);
+		return -EINTR;
 
 	if (likely(!atomic_read(&ce->pin_count))) {
+		struct i915_gem_context *ctx = ce->gem_context;
 		intel_wakeref_t wakeref;
 
 		err = 0;
@@ -172,7 +172,6 @@ intel_context_pin(struct i915_gem_context *ctx,
 			goto err;
 
 		i915_gem_context_get(ctx);
-		GEM_BUG_ON(ce->gem_context != ctx);
 
 		mutex_lock(&ctx->mutex);
 		list_add(&ce->active_link, &ctx->active_engines);
@@ -186,11 +185,11 @@ intel_context_pin(struct i915_gem_context *ctx,
 	GEM_BUG_ON(!intel_context_is_pinned(ce)); /* no overflow! */
 
 	mutex_unlock(&ce->pin_mutex);
-	return ce;
+	return 0;
 
 err:
 	mutex_unlock(&ce->pin_mutex);
-	return ERR_PTR(err);
+	return err;
 }
 
 void intel_context_unpin(struct intel_context *ce)
diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
index 9aeef88176b9..da342e9a8c2e 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -49,11 +49,7 @@ intel_context_is_pinned(struct intel_context *ce)
 	return atomic_read(&ce->pin_count);
 }
 
-static inline void intel_context_pin_unlock(struct intel_context *ce)
-__releases(ce->pin_mutex)
-{
-	mutex_unlock(&ce->pin_mutex);
-}
+void intel_context_pin_unlock(struct intel_context *ce);
 
 struct intel_context *
 __intel_context_insert(struct i915_gem_context *ctx,
@@ -63,7 +59,18 @@ void
 __intel_context_remove(struct intel_context *ce);
 
 struct intel_context *
-intel_context_pin(struct i915_gem_context *ctx, struct intel_engine_cs *engine);
+intel_context_instance(struct i915_gem_context *ctx,
+		       struct intel_engine_cs *engine);
+
+int __intel_context_do_pin(struct intel_context *ce);
+
+static inline int intel_context_pin(struct intel_context *ce)
+{
+	if (likely(atomic_inc_not_zero(&ce->pin_count)))
+		return 0;
+
+	return __intel_context_do_pin(ce);
+}
 
 static inline void __intel_context_pin(struct intel_context *ce)
 {
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index d378b276e476..f6828c0276eb 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -697,11 +697,17 @@ static int pin_context(struct i915_gem_context *ctx,
 		       struct intel_context **out)
 {
 	struct intel_context *ce;
+	int err;
 
-	ce = intel_context_pin(ctx, engine);
+	ce = intel_context_instance(ctx, engine);
 	if (IS_ERR(ce))
 		return PTR_ERR(ce);
 
+	err = intel_context_pin(ce);
+	intel_context_put(ce);
+	if (err)
+		return err;
+
 	*out = ce;
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c
index d9f68c89dff4..a79d9909d171 100644
--- a/drivers/gpu/drm/i915/gt/mock_engine.c
+++ b/drivers/gpu/drm/i915/gt/mock_engine.c
@@ -239,6 +239,7 @@ struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
 				    int id)
 {
 	struct mock_engine *engine;
+	int err;
 
 	GEM_BUG_ON(id >= I915_NUM_ENGINES);
 
@@ -278,10 +279,15 @@ struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
 	INIT_LIST_HEAD(&engine->hw_queue);
 
 	engine->base.kernel_context =
-		intel_context_pin(i915->kernel_context, &engine->base);
+		intel_context_instance(i915->kernel_context, &engine->base);
 	if (IS_ERR(engine->base.kernel_context))
 		goto err_breadcrumbs;
 
+	err = intel_context_pin(engine->base.kernel_context);
+	intel_context_put(engine->base.kernel_context);
+	if (err)
+		goto err_breadcrumbs;
+
 	return &engine->base;
 
 err_breadcrumbs:
diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c
index 40d9f549a0cd..606fc2713240 100644
--- a/drivers/gpu/drm/i915/gvt/scheduler.c
+++ b/drivers/gpu/drm/i915/gvt/scheduler.c
@@ -1188,12 +1188,17 @@ int intel_vgpu_setup_submission(struct intel_vgpu *vgpu)
 		INIT_LIST_HEAD(&s->workload_q_head[i]);
 		s->shadow[i] = ERR_PTR(-EINVAL);
 
-		ce = intel_context_pin(ctx, engine);
+		ce = intel_context_instance(ctx, engine);
 		if (IS_ERR(ce)) {
 			ret = PTR_ERR(ce);
 			goto out_shadow_ctx;
 		}
 
+		ret = intel_context_pin(ce);
+		intel_context_put(ce);
+		if (ret)
+			goto out_shadow_ctx;
+
 		s->shadow[i] = ce;
 	}
 
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 35732c2ae17f..c700cbc2f594 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -2101,14 +2101,19 @@ static int eb_pin_context(struct i915_execbuffer *eb,
 	if (err)
 		return err;
 
+	ce = intel_context_instance(eb->gem_context, engine);
+	if (IS_ERR(ce))
+		return PTR_ERR(ce);
+
 	/*
 	 * Pinning the contexts may generate requests in order to acquire
 	 * GGTT space, so do this first before we reserve a seqno for
 	 * ourselves.
 	 */
-	ce = intel_context_pin(eb->gem_context, engine);
-	if (IS_ERR(ce))
-		return PTR_ERR(ce);
+	err = intel_context_pin(ce);
+	intel_context_put(ce);
+	if (err)
+		return err;
 
 	eb->engine = engine;
 	eb->context = ce;
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 328a740e72cb..afaeabe5e531 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1205,11 +1205,17 @@ static struct intel_context *oa_pin_context(struct drm_i915_private *i915,
 {
 	struct intel_engine_cs *engine = i915->engine[RCS0];
 	struct intel_context *ce;
-	int ret;
+	int err;
 
-	ret = i915_mutex_lock_interruptible(&i915->drm);
-	if (ret)
-		return ERR_PTR(ret);
+	ce = intel_context_instance(ctx, engine);
+	if (IS_ERR(ce))
+		return ce;
+
+	err = i915_mutex_lock_interruptible(&i915->drm);
+	if (err) {
+		intel_context_put(ce);
+		return ERR_PTR(err);
+	}
 
 	/*
 	 * As the ID is the gtt offset of the context's vma we
@@ -1217,10 +1223,11 @@ static struct intel_context *oa_pin_context(struct drm_i915_private *i915,
 	 *
 	 * NB: implied RCS engine...
 	 */
-	ce = intel_context_pin(ctx, engine);
+	err = intel_context_pin(ce);
 	mutex_unlock(&i915->drm.struct_mutex);
-	if (IS_ERR(ce))
-		return ce;
+	intel_context_put(ce);
+	if (err)
+		return ERR_PTR(err);
 
 	i915->perf.oa.pinned_ctx = ce;
 
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 7932d1209247..8abd891d9287 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -753,6 +753,7 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	struct drm_i915_private *i915 = engine->i915;
 	struct intel_context *ce;
 	struct i915_request *rq;
+	int err;
 
 	/*
 	 * Preempt contexts are reserved for exclusive use to inject a
@@ -766,13 +767,21 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	 * GGTT space, so do this first before we reserve a seqno for
 	 * ourselves.
 	 */
-	ce = intel_context_pin(ctx, engine);
+	ce = intel_context_instance(ctx, engine);
 	if (IS_ERR(ce))
 		return ERR_CAST(ce);
 
+	err = intel_context_pin(ce);
+	if (err) {
+		rq = ERR_PTR(err);
+		goto err_put;
+	}
+
 	rq = i915_request_create(ce);
 	intel_context_unpin(ce);
 
+err_put:
+	intel_context_put(ce);
 	return rq;
 }
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 17/29] drm/i915/selftests: Use the real kernel context for sseu isolation tests
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (14 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 16/29] drm/i915: Export intel_context_instance() Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-08 15:00   ` Tvrtko Ursulin
  2019-04-08  9:17 ` [PATCH 18/29] drm/i915/selftests: Pass around intel_context for sseu Chris Wilson
                   ` (15 subsequent siblings)
  31 siblings, 1 reply; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

Simply the setup slightly for the sseu selftests to use the actual
kernel_context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../gpu/drm/i915/selftests/i915_gem_context.c   | 17 ++++-------------
 1 file changed, 4 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index 71d896bbade2..807644ae6877 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -957,7 +957,6 @@ __sseu_finish(struct drm_i915_private *i915,
 	      const char *name,
 	      unsigned int flags,
 	      struct i915_gem_context *ctx,
-	      struct i915_gem_context *kctx,
 	      struct intel_engine_cs *engine,
 	      struct drm_i915_gem_object *obj,
 	      unsigned int expected,
@@ -979,7 +978,8 @@ __sseu_finish(struct drm_i915_private *i915,
 	if (ret)
 		goto out;
 
-	ret = __read_slice_count(i915, kctx, engine, obj, NULL, &rpcs);
+	ret = __read_slice_count(i915, i915->kernel_context, engine, obj,
+				 NULL, &rpcs);
 	ret = __check_rpcs(name, rpcs, ret, slices, "Kernel context", "!");
 
 out:
@@ -1011,22 +1011,17 @@ __sseu_test(struct drm_i915_private *i915,
 	    struct intel_sseu sseu)
 {
 	struct igt_spinner *spin = NULL;
-	struct i915_gem_context *kctx;
 	int ret;
 
-	kctx = kernel_context(i915);
-	if (IS_ERR(kctx))
-		return PTR_ERR(kctx);
-
 	ret = __sseu_prepare(i915, name, flags, ctx, engine, &spin);
 	if (ret)
-		goto out_context;
+		return ret;
 
 	ret = __i915_gem_context_reconfigure_sseu(ctx, engine, sseu);
 	if (ret)
 		goto out_spin;
 
-	ret = __sseu_finish(i915, name, flags, ctx, kctx, engine, obj,
+	ret = __sseu_finish(i915, name, flags, ctx, engine, obj,
 			    hweight32(sseu.slice_mask), spin);
 
 out_spin:
@@ -1035,10 +1030,6 @@ __sseu_test(struct drm_i915_private *i915,
 		igt_spinner_fini(spin);
 		kfree(spin);
 	}
-
-out_context:
-	kernel_context_close(kctx);
-
 	return ret;
 }
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 18/29] drm/i915/selftests: Pass around intel_context for sseu
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (15 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 17/29] drm/i915/selftests: Use the real kernel context for sseu isolation tests Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-10 12:25   ` Tvrtko Ursulin
  2019-04-08  9:17 ` [PATCH 19/29] drm/i915: Pass intel_context to intel_context_pin_lock() Chris Wilson
                   ` (14 subsequent siblings)
  31 siblings, 1 reply; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

Combine the (i915_gem_context, intel_engine) into a single parameter,
the intel_context for convenience and later simplification.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../gpu/drm/i915/selftests/i915_gem_context.c | 74 +++++++++++--------
 1 file changed, 44 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index 807644ae6877..8e2a94333559 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -755,8 +755,7 @@ static struct i915_vma *rpcs_query_batch(struct i915_vma *vma)
 
 static int
 emit_rpcs_query(struct drm_i915_gem_object *obj,
-		struct i915_gem_context *ctx,
-		struct intel_engine_cs *engine,
+		struct intel_context *ce,
 		struct i915_request **rq_out)
 {
 	struct i915_request *rq;
@@ -764,9 +763,9 @@ emit_rpcs_query(struct drm_i915_gem_object *obj,
 	struct i915_vma *vma;
 	int err;
 
-	GEM_BUG_ON(!intel_engine_can_store_dword(engine));
+	GEM_BUG_ON(!intel_engine_can_store_dword(ce->engine));
 
-	vma = i915_vma_instance(obj, &ctx->ppgtt->vm, NULL);
+	vma = i915_vma_instance(obj, &ce->gem_context->ppgtt->vm, NULL);
 	if (IS_ERR(vma))
 		return PTR_ERR(vma);
 
@@ -784,13 +783,15 @@ emit_rpcs_query(struct drm_i915_gem_object *obj,
 		goto err_vma;
 	}
 
-	rq = i915_request_alloc(engine, ctx);
+	rq = i915_request_create(ce);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto err_batch;
 	}
 
-	err = engine->emit_bb_start(rq, batch->node.start, batch->node.size, 0);
+	err = rq->engine->emit_bb_start(rq,
+					batch->node.start, batch->node.size,
+					0);
 	if (err)
 		goto err_request;
 
@@ -834,8 +835,7 @@ static int
 __sseu_prepare(struct drm_i915_private *i915,
 	       const char *name,
 	       unsigned int flags,
-	       struct i915_gem_context *ctx,
-	       struct intel_engine_cs *engine,
+	       struct intel_context *ce,
 	       struct igt_spinner **spin)
 {
 	struct i915_request *rq;
@@ -853,7 +853,10 @@ __sseu_prepare(struct drm_i915_private *i915,
 	if (ret)
 		goto err_free;
 
-	rq = igt_spinner_create_request(*spin, ctx, engine, MI_NOOP);
+	rq = igt_spinner_create_request(*spin,
+					ce->gem_context,
+					ce->engine,
+					MI_NOOP);
 	if (IS_ERR(rq)) {
 		ret = PTR_ERR(rq);
 		goto err_fini;
@@ -880,8 +883,7 @@ __sseu_prepare(struct drm_i915_private *i915,
 
 static int
 __read_slice_count(struct drm_i915_private *i915,
-		   struct i915_gem_context *ctx,
-		   struct intel_engine_cs *engine,
+		   struct intel_context *ce,
 		   struct drm_i915_gem_object *obj,
 		   struct igt_spinner *spin,
 		   u32 *rpcs)
@@ -892,7 +894,7 @@ __read_slice_count(struct drm_i915_private *i915,
 	u32 *buf, val;
 	long ret;
 
-	ret = emit_rpcs_query(obj, ctx, engine, &rq);
+	ret = emit_rpcs_query(obj, ce, &rq);
 	if (ret)
 		return ret;
 
@@ -956,29 +958,28 @@ static int
 __sseu_finish(struct drm_i915_private *i915,
 	      const char *name,
 	      unsigned int flags,
-	      struct i915_gem_context *ctx,
-	      struct intel_engine_cs *engine,
+	      struct intel_context *ce,
 	      struct drm_i915_gem_object *obj,
 	      unsigned int expected,
 	      struct igt_spinner *spin)
 {
-	unsigned int slices = hweight32(engine->sseu.slice_mask);
+	unsigned int slices = hweight32(ce->engine->sseu.slice_mask);
 	u32 rpcs = 0;
 	int ret = 0;
 
 	if (flags & TEST_RESET) {
-		ret = i915_reset_engine(engine, "sseu");
+		ret = i915_reset_engine(ce->engine, "sseu");
 		if (ret)
 			goto out;
 	}
 
-	ret = __read_slice_count(i915, ctx, engine, obj,
+	ret = __read_slice_count(i915, ce, obj,
 				 flags & TEST_RESET ? NULL : spin, &rpcs);
 	ret = __check_rpcs(name, rpcs, ret, expected, "Context", "!");
 	if (ret)
 		goto out;
 
-	ret = __read_slice_count(i915, i915->kernel_context, engine, obj,
+	ret = __read_slice_count(i915, ce->engine->kernel_context, obj,
 				 NULL, &rpcs);
 	ret = __check_rpcs(name, rpcs, ret, slices, "Kernel context", "!");
 
@@ -993,7 +994,7 @@ __sseu_finish(struct drm_i915_private *i915,
 		if (ret)
 			return ret;
 
-		ret = __read_slice_count(i915, ctx, engine, obj, NULL, &rpcs);
+		ret = __read_slice_count(i915, ce, obj, NULL, &rpcs);
 		ret = __check_rpcs(name, rpcs, ret, expected,
 				   "Context", " after idle!");
 	}
@@ -1005,23 +1006,22 @@ static int
 __sseu_test(struct drm_i915_private *i915,
 	    const char *name,
 	    unsigned int flags,
-	    struct i915_gem_context *ctx,
-	    struct intel_engine_cs *engine,
+	    struct intel_context *ce,
 	    struct drm_i915_gem_object *obj,
 	    struct intel_sseu sseu)
 {
 	struct igt_spinner *spin = NULL;
 	int ret;
 
-	ret = __sseu_prepare(i915, name, flags, ctx, engine, &spin);
+	ret = __sseu_prepare(i915, name, flags, ce, &spin);
 	if (ret)
 		return ret;
 
-	ret = __i915_gem_context_reconfigure_sseu(ctx, engine, sseu);
+	ret = __i915_gem_context_reconfigure_sseu(ce->gem_context, ce->engine, sseu);
 	if (ret)
 		goto out_spin;
 
-	ret = __sseu_finish(i915, name, flags, ctx, engine, obj,
+	ret = __sseu_finish(i915, name, flags, ce, obj,
 			    hweight32(sseu.slice_mask), spin);
 
 out_spin:
@@ -1042,6 +1042,7 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
 	struct intel_sseu default_sseu = engine->sseu;
 	struct drm_i915_gem_object *obj;
 	struct i915_gem_context *ctx;
+	struct intel_context *ce;
 	struct intel_sseu pg_sseu;
 	intel_wakeref_t wakeref;
 	struct drm_file *file;
@@ -1093,23 +1094,33 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
 
 	wakeref = intel_runtime_pm_get(i915);
 
+	ce = intel_context_instance(ctx, i915->engine[RCS0]);
+	if (IS_ERR(ce)) {
+		ret = PTR_ERR(ce);
+		goto out_rpm;
+	}
+
+	ret = intel_context_pin(ce);
+	if (ret)
+		goto out_context;
+
 	/* First set the default mask. */
-	ret = __sseu_test(i915, name, flags, ctx, engine, obj, default_sseu);
+	ret = __sseu_test(i915, name, flags, ce, obj, default_sseu);
 	if (ret)
 		goto out_fail;
 
 	/* Then set a power-gated configuration. */
-	ret = __sseu_test(i915, name, flags, ctx, engine, obj, pg_sseu);
+	ret = __sseu_test(i915, name, flags, ce, obj, pg_sseu);
 	if (ret)
 		goto out_fail;
 
 	/* Back to defaults. */
-	ret = __sseu_test(i915, name, flags, ctx, engine, obj, default_sseu);
+	ret = __sseu_test(i915, name, flags, ce, obj, default_sseu);
 	if (ret)
 		goto out_fail;
 
 	/* One last power-gated configuration for the road. */
-	ret = __sseu_test(i915, name, flags, ctx, engine, obj, pg_sseu);
+	ret = __sseu_test(i915, name, flags, ce, obj, pg_sseu);
 	if (ret)
 		goto out_fail;
 
@@ -1117,9 +1128,12 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
 	if (igt_flush_test(i915, I915_WAIT_LOCKED))
 		ret = -EIO;
 
-	i915_gem_object_put(obj);
-
+	intel_context_unpin(ce);
+out_context:
+	intel_context_put(ce);
+out_rpm:
 	intel_runtime_pm_put(i915, wakeref);
+	i915_gem_object_put(obj);
 
 out_unlock:
 	mutex_unlock(&i915->drm.struct_mutex);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 19/29] drm/i915: Pass intel_context to intel_context_pin_lock()
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (16 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 18/29] drm/i915/selftests: Pass around intel_context for sseu Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-10 12:45   ` Tvrtko Ursulin
  2019-04-08  9:17 ` [PATCH 20/29] drm/i915: Split engine setup/init into two phases Chris Wilson
                   ` (13 subsequent siblings)
  31 siblings, 1 reply; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

Move the intel_context_instance() to the caller so that we can decouple
ourselves from one context instance per engine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_context.c       | 26 ------
 drivers/gpu/drm/i915/gt/intel_context.h       | 14 ++-
 drivers/gpu/drm/i915/i915_gem_context.c       | 88 +++++++++++--------
 .../gpu/drm/i915/selftests/i915_gem_context.c |  2 +-
 4 files changed, 64 insertions(+), 66 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index a1267739e369..46bf8d8fb7e4 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -128,32 +128,6 @@ intel_context_instance(struct i915_gem_context *ctx,
 	return intel_context_get(pos);
 }
 
-struct intel_context *
-intel_context_pin_lock(struct i915_gem_context *ctx,
-		       struct intel_engine_cs *engine)
-	__acquires(ce->pin_mutex)
-{
-	struct intel_context *ce;
-
-	ce = intel_context_instance(ctx, engine);
-	if (IS_ERR(ce))
-		return ce;
-
-	if (mutex_lock_interruptible(&ce->pin_mutex)) {
-		intel_context_put(ce);
-		return ERR_PTR(-EINTR);
-	}
-
-	return ce;
-}
-
-void intel_context_pin_unlock(struct intel_context *ce)
-	__releases(ce->pin_mutex)
-{
-	mutex_unlock(&ce->pin_mutex);
-	intel_context_put(ce);
-}
-
 int __intel_context_do_pin(struct intel_context *ce)
 {
 	int err;
diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
index da342e9a8c2e..4f8ef45e6344 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -39,9 +39,17 @@ intel_context_lookup(struct i915_gem_context *ctx,
  * can neither be bound to the GPU or unbound whilst the lock is held, i.e.
  * intel_context_is_pinned() remains stable.
  */
-struct intel_context *
-intel_context_pin_lock(struct i915_gem_context *ctx,
-		       struct intel_engine_cs *engine);
+static inline int intel_context_pin_lock(struct intel_context *ce)
+	__acquires(ce->pin_mutex)
+{
+	return mutex_lock_interruptible(&ce->pin_mutex);
+}
+
+static inline void intel_context_pin_unlock(struct intel_context *ce)
+	__releases(ce->pin_mutex)
+{
+	mutex_unlock(&ce->pin_mutex);
+}
 
 static inline bool
 intel_context_is_pinned(struct intel_context *ce)
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 51c427ac0c21..1e84fe0a4c55 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -141,6 +141,18 @@ static void lut_close(struct i915_gem_context *ctx)
 	rcu_read_unlock();
 }
 
+static struct intel_context *
+lookup_user_engine(struct i915_gem_context *ctx, u16 class, u16 instance)
+{
+	struct intel_engine_cs *engine;
+
+	engine = intel_engine_lookup_user(ctx->i915, class, instance);
+	if (!engine)
+		return ERR_PTR(-EINVAL);
+
+	return intel_context_instance(ctx, engine);
+}
+
 static inline int new_hw_id(struct drm_i915_private *i915, gfp_t gfp)
 {
 	unsigned int max;
@@ -1142,19 +1154,17 @@ gen8_modify_rpcs(struct intel_context *ce, struct intel_sseu sseu)
 }
 
 static int
-__i915_gem_context_reconfigure_sseu(struct i915_gem_context *ctx,
-				    struct intel_engine_cs *engine,
-				    struct intel_sseu sseu)
+__intel_context_reconfigure_sseu(struct intel_context *ce,
+				 struct intel_sseu sseu)
 {
-	struct intel_context *ce;
-	int ret = 0;
+	int ret;
 
-	GEM_BUG_ON(INTEL_GEN(ctx->i915) < 8);
-	GEM_BUG_ON(engine->id != RCS0);
+	GEM_BUG_ON(INTEL_GEN(ce->gem_context->i915) < 8);
+	GEM_BUG_ON(ce->engine->id != RCS0);
 
-	ce = intel_context_pin_lock(ctx, engine);
-	if (IS_ERR(ce))
-		return PTR_ERR(ce);
+	ret = intel_context_pin_lock(ce);
+	if (ret)
+		return ret;
 
 	/* Nothing to do if unmodified. */
 	if (!memcmp(&ce->sseu, &sseu, sizeof(sseu)))
@@ -1170,19 +1180,18 @@ __i915_gem_context_reconfigure_sseu(struct i915_gem_context *ctx,
 }
 
 static int
-i915_gem_context_reconfigure_sseu(struct i915_gem_context *ctx,
-				  struct intel_engine_cs *engine,
-				  struct intel_sseu sseu)
+intel_context_reconfigure_sseu(struct intel_context *ce, struct intel_sseu sseu)
 {
+	struct drm_i915_private *i915 = ce->gem_context->i915;
 	int ret;
 
-	ret = mutex_lock_interruptible(&ctx->i915->drm.struct_mutex);
+	ret = mutex_lock_interruptible(&i915->drm.struct_mutex);
 	if (ret)
 		return ret;
 
-	ret = __i915_gem_context_reconfigure_sseu(ctx, engine, sseu);
+	ret = __intel_context_reconfigure_sseu(ce, sseu);
 
-	mutex_unlock(&ctx->i915->drm.struct_mutex);
+	mutex_unlock(&i915->drm.struct_mutex);
 
 	return ret;
 }
@@ -1290,7 +1299,7 @@ static int set_sseu(struct i915_gem_context *ctx,
 {
 	struct drm_i915_private *i915 = ctx->i915;
 	struct drm_i915_gem_context_param_sseu user_sseu;
-	struct intel_engine_cs *engine;
+	struct intel_context *ce;
 	struct intel_sseu sseu;
 	int ret;
 
@@ -1307,27 +1316,31 @@ static int set_sseu(struct i915_gem_context *ctx,
 	if (user_sseu.flags || user_sseu.rsvd)
 		return -EINVAL;
 
-	engine = intel_engine_lookup_user(i915,
-					  user_sseu.engine_class,
-					  user_sseu.engine_instance);
-	if (!engine)
-		return -EINVAL;
+	ce = lookup_user_engine(ctx,
+				user_sseu.engine_class,
+				user_sseu.engine_instance);
+	if (IS_ERR(ce))
+		return PTR_ERR(ce);
 
 	/* Only render engine supports RPCS configuration. */
-	if (engine->class != RENDER_CLASS)
-		return -ENODEV;
+	if (ce->engine->class != RENDER_CLASS) {
+		ret = -ENODEV;
+		goto out_ce;
+	}
 
 	ret = user_to_context_sseu(i915, &user_sseu, &sseu);
 	if (ret)
-		return ret;
+		goto out_ce;
 
-	ret = i915_gem_context_reconfigure_sseu(ctx, engine, sseu);
+	ret = intel_context_reconfigure_sseu(ce, sseu);
 	if (ret)
-		return ret;
+		goto out_ce;
 
 	args->size = sizeof(user_sseu);
 
-	return 0;
+out_ce:
+	intel_context_put(ce);
+	return ret;
 }
 
 static int ctx_setparam(struct drm_i915_file_private *fpriv,
@@ -1532,8 +1545,8 @@ static int get_sseu(struct i915_gem_context *ctx,
 		    struct drm_i915_gem_context_param *args)
 {
 	struct drm_i915_gem_context_param_sseu user_sseu;
-	struct intel_engine_cs *engine;
 	struct intel_context *ce;
+	int err;
 
 	if (args->size == 0)
 		goto out;
@@ -1547,22 +1560,25 @@ static int get_sseu(struct i915_gem_context *ctx,
 	if (user_sseu.flags || user_sseu.rsvd)
 		return -EINVAL;
 
-	engine = intel_engine_lookup_user(ctx->i915,
-					  user_sseu.engine_class,
-					  user_sseu.engine_instance);
-	if (!engine)
-		return -EINVAL;
-
-	ce = intel_context_pin_lock(ctx, engine); /* serialises with set_sseu */
+	ce = lookup_user_engine(ctx,
+				user_sseu.engine_class,
+				user_sseu.engine_instance);
 	if (IS_ERR(ce))
 		return PTR_ERR(ce);
 
+	err = intel_context_pin_lock(ce); /* serialises with set_sseu */
+	if (err) {
+		intel_context_put(ce);
+		return err;
+	}
+
 	user_sseu.slice_mask = ce->sseu.slice_mask;
 	user_sseu.subslice_mask = ce->sseu.subslice_mask;
 	user_sseu.min_eus_per_subslice = ce->sseu.min_eus_per_subslice;
 	user_sseu.max_eus_per_subslice = ce->sseu.max_eus_per_subslice;
 
 	intel_context_pin_unlock(ce);
+	intel_context_put(ce);
 
 	if (copy_to_user(u64_to_user_ptr(args->value), &user_sseu,
 			 sizeof(user_sseu)))
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index 8e2a94333559..214d1fd2f4dc 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -1017,7 +1017,7 @@ __sseu_test(struct drm_i915_private *i915,
 	if (ret)
 		return ret;
 
-	ret = __i915_gem_context_reconfigure_sseu(ce->gem_context, ce->engine, sseu);
+	ret = __intel_context_reconfigure_sseu(ce, sseu);
 	if (ret)
 		goto out_spin;
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 20/29] drm/i915: Split engine setup/init into two phases
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (17 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 19/29] drm/i915: Pass intel_context to intel_context_pin_lock() Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-10 13:30   ` Tvrtko Ursulin
  2019-04-08  9:17 ` [PATCH 21/29] drm/i915: Switch back to an array of logical per-engine HW contexts Chris Wilson
                   ` (12 subsequent siblings)
  31 siblings, 1 reply; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

In the next patch, we require the engine vfuncs setup prior to
initialising the pinned kernel contexts, so split the vfunc setup from
the engine initialisation and call it earlier.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_engine.h        |   8 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  99 ++++----
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  74 ++----
 drivers/gpu/drm/i915/gt/intel_lrc.h           |   5 +-
 drivers/gpu/drm/i915/gt/intel_ringbuffer.c    | 232 +++++++++---------
 drivers/gpu/drm/i915/gt/intel_workarounds.c   |   3 +-
 drivers/gpu/drm/i915/gt/mock_engine.c         |  48 ++--
 drivers/gpu/drm/i915/gt/mock_engine.h         |   2 +
 drivers/gpu/drm/i915/i915_gem.c               |   6 +
 .../gpu/drm/i915/selftests/mock_gem_device.c  |  12 +-
 10 files changed, 245 insertions(+), 244 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index a17152e96bf8..a8dc2740ba2f 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -362,14 +362,12 @@ __intel_ring_space(unsigned int head, unsigned int tail, unsigned int size)
 	return (head - tail - CACHELINE_BYTES) & (size - 1);
 }
 
-int intel_engine_setup_common(struct intel_engine_cs *engine);
+int intel_engines_setup(struct drm_i915_private *i915);
 int intel_engine_init_common(struct intel_engine_cs *engine);
 void intel_engine_cleanup_common(struct intel_engine_cs *engine);
 
-int intel_init_render_ring_buffer(struct intel_engine_cs *engine);
-int intel_init_bsd_ring_buffer(struct intel_engine_cs *engine);
-int intel_init_blt_ring_buffer(struct intel_engine_cs *engine);
-int intel_init_vebox_ring_buffer(struct intel_engine_cs *engine);
+int intel_ring_submission_setup(struct intel_engine_cs *engine);
+int intel_ring_submission_init(struct intel_engine_cs *engine);
 
 int intel_engine_stop_cs(struct intel_engine_cs *engine);
 void intel_engine_cancel_stop_cs(struct intel_engine_cs *engine);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index f6828c0276eb..3f794bc71958 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -50,35 +50,24 @@
 
 struct engine_class_info {
 	const char *name;
-	int (*init_legacy)(struct intel_engine_cs *engine);
-	int (*init_execlists)(struct intel_engine_cs *engine);
-
 	u8 uabi_class;
 };
 
 static const struct engine_class_info intel_engine_classes[] = {
 	[RENDER_CLASS] = {
 		.name = "rcs",
-		.init_execlists = logical_render_ring_init,
-		.init_legacy = intel_init_render_ring_buffer,
 		.uabi_class = I915_ENGINE_CLASS_RENDER,
 	},
 	[COPY_ENGINE_CLASS] = {
 		.name = "bcs",
-		.init_execlists = logical_xcs_ring_init,
-		.init_legacy = intel_init_blt_ring_buffer,
 		.uabi_class = I915_ENGINE_CLASS_COPY,
 	},
 	[VIDEO_DECODE_CLASS] = {
 		.name = "vcs",
-		.init_execlists = logical_xcs_ring_init,
-		.init_legacy = intel_init_bsd_ring_buffer,
 		.uabi_class = I915_ENGINE_CLASS_VIDEO,
 	},
 	[VIDEO_ENHANCEMENT_CLASS] = {
 		.name = "vecs",
-		.init_execlists = logical_xcs_ring_init,
-		.init_legacy = intel_init_vebox_ring_buffer,
 		.uabi_class = I915_ENGINE_CLASS_VIDEO_ENHANCE,
 	},
 };
@@ -400,48 +389,39 @@ int intel_engines_init_mmio(struct drm_i915_private *dev_priv)
 
 /**
  * intel_engines_init() - init the Engine Command Streamers
- * @dev_priv: i915 device private
+ * @i915: i915 device private
  *
  * Return: non-zero if the initialization failed.
  */
-int intel_engines_init(struct drm_i915_private *dev_priv)
+int intel_engines_init(struct drm_i915_private *i915)
 {
+	int (*init)(struct intel_engine_cs *engine);
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id, err_id;
 	int err;
 
-	for_each_engine(engine, dev_priv, id) {
-		const struct engine_class_info *class_info =
-			&intel_engine_classes[engine->class];
-		int (*init)(struct intel_engine_cs *engine);
-
-		if (HAS_EXECLISTS(dev_priv))
-			init = class_info->init_execlists;
-		else
-			init = class_info->init_legacy;
+	if (HAS_EXECLISTS(i915))
+		init = intel_execlists_submission_init;
+	else
+		init = intel_ring_submission_init;
 
-		err = -EINVAL;
+	for_each_engine(engine, i915, id) {
 		err_id = id;
 
-		if (GEM_DEBUG_WARN_ON(!init))
-			goto cleanup;
-
 		err = init(engine);
 		if (err)
 			goto cleanup;
-
-		GEM_BUG_ON(!engine->submit_request);
 	}
 
 	return 0;
 
 cleanup:
-	for_each_engine(engine, dev_priv, id) {
+	for_each_engine(engine, i915, id) {
 		if (id >= err_id) {
 			kfree(engine);
-			dev_priv->engine[id] = NULL;
+			i915->engine[id] = NULL;
 		} else {
-			dev_priv->gt.cleanup_engine(engine);
+			i915->gt.cleanup_engine(engine);
 		}
 	}
 	return err;
@@ -559,16 +539,7 @@ static int init_status_page(struct intel_engine_cs *engine)
 	return ret;
 }
 
-/**
- * intel_engines_setup_common - setup engine state not requiring hw access
- * @engine: Engine to setup.
- *
- * Initializes @engine@ structure members shared between legacy and execlists
- * submission modes which do not require hardware access.
- *
- * Typically done early in the submission mode specific engine setup stage.
- */
-int intel_engine_setup_common(struct intel_engine_cs *engine)
+static int intel_engine_setup_common(struct intel_engine_cs *engine)
 {
 	int err;
 
@@ -602,6 +573,52 @@ int intel_engine_setup_common(struct intel_engine_cs *engine)
 	return err;
 }
 
+/**
+ * intel_engines_setup- setup engine state not requiring hw access
+ * @i915: Device to setup.
+ *
+ * Initializes engine structure members shared between legacy and execlists
+ * submission modes which do not require hardware access.
+ *
+ * Typically done early in the submission mode specific engine setup stage.
+ */
+int intel_engines_setup(struct drm_i915_private *i915)
+{
+	int (*setup)(struct intel_engine_cs *engine);
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	int err;
+
+	if (HAS_EXECLISTS(i915))
+		setup = intel_execlists_submission_setup;
+	else
+		setup = intel_ring_submission_setup;
+
+	for_each_engine(engine, i915, id) {
+		err = intel_engine_setup_common(engine);
+		if (err)
+			goto cleanup;
+
+		err = setup(engine);
+		if (err)
+			goto cleanup;
+
+		GEM_BUG_ON(!engine->cops);
+	}
+
+	return 0;
+
+cleanup:
+	for_each_engine(engine, i915, id) {
+		if (engine->cops)
+			i915->gt.cleanup_engine(engine);
+		else
+			kfree(engine);
+		i915->engine[id] = NULL;
+	}
+	return err;
+}
+
 void intel_engines_set_scheduler_caps(struct drm_i915_private *i915)
 {
 	static const struct {
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 7a5e6e962e61..d4e28fbb5dcd 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1786,8 +1786,8 @@ static int intel_init_workaround_bb(struct intel_engine_cs *engine)
 	unsigned int i;
 	int ret;
 
-	if (GEM_DEBUG_WARN_ON(engine->id != RCS0))
-		return -EINVAL;
+	if (engine->class != RENDER_CLASS)
+		return 0;
 
 	switch (INTEL_GEN(engine->i915)) {
 	case 11:
@@ -2423,15 +2423,8 @@ logical_ring_default_irqs(struct intel_engine_cs *engine)
 	engine->irq_keep_mask = GT_CONTEXT_SWITCH_INTERRUPT << shift;
 }
 
-static int
-logical_ring_setup(struct intel_engine_cs *engine)
+int intel_execlists_submission_setup(struct intel_engine_cs *engine)
 {
-	int err;
-
-	err = intel_engine_setup_common(engine);
-	if (err)
-		return err;
-
 	/* Intentionally left blank. */
 	engine->buffer = NULL;
 
@@ -2441,10 +2434,16 @@ logical_ring_setup(struct intel_engine_cs *engine)
 	logical_ring_default_vfuncs(engine);
 	logical_ring_default_irqs(engine);
 
+	if (engine->class == RENDER_CLASS) {
+		engine->init_context = gen8_init_rcs_context;
+		engine->emit_flush = gen8_emit_flush_render;
+		engine->emit_fini_breadcrumb = gen8_emit_fini_breadcrumb_rcs;
+	}
+
 	return 0;
 }
 
-static int logical_ring_init(struct intel_engine_cs *engine)
+int intel_execlists_submission_init(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *i915 = engine->i915;
 	struct intel_engine_execlists * const execlists = &engine->execlists;
@@ -2456,6 +2455,15 @@ static int logical_ring_init(struct intel_engine_cs *engine)
 		return ret;
 
 	intel_engine_init_workarounds(engine);
+	intel_engine_init_whitelist(engine);
+
+	if (intel_init_workaround_bb(engine))
+		/*
+		 * We continue even if we fail to initialize WA batch
+		 * because we only expect rare glitches but nothing
+		 * critical to prevent us from using GPU
+		 */
+		DRM_ERROR("WA batch buffer initialization failed\n");
 
 	if (HAS_LOGICAL_RING_ELSQ(i915)) {
 		execlists->submit_reg = i915->uncore.regs +
@@ -2483,50 +2491,6 @@ static int logical_ring_init(struct intel_engine_cs *engine)
 	return 0;
 }
 
-int logical_render_ring_init(struct intel_engine_cs *engine)
-{
-	int ret;
-
-	ret = logical_ring_setup(engine);
-	if (ret)
-		return ret;
-
-	/* Override some for render ring. */
-	engine->init_context = gen8_init_rcs_context;
-	engine->emit_flush = gen8_emit_flush_render;
-	engine->emit_fini_breadcrumb = gen8_emit_fini_breadcrumb_rcs;
-
-	ret = logical_ring_init(engine);
-	if (ret)
-		return ret;
-
-	ret = intel_init_workaround_bb(engine);
-	if (ret) {
-		/*
-		 * We continue even if we fail to initialize WA batch
-		 * because we only expect rare glitches but nothing
-		 * critical to prevent us from using GPU
-		 */
-		DRM_ERROR("WA batch buffer initialization failed: %d\n",
-			  ret);
-	}
-
-	intel_engine_init_whitelist(engine);
-
-	return 0;
-}
-
-int logical_xcs_ring_init(struct intel_engine_cs *engine)
-{
-	int err;
-
-	err = logical_ring_setup(engine);
-	if (err)
-		return err;
-
-	return logical_ring_init(engine);
-}
-
 static u32 intel_lr_indirect_ctx_offset(struct intel_engine_cs *engine)
 {
 	u32 indirect_ctx_offset;
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.h b/drivers/gpu/drm/i915/gt/intel_lrc.h
index a598f2d56de3..a56ee45b9e3c 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.h
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.h
@@ -67,8 +67,9 @@ enum {
 
 /* Logical Rings */
 void intel_logical_ring_cleanup(struct intel_engine_cs *engine);
-int logical_render_ring_init(struct intel_engine_cs *engine);
-int logical_xcs_ring_init(struct intel_engine_cs *engine);
+
+int intel_execlists_submission_setup(struct intel_engine_cs *engine);
+int intel_execlists_submission_init(struct intel_engine_cs *engine);
 
 /* Logical Ring Contexts */
 
diff --git a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
index 748c133b83b4..3204dbb541f6 100644
--- a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
@@ -1523,54 +1523,6 @@ static const struct intel_context_ops ring_context_ops = {
 	.destroy = ring_context_destroy,
 };
 
-static int intel_init_ring_buffer(struct intel_engine_cs *engine)
-{
-	struct i915_timeline *timeline;
-	struct intel_ring *ring;
-	int err;
-
-	err = intel_engine_setup_common(engine);
-	if (err)
-		return err;
-
-	timeline = i915_timeline_create(engine->i915, engine->status_page.vma);
-	if (IS_ERR(timeline)) {
-		err = PTR_ERR(timeline);
-		goto err;
-	}
-	GEM_BUG_ON(timeline->has_initial_breadcrumb);
-
-	ring = intel_engine_create_ring(engine, timeline, 32 * PAGE_SIZE);
-	i915_timeline_put(timeline);
-	if (IS_ERR(ring)) {
-		err = PTR_ERR(ring);
-		goto err;
-	}
-
-	err = intel_ring_pin(ring);
-	if (err)
-		goto err_ring;
-
-	GEM_BUG_ON(engine->buffer);
-	engine->buffer = ring;
-
-	err = intel_engine_init_common(engine);
-	if (err)
-		goto err_unpin;
-
-	GEM_BUG_ON(ring->timeline->hwsp_ggtt != engine->status_page.vma);
-
-	return 0;
-
-err_unpin:
-	intel_ring_unpin(ring);
-err_ring:
-	intel_ring_put(ring);
-err:
-	intel_engine_cleanup_common(engine);
-	return err;
-}
-
 void intel_engine_cleanup(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
@@ -2166,24 +2118,6 @@ static int gen6_ring_flush(struct i915_request *rq, u32 mode)
 	return gen6_flush_dw(rq, mode, MI_INVALIDATE_TLB);
 }
 
-static void intel_ring_init_irq(struct drm_i915_private *dev_priv,
-				struct intel_engine_cs *engine)
-{
-	if (INTEL_GEN(dev_priv) >= 6) {
-		engine->irq_enable = gen6_irq_enable;
-		engine->irq_disable = gen6_irq_disable;
-	} else if (INTEL_GEN(dev_priv) >= 5) {
-		engine->irq_enable = gen5_irq_enable;
-		engine->irq_disable = gen5_irq_disable;
-	} else if (INTEL_GEN(dev_priv) >= 3) {
-		engine->irq_enable = i9xx_irq_enable;
-		engine->irq_disable = i9xx_irq_disable;
-	} else {
-		engine->irq_enable = i8xx_irq_enable;
-		engine->irq_disable = i8xx_irq_disable;
-	}
-}
-
 static void i9xx_set_default_submission(struct intel_engine_cs *engine)
 {
 	engine->submit_request = i9xx_submit_request;
@@ -2199,13 +2133,33 @@ static void gen6_bsd_set_default_submission(struct intel_engine_cs *engine)
 	engine->submit_request = gen6_bsd_submit_request;
 }
 
-static void intel_ring_default_vfuncs(struct drm_i915_private *dev_priv,
-				      struct intel_engine_cs *engine)
+static void setup_irq(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *i915 = engine->i915;
+
+	if (INTEL_GEN(i915) >= 6) {
+		engine->irq_enable = gen6_irq_enable;
+		engine->irq_disable = gen6_irq_disable;
+	} else if (INTEL_GEN(i915) >= 5) {
+		engine->irq_enable = gen5_irq_enable;
+		engine->irq_disable = gen5_irq_disable;
+	} else if (INTEL_GEN(i915) >= 3) {
+		engine->irq_enable = i9xx_irq_enable;
+		engine->irq_disable = i9xx_irq_disable;
+	} else {
+		engine->irq_enable = i8xx_irq_enable;
+		engine->irq_disable = i8xx_irq_disable;
+	}
+}
+
+static void setup_xcs(struct intel_engine_cs *engine)
 {
+	struct drm_i915_private *i915 = engine->i915;
+
 	/* gen8+ are only supported with execlists */
-	GEM_BUG_ON(INTEL_GEN(dev_priv) >= 8);
+	GEM_BUG_ON(INTEL_GEN(i915) >= 8);
 
-	intel_ring_init_irq(dev_priv, engine);
+	setup_irq(engine);
 
 	engine->resume = xcs_resume;
 	engine->reset.prepare = reset_prepare;
@@ -2221,117 +2175,96 @@ static void intel_ring_default_vfuncs(struct drm_i915_private *dev_priv,
 	 * engine->emit_init_breadcrumb().
 	 */
 	engine->emit_fini_breadcrumb = i9xx_emit_breadcrumb;
-	if (IS_GEN(dev_priv, 5))
+	if (IS_GEN(i915, 5))
 		engine->emit_fini_breadcrumb = gen5_emit_breadcrumb;
 
 	engine->set_default_submission = i9xx_set_default_submission;
 
-	if (INTEL_GEN(dev_priv) >= 6)
+	if (INTEL_GEN(i915) >= 6)
 		engine->emit_bb_start = gen6_emit_bb_start;
-	else if (INTEL_GEN(dev_priv) >= 4)
+	else if (INTEL_GEN(i915) >= 4)
 		engine->emit_bb_start = i965_emit_bb_start;
-	else if (IS_I830(dev_priv) || IS_I845G(dev_priv))
+	else if (IS_I830(i915) || IS_I845G(i915))
 		engine->emit_bb_start = i830_emit_bb_start;
 	else
 		engine->emit_bb_start = i915_emit_bb_start;
 }
 
-int intel_init_render_ring_buffer(struct intel_engine_cs *engine)
+static void setup_rcs(struct intel_engine_cs *engine)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	intel_ring_default_vfuncs(dev_priv, engine);
+	struct drm_i915_private *i915 = engine->i915;
 
-	if (HAS_L3_DPF(dev_priv))
+	if (HAS_L3_DPF(i915))
 		engine->irq_keep_mask = GT_RENDER_L3_PARITY_ERROR_INTERRUPT;
 
 	engine->irq_enable_mask = GT_RENDER_USER_INTERRUPT;
 
-	if (INTEL_GEN(dev_priv) >= 7) {
+	if (INTEL_GEN(i915) >= 7) {
 		engine->init_context = intel_rcs_ctx_init;
 		engine->emit_flush = gen7_render_ring_flush;
 		engine->emit_fini_breadcrumb = gen7_rcs_emit_breadcrumb;
-	} else if (IS_GEN(dev_priv, 6)) {
+	} else if (IS_GEN(i915, 6)) {
 		engine->init_context = intel_rcs_ctx_init;
 		engine->emit_flush = gen6_render_ring_flush;
 		engine->emit_fini_breadcrumb = gen6_rcs_emit_breadcrumb;
-	} else if (IS_GEN(dev_priv, 5)) {
+	} else if (IS_GEN(i915, 5)) {
 		engine->emit_flush = gen4_render_ring_flush;
 	} else {
-		if (INTEL_GEN(dev_priv) < 4)
+		if (INTEL_GEN(i915) < 4)
 			engine->emit_flush = gen2_render_ring_flush;
 		else
 			engine->emit_flush = gen4_render_ring_flush;
 		engine->irq_enable_mask = I915_USER_INTERRUPT;
 	}
 
-	if (IS_HASWELL(dev_priv))
+	if (IS_HASWELL(i915))
 		engine->emit_bb_start = hsw_emit_bb_start;
 
 	engine->resume = rcs_resume;
-
-	ret = intel_init_ring_buffer(engine);
-	if (ret)
-		return ret;
-
-	return 0;
 }
 
-int intel_init_bsd_ring_buffer(struct intel_engine_cs *engine)
+static void setup_vcs(struct intel_engine_cs *engine)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
-
-	intel_ring_default_vfuncs(dev_priv, engine);
+	struct drm_i915_private *i915 = engine->i915;
 
-	if (INTEL_GEN(dev_priv) >= 6) {
+	if (INTEL_GEN(i915) >= 6) {
 		/* gen6 bsd needs a special wa for tail updates */
-		if (IS_GEN(dev_priv, 6))
+		if (IS_GEN(i915, 6))
 			engine->set_default_submission = gen6_bsd_set_default_submission;
 		engine->emit_flush = gen6_bsd_ring_flush;
 		engine->irq_enable_mask = GT_BSD_USER_INTERRUPT;
 
-		if (IS_GEN(dev_priv, 6))
+		if (IS_GEN(i915, 6))
 			engine->emit_fini_breadcrumb = gen6_xcs_emit_breadcrumb;
 		else
 			engine->emit_fini_breadcrumb = gen7_xcs_emit_breadcrumb;
 	} else {
 		engine->emit_flush = bsd_ring_flush;
-		if (IS_GEN(dev_priv, 5))
+		if (IS_GEN(i915, 5))
 			engine->irq_enable_mask = ILK_BSD_USER_INTERRUPT;
 		else
 			engine->irq_enable_mask = I915_BSD_USER_INTERRUPT;
 	}
-
-	return intel_init_ring_buffer(engine);
 }
 
-int intel_init_blt_ring_buffer(struct intel_engine_cs *engine)
+static void setup_bcs(struct intel_engine_cs *engine)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
-
-	GEM_BUG_ON(INTEL_GEN(dev_priv) < 6);
-
-	intel_ring_default_vfuncs(dev_priv, engine);
+	struct drm_i915_private *i915 = engine->i915;
 
 	engine->emit_flush = gen6_ring_flush;
 	engine->irq_enable_mask = GT_BLT_USER_INTERRUPT;
 
-	if (IS_GEN(dev_priv, 6))
+	if (IS_GEN(i915, 6))
 		engine->emit_fini_breadcrumb = gen6_xcs_emit_breadcrumb;
 	else
 		engine->emit_fini_breadcrumb = gen7_xcs_emit_breadcrumb;
-
-	return intel_init_ring_buffer(engine);
 }
 
-int intel_init_vebox_ring_buffer(struct intel_engine_cs *engine)
+static void setup_vecs(struct intel_engine_cs *engine)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
-
-	GEM_BUG_ON(INTEL_GEN(dev_priv) < 7);
+	struct drm_i915_private *i915 = engine->i915;
 
-	intel_ring_default_vfuncs(dev_priv, engine);
+	GEM_BUG_ON(INTEL_GEN(i915) < 7);
 
 	engine->emit_flush = gen6_ring_flush;
 	engine->irq_enable_mask = PM_VEBOX_USER_INTERRUPT;
@@ -2339,6 +2272,73 @@ int intel_init_vebox_ring_buffer(struct intel_engine_cs *engine)
 	engine->irq_disable = hsw_vebox_irq_disable;
 
 	engine->emit_fini_breadcrumb = gen7_xcs_emit_breadcrumb;
+}
+
+int intel_ring_submission_setup(struct intel_engine_cs *engine)
+{
+	setup_xcs(engine);
+
+	switch (engine->class) {
+	case RENDER_CLASS:
+		setup_rcs(engine);
+		break;
+	case VIDEO_DECODE_CLASS:
+		setup_vcs(engine);
+		break;
+	case COPY_ENGINE_CLASS:
+		setup_bcs(engine);
+		break;
+	case VIDEO_ENHANCEMENT_CLASS:
+		setup_vecs(engine);
+		break;
+	default:
+		MISSING_CASE(engine->class);
+		return -ENODEV;
+	}
+
+	return 0;
+}
+
+int intel_ring_submission_init(struct intel_engine_cs *engine)
+{
+	struct i915_timeline *timeline;
+	struct intel_ring *ring;
+	int err;
+
+	timeline = i915_timeline_create(engine->i915, engine->status_page.vma);
+	if (IS_ERR(timeline)) {
+		err = PTR_ERR(timeline);
+		goto err;
+	}
+	GEM_BUG_ON(timeline->has_initial_breadcrumb);
+
+	ring = intel_engine_create_ring(engine, timeline, 32 * PAGE_SIZE);
+	i915_timeline_put(timeline);
+	if (IS_ERR(ring)) {
+		err = PTR_ERR(ring);
+		goto err;
+	}
+
+	err = intel_ring_pin(ring);
+	if (err)
+		goto err_ring;
 
-	return intel_init_ring_buffer(engine);
+	GEM_BUG_ON(engine->buffer);
+	engine->buffer = ring;
+
+	err = intel_engine_init_common(engine);
+	if (err)
+		goto err_unpin;
+
+	GEM_BUG_ON(ring->timeline->hwsp_ggtt != engine->status_page.vma);
+
+	return 0;
+
+err_unpin:
+	intel_ring_unpin(ring);
+err_ring:
+	intel_ring_put(ring);
+err:
+	intel_engine_cleanup_common(engine);
+	return err;
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 2d6d17ee3601..e9d8174b24e8 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -1060,7 +1060,8 @@ void intel_engine_init_whitelist(struct intel_engine_cs *engine)
 	struct drm_i915_private *i915 = engine->i915;
 	struct i915_wa_list *w = &engine->whitelist;
 
-	GEM_BUG_ON(engine->id != RCS0);
+	if (engine->class != RENDER_CLASS)
+		return;
 
 	wa_init_start(w, "whitelist");
 
diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c
index a79d9909d171..3b672e011cf0 100644
--- a/drivers/gpu/drm/i915/gt/mock_engine.c
+++ b/drivers/gpu/drm/i915/gt/mock_engine.c
@@ -239,7 +239,6 @@ struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
 				    int id)
 {
 	struct mock_engine *engine;
-	int err;
 
 	GEM_BUG_ON(id >= I915_NUM_ENGINES);
 
@@ -265,37 +264,44 @@ struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
 	engine->base.reset.finish = mock_reset_finish;
 	engine->base.cancel_requests = mock_cancel_requests;
 
-	if (i915_timeline_init(i915, &engine->base.timeline, NULL))
-		goto err_free;
-	i915_timeline_set_subclass(&engine->base.timeline, TIMELINE_ENGINE);
-
-	intel_engine_init_breadcrumbs(&engine->base);
-	intel_engine_init_execlists(&engine->base);
-	intel_engine_init__pm(&engine->base);
-
 	/* fake hw queue */
 	spin_lock_init(&engine->hw_lock);
 	timer_setup(&engine->hw_delay, hw_delay_complete, 0);
 	INIT_LIST_HEAD(&engine->hw_queue);
 
-	engine->base.kernel_context =
-		intel_context_instance(i915->kernel_context, &engine->base);
-	if (IS_ERR(engine->base.kernel_context))
+	return &engine->base;
+}
+
+int mock_engine_init(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *i915 = engine->i915;
+	int err;
+
+	intel_engine_init_breadcrumbs(engine);
+	intel_engine_init_execlists(engine);
+	intel_engine_init__pm(engine);
+
+	if (i915_timeline_init(i915, &engine->timeline, NULL))
 		goto err_breadcrumbs;
+	i915_timeline_set_subclass(&engine->timeline, TIMELINE_ENGINE);
+
+	engine->kernel_context =
+		intel_context_instance(i915->kernel_context, engine);
+	if (IS_ERR(engine->kernel_context))
+		goto err_timeline;
 
-	err = intel_context_pin(engine->base.kernel_context);
-	intel_context_put(engine->base.kernel_context);
+	err = intel_context_pin(engine->kernel_context);
+	intel_context_put(engine->kernel_context);
 	if (err)
-		goto err_breadcrumbs;
+		goto err_timeline;
 
-	return &engine->base;
+	return 0;
 
+err_timeline:
+	i915_timeline_fini(&engine->timeline);
 err_breadcrumbs:
-	intel_engine_fini_breadcrumbs(&engine->base);
-	i915_timeline_fini(&engine->base.timeline);
-err_free:
-	kfree(engine);
-	return NULL;
+	intel_engine_fini_breadcrumbs(engine);
+	return -ENOMEM;
 }
 
 void mock_engine_flush(struct intel_engine_cs *engine)
diff --git a/drivers/gpu/drm/i915/gt/mock_engine.h b/drivers/gpu/drm/i915/gt/mock_engine.h
index 44b35a85e9d1..3f9b698c49d2 100644
--- a/drivers/gpu/drm/i915/gt/mock_engine.h
+++ b/drivers/gpu/drm/i915/gt/mock_engine.h
@@ -42,6 +42,8 @@ struct mock_engine {
 struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
 				    const char *name,
 				    int id);
+int mock_engine_init(struct intel_engine_cs *engine);
+
 void mock_engine_flush(struct intel_engine_cs *engine);
 void mock_engine_reset(struct intel_engine_cs *engine);
 void mock_engine_free(struct intel_engine_cs *engine);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a6436c77109a..50266e87c225 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4523,6 +4523,12 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 		goto err_ggtt;
 	}
 
+	ret = intel_engines_setup(dev_priv);
+	if (ret) {
+		GEM_BUG_ON(ret == -EIO);
+		goto err_unlock;
+	}
+
 	ret = i915_gem_contexts_init(dev_priv);
 	if (ret) {
 		GEM_BUG_ON(ret == -EIO);
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index c072424c6b7c..e4033d0576c4 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -209,12 +209,16 @@ struct drm_i915_private *mock_gem_device(void)
 	mock_init_ggtt(i915, &i915->ggtt);
 
 	mkwrite_device_info(i915)->engine_mask = BIT(0);
-	i915->kernel_context = mock_context(i915, NULL);
-	if (!i915->kernel_context)
-		goto err_unlock;
 
 	i915->engine[RCS0] = mock_engine(i915, "mock", RCS0);
 	if (!i915->engine[RCS0])
+		goto err_unlock;
+
+	i915->kernel_context = mock_context(i915, NULL);
+	if (!i915->kernel_context)
+		goto err_engine;
+
+	if (mock_engine_init(i915->engine[RCS0]))
 		goto err_context;
 
 	mutex_unlock(&i915->drm.struct_mutex);
@@ -225,6 +229,8 @@ struct drm_i915_private *mock_gem_device(void)
 
 err_context:
 	i915_gem_contexts_fini(i915);
+err_engine:
+	mock_engine_free(i915->engine[RCS0]);
 err_unlock:
 	mutex_unlock(&i915->drm.struct_mutex);
 	i915_timelines_fini(i915);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 21/29] drm/i915: Switch back to an array of logical per-engine HW contexts
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (18 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 20/29] drm/i915: Split engine setup/init into two phases Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-10 15:32   ` Tvrtko Ursulin
  2019-04-11 13:33   ` [PATCH] " Chris Wilson
  2019-04-08  9:17 ` [PATCH 22/29] drm/i915: Remove intel_context.active_link Chris Wilson
                   ` (11 subsequent siblings)
  31 siblings, 2 replies; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

We switched to a tree of per-engine HW context to accommodate the
introduction of virtual engines. However, we plan to also support
multiple instances of the same engine within the GEM context, defeating
our use of the engine as a key to looking up the HW context. Just
allocate a logical per-engine instance and always use an index into the
ctx->engines[]. Later on, this ctx->engines[] may be replaced by a user
specified map.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_context.c       | 97 ++-----------------
 drivers/gpu/drm/i915/gt/intel_context.h       | 25 +----
 drivers/gpu/drm/i915/gt/intel_context_types.h |  2 -
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  2 +-
 drivers/gpu/drm/i915/gt/mock_engine.c         |  3 +-
 drivers/gpu/drm/i915/gvt/scheduler.c          |  2 +-
 drivers/gpu/drm/i915/i915_gem.c               | 29 +++---
 drivers/gpu/drm/i915/i915_gem_context.c       | 92 +++++++++++++++---
 drivers/gpu/drm/i915/i915_gem_context.h       | 42 ++++++++
 drivers/gpu/drm/i915/i915_gem_context_types.h | 26 ++++-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c    | 70 ++++++-------
 drivers/gpu/drm/i915/i915_perf.c              | 81 +++++++++-------
 drivers/gpu/drm/i915/i915_request.c           |  2 +-
 drivers/gpu/drm/i915/intel_guc_submission.c   | 24 +++--
 .../gpu/drm/i915/selftests/i915_gem_context.c |  2 +-
 drivers/gpu/drm/i915/selftests/mock_context.c | 14 ++-
 16 files changed, 283 insertions(+), 230 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index 46bf8d8fb7e4..803ac9a1dd15 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -17,7 +17,7 @@ static struct i915_global_context {
 	struct kmem_cache *slab_ce;
 } global;
 
-struct intel_context *intel_context_alloc(void)
+static struct intel_context *intel_context_alloc(void)
 {
 	return kmem_cache_zalloc(global.slab_ce, GFP_KERNEL);
 }
@@ -28,104 +28,17 @@ void intel_context_free(struct intel_context *ce)
 }
 
 struct intel_context *
-intel_context_lookup(struct i915_gem_context *ctx,
+intel_context_create(struct i915_gem_context *ctx,
 		     struct intel_engine_cs *engine)
 {
-	struct intel_context *ce = NULL;
-	struct rb_node *p;
-
-	spin_lock(&ctx->hw_contexts_lock);
-	p = ctx->hw_contexts.rb_node;
-	while (p) {
-		struct intel_context *this =
-			rb_entry(p, struct intel_context, node);
-
-		if (this->engine == engine) {
-			GEM_BUG_ON(this->gem_context != ctx);
-			ce = this;
-			break;
-		}
-
-		if (this->engine < engine)
-			p = p->rb_right;
-		else
-			p = p->rb_left;
-	}
-	spin_unlock(&ctx->hw_contexts_lock);
-
-	return ce;
-}
-
-struct intel_context *
-__intel_context_insert(struct i915_gem_context *ctx,
-		       struct intel_engine_cs *engine,
-		       struct intel_context *ce)
-{
-	struct rb_node **p, *parent;
-	int err = 0;
-
-	spin_lock(&ctx->hw_contexts_lock);
-
-	parent = NULL;
-	p = &ctx->hw_contexts.rb_node;
-	while (*p) {
-		struct intel_context *this;
-
-		parent = *p;
-		this = rb_entry(parent, struct intel_context, node);
-
-		if (this->engine == engine) {
-			err = -EEXIST;
-			ce = this;
-			break;
-		}
-
-		if (this->engine < engine)
-			p = &parent->rb_right;
-		else
-			p = &parent->rb_left;
-	}
-	if (!err) {
-		rb_link_node(&ce->node, parent, p);
-		rb_insert_color(&ce->node, &ctx->hw_contexts);
-	}
-
-	spin_unlock(&ctx->hw_contexts_lock);
-
-	return ce;
-}
-
-void __intel_context_remove(struct intel_context *ce)
-{
-	struct i915_gem_context *ctx = ce->gem_context;
-
-	spin_lock(&ctx->hw_contexts_lock);
-	rb_erase(&ce->node, &ctx->hw_contexts);
-	spin_unlock(&ctx->hw_contexts_lock);
-}
-
-struct intel_context *
-intel_context_instance(struct i915_gem_context *ctx,
-		       struct intel_engine_cs *engine)
-{
-	struct intel_context *ce, *pos;
-
-	ce = intel_context_lookup(ctx, engine);
-	if (likely(ce))
-		return intel_context_get(ce);
+	struct intel_context *ce;
 
 	ce = intel_context_alloc();
 	if (!ce)
 		return ERR_PTR(-ENOMEM);
 
 	intel_context_init(ce, ctx, engine);
-
-	pos = __intel_context_insert(ctx, engine, ce);
-	if (unlikely(pos != ce)) /* Beaten! Use their HW context instead */
-		intel_context_free(ce);
-
-	GEM_BUG_ON(intel_context_lookup(ctx, engine) != pos);
-	return intel_context_get(pos);
+	return ce;
 }
 
 int __intel_context_do_pin(struct intel_context *ce)
@@ -204,6 +117,8 @@ intel_context_init(struct intel_context *ce,
 		   struct i915_gem_context *ctx,
 		   struct intel_engine_cs *engine)
 {
+	GEM_BUG_ON(!engine->cops);
+
 	kref_init(&ce->ref);
 
 	ce->gem_context = ctx;
diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
index 4f8ef45e6344..567a415be513 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -12,24 +12,16 @@
 #include "intel_context_types.h"
 #include "intel_engine_types.h"
 
-struct intel_context *intel_context_alloc(void);
-void intel_context_free(struct intel_context *ce);
-
 void intel_context_init(struct intel_context *ce,
 			struct i915_gem_context *ctx,
 			struct intel_engine_cs *engine);
 
-/**
- * intel_context_lookup - Find the matching HW context for this (ctx, engine)
- * @ctx - the parent GEM context
- * @engine - the target HW engine
- *
- * May return NULL if the HW context hasn't been instantiated (i.e. unused).
- */
 struct intel_context *
-intel_context_lookup(struct i915_gem_context *ctx,
+intel_context_create(struct i915_gem_context *ctx,
 		     struct intel_engine_cs *engine);
 
+void intel_context_free(struct intel_context *ce);
+
 /**
  * intel_context_pin_lock - Stablises the 'pinned' status of the HW context
  * @ctx - the parent GEM context
@@ -59,17 +51,6 @@ intel_context_is_pinned(struct intel_context *ce)
 
 void intel_context_pin_unlock(struct intel_context *ce);
 
-struct intel_context *
-__intel_context_insert(struct i915_gem_context *ctx,
-		       struct intel_engine_cs *engine,
-		       struct intel_context *ce);
-void
-__intel_context_remove(struct intel_context *ce);
-
-struct intel_context *
-intel_context_instance(struct i915_gem_context *ctx,
-		       struct intel_engine_cs *engine);
-
 int __intel_context_do_pin(struct intel_context *ce);
 
 static inline int intel_context_pin(struct intel_context *ce)
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h
index f02d27734e3b..3579c2708321 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -10,7 +10,6 @@
 #include <linux/kref.h>
 #include <linux/list.h>
 #include <linux/mutex.h>
-#include <linux/rbtree.h>
 #include <linux/types.h>
 
 #include "i915_active_types.h"
@@ -61,7 +60,6 @@ struct intel_context {
 	struct i915_active_request active_tracker;
 
 	const struct intel_context_ops *ops;
-	struct rb_node node;
 
 	/** sseu: Control eu/slice partitioning */
 	struct intel_sseu sseu;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 3f794bc71958..0df3c5238c04 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -716,7 +716,7 @@ static int pin_context(struct i915_gem_context *ctx,
 	struct intel_context *ce;
 	int err;
 
-	ce = intel_context_instance(ctx, engine);
+	ce = i915_gem_context_get_engine(ctx, engine->id);
 	if (IS_ERR(ce))
 		return PTR_ERR(ce);
 
diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c
index 3b672e011cf0..87e538825fee 100644
--- a/drivers/gpu/drm/i915/gt/mock_engine.c
+++ b/drivers/gpu/drm/i915/gt/mock_engine.c
@@ -23,6 +23,7 @@
  */
 
 #include "i915_drv.h"
+#include "i915_gem_context.h"
 #include "intel_context.h"
 #include "intel_engine_pm.h"
 
@@ -286,7 +287,7 @@ int mock_engine_init(struct intel_engine_cs *engine)
 	i915_timeline_set_subclass(&engine->timeline, TIMELINE_ENGINE);
 
 	engine->kernel_context =
-		intel_context_instance(i915->kernel_context, engine);
+		i915_gem_context_get_engine(i915->kernel_context, engine->id);
 	if (IS_ERR(engine->kernel_context))
 		goto err_timeline;
 
diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c
index 606fc2713240..8b6574e1b495 100644
--- a/drivers/gpu/drm/i915/gvt/scheduler.c
+++ b/drivers/gpu/drm/i915/gvt/scheduler.c
@@ -1188,7 +1188,7 @@ int intel_vgpu_setup_submission(struct intel_vgpu *vgpu)
 		INIT_LIST_HEAD(&s->workload_q_head[i]);
 		s->shadow[i] = ERR_PTR(-EINVAL);
 
-		ce = intel_context_instance(ctx, engine);
+		ce = i915_gem_context_get_engine(ctx, i);
 		if (IS_ERR(ce)) {
 			ret = PTR_ERR(ce);
 			goto out_shadow_ctx;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 50266e87c225..21b4a04c424b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4312,8 +4312,9 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
 
 static int __intel_engines_record_defaults(struct drm_i915_private *i915)
 {
-	struct i915_gem_context *ctx;
 	struct intel_engine_cs *engine;
+	struct i915_gem_context *ctx;
+	struct i915_gem_engines *e;
 	enum intel_engine_id id;
 	int err = 0;
 
@@ -4330,18 +4331,26 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
 	if (IS_ERR(ctx))
 		return PTR_ERR(ctx);
 
+	e = i915_gem_context_engine_list_lock(ctx);
+
 	for_each_engine(engine, i915, id) {
+		struct intel_context *ce = e->engines[id];
 		struct i915_request *rq;
 
-		rq = i915_request_alloc(engine, ctx);
+		err = intel_context_pin(ce);
+		if (err)
+			goto err_active;
+
+		rq = i915_request_create(ce);
+		intel_context_unpin(ce);
 		if (IS_ERR(rq)) {
 			err = PTR_ERR(rq);
-			goto out_ctx;
+			goto err_active;
 		}
 
 		err = 0;
-		if (engine->init_context)
-			err = engine->init_context(rq);
+		if (rq->engine->init_context)
+			err = rq->engine->init_context(rq);
 
 		i915_request_add(rq);
 		if (err)
@@ -4355,15 +4364,10 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
 	}
 
 	for_each_engine(engine, i915, id) {
-		struct intel_context *ce;
-		struct i915_vma *state;
+		struct intel_context *ce = e->engines[id];
+		struct i915_vma *state = ce->state;
 		void *vaddr;
 
-		ce = intel_context_lookup(ctx, engine);
-		if (!ce)
-			continue;
-
-		state = ce->state;
 		if (!state)
 			continue;
 
@@ -4419,6 +4423,7 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
 	}
 
 out_ctx:
+	i915_gem_context_engine_list_unlock(ctx);
 	i915_gem_context_set_closed(ctx);
 	i915_gem_context_put(ctx);
 	return err;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 1e84fe0a4c55..517816558c85 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -150,7 +150,7 @@ lookup_user_engine(struct i915_gem_context *ctx, u16 class, u16 instance)
 	if (!engine)
 		return ERR_PTR(-EINVAL);
 
-	return intel_context_instance(ctx, engine);
+	return i915_gem_context_get_engine(ctx, engine->id);
 }
 
 static inline int new_hw_id(struct drm_i915_private *i915, gfp_t gfp)
@@ -242,10 +242,54 @@ static void release_hw_id(struct i915_gem_context *ctx)
 	mutex_unlock(&i915->contexts.mutex);
 }
 
-static void i915_gem_context_free(struct i915_gem_context *ctx)
+static void free_engines(struct i915_gem_engines *e)
+{
+	unsigned int n;
+
+	for (n = 0; n < e->num_engines; n++) {
+		if (!e->engines[n])
+			continue;
+
+		intel_context_put(e->engines[n]);
+	}
+	kfree(e);
+}
+
+static void free_engines_n(struct i915_gem_engines *e, unsigned long n)
 {
-	struct intel_context *it, *n;
+	e->num_engines = n;
+	free_engines(e);
+}
+
+static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx)
+{
+	struct intel_engine_cs *engine;
+	struct i915_gem_engines *e;
+	enum intel_engine_id id;
+
+	e = kzalloc(struct_size(e, engines, I915_NUM_ENGINES), GFP_KERNEL);
+	if (!e)
+		return ERR_PTR(-ENOMEM);
+
+	e->i915 = ctx->i915;
+	for_each_engine(engine, ctx->i915, id) {
+		struct intel_context *ce;
 
+		ce = intel_context_create(ctx, engine);
+		if (IS_ERR(ce)) {
+			free_engines_n(e, id);
+			return ERR_CAST(ce);
+		}
+
+		e->engines[id] = ce;
+	}
+	e->num_engines = id;
+
+	return e;
+}
+
+static void i915_gem_context_free(struct i915_gem_context *ctx)
+{
 	lockdep_assert_held(&ctx->i915->drm.struct_mutex);
 	GEM_BUG_ON(!i915_gem_context_is_closed(ctx));
 	GEM_BUG_ON(!list_empty(&ctx->active_engines));
@@ -253,8 +297,8 @@ static void i915_gem_context_free(struct i915_gem_context *ctx)
 	release_hw_id(ctx);
 	i915_ppgtt_put(ctx->ppgtt);
 
-	rbtree_postorder_for_each_entry_safe(it, n, &ctx->hw_contexts, node)
-		intel_context_put(it);
+	free_engines(rcu_access_pointer(ctx->engines));
+	mutex_destroy(&ctx->engines_mutex);
 
 	if (ctx->timeline)
 		i915_timeline_put(ctx->timeline);
@@ -363,6 +407,8 @@ static struct i915_gem_context *
 __create_context(struct drm_i915_private *dev_priv)
 {
 	struct i915_gem_context *ctx;
+	struct i915_gem_engines *e;
+	int err;
 	int i;
 
 	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
@@ -376,8 +422,13 @@ __create_context(struct drm_i915_private *dev_priv)
 	INIT_LIST_HEAD(&ctx->active_engines);
 	mutex_init(&ctx->mutex);
 
-	ctx->hw_contexts = RB_ROOT;
-	spin_lock_init(&ctx->hw_contexts_lock);
+	mutex_init(&ctx->engines_mutex);
+	e = default_engines(ctx);
+	if (IS_ERR(e)) {
+		err = PTR_ERR(e);
+		goto err_free;
+	}
+	RCU_INIT_POINTER(ctx->engines, e);
 
 	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
 	INIT_LIST_HEAD(&ctx->handles_list);
@@ -399,6 +450,10 @@ __create_context(struct drm_i915_private *dev_priv)
 		ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
 
 	return ctx;
+
+err_free:
+	kfree(ctx);
+	return ERR_PTR(err);
 }
 
 static struct i915_hw_ppgtt *
@@ -862,7 +917,8 @@ static int context_barrier_task(struct i915_gem_context *ctx,
 {
 	struct drm_i915_private *i915 = ctx->i915;
 	struct context_barrier_task *cb;
-	struct intel_context *ce, *next;
+	struct i915_gem_engines *e;
+	unsigned int i;
 	int err = 0;
 
 	lockdep_assert_held(&i915->drm.struct_mutex);
@@ -875,20 +931,29 @@ static int context_barrier_task(struct i915_gem_context *ctx,
 	i915_active_init(i915, &cb->base, cb_retire);
 	i915_active_acquire(&cb->base);
 
-	rbtree_postorder_for_each_entry_safe(ce, next, &ctx->hw_contexts, node) {
-		struct intel_engine_cs *engine = ce->engine;
+	e = i915_gem_context_engine_list_lock(ctx);
+	for (i = 0; i < e->num_engines; i++) {
+		struct intel_context *ce = e->engines[i];
 		struct i915_request *rq;
 
-		if (!(engine->mask & engines))
+		if (!ce || !(ce->engine->mask & engines))
+			continue;
+
+		if (!intel_context_is_pinned(ce))
 			continue;
 
 		if (I915_SELFTEST_ONLY(context_barrier_inject_fault &
-				       engine->mask)) {
+				       ce->engine->mask)) {
 			err = -ENXIO;
 			break;
 		}
 
-		rq = i915_request_alloc(engine, ctx);
+		err = intel_context_pin(ce);
+		if (err)
+			break;
+
+		rq = i915_request_create(ce);
+		intel_context_unpin(ce);
 		if (IS_ERR(rq)) {
 			err = PTR_ERR(rq);
 			break;
@@ -904,6 +969,7 @@ static int context_barrier_task(struct i915_gem_context *ctx,
 		if (err)
 			break;
 	}
+	i915_gem_context_engine_list_unlock(ctx);
 
 	cb->task = err ? NULL : task; /* caller needs to unwind instead */
 	cb->data = data;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index 5a8e080499fb..d94fc4c1907c 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -176,6 +176,48 @@ static inline void i915_gem_context_put(struct i915_gem_context *ctx)
 	kref_put(&ctx->ref, i915_gem_context_release);
 }
 
+static inline struct i915_gem_engines *
+i915_gem_context_engine_list(struct i915_gem_context *ctx)
+{
+	return rcu_dereference_protected(ctx->engines,
+					 lockdep_is_held(&ctx->engines_mutex));
+}
+
+static inline struct i915_gem_engines *
+i915_gem_context_engine_list_lock(struct i915_gem_context *ctx)
+	__acquires(&ctx->engines_mutex)
+{
+	mutex_lock(&ctx->engines_mutex);
+	return i915_gem_context_engine_list(ctx);
+}
+
+static inline void
+i915_gem_context_engine_list_unlock(struct i915_gem_context *ctx)
+	__releases(&ctx->engines_mutex)
+{
+	mutex_unlock(&ctx->engines_mutex);
+}
+
+static inline struct intel_context *
+i915_gem_context_lookup_engine(struct i915_gem_context *ctx, unsigned long idx)
+{
+	return i915_gem_context_engine_list(ctx)->engines[idx];
+}
+
+static inline struct intel_context *
+i915_gem_context_get_engine(struct i915_gem_context *ctx, unsigned long idx)
+{
+	struct intel_context *ce = ERR_PTR(-EINVAL);
+
+	rcu_read_lock(); {
+		struct i915_gem_engines *e = rcu_dereference(ctx->engines);
+		if (likely(idx < e->num_engines && e->engines[idx]))
+			ce = intel_context_get(e->engines[idx]);
+	} rcu_read_unlock();
+
+	return ce;
+}
+
 struct i915_lut_handle *i915_lut_handle_alloc(void);
 void i915_lut_handle_free(struct i915_lut_handle *lut);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context_types.h b/drivers/gpu/drm/i915/i915_gem_context_types.h
index d282a6ab3b9f..1610a74b0283 100644
--- a/drivers/gpu/drm/i915/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/i915_gem_context_types.h
@@ -29,6 +29,13 @@ struct i915_hw_ppgtt;
 struct i915_timeline;
 struct intel_ring;
 
+struct i915_gem_engines {
+	struct rcu_work rcu;
+	struct drm_i915_private *i915;
+	unsigned long num_engines;
+	struct intel_context *engines[];
+};
+
 /**
  * struct i915_gem_context - client state
  *
@@ -42,6 +49,21 @@ struct i915_gem_context {
 	/** file_priv: owning file descriptor */
 	struct drm_i915_file_private *file_priv;
 
+	/**
+	 * @engines: User defined engines for this context
+	 *
+	 * NULL means to use legacy definitions (including random meaning of
+	 * I915_EXEC_BSD with I915_EXEC_BSD_SELECTOR overrides).
+	 *
+	 * If defined, execbuf uses the I915_EXEC_MASK as an index into
+	 * array, and various uAPI other the ability to lookup up an
+	 * index from this array to select an engine operate on.
+	 *
+	 * User defined by I915_CONTEXT_PARAM_ENGINE.
+	 */
+	struct i915_gem_engines __rcu *engines;
+	struct mutex engines_mutex; /* guards writes to engines */
+
 	struct i915_timeline *timeline;
 
 	/**
@@ -134,10 +156,6 @@ struct i915_gem_context {
 
 	struct i915_sched_attr sched;
 
-	/** hw_contexts: per-engine logical HW state */
-	struct rb_root hw_contexts;
-	spinlock_t hw_contexts_lock;
-
 	/** ring_size: size for allocating the per-engine ring buffer */
 	u32 ring_size;
 	/** desc_template: invariant fields for the HW context descriptor */
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index c700cbc2f594..941192c60832 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -2077,9 +2077,7 @@ gen8_dispatch_bsd_engine(struct drm_i915_private *dev_priv,
 	return file_priv->bsd_engine;
 }
 
-#define I915_USER_RINGS (4)
-
-static const enum intel_engine_id user_ring_map[I915_USER_RINGS + 1] = {
+static const enum intel_engine_id user_ring_map[] = {
 	[I915_EXEC_DEFAULT]	= RCS0,
 	[I915_EXEC_RENDER]	= RCS0,
 	[I915_EXEC_BLT]		= BCS0,
@@ -2087,10 +2085,8 @@ static const enum intel_engine_id user_ring_map[I915_USER_RINGS + 1] = {
 	[I915_EXEC_VEBOX]	= VECS0
 };
 
-static int eb_pin_context(struct i915_execbuffer *eb,
-			  struct intel_engine_cs *engine)
+static int eb_pin_context(struct i915_execbuffer *eb, struct intel_context *ce)
 {
-	struct intel_context *ce;
 	int err;
 
 	/*
@@ -2101,21 +2097,16 @@ static int eb_pin_context(struct i915_execbuffer *eb,
 	if (err)
 		return err;
 
-	ce = intel_context_instance(eb->gem_context, engine);
-	if (IS_ERR(ce))
-		return PTR_ERR(ce);
-
 	/*
 	 * Pinning the contexts may generate requests in order to acquire
 	 * GGTT space, so do this first before we reserve a seqno for
 	 * ourselves.
 	 */
 	err = intel_context_pin(ce);
-	intel_context_put(ce);
 	if (err)
 		return err;
 
-	eb->engine = engine;
+	eb->engine = ce->engine;
 	eb->context = ce;
 	return 0;
 }
@@ -2125,24 +2116,18 @@ static void eb_unpin_context(struct i915_execbuffer *eb)
 	intel_context_unpin(eb->context);
 }
 
-static int
-eb_select_engine(struct i915_execbuffer *eb,
-		 struct drm_file *file,
-		 struct drm_i915_gem_execbuffer2 *args)
+static unsigned int
+eb_select_legacy_ring(struct i915_execbuffer *eb,
+		      struct drm_file *file,
+		      struct drm_i915_gem_execbuffer2 *args)
 {
 	unsigned int user_ring_id = args->flags & I915_EXEC_RING_MASK;
-	struct intel_engine_cs *engine;
-
-	if (user_ring_id > I915_USER_RINGS) {
-		DRM_DEBUG("execbuf with unknown ring: %u\n", user_ring_id);
-		return -EINVAL;
-	}
 
-	if ((user_ring_id != I915_EXEC_BSD) &&
-	    ((args->flags & I915_EXEC_BSD_MASK) != 0)) {
+	if (user_ring_id != I915_EXEC_BSD &&
+	    (args->flags & I915_EXEC_BSD_MASK)) {
 		DRM_DEBUG("execbuf with non bsd ring but with invalid "
 			  "bsd dispatch flags: %d\n", (int)(args->flags));
-		return -EINVAL;
+		return -1;
 	}
 
 	if (user_ring_id == I915_EXEC_BSD && HAS_ENGINE(eb->i915, VCS1)) {
@@ -2157,20 +2142,39 @@ eb_select_engine(struct i915_execbuffer *eb,
 		} else {
 			DRM_DEBUG("execbuf with unknown bsd ring: %u\n",
 				  bsd_idx);
-			return -EINVAL;
+			return -1;
 		}
 
-		engine = eb->i915->engine[_VCS(bsd_idx)];
-	} else {
-		engine = eb->i915->engine[user_ring_map[user_ring_id]];
+		return _VCS(bsd_idx);
 	}
 
-	if (!engine) {
-		DRM_DEBUG("execbuf with invalid ring: %u\n", user_ring_id);
-		return -EINVAL;
+	if (user_ring_id >= ARRAY_SIZE(user_ring_map)) {
+		DRM_DEBUG("execbuf with unknown ring: %u\n", user_ring_id);
+		return -1;
 	}
 
-	return eb_pin_context(eb, engine);
+	return user_ring_map[user_ring_id];
+}
+
+static int
+eb_select_engine(struct i915_execbuffer *eb,
+		 struct drm_file *file,
+		 struct drm_i915_gem_execbuffer2 *args)
+{
+	struct intel_context *ce;
+	unsigned int idx;
+	int err;
+
+	idx = eb_select_legacy_ring(eb, file, args);
+
+	ce = i915_gem_context_get_engine(eb->gem_context, idx);
+	if (IS_ERR(ce))
+		return PTR_ERR(ce);
+
+	err = eb_pin_context(eb, ce);
+	intel_context_put(ce);
+
+	return err;
 }
 
 static void
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index afaeabe5e531..304e597e19e6 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1203,35 +1203,37 @@ static int i915_oa_read(struct i915_perf_stream *stream,
 static struct intel_context *oa_pin_context(struct drm_i915_private *i915,
 					    struct i915_gem_context *ctx)
 {
-	struct intel_engine_cs *engine = i915->engine[RCS0];
-	struct intel_context *ce;
+	struct i915_gem_engines *e;
+	unsigned int i;
 	int err;
 
-	ce = intel_context_instance(ctx, engine);
-	if (IS_ERR(ce))
-		return ce;
-
 	err = i915_mutex_lock_interruptible(&i915->drm);
-	if (err) {
-		intel_context_put(ce);
+	if (err)
 		return ERR_PTR(err);
-	}
 
-	/*
-	 * As the ID is the gtt offset of the context's vma we
-	 * pin the vma to ensure the ID remains fixed.
-	 *
-	 * NB: implied RCS engine...
-	 */
-	err = intel_context_pin(ce);
+	e = i915_gem_context_engine_list_lock(ctx);
+	for (i = 0; i < e->num_engines; i++) {
+		struct intel_context *ce = e->engines[i];
+
+		if (ce->engine->class != RENDER_CLASS)
+			continue;
+
+		/*
+		 * As the ID is the gtt offset of the context's vma we
+		 * pin the vma to ensure the ID remains fixed.
+		 */
+		err = intel_context_pin(ce);
+		if (err == 0) {
+			i915->perf.oa.pinned_ctx = ce;
+			break;
+		}
+	}
+	i915_gem_context_engine_list_unlock(ctx);
 	mutex_unlock(&i915->drm.struct_mutex);
-	intel_context_put(ce);
 	if (err)
 		return ERR_PTR(err);
 
-	i915->perf.oa.pinned_ctx = ce;
-
-	return ce;
+	return i915->perf.oa.pinned_ctx;
 }
 
 /**
@@ -1717,10 +1719,10 @@ gen8_update_reg_state_unlocked(struct intel_context *ce,
 static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
 				       const struct i915_oa_config *oa_config)
 {
-	struct intel_engine_cs *engine = dev_priv->engine[RCS0];
 	unsigned int map_type = i915_coherent_map_type(dev_priv);
 	struct i915_gem_context *ctx;
 	struct i915_request *rq;
+	unsigned int i;
 	int ret;
 
 	lockdep_assert_held(&dev_priv->drm.struct_mutex);
@@ -1746,30 +1748,41 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
 
 	/* Update all contexts now that we've stalled the submission. */
 	list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
-		struct intel_context *ce = intel_context_lookup(ctx, engine);
-		u32 *regs;
+		struct i915_gem_engines *e =
+			i915_gem_context_engine_list_lock(ctx);
 
-		/* OA settings will be set upon first use */
-		if (!ce || !ce->state)
-			continue;
+		for (i = 0; i < e->num_engines; i++) {
+			struct intel_context *ce = e->engines[i];
+			u32 *regs;
+
+			if (!ce || ce->engine->class != RENDER_CLASS)
+				continue;
+
+			/* OA settings will be set upon first use */
+			if (!ce->state)
+				continue;
 
-		regs = i915_gem_object_pin_map(ce->state->obj, map_type);
-		if (IS_ERR(regs))
-			return PTR_ERR(regs);
+			regs = i915_gem_object_pin_map(ce->state->obj,
+						       map_type);
+			if (IS_ERR(regs))
+				return PTR_ERR(regs);
 
-		ce->state->obj->mm.dirty = true;
-		regs += LRC_STATE_PN * PAGE_SIZE / sizeof(*regs);
+			ce->state->obj->mm.dirty = true;
+			regs += LRC_STATE_PN * PAGE_SIZE / sizeof(*regs);
 
-		gen8_update_reg_state_unlocked(ce, regs, oa_config);
+			gen8_update_reg_state_unlocked(ce, regs, oa_config);
+
+			i915_gem_object_unpin_map(ce->state->obj);
+		}
 
-		i915_gem_object_unpin_map(ce->state->obj);
+		i915_gem_context_engine_list_unlock(ctx);
 	}
 
 	/*
 	 * Apply the configuration by doing one context restore of the edited
 	 * context image.
 	 */
-	rq = i915_request_create(engine->kernel_context);
+	rq = i915_request_create(dev_priv->engine[RCS0]->kernel_context);
 	if (IS_ERR(rq))
 		return PTR_ERR(rq);
 
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 8abd891d9287..5e52c1623c17 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -767,7 +767,7 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	 * GGTT space, so do this first before we reserve a seqno for
 	 * ourselves.
 	 */
-	ce = intel_context_instance(ctx, engine);
+	ce = i915_gem_context_get_engine(ctx, engine->id);
 	if (IS_ERR(ce))
 		return ERR_CAST(ce);
 
diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
index 7fbfcb3d63e0..ae230b38138a 100644
--- a/drivers/gpu/drm/i915/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/intel_guc_submission.c
@@ -364,11 +364,10 @@ static void guc_stage_desc_pool_destroy(struct intel_guc *guc)
 static void guc_stage_desc_init(struct intel_guc_client *client)
 {
 	struct intel_guc *guc = client->guc;
-	struct drm_i915_private *dev_priv = guc_to_i915(guc);
-	struct intel_engine_cs *engine;
 	struct i915_gem_context *ctx = client->owner;
+	struct i915_gem_engines *e;
 	struct guc_stage_desc *desc;
-	unsigned int tmp;
+	unsigned int i;
 	u32 gfx_addr;
 
 	desc = __get_stage_desc(client);
@@ -382,10 +381,13 @@ static void guc_stage_desc_init(struct intel_guc_client *client)
 	desc->priority = client->priority;
 	desc->db_id = client->doorbell_id;
 
-	for_each_engine_masked(engine, dev_priv, client->engines, tmp) {
-		struct intel_context *ce = intel_context_lookup(ctx, engine);
-		u32 guc_engine_id = engine->guc_id;
-		struct guc_execlist_context *lrc = &desc->lrc[guc_engine_id];
+	e = i915_gem_context_engine_list_lock(ctx);
+	for (i = 0; i < e->num_engines; i++) {
+		struct intel_context *ce = e->engines[i];
+		struct guc_execlist_context *lrc;
+
+		if (!ce || !(ce->engine->mask & client->engines))
+			continue;
 
 		/* TODO: We have a design issue to be solved here. Only when we
 		 * receive the first batch, we know which engine is used by the
@@ -394,7 +396,7 @@ static void guc_stage_desc_init(struct intel_guc_client *client)
 		 * for now who owns a GuC client. But for future owner of GuC
 		 * client, need to make sure lrc is pinned prior to enter here.
 		 */
-		if (!ce || !ce->state)
+		if (!ce->state)
 			break;	/* XXX: continue? */
 
 		/*
@@ -404,6 +406,7 @@ static void guc_stage_desc_init(struct intel_guc_client *client)
 		 * Instead, the GuC uses the LRCA of the user mode context (see
 		 * guc_add_request below).
 		 */
+		lrc = &desc->lrc[ce->engine->guc_id];
 		lrc->context_desc = lower_32_bits(ce->lrc_desc);
 
 		/* The state page is after PPHWSP */
@@ -414,15 +417,16 @@ static void guc_stage_desc_init(struct intel_guc_client *client)
 		 * here. In proxy submission, it wants the stage id
 		 */
 		lrc->context_id = (client->stage_id << GUC_ELC_CTXID_OFFSET) |
-				(guc_engine_id << GUC_ELC_ENGINE_OFFSET);
+				(ce->engine->guc_id << GUC_ELC_ENGINE_OFFSET);
 
 		lrc->ring_begin = intel_guc_ggtt_offset(guc, ce->ring->vma);
 		lrc->ring_end = lrc->ring_begin + ce->ring->size - 1;
 		lrc->ring_next_free_location = lrc->ring_begin;
 		lrc->ring_current_tail_pointer_value = 0;
 
-		desc->engines_used |= (1 << guc_engine_id);
+		desc->engines_used |= BIT(ce->engine->guc_id);
 	}
+	i915_gem_context_engine_list_unlock(ctx);
 
 	DRM_DEBUG_DRIVER("Host engines 0x%x => GuC engines used 0x%x\n",
 			 client->engines, desc->engines_used);
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index 214d1fd2f4dc..7fd224a4ca4c 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -1094,7 +1094,7 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
 
 	wakeref = intel_runtime_pm_get(i915);
 
-	ce = intel_context_instance(ctx, i915->engine[RCS0]);
+	ce = i915_gem_context_get_engine(ctx, RCS0);
 	if (IS_ERR(ce)) {
 		ret = PTR_ERR(ce);
 		goto out_rpm;
diff --git a/drivers/gpu/drm/i915/selftests/mock_context.c b/drivers/gpu/drm/i915/selftests/mock_context.c
index 0426093bf1d9..71c750693585 100644
--- a/drivers/gpu/drm/i915/selftests/mock_context.c
+++ b/drivers/gpu/drm/i915/selftests/mock_context.c
@@ -30,6 +30,7 @@ mock_context(struct drm_i915_private *i915,
 	     const char *name)
 {
 	struct i915_gem_context *ctx;
+	struct i915_gem_engines *e;
 	int ret;
 
 	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
@@ -40,8 +41,11 @@ mock_context(struct drm_i915_private *i915,
 	INIT_LIST_HEAD(&ctx->link);
 	ctx->i915 = i915;
 
-	ctx->hw_contexts = RB_ROOT;
-	spin_lock_init(&ctx->hw_contexts_lock);
+	mutex_init(&ctx->engines_mutex);
+	e = default_engines(ctx);
+	if (IS_ERR(e))
+		goto err_free;
+	RCU_INIT_POINTER(ctx->engines, e);
 
 	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
 	INIT_LIST_HEAD(&ctx->handles_list);
@@ -51,7 +55,7 @@ mock_context(struct drm_i915_private *i915,
 
 	ret = i915_gem_context_pin_hw_id(ctx);
 	if (ret < 0)
-		goto err_handles;
+		goto err_engines;
 
 	if (name) {
 		struct i915_hw_ppgtt *ppgtt;
@@ -69,7 +73,9 @@ mock_context(struct drm_i915_private *i915,
 
 	return ctx;
 
-err_handles:
+err_engines:
+	free_engines(rcu_access_pointer(ctx->engines));
+err_free:
 	kfree(ctx);
 	return NULL;
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 22/29] drm/i915: Remove intel_context.active_link
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (19 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 21/29] drm/i915: Switch back to an array of logical per-engine HW contexts Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-08  9:17 ` [PATCH 23/29] drm/i915: Move i915_request_alloc into selftests/ Chris Wilson
                   ` (10 subsequent siblings)
  31 siblings, 0 replies; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

We no longer need to track the active intel_contexts within each engine,
allowing us to drop a tricky mutex_lock from inside unpin (which may
occur inside fs_reclaim).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_context.c       | 11 +-------
 drivers/gpu/drm/i915/gt/intel_context_types.h |  1 -
 drivers/gpu/drm/i915/i915_debugfs.c           | 25 ++++++++++++++++---
 drivers/gpu/drm/i915/i915_gem_context.c       |  2 --
 drivers/gpu/drm/i915/i915_gem_context_types.h |  1 -
 .../gpu/drm/i915/selftests/i915_gem_context.c |  1 -
 drivers/gpu/drm/i915/selftests/mock_context.c |  1 -
 7 files changed, 22 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index 803ac9a1dd15..236db0cea59b 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -49,7 +49,6 @@ int __intel_context_do_pin(struct intel_context *ce)
 		return -EINTR;
 
 	if (likely(!atomic_read(&ce->pin_count))) {
-		struct i915_gem_context *ctx = ce->gem_context;
 		intel_wakeref_t wakeref;
 
 		err = 0;
@@ -58,11 +57,7 @@ int __intel_context_do_pin(struct intel_context *ce)
 		if (err)
 			goto err;
 
-		i915_gem_context_get(ctx);
-
-		mutex_lock(&ctx->mutex);
-		list_add(&ce->active_link, &ctx->active_engines);
-		mutex_unlock(&ctx->mutex);
+		i915_gem_context_get(ce->gem_context); /* for ctx->ppgtt */
 
 		intel_context_get(ce);
 		smp_mb__before_atomic(); /* flush pin before it is visible */
@@ -91,10 +86,6 @@ void intel_context_unpin(struct intel_context *ce)
 	if (likely(atomic_dec_and_test(&ce->pin_count))) {
 		ce->ops->unpin(ce);
 
-		mutex_lock(&ce->gem_context->mutex);
-		list_del(&ce->active_link);
-		mutex_unlock(&ce->gem_context->mutex);
-
 		i915_gem_context_put(ce->gem_context);
 		intel_context_put(ce);
 	}
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 3579c2708321..d5a7dbd0daee 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -38,7 +38,6 @@ struct intel_context {
 	struct intel_engine_cs *engine;
 	struct intel_engine_cs *active;
 
-	struct list_head active_link;
 	struct list_head signal_link;
 	struct list_head signals;
 
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 8ac8a67e7b81..848703b1f8ea 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -34,6 +34,7 @@
 
 #include "gt/intel_reset.h"
 
+#include "i915_gem_context.h"
 #include "intel_dp.h"
 #include "intel_drv.h"
 #include "intel_fbc.h"
@@ -396,14 +397,22 @@ static void print_context_stats(struct seq_file *m,
 	struct i915_gem_context *ctx;
 
 	list_for_each_entry(ctx, &i915->contexts.list, link) {
-		struct intel_context *ce;
+		struct i915_gem_engines *e;
+		unsigned int n;
+
+		e = i915_gem_context_engine_list_lock(ctx);
+		for (n = 0; n < e->num_engines; n++) {
+			struct intel_context *ce = e->engines[n];
+
+			if (!ce)
+				continue;
 
-		list_for_each_entry(ce, &ctx->active_engines, active_link) {
 			if (ce->state)
 				per_file_stats(0, ce->state->obj, &kstats);
 			if (ce->ring)
 				per_file_stats(0, ce->ring->vma->obj, &kstats);
 		}
+		i915_gem_context_engine_list_unlock(ctx);
 
 		if (!IS_ERR_OR_NULL(ctx->file_priv)) {
 			struct file_stats stats = { .vm = &ctx->ppgtt->vm, };
@@ -1893,7 +1902,8 @@ static int i915_context_status(struct seq_file *m, void *unused)
 		return ret;
 
 	list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
-		struct intel_context *ce;
+		struct i915_gem_engines *e;
+		unsigned int n;
 
 		seq_puts(m, "HW context ");
 		if (!list_empty(&ctx->hw_id_link))
@@ -1917,7 +1927,13 @@ static int i915_context_status(struct seq_file *m, void *unused)
 		seq_putc(m, ctx->remap_slice ? 'R' : 'r');
 		seq_putc(m, '\n');
 
-		list_for_each_entry(ce, &ctx->active_engines, active_link) {
+		e = i915_gem_context_engine_list_lock(ctx);
+		for (n = 0; n < e->num_engines; n++) {
+			struct intel_context *ce = e->engines[n];
+
+			if (!ce)
+				continue;
+
 			seq_printf(m, "%s: ", ce->engine->name);
 			if (ce->state)
 				describe_obj(m, ce->state->obj);
@@ -1925,6 +1941,7 @@ static int i915_context_status(struct seq_file *m, void *unused)
 				describe_ctx_ring(m, ce->ring);
 			seq_putc(m, '\n');
 		}
+		i915_gem_context_engine_list_unlock(ctx);
 
 		seq_putc(m, '\n');
 	}
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 517816558c85..c05f949ed893 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -292,7 +292,6 @@ static void i915_gem_context_free(struct i915_gem_context *ctx)
 {
 	lockdep_assert_held(&ctx->i915->drm.struct_mutex);
 	GEM_BUG_ON(!i915_gem_context_is_closed(ctx));
-	GEM_BUG_ON(!list_empty(&ctx->active_engines));
 
 	release_hw_id(ctx);
 	i915_ppgtt_put(ctx->ppgtt);
@@ -419,7 +418,6 @@ __create_context(struct drm_i915_private *dev_priv)
 	list_add_tail(&ctx->link, &dev_priv->contexts.list);
 	ctx->i915 = dev_priv;
 	ctx->sched.priority = I915_USER_PRIORITY(I915_PRIORITY_NORMAL);
-	INIT_LIST_HEAD(&ctx->active_engines);
 	mutex_init(&ctx->mutex);
 
 	mutex_init(&ctx->engines_mutex);
diff --git a/drivers/gpu/drm/i915/i915_gem_context_types.h b/drivers/gpu/drm/i915/i915_gem_context_types.h
index 1610a74b0283..ccb64f170e49 100644
--- a/drivers/gpu/drm/i915/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/i915_gem_context_types.h
@@ -151,7 +151,6 @@ struct i915_gem_context {
 	atomic_t hw_id_pin_count;
 	struct list_head hw_id_link;
 
-	struct list_head active_engines;
 	struct mutex mutex;
 
 	struct i915_sched_attr sched;
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index 7fd224a4ca4c..deedd1898fe5 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -1675,7 +1675,6 @@ static int mock_context_barrier(void *arg)
 		goto out;
 	}
 	i915_request_add(rq);
-	GEM_BUG_ON(list_empty(&ctx->active_engines));
 
 	counter = 0;
 	context_barrier_inject_fault = BIT(RCS0);
diff --git a/drivers/gpu/drm/i915/selftests/mock_context.c b/drivers/gpu/drm/i915/selftests/mock_context.c
index 71c750693585..10e67c931ed1 100644
--- a/drivers/gpu/drm/i915/selftests/mock_context.c
+++ b/drivers/gpu/drm/i915/selftests/mock_context.c
@@ -50,7 +50,6 @@ mock_context(struct drm_i915_private *i915,
 	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
 	INIT_LIST_HEAD(&ctx->handles_list);
 	INIT_LIST_HEAD(&ctx->hw_id_link);
-	INIT_LIST_HEAD(&ctx->active_engines);
 	mutex_init(&ctx->mutex);
 
 	ret = i915_gem_context_pin_hw_id(ctx);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 23/29] drm/i915: Move i915_request_alloc into selftests/
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (20 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 22/29] drm/i915: Remove intel_context.active_link Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-12  7:05   ` Tvrtko Ursulin
  2019-04-08  9:17 ` [PATCH 24/29] drm/i915: Allow multiple user handles to the same VM Chris Wilson
                   ` (9 subsequent siblings)
  31 siblings, 1 reply; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

Having transitioned GEM over to using intel_context as its primary means
of tracking the GEM context and engine combined and using
 i915_request_create(), we can move the older i915_request_alloc()
helper function into selftests/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/Makefile                 |  1 +
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |  9 ++--
 drivers/gpu/drm/i915/gt/selftest_lrc.c        | 13 ++---
 .../gpu/drm/i915/gt/selftest_workarounds.c    |  5 +-
 drivers/gpu/drm/i915/i915_request.c           | 47 -------------------
 drivers/gpu/drm/i915/i915_request.h           |  3 --
 drivers/gpu/drm/i915/selftests/huge_pages.c   |  3 +-
 drivers/gpu/drm/i915/selftests/i915_gem.c     |  5 +-
 .../gpu/drm/i915/selftests/i915_gem_context.c | 13 ++---
 .../gpu/drm/i915/selftests/i915_gem_evict.c   |  3 +-
 drivers/gpu/drm/i915/selftests/i915_request.c |  4 +-
 .../gpu/drm/i915/selftests/igt_gem_utils.c    | 43 +++++++++++++++++
 .../gpu/drm/i915/selftests/igt_gem_utils.h    | 17 +++++++
 drivers/gpu/drm/i915/selftests/igt_spinner.c  |  3 +-
 drivers/gpu/drm/i915/selftests/mock_request.c |  3 +-
 15 files changed, 96 insertions(+), 76 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/selftests/igt_gem_utils.c
 create mode 100644 drivers/gpu/drm/i915/selftests/igt_gem_utils.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index dd8d923aa1c6..58643373495c 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -193,6 +193,7 @@ i915-$(CONFIG_DRM_I915_SELFTEST) += \
 	selftests/i915_random.o \
 	selftests/i915_selftest.o \
 	selftests/igt_flush_test.o \
+	selftests/igt_gem_utils.o \
 	selftests/igt_live_test.o \
 	selftests/igt_reset.o \
 	selftests/igt_spinner.o
diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index 6004d6907e9c..c206748629dc 100644
--- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -29,6 +29,7 @@
 #include "i915_selftest.h"
 #include "selftests/i915_random.h"
 #include "selftests/igt_flush_test.h"
+#include "selftests/igt_gem_utils.h"
 #include "selftests/igt_reset.h"
 #include "selftests/igt_wedge_me.h"
 
@@ -175,7 +176,7 @@ hang_create_request(struct hang *h, struct intel_engine_cs *engine)
 	if (err)
 		goto unpin_vma;
 
-	rq = i915_request_alloc(engine, h->ctx);
+	rq = igt_request_alloc(h->ctx, engine);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto unpin_hws;
@@ -455,7 +456,7 @@ static int igt_reset_nop(void *arg)
 			for (i = 0; i < 16; i++) {
 				struct i915_request *rq;
 
-				rq = i915_request_alloc(engine, ctx);
+				rq = igt_request_alloc(ctx, engine);
 				if (IS_ERR(rq)) {
 					err = PTR_ERR(rq);
 					break;
@@ -554,7 +555,7 @@ static int igt_reset_nop_engine(void *arg)
 			for (i = 0; i < 16; i++) {
 				struct i915_request *rq;
 
-				rq = i915_request_alloc(engine, ctx);
+				rq = igt_request_alloc(ctx, engine);
 				if (IS_ERR(rq)) {
 					err = PTR_ERR(rq);
 					break;
@@ -800,7 +801,7 @@ static int active_engine(void *data)
 		struct i915_request *new;
 
 		mutex_lock(&engine->i915->drm.struct_mutex);
-		new = i915_request_alloc(engine, ctx[idx]);
+		new = igt_request_alloc(ctx[idx], engine);
 		if (IS_ERR(new)) {
 			mutex_unlock(&engine->i915->drm.struct_mutex);
 			err = PTR_ERR(new);
diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index cd0551f97c2f..84538f69185b 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -10,6 +10,7 @@
 #include "i915_selftest.h"
 #include "selftests/i915_random.h"
 #include "selftests/igt_flush_test.h"
+#include "selftests/igt_gem_utils.h"
 #include "selftests/igt_live_test.h"
 #include "selftests/igt_spinner.h"
 #include "selftests/mock_context.h"
@@ -148,7 +149,7 @@ static int live_busywait_preempt(void *arg)
 		 * fails, we hang instead.
 		 */
 
-		lo = i915_request_alloc(engine, ctx_lo);
+		lo = igt_request_alloc(ctx_lo, engine);
 		if (IS_ERR(lo)) {
 			err = PTR_ERR(lo);
 			goto err_vma;
@@ -192,7 +193,7 @@ static int live_busywait_preempt(void *arg)
 			goto err_vma;
 		}
 
-		hi = i915_request_alloc(engine, ctx_hi);
+		hi = igt_request_alloc(ctx_hi, engine);
 		if (IS_ERR(hi)) {
 			err = PTR_ERR(hi);
 			goto err_vma;
@@ -857,13 +858,13 @@ static int live_chain_preempt(void *arg)
 			i915_request_add(rq);
 
 			for (i = 0; i < count; i++) {
-				rq = i915_request_alloc(engine, lo.ctx);
+				rq = igt_request_alloc(lo.ctx, engine);
 				if (IS_ERR(rq))
 					goto err_wedged;
 				i915_request_add(rq);
 			}
 
-			rq = i915_request_alloc(engine, hi.ctx);
+			rq = igt_request_alloc(hi.ctx, engine);
 			if (IS_ERR(rq))
 				goto err_wedged;
 			i915_request_add(rq);
@@ -882,7 +883,7 @@ static int live_chain_preempt(void *arg)
 			}
 			igt_spinner_end(&lo.spin);
 
-			rq = i915_request_alloc(engine, lo.ctx);
+			rq = igt_request_alloc(lo.ctx, engine);
 			if (IS_ERR(rq))
 				goto err_wedged;
 			i915_request_add(rq);
@@ -1087,7 +1088,7 @@ static int smoke_submit(struct preempt_smoke *smoke,
 
 	ctx->sched.priority = prio;
 
-	rq = i915_request_alloc(smoke->engine, ctx);
+	rq = igt_request_alloc(ctx, smoke->engine);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto unpin;
diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
index 028942cac4b4..02511b532b41 100644
--- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
@@ -8,6 +8,7 @@
 #include "intel_reset.h"
 
 #include "selftests/igt_flush_test.h"
+#include "selftests/igt_gem_utils.h"
 #include "selftests/igt_reset.h"
 #include "selftests/igt_spinner.h"
 #include "selftests/igt_wedge_me.h"
@@ -102,7 +103,7 @@ read_nonprivs(struct i915_gem_context *ctx, struct intel_engine_cs *engine)
 	if (err)
 		goto err_obj;
 
-	rq = i915_request_alloc(engine, ctx);
+	rq = igt_request_alloc(ctx, engine);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto err_pin;
@@ -554,7 +555,7 @@ static int check_dirty_whitelist(struct i915_gem_context *ctx,
 		i915_gem_object_unpin_map(batch->obj);
 		i915_gem_chipset_flush(ctx->i915);
 
-		rq = i915_request_alloc(engine, ctx);
+		rq = igt_request_alloc(ctx, engine);
 		if (IS_ERR(rq)) {
 			err = PTR_ERR(rq);
 			goto out_batch;
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 5e52c1623c17..67360107d61d 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -738,53 +738,6 @@ i915_request_create(struct intel_context *ce)
 	return rq;
 }
 
-/**
- * i915_request_alloc - allocate a request structure
- *
- * @engine: engine that we wish to issue the request on.
- * @ctx: context that the request will be associated with.
- *
- * Returns a pointer to the allocated request if successful,
- * or an error code if not.
- */
-struct i915_request *
-i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
-{
-	struct drm_i915_private *i915 = engine->i915;
-	struct intel_context *ce;
-	struct i915_request *rq;
-	int err;
-
-	/*
-	 * Preempt contexts are reserved for exclusive use to inject a
-	 * preemption context switch. They are never to be used for any trivial
-	 * request!
-	 */
-	GEM_BUG_ON(ctx == i915->preempt_context);
-
-	/*
-	 * Pinning the contexts may generate requests in order to acquire
-	 * GGTT space, so do this first before we reserve a seqno for
-	 * ourselves.
-	 */
-	ce = i915_gem_context_get_engine(ctx, engine->id);
-	if (IS_ERR(ce))
-		return ERR_CAST(ce);
-
-	err = intel_context_pin(ce);
-	if (err) {
-		rq = ERR_PTR(err);
-		goto err_put;
-	}
-
-	rq = i915_request_create(ce);
-	intel_context_unpin(ce);
-
-err_put:
-	intel_context_put(ce);
-	return rq;
-}
-
 static int
 emit_semaphore_wait(struct i915_request *to,
 		    struct i915_request *from,
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index c0b7539daaf5..ea7a37ad63e2 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -245,9 +245,6 @@ i915_request_create(struct intel_context *ce);
 
 struct i915_request *__i915_request_commit(struct i915_request *request);
 
-struct i915_request * __must_check
-i915_request_alloc(struct intel_engine_cs *engine,
-		   struct i915_gem_context *ctx);
 void i915_request_retire_upto(struct i915_request *rq);
 
 static inline struct i915_request *
diff --git a/drivers/gpu/drm/i915/selftests/huge_pages.c b/drivers/gpu/drm/i915/selftests/huge_pages.c
index 90721b54e7ae..1e1f83326a96 100644
--- a/drivers/gpu/drm/i915/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/selftests/huge_pages.c
@@ -26,6 +26,7 @@
 
 #include <linux/prime_numbers.h>
 
+#include "igt_gem_utils.h"
 #include "mock_drm.h"
 #include "i915_random.h"
 
@@ -980,7 +981,7 @@ static int gpu_write(struct i915_vma *vma,
 	if (IS_ERR(batch))
 		return PTR_ERR(batch);
 
-	rq = i915_request_alloc(engine, ctx);
+	rq = igt_request_alloc(ctx, engine);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto err_batch;
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem.c b/drivers/gpu/drm/i915/selftests/i915_gem.c
index 7460b77b34e0..3f4033c0e8e2 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem.c
@@ -8,8 +8,9 @@
 
 #include "../i915_selftest.h"
 
-#include "mock_context.h"
+#include "igt_gem_utils.h"
 #include "igt_flush_test.h"
+#include "mock_context.h"
 
 static int switch_to_context(struct drm_i915_private *i915,
 			     struct i915_gem_context *ctx)
@@ -20,7 +21,7 @@ static int switch_to_context(struct drm_i915_private *i915,
 	for_each_engine(engine, i915, id) {
 		struct i915_request *rq;
 
-		rq = i915_request_alloc(engine, ctx);
+		rq = igt_request_alloc(ctx, engine);
 		if (IS_ERR(rq))
 			return PTR_ERR(rq);
 
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index deedd1898fe5..b62f005e4d50 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -29,6 +29,7 @@
 
 #include "i915_random.h"
 #include "igt_flush_test.h"
+#include "igt_gem_utils.h"
 #include "igt_live_test.h"
 #include "igt_reset.h"
 #include "igt_spinner.h"
@@ -91,7 +92,7 @@ static int live_nop_switch(void *arg)
 
 		times[0] = ktime_get_raw();
 		for (n = 0; n < nctx; n++) {
-			rq = i915_request_alloc(engine, ctx[n]);
+			rq = igt_request_alloc(ctx[n], engine);
 			if (IS_ERR(rq)) {
 				err = PTR_ERR(rq);
 				goto out_unlock;
@@ -121,7 +122,7 @@ static int live_nop_switch(void *arg)
 			times[1] = ktime_get_raw();
 
 			for (n = 0; n < prime; n++) {
-				rq = i915_request_alloc(engine, ctx[n % nctx]);
+				rq = igt_request_alloc(ctx[n % nctx], engine);
 				if (IS_ERR(rq)) {
 					err = PTR_ERR(rq);
 					goto out_unlock;
@@ -301,7 +302,7 @@ static int gpu_fill(struct drm_i915_gem_object *obj,
 		goto err_vma;
 	}
 
-	rq = i915_request_alloc(engine, ctx);
+	rq = igt_request_alloc(ctx, engine);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto err_batch;
@@ -1350,7 +1351,7 @@ static int write_to_scratch(struct i915_gem_context *ctx,
 	if (err)
 		goto err_unpin;
 
-	rq = i915_request_alloc(engine, ctx);
+	rq = igt_request_alloc(ctx, engine);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto err_unpin;
@@ -1445,7 +1446,7 @@ static int read_from_scratch(struct i915_gem_context *ctx,
 	if (err)
 		goto err_unpin;
 
-	rq = i915_request_alloc(engine, ctx);
+	rq = igt_request_alloc(ctx, engine);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto err_unpin;
@@ -1669,7 +1670,7 @@ static int mock_context_barrier(void *arg)
 		goto out;
 	}
 
-	rq = i915_request_alloc(i915->engine[RCS0], ctx);
+	rq = igt_request_alloc(ctx, i915->engine[RCS0]);
 	if (IS_ERR(rq)) {
 		pr_err("Request allocation failed!\n");
 		goto out;
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
index 89766688e420..4fc6e5445dd1 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
@@ -24,6 +24,7 @@
 
 #include "../i915_selftest.h"
 
+#include "igt_gem_utils.h"
 #include "lib_sw_fence.h"
 #include "mock_context.h"
 #include "mock_drm.h"
@@ -460,7 +461,7 @@ static int igt_evict_contexts(void *arg)
 
 			/* We will need some GGTT space for the rq's context */
 			igt_evict_ctl.fail_if_busy = true;
-			rq = i915_request_alloc(engine, ctx);
+			rq = igt_request_alloc(ctx, engine);
 			igt_evict_ctl.fail_if_busy = false;
 
 			if (IS_ERR(rq)) {
diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c
index 098d7b3aa131..b60591531e4a 100644
--- a/drivers/gpu/drm/i915/selftests/i915_request.c
+++ b/drivers/gpu/drm/i915/selftests/i915_request.c
@@ -267,7 +267,7 @@ static struct i915_request *
 __live_request_alloc(struct i915_gem_context *ctx,
 		     struct intel_engine_cs *engine)
 {
-	return i915_request_alloc(engine, ctx);
+	return igt_request_alloc(ctx, engine);
 }
 
 static int __igt_breadcrumbs_smoketest(void *arg)
@@ -1074,7 +1074,7 @@ max_batches(struct i915_gem_context *ctx, struct intel_engine_cs *engine)
 	if (HAS_EXECLISTS(ctx->i915))
 		return INT_MAX;
 
-	rq = i915_request_alloc(engine, ctx);
+	rq = igt_request_alloc(ctx, engine);
 	if (IS_ERR(rq)) {
 		ret = PTR_ERR(rq);
 	} else {
diff --git a/drivers/gpu/drm/i915/selftests/igt_gem_utils.c b/drivers/gpu/drm/i915/selftests/igt_gem_utils.c
new file mode 100644
index 000000000000..23299aa80ac4
--- /dev/null
+++ b/drivers/gpu/drm/i915/selftests/igt_gem_utils.c
@@ -0,0 +1,43 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2018 Intel Corporation
+ */
+
+#include "igt_gem_utils.h"
+
+#include "gt/intel_context.h"
+
+#include "../i915_gem_context.h"
+#include "../i915_gem_pm.h"
+#include "../i915_request.h"
+
+struct i915_request *
+igt_request_alloc(struct i915_gem_context *ctx, struct intel_engine_cs *engine)
+{
+	struct intel_context *ce;
+	struct i915_request *rq;
+	int err;
+
+	/*
+	 * Pinning the contexts may generate requests in order to acquire
+	 * GGTT space, so do this first before we reserve a seqno for
+	 * ourselves.
+	 */
+	ce = i915_gem_context_get_engine(ctx, engine->id);
+	if (IS_ERR(ce))
+		return ERR_CAST(ce);
+
+	err = intel_context_pin(ce);
+	if (err) {
+		rq = ERR_PTR(err);
+		goto err_put;
+	}
+
+	rq = i915_request_create(ce);
+	intel_context_unpin(ce);
+
+err_put:
+	intel_context_put(ce);
+	return rq;
+}
diff --git a/drivers/gpu/drm/i915/selftests/igt_gem_utils.h b/drivers/gpu/drm/i915/selftests/igt_gem_utils.h
new file mode 100644
index 000000000000..0f17251cf75d
--- /dev/null
+++ b/drivers/gpu/drm/i915/selftests/igt_gem_utils.h
@@ -0,0 +1,17 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2018 Intel Corporation
+ */
+
+#ifndef __IGT_GEM_UTILS_H__
+#define __IGT_GEM_UTILS_H__
+
+struct i915_request;
+struct i915_gem_context;
+struct intel_engine_cs;
+
+struct i915_request *
+igt_request_alloc(struct i915_gem_context *ctx, struct intel_engine_cs *engine);
+
+#endif /* __IGT_GEM_UTILS_H__ */
diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c
index 16890dfe74c0..ece8a8a0d3b0 100644
--- a/drivers/gpu/drm/i915/selftests/igt_spinner.c
+++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c
@@ -4,6 +4,7 @@
  * Copyright © 2018 Intel Corporation
  */
 
+#include "igt_gem_utils.h"
 #include "igt_spinner.h"
 
 int igt_spinner_init(struct igt_spinner *spin, struct drm_i915_private *i915)
@@ -114,7 +115,7 @@ igt_spinner_create_request(struct igt_spinner *spin,
 	if (err)
 		goto unpin_vma;
 
-	rq = i915_request_alloc(engine, ctx);
+	rq = igt_request_alloc(ctx, engine);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto unpin_hws;
diff --git a/drivers/gpu/drm/i915/selftests/mock_request.c b/drivers/gpu/drm/i915/selftests/mock_request.c
index f739ba63057f..b99f7576153c 100644
--- a/drivers/gpu/drm/i915/selftests/mock_request.c
+++ b/drivers/gpu/drm/i915/selftests/mock_request.c
@@ -24,6 +24,7 @@
 
 #include "gt/mock_engine.h"
 
+#include "igt_gem_utils.h"
 #include "mock_request.h"
 
 struct i915_request *
@@ -34,7 +35,7 @@ mock_request(struct intel_engine_cs *engine,
 	struct i915_request *request;
 
 	/* NB the i915->requests slab cache is enlarged to fit mock_request */
-	request = i915_request_alloc(engine, context);
+	request = igt_request_alloc(context, engine);
 	if (IS_ERR(request))
 		return NULL;
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 24/29] drm/i915: Allow multiple user handles to the same VM
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (21 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 23/29] drm/i915: Move i915_request_alloc into selftests/ Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-12  7:21   ` Tvrtko Ursulin
  2019-04-08  9:17 ` [PATCH 25/29] drm/i915: Restore control over ppgtt for context creation ABI Chris Wilson
                   ` (8 subsequent siblings)
  31 siblings, 1 reply; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

It was noted that we made the same mistake for VM_ID as for object
handles, whereby we ensured that we only allocated a single handle for
one ppgtt. This has the unfortunate consequence for userspace that they
need to reference count the handles to avoid destroying an active ID. If
we allow multiple handles to the same ppgtt, userspace can freely
unreference any handle they own without fear of destroying the same
handle in use elsewhere.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 26 ++++++++-----------------
 drivers/gpu/drm/i915/i915_gem_gtt.h     |  2 --
 2 files changed, 8 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index c05f949ed893..2a30cab7a65d 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -837,8 +837,7 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void *data,
 	if (err < 0)
 		goto err_unlock;
 
-	GEM_BUG_ON(err == 0); /* reserved for default/unassigned ppgtt */
-	ppgtt->user_handle = err;
+	GEM_BUG_ON(err == 0); /* reserved for invalid/unassigned ppgtt */
 
 	mutex_unlock(&file_priv->vm_idr_lock);
 
@@ -876,10 +875,6 @@ int i915_gem_vm_destroy_ioctl(struct drm_device *dev, void *data,
 		return err;
 
 	ppgtt = idr_remove(&file_priv->vm_idr, id);
-	if (ppgtt) {
-		GEM_BUG_ON(ppgtt->user_handle != id);
-		ppgtt->user_handle = 0;
-	}
 
 	mutex_unlock(&file_priv->vm_idr_lock);
 	if (!ppgtt)
@@ -1001,18 +996,15 @@ static int get_ppgtt(struct drm_i915_file_private *file_priv,
 	if (ret)
 		goto err_put;
 
-	if (!ppgtt->user_handle) {
-		ret = idr_alloc(&file_priv->vm_idr, ppgtt, 0, 0, GFP_KERNEL);
-		GEM_BUG_ON(!ret);
-		if (ret < 0)
-			goto err_unlock;
+	ret = idr_alloc(&file_priv->vm_idr, ppgtt, 0, 0, GFP_KERNEL);
+	GEM_BUG_ON(!ret);
+	if (ret < 0)
+		goto err_unlock;
 
-		ppgtt->user_handle = ret;
-		i915_ppgtt_get(ppgtt);
-	}
+	i915_ppgtt_get(ppgtt);
 
 	args->size = 0;
-	args->value = ppgtt->user_handle;
+	args->value = ret;
 
 	ret = 0;
 err_unlock:
@@ -1103,10 +1095,8 @@ static int set_ppgtt(struct drm_i915_file_private *file_priv,
 		return err;
 
 	ppgtt = idr_find(&file_priv->vm_idr, args->value);
-	if (ppgtt) {
-		GEM_BUG_ON(ppgtt->user_handle != args->value);
+	if (ppgtt)
 		i915_ppgtt_get(ppgtt);
-	}
 	mutex_unlock(&file_priv->vm_idr_lock);
 	if (!ppgtt)
 		return -ENOENT;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index c8d96e91f3dc..4832bb5c5fc0 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -396,8 +396,6 @@ struct i915_hw_ppgtt {
 		struct i915_page_directory_pointer pdp;	/* GEN8+ */
 		struct i915_page_directory pd;		/* GEN6-7 */
 	};
-
-	u32 user_handle;
 };
 
 struct gen6_hw_ppgtt {
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 25/29] drm/i915: Restore control over ppgtt for context creation ABI
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (22 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 24/29] drm/i915: Allow multiple user handles to the same VM Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-08  9:17 ` [PATCH 26/29] drm/i915: Allow a context to define its set of engines Chris Wilson
                   ` (7 subsequent siblings)
  31 siblings, 0 replies; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

Having hid the partially exposed new ABI from the PR, put it back again
for completion of context recovery. A significant part of context
recovery is the ability to reuse as much of the old context as is
feasible (to avoid expensive reconstruction). The biggest chunk kept
hidden at the moment is fine-control over the ctx->ppgtt (the GPU page
tables and associated translation tables and kernel maps), so make
control over the ctx->ppgtt explicit.

This allows userspace to create and share virtual memory address spaces
(within the limits of a single fd) between contexts they own, along with
the ability to query the contexts for the vm state.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.c         |  2 ++
 drivers/gpu/drm/i915/i915_gem_context.c |  5 -----
 include/uapi/drm/i915_drm.h             | 15 +++++++++++++++
 3 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index c8cb70d4fe91..ea9b4eb3bf9b 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -3150,6 +3150,8 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_PERF_ADD_CONFIG, i915_perf_add_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_PERF_REMOVE_CONFIG, i915_perf_remove_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_QUERY, i915_query_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_GEM_VM_CREATE, i915_gem_vm_create_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_GEM_VM_DESTROY, i915_gem_vm_destroy_ioctl, DRM_RENDER_ALLOW),
 };
 
 static struct drm_driver driver = {
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 2a30cab7a65d..0b4d7c8056b2 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -98,7 +98,6 @@
 #include "i915_user_extensions.h"
 
 #define I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE (1 << 1)
-#define I915_CONTEXT_PARAM_VM 0x9
 
 #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
 
@@ -979,8 +978,6 @@ static int get_ppgtt(struct drm_i915_file_private *file_priv,
 	struct i915_hw_ppgtt *ppgtt;
 	int ret;
 
-	return -EINVAL; /* nothing to see here; please move along */
-
 	if (!ctx->ppgtt)
 		return -ENODEV;
 
@@ -1079,8 +1076,6 @@ static int set_ppgtt(struct drm_i915_file_private *file_priv,
 	struct i915_hw_ppgtt *ppgtt, *old;
 	int err;
 
-	return -EINVAL; /* nothing to see here; please move along */
-
 	if (args->size)
 		return -EINVAL;
 
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 52051d24d89d..d45b79746fc4 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -343,6 +343,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_I915_PERF_ADD_CONFIG	0x37
 #define DRM_I915_PERF_REMOVE_CONFIG	0x38
 #define DRM_I915_QUERY			0x39
+#define DRM_I915_GEM_VM_CREATE		0x3a
+#define DRM_I915_GEM_VM_DESTROY		0x3b
 /* Must be kept compact -- no holes */
 
 #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
@@ -403,6 +405,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_PERF_ADD_CONFIG	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_ADD_CONFIG, struct drm_i915_perf_oa_config)
 #define DRM_IOCTL_I915_PERF_REMOVE_CONFIG	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_REMOVE_CONFIG, __u64)
 #define DRM_IOCTL_I915_QUERY			DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
+#define DRM_IOCTL_I915_GEM_VM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)
+#define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
 
 /* Allow drivers to submit batchbuffers directly to hardware, relying
  * on the security mechanisms provided by hardware.
@@ -1495,6 +1499,17 @@ struct drm_i915_gem_context_param {
  * On creation, all new contexts are marked as recoverable.
  */
 #define I915_CONTEXT_PARAM_RECOVERABLE	0x8
+
+	/*
+	 * The id of the associated virtual memory address space (ppGTT) of
+	 * this context. Can be retrieved and passed to another context
+	 * (on the same fd) for both to use the same ppGTT and so share
+	 * address layouts, and avoid reloading the page tables on context
+	 * switches between themselves.
+	 *
+	 * See DRM_I915_GEM_VM_CREATE and DRM_I915_GEM_VM_DESTROY.
+	 */
+#define I915_CONTEXT_PARAM_VM		0x9
 /* Must be kept compact -- no holes and well documented */
 
 	__u64 value;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 26/29] drm/i915: Allow a context to define its set of engines
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (23 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 25/29] drm/i915: Restore control over ppgtt for context creation ABI Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-08  9:17 ` [PATCH 27/29] drm/i915: Allow userspace to clone contexts on creation Chris Wilson
                   ` (6 subsequent siblings)
  31 siblings, 0 replies; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

Over the last few years, we have debated how to extend the user API to
support an increase in the number of engines, that may be sparse and
even be heterogeneous within a class (not all video decoders created
equal). We settled on using (class, instance) tuples to identify a
specific engine, with an API for the user to construct a map of engines
to capabilities. Into this picture, we then add a challenge of virtual
engines; one user engine that maps behind the scenes to any number of
physical engines. To keep it general, we want the user to have full
control over that mapping. To that end, we allow the user to constrain a
context to define the set of engines that it can access, order fully
controlled by the user via (class, instance). With such precise control
in context setup, we can continue to use the existing execbuf uABI of
specifying a single index; only now it doesn't automagically map onto
the engines, it uses the user defined engine map from the context.

The I915_EXEC_DEFAULT slot is left empty, and invalid for use by
execbuf. It's use will be revealed in the next patch.

v2: Fixup freeing of local on success of get_engines()
v3: Allow empty engines[]
v4: s/nengine/num_engines/
v5: Replace 64 limit on num_engines with a note that execbuf is
currently limited to only using the first 64 engines.

Testcase: igt/gem_ctx_engines
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v4
---
 drivers/gpu/drm/i915/i915_gem_context.c       | 222 +++++++++++++++++-
 drivers/gpu/drm/i915/i915_gem_context.h       |  18 ++
 drivers/gpu/drm/i915/i915_gem_context_types.h |   8 +
 drivers/gpu/drm/i915/i915_gem_execbuffer.c    |   5 +-
 drivers/gpu/drm/i915/i915_utils.h             |  36 +++
 include/uapi/drm/i915_drm.h                   |  39 +++
 6 files changed, 321 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 0b4d7c8056b2..9286a31ef2ed 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -90,7 +90,6 @@
 #include <drm/i915_drm.h>
 
 #include "gt/intel_lrc_reg.h"
-#include "gt/intel_workarounds.h"
 
 #include "i915_drv.h"
 #include "i915_globals.h"
@@ -143,13 +142,17 @@ static void lut_close(struct i915_gem_context *ctx)
 static struct intel_context *
 lookup_user_engine(struct i915_gem_context *ctx, u16 class, u16 instance)
 {
-	struct intel_engine_cs *engine;
+	if (!i915_gem_context_user_engines(ctx)) {
+		struct intel_engine_cs *engine;
 
-	engine = intel_engine_lookup_user(ctx->i915, class, instance);
-	if (!engine)
-		return ERR_PTR(-EINVAL);
+		engine = intel_engine_lookup_user(ctx->i915, class, instance);
+		if (!engine)
+			return ERR_PTR(-EINVAL);
+
+		instance = engine->id;
+	}
 
-	return i915_gem_context_get_engine(ctx, engine->id);
+	return i915_gem_context_get_engine(ctx, instance);
 }
 
 static inline int new_hw_id(struct drm_i915_private *i915, gfp_t gfp)
@@ -260,6 +263,17 @@ static void free_engines_n(struct i915_gem_engines *e, unsigned long n)
 	free_engines(e);
 }
 
+static void free_engines_rcu(struct work_struct *wrk)
+{
+	struct i915_gem_engines *e =
+		container_of(wrk, struct i915_gem_engines, rcu.work);
+	struct drm_i915_private *i915 = e->i915;
+
+	mutex_lock(&i915->drm.struct_mutex);
+	free_engines(e);
+	mutex_unlock(&i915->drm.struct_mutex);
+}
+
 static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx)
 {
 	struct intel_engine_cs *engine;
@@ -1392,6 +1406,194 @@ static int set_sseu(struct i915_gem_context *ctx,
 	return ret;
 }
 
+struct set_engines {
+	struct i915_gem_context *ctx;
+	struct i915_gem_engines *engines;
+};
+
+static const i915_user_extension_fn set_engines__extensions[] = {
+};
+
+static int
+set_engines(struct i915_gem_context *ctx,
+	    const struct drm_i915_gem_context_param *args)
+{
+	struct i915_context_param_engines __user *user =
+		u64_to_user_ptr(args->value);
+	struct set_engines set = { .ctx = ctx };
+	unsigned int num_engines, n;
+	u64 extensions;
+	int err;
+
+	if (!args->size) { /* switch back to legacy user_ring_map */
+		if (!i915_gem_context_user_engines(ctx))
+			return 0;
+
+		set.engines = default_engines(ctx);
+		if (IS_ERR(set.engines))
+			return PTR_ERR(set.engines);
+
+		goto replace;
+	}
+
+	BUILD_BUG_ON(!IS_ALIGNED(sizeof(*user), sizeof(*user->class_instance)));
+	if (args->size < sizeof(*user) ||
+	    !IS_ALIGNED(args->size, sizeof(*user->class_instance))) {
+		DRM_DEBUG("Invalid size for engine array: %d\n",
+			  args->size);
+		return -EINVAL;
+	}
+
+	/*
+	 * Note that I915_EXEC_RING_MASK limits execbuf to only using the
+	 * first 64 engines defined here.
+	 */
+	num_engines =
+		(args->size - sizeof(*user)) / sizeof(*user->class_instance);
+
+	set.engines = kmalloc(struct_size(set.engines, engines, num_engines),
+			      GFP_KERNEL);
+	if (!set.engines)
+		return -ENOMEM;
+
+	set.engines->i915 = ctx->i915;
+	for (n = 0; n < num_engines; n++) {
+		struct intel_engine_cs *engine;
+		u16 class, inst;
+
+		if (get_user(class, &user->class_instance[n].engine_class) ||
+		    get_user(inst, &user->class_instance[n].engine_instance)) {
+			free_engines_n(set.engines, n);
+			return -EFAULT;
+		}
+
+		if (class == (u16)I915_ENGINE_CLASS_INVALID &&
+		    inst == (u16)I915_ENGINE_CLASS_INVALID_NONE) {
+			set.engines->engines[n] = NULL;
+			continue;
+		}
+
+		engine = intel_engine_lookup_user(ctx->i915, class, inst);
+		if (!engine) {
+			DRM_DEBUG("Invalid engine[%d]: { class:%d, inst:%d }\n",
+				  n, class, inst);
+			free_engines_n(set.engines, n);
+			return -ENOENT;
+		}
+
+		set.engines->engines[n] = intel_context_create(ctx, engine);
+		if (!set.engines->engines[n]) {
+			free_engines_n(set.engines, n);
+			return -ENOMEM;
+		}
+	}
+	set.engines->num_engines = num_engines;
+
+	err = -EFAULT;
+	if (!get_user(extensions, &user->extensions))
+		err = i915_user_extensions(u64_to_user_ptr(extensions),
+					   set_engines__extensions,
+					   ARRAY_SIZE(set_engines__extensions),
+					   &set);
+	if (err) {
+		free_engines(set.engines);
+		return err;
+	}
+
+replace:
+	mutex_lock(&ctx->i915->drm.struct_mutex);
+	if (args->size)
+		i915_gem_context_set_user_engines(ctx);
+	else
+		i915_gem_context_clear_user_engines(ctx);
+	rcu_swap_protected(ctx->engines, set.engines, 1);
+	mutex_unlock(&ctx->i915->drm.struct_mutex);
+
+	INIT_RCU_WORK(&set.engines->rcu, free_engines_rcu);
+	queue_rcu_work(system_wq, &set.engines->rcu);
+
+	return 0;
+}
+
+static int
+get_engines(struct i915_gem_context *ctx,
+	    struct drm_i915_gem_context_param *args)
+{
+	struct i915_context_param_engines __user *user;
+	struct i915_gem_engines *e;
+	size_t n, count, size;
+	int err = 0;
+
+	err = mutex_lock_interruptible(&ctx->engines_mutex);
+	if (err)
+		return err;
+
+	if (!i915_gem_context_user_engines(ctx)) {
+		args->size = 0;
+		goto unlock;
+	}
+
+	e = i915_gem_context_engine_list(ctx);
+	count = e->num_engines;
+
+	/* Be paranoid in case we have an impedance mismatch */
+	if (!check_struct_size(user, class_instance, count, &size)) {
+		err = -ENOMEM;
+		goto unlock;
+	}
+	if (unlikely(overflows_type(size, args->size))) {
+		err = -ENOMEM;
+		goto unlock;
+	}
+
+	if (!args->size) {
+		args->size = size;
+		goto unlock;
+	}
+
+	if (args->size < size) {
+		err = -EINVAL;
+		goto unlock;
+	}
+
+	user = u64_to_user_ptr(args->value);
+	if (!access_ok(user, size)) {
+		err = -EFAULT;
+		goto unlock;
+	}
+
+	if (put_user(0, &user->extensions)) {
+		err = -EFAULT;
+		goto unlock;
+	}
+
+	for (n = 0; n < count; n++) {
+		struct {
+			u16 class;
+			u16 instance;
+		} ci = {
+			.class = I915_ENGINE_CLASS_INVALID,
+			.instance = I915_ENGINE_CLASS_INVALID_NONE,
+		};
+
+		if (e->engines[n]) {
+			ci.class = e->engines[n]->engine->uabi_class;
+			ci.instance = e->engines[n]->engine->instance;
+		}
+
+		if (copy_to_user(&user->class_instance[n], &ci, sizeof(ci))) {
+			err = -EFAULT;
+			goto unlock;
+		}
+	}
+
+	args->size = size;
+
+unlock:
+	mutex_unlock(&ctx->engines_mutex);
+	return err;
+}
+
 static int ctx_setparam(struct drm_i915_file_private *fpriv,
 			struct i915_gem_context *ctx,
 			struct drm_i915_gem_context_param *args)
@@ -1465,6 +1667,10 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
 		ret = set_ppgtt(fpriv, ctx, args);
 		break;
 
+	case I915_CONTEXT_PARAM_ENGINES:
+		ret = set_engines(ctx, args);
+		break;
+
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
 	default:
 		ret = -EINVAL;
@@ -1695,6 +1901,10 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		ret = get_ppgtt(file_priv, ctx, args);
 		break;
 
+	case I915_CONTEXT_PARAM_ENGINES:
+		ret = get_engines(ctx, args);
+		break;
+
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
 	default:
 		ret = -EINVAL;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index d94fc4c1907c..ef90f327836f 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -112,6 +112,24 @@ static inline void i915_gem_context_set_force_single_submission(struct i915_gem_
 	__set_bit(CONTEXT_FORCE_SINGLE_SUBMISSION, &ctx->flags);
 }
 
+static inline bool
+i915_gem_context_user_engines(const struct i915_gem_context *ctx)
+{
+	return test_bit(CONTEXT_USER_ENGINES, &ctx->flags);
+}
+
+static inline void
+i915_gem_context_set_user_engines(struct i915_gem_context *ctx)
+{
+	set_bit(CONTEXT_USER_ENGINES, &ctx->flags);
+}
+
+static inline void
+i915_gem_context_clear_user_engines(struct i915_gem_context *ctx)
+{
+	clear_bit(CONTEXT_USER_ENGINES, &ctx->flags);
+}
+
 int __i915_gem_context_pin_hw_id(struct i915_gem_context *ctx);
 static inline int i915_gem_context_pin_hw_id(struct i915_gem_context *ctx)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_context_types.h b/drivers/gpu/drm/i915/i915_gem_context_types.h
index ccb64f170e49..13aed6cd6bd9 100644
--- a/drivers/gpu/drm/i915/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/i915_gem_context_types.h
@@ -132,6 +132,14 @@ struct i915_gem_context {
 #define CONTEXT_BANNED			0
 #define CONTEXT_CLOSED			1
 #define CONTEXT_FORCE_SINGLE_SUBMISSION	2
+#define CONTEXT_USER_ENGINES		3
+
+	/**
+	 * @num_engines: Number of user defined engines for this context
+	 *
+	 * See @engines for the elements.
+	 */
+	unsigned int num_engines;
 
 	/**
 	 * @hw_id: - unique identifier for the context
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 941192c60832..4fde34bd6136 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -2165,7 +2165,10 @@ eb_select_engine(struct i915_execbuffer *eb,
 	unsigned int idx;
 	int err;
 
-	idx = eb_select_legacy_ring(eb, file, args);
+	if (i915_gem_context_user_engines(eb->gem_context))
+		idx = args->flags & I915_EXEC_RING_MASK;
+	else
+		idx = eb_select_legacy_ring(eb, file, args);
 
 	ce = i915_gem_context_get_engine(eb->gem_context, idx);
 	if (IS_ERR(ce))
diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h
index 2dbe8933b50a..1436fe2fb5f8 100644
--- a/drivers/gpu/drm/i915/i915_utils.h
+++ b/drivers/gpu/drm/i915/i915_utils.h
@@ -25,6 +25,9 @@
 #ifndef __I915_UTILS_H
 #define __I915_UTILS_H
 
+#include <linux/kernel.h>
+#include <linux/overflow.h>
+
 #undef WARN_ON
 /* Many gcc seem to no see through this and fall over :( */
 #if 0
@@ -73,6 +76,39 @@
 #define overflows_type(x, T) \
 	(sizeof(x) > sizeof(T) && (x) >> BITS_PER_TYPE(T))
 
+static inline bool
+__check_struct_size(size_t base, size_t arr, size_t count, size_t *size)
+{
+	size_t sz;
+
+	if (check_mul_overflow(count, arr, &sz))
+		return false;
+
+	if (check_add_overflow(sz, base, &sz))
+		return false;
+
+	*size = sz;
+	return true;
+}
+
+/**
+ * check_struct_size() - Calculate size of structure with trailing array.
+ * @p: Pointer to the structure.
+ * @member: Name of the array member.
+ * @n: Number of elements in the array.
+ * @sz: Total size of structure and array
+ *
+ * Calculates size of memory needed for structure @p followed by an
+ * array of @n @member elements, like struct_size() but reports
+ * whether it overflowed, and the resultant size in @sz
+ *
+ * Return: false if the calculation overflowed.
+ */
+#define check_struct_size(p, member, n, sz) \
+	likely(__check_struct_size(sizeof(*(p)), \
+				   sizeof(*(p)->member) + __must_be_array((p)->member), \
+				   n, sz))
+
 #define ptr_mask_bits(ptr, n) ({					\
 	unsigned long __v = (unsigned long)(ptr);			\
 	(typeof(ptr))(__v & -BIT(n));					\
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index d45b79746fc4..a0064d7e87c2 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -126,6 +126,8 @@ enum drm_i915_gem_engine_class {
 	I915_ENGINE_CLASS_INVALID	= -1
 };
 
+#define I915_ENGINE_CLASS_INVALID_NONE -1
+
 /**
  * DOC: perf_events exposed by i915 through /sys/bus/event_sources/drivers/i915
  *
@@ -1510,6 +1512,26 @@ struct drm_i915_gem_context_param {
 	 * See DRM_I915_GEM_VM_CREATE and DRM_I915_GEM_VM_DESTROY.
 	 */
 #define I915_CONTEXT_PARAM_VM		0x9
+
+/*
+ * I915_CONTEXT_PARAM_ENGINES:
+ *
+ * Bind this context to operate on this subset of available engines. Henceforth,
+ * the I915_EXEC_RING selector for DRM_IOCTL_I915_GEM_EXECBUFFER2 operates as
+ * an index into this array of engines; I915_EXEC_DEFAULT selecting engine[0]
+ * and upwards. Slots 0...N are filled in using the specified (class, instance).
+ * Use
+ *	engine_class: I915_ENGINE_CLASS_INVALID,
+ *	engine_instance: I915_ENGINE_CLASS_INVALID_NONE
+ * to specify a gap in the array that can be filled in later, e.g. by a
+ * virtual engine used for load balancing.
+ *
+ * Setting the number of engines bound to the context to 0, by passing a zero
+ * sized argument, will revert back to default settings.
+ *
+ * See struct i915_context_param_engines.
+ */
+#define I915_CONTEXT_PARAM_ENGINES	0xa
 /* Must be kept compact -- no holes and well documented */
 
 	__u64 value;
@@ -1574,6 +1596,23 @@ struct drm_i915_gem_context_param_sseu {
 	__u32 rsvd;
 };
 
+struct i915_context_param_engines {
+	__u64 extensions; /* linked chain of extension blocks, 0 terminates */
+
+	struct {
+		__u16 engine_class; /* see enum drm_i915_gem_engine_class */
+		__u16 engine_instance;
+	} class_instance[0];
+} __attribute__((packed));
+
+#define I915_DEFINE_CONTEXT_PARAM_ENGINES(name__, N__) struct { \
+	__u64 extensions; \
+	struct { \
+		__u16 engine_class; \
+		__u16 engine_instance; \
+	} class_instance[N__]; \
+} __attribute__((packed)) name__
+
 struct drm_i915_gem_context_create_ext_setparam {
 #define I915_CONTEXT_CREATE_EXT_SETPARAM 0
 	struct i915_user_extension base;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 27/29] drm/i915: Allow userspace to clone contexts on creation
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (24 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 26/29] drm/i915: Allow a context to define its set of engines Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-08  9:17 ` [PATCH 28/29] drm/i915: Re-expose SINGLE_TIMELINE flags for context creation Chris Wilson
                   ` (5 subsequent siblings)
  31 siblings, 0 replies; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

A usecase arose out of handling context recovery in mesa, whereby they
wish to recreate a context with fresh logical state but preserving all
other details of the original. Currently, they create a new context and
iterate over which bits they want to copy across, but it would much more
convenient if they were able to just pass in a target context to clone
during creation. This essentially extends the setparam during creation
to pull the details from a target context instead of the user supplied
parameters.

The ideal here is that we don't expose control over anything more than
can be obtained via CONTEXT_PARAM. That is userspace retains explicit
control over all features, and this api is just convenience.

For example, you could replace

	struct context_param p = { .param = CONTEXT_PARAM_VM };

	param.ctx_id = old_id;
	gem_context_get_param(&p.param);

	new_id = gem_context_create();

	param.ctx_id = new_id;
	gem_context_set_param(&p.param);

	gem_vm_destroy(param.value); /* drop the ref to VM_ID handle */

with

	struct create_ext_param p = {
	  { .name = CONTEXT_CREATE_CLONE },
	  .clone_id = old_id,
	  .flags = CLONE_FLAGS_VM
	}
	new_id = gem_context_create_ext(&p);

and not have to worry about stray namespace pollution etc.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 206 ++++++++++++++++++++++++
 include/uapi/drm/i915_drm.h             |  15 ++
 2 files changed, 221 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 9286a31ef2ed..5c8fed7717f6 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -1699,8 +1699,214 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
 	return ctx_setparam(arg->fpriv, arg->ctx, &local.param);
 }
 
+static int clone_engines(struct i915_gem_context *dst,
+			 struct i915_gem_context *src)
+{
+	struct i915_gem_engines *e, *clone;
+	bool user_engines;
+	unsigned long n;
+
+	e = i915_gem_context_engine_list_lock(src);
+
+	clone = kmalloc(struct_size(e, engines, e->num_engines), GFP_KERNEL);
+	if (!clone)
+		goto err_unlock;
+
+	clone->i915 = dst->i915;
+	for (n = 0; n < e->num_engines; n++) {
+		if (!e->engines[n]) {
+			clone->engines[n] = NULL;
+			continue;
+		}
+
+		clone->engines[n] =
+			intel_context_create(dst, e->engines[n]->engine);
+		if (!clone->engines[n]) {
+			free_engines_n(clone, n);
+			goto err_unlock;
+		}
+	}
+	clone->num_engines = n;
+
+	user_engines = i915_gem_context_user_engines(src);
+	i915_gem_context_engine_list_unlock(src);
+
+	free_engines(dst->engines);
+	RCU_INIT_POINTER(dst->engines, clone);
+	if (user_engines)
+		i915_gem_context_set_user_engines(dst);
+	else
+		i915_gem_context_clear_user_engines(dst);
+	return 0;
+
+err_unlock:
+	i915_gem_context_engine_list_unlock(src);
+	return -ENOMEM;
+}
+
+static int clone_flags(struct i915_gem_context *dst,
+		       struct i915_gem_context *src)
+{
+	dst->user_flags = src->user_flags;
+	return 0;
+}
+
+static int clone_schedattr(struct i915_gem_context *dst,
+			   struct i915_gem_context *src)
+{
+	dst->sched = src->sched;
+	return 0;
+}
+
+static int clone_sseu(struct i915_gem_context *dst,
+		      struct i915_gem_context *src)
+{
+	struct i915_gem_engines *e, *clone;
+	unsigned long n;
+	int err;
+
+	clone = dst->engines; /* no locking required; sole access */
+	e = i915_gem_context_engine_list_lock(src);
+	if (e->num_engines != clone->num_engines) {
+		err = -EINVAL;
+		goto unlock;
+	}
+
+	for (n = 0; n < e->num_engines; n++) {
+		struct intel_context *ce = e->engines[n];
+
+		if (clone->engines[n]->engine->class != ce->engine->class) {
+			/* Must have compatible engine maps! */
+			err = -EINVAL;
+			goto unlock;
+		}
+
+		err = intel_context_pin_lock(ce); /* serialises with set_sseu */
+		if (err)
+			goto unlock;
+
+		clone->engines[n]->sseu = ce->sseu;
+		intel_context_pin_unlock(ce);
+	}
+
+	err = 0;
+unlock:
+	i915_gem_context_engine_list_unlock(src);
+	return err;
+}
+
+static int clone_timeline(struct i915_gem_context *dst,
+			  struct i915_gem_context *src)
+{
+	if (src->timeline) {
+		GEM_BUG_ON(src->timeline == dst->timeline);
+
+		if (dst->timeline)
+			i915_timeline_put(dst->timeline);
+		dst->timeline = i915_timeline_get(src->timeline);
+	}
+
+	return 0;
+}
+
+static int clone_vm(struct i915_gem_context *dst,
+		    struct i915_gem_context *src)
+{
+	struct i915_hw_ppgtt *ppgtt;
+
+	rcu_read_lock();
+	do {
+		ppgtt = READ_ONCE(src->ppgtt);
+		if (!ppgtt)
+			break;
+
+		if (!kref_get_unless_zero(&ppgtt->ref))
+			continue;
+
+		/*
+		 * This ppgtt may have be reallocated between
+		 * the read and the kref, and reassigned to a third
+		 * context. In order to avoid inadvertent sharing
+		 * of this ppgtt with that third context (and not
+		 * src), we have to confirm that we have the same
+		 * ppgtt after passing through the strong memory
+		 * barrier implied by a successful
+		 * kref_get_unless_zero().
+		 *
+		 * Once we have acquired the current ppgtt of src,
+		 * we no longer care if it is released from src, as
+		 * it cannot be reallocated elsewhere.
+		 */
+
+		if (ppgtt == READ_ONCE(src->ppgtt))
+			break;
+
+		i915_ppgtt_put(ppgtt);
+	} while (1);
+	rcu_read_unlock();
+
+	if (ppgtt) {
+		__assign_ppgtt(dst, ppgtt);
+		i915_ppgtt_put(ppgtt);
+	}
+
+	return 0;
+}
+
+static int create_clone(struct i915_user_extension __user *ext, void *data)
+{
+	static int (* const fn[])(struct i915_gem_context *dst,
+				  struct i915_gem_context *src) = {
+#define MAP(x, y) [ilog2(I915_CONTEXT_CLONE_##x)] = y
+		MAP(ENGINES, clone_engines),
+		MAP(FLAGS, clone_flags),
+		MAP(SCHEDATTR, clone_schedattr),
+		MAP(SSEU, clone_sseu),
+		MAP(TIMELINE, clone_timeline),
+		MAP(VM, clone_vm),
+#undef MAP
+	};
+	struct drm_i915_gem_context_create_ext_clone local;
+	const struct create_ext *arg = data;
+	struct i915_gem_context *dst = arg->ctx;
+	struct i915_gem_context *src;
+	int err, bit;
+
+	if (copy_from_user(&local, ext, sizeof(local)))
+		return -EFAULT;
+
+	BUILD_BUG_ON(GENMASK(BITS_PER_TYPE(local.flags) - 1, ARRAY_SIZE(fn)) !=
+		     I915_CONTEXT_CLONE_UNKNOWN);
+
+	if (local.flags & I915_CONTEXT_CLONE_UNKNOWN)
+		return -EINVAL;
+
+	if (local.rsvd)
+		return -EINVAL;
+
+	rcu_read_lock();
+	src = __i915_gem_context_lookup_rcu(arg->fpriv, local.clone_id);
+	rcu_read_unlock();
+	if (!src)
+		return -ENOENT;
+
+	GEM_BUG_ON(src == dst);
+
+	for (bit = 0; bit < ARRAY_SIZE(fn); bit++) {
+		if (!(local.flags & BIT(bit)))
+			continue;
+
+		err = fn[bit](dst, src);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
 static const i915_user_extension_fn create_extensions[] = {
 	[I915_CONTEXT_CREATE_EXT_SETPARAM] = create_setparam,
+	[I915_CONTEXT_CREATE_EXT_CLONE] = create_clone,
 };
 
 static bool client_is_banned(struct drm_i915_file_private *file_priv)
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index a0064d7e87c2..ab30bcd7c1f1 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1619,6 +1619,21 @@ struct drm_i915_gem_context_create_ext_setparam {
 	struct drm_i915_gem_context_param param;
 };
 
+struct drm_i915_gem_context_create_ext_clone {
+#define I915_CONTEXT_CREATE_EXT_CLONE 1
+	struct i915_user_extension base;
+	__u32 clone_id;
+	__u32 flags;
+#define I915_CONTEXT_CLONE_ENGINES	(1u << 0)
+#define I915_CONTEXT_CLONE_FLAGS	(1u << 1)
+#define I915_CONTEXT_CLONE_SCHEDATTR	(1u << 2)
+#define I915_CONTEXT_CLONE_SSEU		(1u << 3)
+#define I915_CONTEXT_CLONE_TIMELINE	(1u << 4)
+#define I915_CONTEXT_CLONE_VM		(1u << 5)
+#define I915_CONTEXT_CLONE_UNKNOWN -(I915_CONTEXT_CLONE_VM << 1)
+	__u64 rsvd;
+};
+
 struct drm_i915_gem_context_destroy {
 	__u32 ctx_id;
 	__u32 pad;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 28/29] drm/i915: Re-expose SINGLE_TIMELINE flags for context creation
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (25 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 27/29] drm/i915: Allow userspace to clone contexts on creation Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-08  9:17 ` [PATCH 29/29] drm/i915: Load balancing across a virtual engine Chris Wilson
                   ` (4 subsequent siblings)
  31 siblings, 0 replies; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

The SINGLE_TIMELINE flag can be used to create a context such that all
engine instances within that context share a common timeline. This can
be useful for mixing operations between real and virtual engines, or
when using a composite context for a single client API context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 4 ----
 include/uapi/drm/i915_drm.h             | 3 ++-
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 5c8fed7717f6..4909acbeee23 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -96,8 +96,6 @@
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
 
-#define I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE (1 << 1)
-
 #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
 
 static struct i915_global_gem_context {
@@ -496,8 +494,6 @@ i915_gem_create_context(struct drm_i915_private *dev_priv, unsigned int flags)
 
 	lockdep_assert_held(&dev_priv->drm.struct_mutex);
 
-	BUILD_BUG_ON(I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &
-		     ~I915_CONTEXT_CREATE_FLAGS_UNKNOWN);
 	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
 	    !HAS_EXECLISTS(dev_priv))
 		return ERR_PTR(-EINVAL);
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index ab30bcd7c1f1..303d3dd980e4 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1458,8 +1458,9 @@ struct drm_i915_gem_context_create_ext {
 	__u32 ctx_id; /* output: id of new context*/
 	__u32 flags;
 #define I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS	(1u << 0)
+#define I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE	(1u << 1)
 #define I915_CONTEXT_CREATE_FLAGS_UNKNOWN \
-	(-(I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS << 1))
+	(-(I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE << 1))
 	__u64 extensions;
 };
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH 29/29] drm/i915: Load balancing across a virtual engine
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (26 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 28/29] drm/i915: Re-expose SINGLE_TIMELINE flags for context creation Chris Wilson
@ 2019-04-08  9:17 ` Chris Wilson
  2019-04-08  9:59 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/29] drm/i915: Mark up ips for RCU protection Patchwork
                   ` (3 subsequent siblings)
  31 siblings, 0 replies; 71+ messages in thread
From: Chris Wilson @ 2019-04-08  9:17 UTC (permalink / raw)
  To: intel-gfx

Having allowed the user to define a set of engines that they will want
to only use, we go one step further and allow them to bind those engines
into a single virtual instance. Submitting a batch to the virtual engine
will then forward it to any one of the set in a manner as best to
distribute load.  The virtual engine has a single timeline across all
engines (it operates as a single queue), so it is not able to concurrently
run batches across multiple engines by itself; that is left up to the user
to submit multiple concurrent batches to multiple queues. Multiple users
will be load balanced across the system.

The mechanism used for load balancing in this patch is a late greedy
balancer. When a request is ready for execution, it is added to each
engine's queue, and when an engine is ready for its next request it
claims it from the virtual engine. The first engine to do so, wins, i.e.
the request is executed at the earliest opportunity (idle moment) in the
system.

As not all HW is created equal, the user is still able to skip the
virtual engine and execute the batch on a specific engine, all within the
same queue. It will then be executed in order on the correct engine,
with execution on other virtual engines being moved away due to the load
detection.

A couple of areas for potential improvement left!

- The virtual engine always take priority over equal-priority tasks.
Mostly broken up by applying FQ_CODEL rules for prioritising new clients,
and hopefully the virtual and real engines are not then congested (i.e.
all work is via virtual engines, or all work is to the real engine).

- We require the breadcrumb irq around every virtual engine request. For
normal engines, we eliminate the need for the slow round trip via
interrupt by using the submit fence and queueing in order. For virtual
engines, we have to allow any job to transfer to a new ring, and cannot
coalesce the submissions, so require the completion fence instead,
forcing the persistent use of interrupts.

- We only drip feed single requests through each virtual engine and onto
the physical engines, even if there was enough work to fill all ELSP,
leaving small stalls with an idle CS event at the end of every request.
Could we be greedy and fill both slots? Being lazy is virtuous for load
distribution on less-than-full workloads though.

Other areas of improvement are more general, such as reducing lock
contention, reducing dispatch overhead, looking at direct submission
rather than bouncing around tasklets etc.

sseu: Lift the restriction to allow sseu to be reconfigured on virtual
engines composed of RENDER_CLASS (rcs).

v2: macroize check_user_mbz()
v3: Cancel virtual engines on wedging
v4: Commence commenting
v5: Replace 64b sibling_mask with a list of class:instance

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_types.h |   8 +
 drivers/gpu/drm/i915/gt/intel_lrc.c          | 617 ++++++++++++++++++-
 drivers/gpu/drm/i915/gt/intel_lrc.h          |   9 +
 drivers/gpu/drm/i915/gt/selftest_lrc.c       | 180 ++++++
 drivers/gpu/drm/i915/i915_gem.h              |   5 +
 drivers/gpu/drm/i915/i915_gem_context.c      | 120 +++-
 drivers/gpu/drm/i915/i915_scheduler.c        |  18 +-
 drivers/gpu/drm/i915/i915_timeline_types.h   |   1 +
 include/uapi/drm/i915_drm.h                  |  45 ++
 9 files changed, 987 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index cd13c3840438..e14e892d6eb9 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -227,6 +227,7 @@ struct intel_engine_execlists {
 	 * @queue: queue of requests, in priority lists
 	 */
 	struct rb_root_cached queue;
+	struct rb_root_cached virtual;
 
 	/**
 	 * @csb_write: control register for Context Switch buffer
@@ -440,6 +441,7 @@ struct intel_engine_cs {
 #define I915_ENGINE_HAS_PREEMPTION   BIT(2)
 #define I915_ENGINE_HAS_SEMAPHORES   BIT(3)
 #define I915_ENGINE_NEEDS_BREADCRUMB_TASKLET BIT(4)
+#define I915_ENGINE_IS_VIRTUAL       BIT(5)
 	unsigned int flags;
 
 	/*
@@ -529,6 +531,12 @@ intel_engine_needs_breadcrumb_tasklet(const struct intel_engine_cs *engine)
 	return engine->flags & I915_ENGINE_NEEDS_BREADCRUMB_TASKLET;
 }
 
+static inline bool
+intel_engine_is_virtual(const struct intel_engine_cs *engine)
+{
+	return engine->flags & I915_ENGINE_IS_VIRTUAL;
+}
+
 #define instdone_slice_mask(dev_priv__) \
 	(IS_GEN(dev_priv__, 7) ? \
 	 1 : RUNTIME_INFO(dev_priv__)->sseu.slice_mask)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index d4e28fbb5dcd..1acd53a02528 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -136,6 +136,7 @@
 #include "i915_drv.h"
 #include "i915_gem_render_state.h"
 #include "i915_vgpu.h"
+#include "intel_engine_pm.h"
 #include "intel_lrc_reg.h"
 #include "intel_mocs.h"
 #include "intel_reset.h"
@@ -165,6 +166,41 @@
 
 #define ACTIVE_PRIORITY (I915_PRIORITY_NEWCLIENT | I915_PRIORITY_NOSEMAPHORE)
 
+struct virtual_engine {
+	struct intel_engine_cs base;
+	struct intel_context context;
+
+	/*
+	 * We allow only a single request through the virtual engine at a time
+	 * (each request in the timeline waits for the completion fence of
+	 * the previous before being submitted). By restricting ourselves to
+	 * only submitting a single request, each request is placed on to a
+	 * physical to maximise load spreading (by virtue of the late greedy
+	 * scheduling -- each real engine takes the next available request
+	 * upon idling).
+	 */
+	struct i915_request *request;
+
+	/*
+	 * We keep a rbtree of available virtual engines inside each physical
+	 * engine, sorted by priority. Here we preallocate the nodes we need
+	 * for the virtual engine, indexed by physical_engine->id.
+	 */
+	struct ve_node {
+		struct rb_node rb;
+		int prio;
+	} nodes[I915_NUM_ENGINES];
+
+	/* And finally, which physical engines this virtual engine maps onto. */
+	unsigned int num_siblings;
+	struct intel_engine_cs *siblings[0];
+};
+
+static struct virtual_engine *to_virtual_engine(struct intel_engine_cs *engine)
+{
+	return container_of(engine, struct virtual_engine, base);
+}
+
 static int execlists_context_deferred_alloc(struct intel_context *ce,
 					    struct intel_engine_cs *engine);
 static void execlists_init_reg_state(u32 *reg_state,
@@ -228,7 +264,8 @@ static int queue_prio(const struct intel_engine_execlists *execlists)
 }
 
 static inline bool need_preempt(const struct intel_engine_cs *engine,
-				const struct i915_request *rq)
+				const struct i915_request *rq,
+				struct rb_node *rb)
 {
 	int last_prio;
 
@@ -263,6 +300,22 @@ static inline bool need_preempt(const struct intel_engine_cs *engine,
 	    rq_prio(list_next_entry(rq, link)) > last_prio)
 		return true;
 
+	if (rb) { /* XXX virtual precedence */
+		struct virtual_engine *ve =
+			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+		bool preempt = false;
+
+		if (engine == ve->siblings[0]) { /* only preempt one sibling */
+			spin_lock(&ve->base.timeline.lock);
+			if (ve->request)
+				preempt = rq_prio(ve->request) > last_prio;
+			spin_unlock(&ve->base.timeline.lock);
+		}
+
+		if (preempt)
+			return preempt;
+	}
+
 	/*
 	 * If the inflight context did not trigger the preemption, then maybe
 	 * it was the set of queued requests? Pick the highest priority in
@@ -381,6 +434,8 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
 	list_for_each_entry_safe_reverse(rq, rn,
 					 &engine->timeline.requests,
 					 link) {
+		struct intel_engine_cs *owner;
+
 		if (i915_request_completed(rq))
 			break;
 
@@ -389,14 +444,30 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
 
 		GEM_BUG_ON(rq->hw_context->active);
 
-		GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID);
-		if (rq_prio(rq) != prio) {
-			prio = rq_prio(rq);
-			pl = i915_sched_lookup_priolist(engine, prio);
-		}
-		GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
+		/*
+		 * Push the request back into the queue for later resubmission.
+		 * If this request is not native to this physical engine (i.e.
+		 * it came from a virtual source), push it back onto the virtual
+		 * engine so that it can be moved across onto another physical
+		 * engine as load dictates.
+		 */
+		owner = rq->hw_context->engine;
+		if (likely(owner == engine)) {
+			GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID);
+			if (rq_prio(rq) != prio) {
+				prio = rq_prio(rq);
+				pl = i915_sched_lookup_priolist(engine, prio);
+			}
+			GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
 
-		list_add(&rq->sched.link, pl);
+			list_add(&rq->sched.link, pl);
+		} else {
+			if (__i915_request_has_started(rq))
+				rq->sched.attr.priority |= ACTIVE_PRIORITY;
+
+			rq->engine = owner;
+			owner->submit_request(rq);
+		}
 
 		active = rq;
 	}
@@ -658,6 +729,72 @@ static void complete_preempt_context(struct intel_engine_execlists *execlists)
 						  execlists));
 }
 
+static void virtual_update_register_offsets(u32 *regs,
+					    struct intel_engine_cs *engine)
+{
+	u32 base = engine->mmio_base;
+
+	regs[CTX_CONTEXT_CONTROL] =
+		i915_mmio_reg_offset(RING_CONTEXT_CONTROL(base));
+	regs[CTX_RING_HEAD] = i915_mmio_reg_offset(RING_HEAD(base));
+	regs[CTX_RING_TAIL] = i915_mmio_reg_offset(RING_TAIL(base));
+	regs[CTX_RING_BUFFER_START] = i915_mmio_reg_offset(RING_START(base));
+	regs[CTX_RING_BUFFER_CONTROL] = i915_mmio_reg_offset(RING_CTL(base));
+
+	regs[CTX_BB_HEAD_U] = i915_mmio_reg_offset(RING_BBADDR_UDW(base));
+	regs[CTX_BB_HEAD_L] = i915_mmio_reg_offset(RING_BBADDR(base));
+	regs[CTX_BB_STATE] = i915_mmio_reg_offset(RING_BBSTATE(base));
+	regs[CTX_SECOND_BB_HEAD_U] =
+		i915_mmio_reg_offset(RING_SBBADDR_UDW(base));
+	regs[CTX_SECOND_BB_HEAD_L] = i915_mmio_reg_offset(RING_SBBADDR(base));
+	regs[CTX_SECOND_BB_STATE] = i915_mmio_reg_offset(RING_SBBSTATE(base));
+
+	regs[CTX_CTX_TIMESTAMP] =
+		i915_mmio_reg_offset(RING_CTX_TIMESTAMP(base));
+	regs[CTX_PDP3_UDW] = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(base, 3));
+	regs[CTX_PDP3_LDW] = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(base, 3));
+	regs[CTX_PDP2_UDW] = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(base, 2));
+	regs[CTX_PDP2_LDW] = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(base, 2));
+	regs[CTX_PDP1_UDW] = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(base, 1));
+	regs[CTX_PDP1_LDW] = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(base, 1));
+	regs[CTX_PDP0_UDW] = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(base, 0));
+	regs[CTX_PDP0_LDW] = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(base, 0));
+
+	if (engine->class == RENDER_CLASS) {
+		regs[CTX_RCS_INDIRECT_CTX] =
+			i915_mmio_reg_offset(RING_INDIRECT_CTX(base));
+		regs[CTX_RCS_INDIRECT_CTX_OFFSET] =
+			i915_mmio_reg_offset(RING_INDIRECT_CTX_OFFSET(base));
+		regs[CTX_BB_PER_CTX_PTR] =
+			i915_mmio_reg_offset(RING_BB_PER_CTX_PTR(base));
+
+		regs[CTX_R_PWR_CLK_STATE] =
+			i915_mmio_reg_offset(GEN8_R_PWR_CLK_STATE);
+	}
+}
+
+static bool virtual_matches(const struct virtual_engine *ve,
+			    const struct i915_request *rq,
+			    const struct intel_engine_cs *engine)
+{
+	const struct intel_engine_cs *active;
+
+	/*
+	 * We track when the HW has completed saving the context image
+	 * (i.e. when we have seen the final CS event switching out of
+	 * the context) and must not overwrite the context image before
+	 * then. This restricts us to only using the active engine
+	 * while the previous virtualized request is inflight (so
+	 * we reuse the register offsets). This is a very small
+	 * hystersis on the greedy seelction algorithm.
+	 */
+	active = READ_ONCE(ve->context.active);
+	if (active && active != engine)
+		return false;
+
+	return true;
+}
+
 static void execlists_dequeue(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlists * const execlists = &engine->execlists;
@@ -690,6 +827,26 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	 * and context switches) submission.
 	 */
 
+	for (rb = rb_first_cached(&execlists->virtual); rb; ) {
+		struct virtual_engine *ve =
+			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+		struct i915_request *rq = READ_ONCE(ve->request);
+
+		if (!rq) { /* lazily cleanup after another engine handled rq */
+			rb_erase_cached(rb, &execlists->virtual);
+			RB_CLEAR_NODE(rb);
+			rb = rb_first_cached(&execlists->virtual);
+			continue;
+		}
+
+		if (!virtual_matches(ve, rq, engine)) {
+			rb = rb_next(rb);
+			continue;
+		}
+
+		break;
+	}
+
 	if (last) {
 		/*
 		 * Don't resubmit or switch until all outstanding
@@ -711,7 +868,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		if (!execlists_is_active(execlists, EXECLISTS_ACTIVE_HWACK))
 			return;
 
-		if (need_preempt(engine, last)) {
+		if (need_preempt(engine, last, rb)) {
 			inject_preempt_context(engine);
 			return;
 		}
@@ -751,6 +908,89 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		last->tail = last->wa_tail;
 	}
 
+	while (rb) { /* XXX virtual is always taking precedence */
+		struct virtual_engine *ve =
+			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+		struct i915_request *rq;
+
+		spin_lock(&ve->base.timeline.lock);
+
+		rq = ve->request;
+		if (unlikely(!rq)) { /* lost the race to a sibling */
+			spin_unlock(&ve->base.timeline.lock);
+			rb_erase_cached(rb, &execlists->virtual);
+			RB_CLEAR_NODE(rb);
+			rb = rb_first_cached(&execlists->virtual);
+			continue;
+		}
+
+		GEM_BUG_ON(rq != ve->request);
+		GEM_BUG_ON(rq->engine != &ve->base);
+		GEM_BUG_ON(rq->hw_context != &ve->context);
+
+		if (rq_prio(rq) >= queue_prio(execlists)) {
+			if (!virtual_matches(ve, rq, engine)) {
+				spin_unlock(&ve->base.timeline.lock);
+				rb = rb_next(rb);
+				continue;
+			}
+
+			if (last && !can_merge_rq(last, rq)) {
+				spin_unlock(&ve->base.timeline.lock);
+				return; /* leave this rq for another engine */
+			}
+
+			GEM_TRACE("%s: virtual rq=%llx:%lld%s, new engine? %s\n",
+				  engine->name,
+				  rq->fence.context,
+				  rq->fence.seqno,
+				  i915_request_completed(rq) ? "!" :
+				  i915_request_started(rq) ? "*" :
+				  "",
+				  yesno(engine != ve->siblings[0]));
+
+			ve->request = NULL;
+			ve->base.execlists.queue_priority_hint = INT_MIN;
+			rb_erase_cached(rb, &execlists->virtual);
+			RB_CLEAR_NODE(rb);
+
+			rq->engine = engine;
+
+			if (engine != ve->siblings[0]) {
+				u32 *regs = ve->context.lrc_reg_state;
+				unsigned int n;
+
+				GEM_BUG_ON(READ_ONCE(ve->context.active));
+				virtual_update_register_offsets(regs, engine);
+
+				/*
+				 * Move the bound engine to the top of the list
+				 * for future execution. We then kick this
+				 * tasklet first before checking others, so that
+				 * we preferentially reuse this set of bound
+				 * registers.
+				 */
+				for (n = 1; n < ve->num_siblings; n++) {
+					if (ve->siblings[n] == engine) {
+						swap(ve->siblings[n],
+						     ve->siblings[0]);
+						break;
+					}
+				}
+
+				GEM_BUG_ON(ve->siblings[0] != engine);
+			}
+
+			__i915_request_submit(rq);
+			trace_i915_request_in(rq, port_index(port, execlists));
+			submit = true;
+			last = rq;
+		}
+
+		spin_unlock(&ve->base.timeline.lock);
+		break;
+	}
+
 	while ((rb = rb_first_cached(&execlists->queue))) {
 		struct i915_priolist *p = to_priolist(rb);
 		struct i915_request *rq, *rn;
@@ -970,6 +1210,26 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
 		i915_priolist_free(p);
 	}
 
+	/* Cancel all attached virtual engines */
+	while ((rb = rb_first_cached(&execlists->virtual))) {
+		struct virtual_engine *ve =
+			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+
+		rb_erase_cached(rb, &execlists->virtual);
+		RB_CLEAR_NODE(rb);
+
+		spin_lock(&ve->base.timeline.lock);
+		if (ve->request) {
+			ve->request->engine = engine;
+			__i915_request_submit(ve->request);
+			dma_fence_set_error(&ve->request->fence, -EIO);
+			i915_request_mark_complete(ve->request);
+			ve->base.execlists.queue_priority_hint = INT_MIN;
+			ve->request = NULL;
+		}
+		spin_unlock(&ve->base.timeline.lock);
+	}
+
 	/* Remaining _unready_ requests will be nop'ed when submitted */
 
 	execlists->queue_priority_hint = INT_MIN;
@@ -2753,6 +3013,322 @@ static int execlists_context_deferred_alloc(struct intel_context *ce,
 	return ret;
 }
 
+static void virtual_context_destroy(struct kref *kref)
+{
+	struct virtual_engine *ve =
+		container_of(kref, typeof(*ve), context.ref);
+	unsigned int n;
+
+	GEM_BUG_ON(ve->request);
+	GEM_BUG_ON(ve->context.active);
+
+	for (n = 0; n < ve->num_siblings; n++) {
+		struct intel_engine_cs *sibling = ve->siblings[n];
+		struct rb_node *node = &ve->nodes[sibling->id].rb;
+
+		if (RB_EMPTY_NODE(node))
+			continue;
+
+		spin_lock_irq(&sibling->timeline.lock);
+
+		if (!RB_EMPTY_NODE(node))
+			rb_erase_cached(node, &sibling->execlists.virtual);
+
+		spin_unlock_irq(&sibling->timeline.lock);
+	}
+	GEM_BUG_ON(__tasklet_is_scheduled(&ve->base.execlists.tasklet));
+
+	if (ve->context.state)
+		__execlists_context_fini(&ve->context);
+
+	i915_timeline_fini(&ve->base.timeline);
+	kfree(ve);
+}
+
+static void virtual_engine_initial_hint(struct virtual_engine *ve)
+{
+	int swp;
+
+	/*
+	 * Pick a random sibling on starting to help spread the load around.
+	 *
+	 * New contexts are typically created with exactly the same order
+	 * of siblings, and often started in batches. Due to the way we iterate
+	 * the array of sibling when submitting requests, sibling[0] is
+	 * prioritised for dequeuing. If we make sure that sibling[0] is fairly
+	 * randomised across the system, we also help spread the load by the
+	 * first engine we inspect being different each time.
+	 *
+	 * NB This does not force us to execute on this engine, it will just
+	 * typically be the first we inspect for submission.
+	 */
+	swp = prandom_u32_max(ve->num_siblings);
+	if (!swp)
+		return;
+
+	swap(ve->siblings[swp], ve->siblings[0]);
+	virtual_update_register_offsets(ve->context.lrc_reg_state,
+					ve->siblings[0]);
+}
+
+static int virtual_context_pin(struct intel_context *ce)
+{
+	struct virtual_engine *ve = container_of(ce, typeof(*ve), context);
+	int err;
+
+	/* Note: we must use a real engine class for setting up reg state */
+	err = __execlists_context_pin(ce, ve->siblings[0]);
+	if (err)
+		return err;
+
+	virtual_engine_initial_hint(ve);
+	return 0;
+}
+
+static void virtual_context_enter(struct intel_context *ce)
+{
+	struct virtual_engine *ve = container_of(ce, typeof(*ve), context);
+	unsigned int n;
+
+	for (n = 0; n < ve->num_siblings; n++)
+		intel_engine_pm_get(ve->siblings[n]);
+}
+
+static void virtual_context_exit(struct intel_context *ce)
+{
+	struct virtual_engine *ve = container_of(ce, typeof(*ve), context);
+	unsigned int n;
+
+	for (n = 0; n < ve->num_siblings; n++)
+		intel_engine_pm_put(ve->siblings[n]);
+}
+
+static const struct intel_context_ops virtual_context_ops = {
+	.pin = virtual_context_pin,
+	.unpin = execlists_context_unpin,
+
+	.enter = virtual_context_enter,
+	.exit = virtual_context_exit,
+
+	.destroy = virtual_context_destroy,
+};
+
+static void virtual_submission_tasklet(unsigned long data)
+{
+	struct virtual_engine * const ve = (struct virtual_engine *)data;
+	const int prio = ve->base.execlists.queue_priority_hint;
+	unsigned int n;
+
+	GEM_TRACE("%s: rq=%llx:%lld, prio=%d\n",
+		  ve->base.name,
+		  ve->request->fence.context,
+		  ve->request->fence.seqno,
+		  prio);
+
+	local_irq_disable();
+	for (n = 0; READ_ONCE(ve->request) && n < ve->num_siblings; n++) {
+		struct intel_engine_cs *sibling = ve->siblings[n];
+		struct ve_node * const node = &ve->nodes[sibling->id];
+		struct rb_node **parent, *rb;
+		bool first;
+
+		spin_lock(&sibling->timeline.lock);
+
+		if (!RB_EMPTY_NODE(&node->rb)) {
+			/*
+			 * Cheat and avoid rebalancing the tree if we can
+			 * reuse this node in situ.
+			 */
+			first = rb_first_cached(&sibling->execlists.virtual) ==
+				&node->rb;
+			if (prio == node->prio || (prio > node->prio && first))
+				goto submit_engine;
+
+			rb_erase_cached(&node->rb, &sibling->execlists.virtual);
+		}
+
+		rb = NULL;
+		first = true;
+		parent = &sibling->execlists.virtual.rb_root.rb_node;
+		while (*parent) {
+			struct ve_node *other;
+
+			rb = *parent;
+			other = rb_entry(rb, typeof(*other), rb);
+			if (prio > other->prio) {
+				parent = &rb->rb_left;
+			} else {
+				parent = &rb->rb_right;
+				first = false;
+			}
+		}
+
+		rb_link_node(&node->rb, rb, parent);
+		rb_insert_color_cached(&node->rb,
+				       &sibling->execlists.virtual,
+				       first);
+
+submit_engine:
+		GEM_BUG_ON(RB_EMPTY_NODE(&node->rb));
+		node->prio = prio;
+		if (first && prio > sibling->execlists.queue_priority_hint) {
+			sibling->execlists.queue_priority_hint = prio;
+			tasklet_hi_schedule(&sibling->execlists.tasklet);
+		}
+
+		spin_unlock(&sibling->timeline.lock);
+	}
+	local_irq_enable();
+}
+
+static void virtual_submit_request(struct i915_request *rq)
+{
+	struct virtual_engine *ve = to_virtual_engine(rq->engine);
+
+	GEM_TRACE("%s: rq=%llx:%lld\n",
+		  ve->base.name,
+		  rq->fence.context,
+		  rq->fence.seqno);
+
+	GEM_BUG_ON(ve->base.submit_request != virtual_submit_request);
+
+	GEM_BUG_ON(ve->request);
+	ve->base.execlists.queue_priority_hint = rq_prio(rq);
+	WRITE_ONCE(ve->request, rq);
+
+	tasklet_schedule(&ve->base.execlists.tasklet);
+}
+
+struct intel_context *
+intel_execlists_create_virtual(struct i915_gem_context *ctx,
+			       struct intel_engine_cs **siblings,
+			       unsigned int count)
+{
+	struct virtual_engine *ve;
+	unsigned int n;
+	int err;
+
+	if (count == 0)
+		return ERR_PTR(-EINVAL);
+
+	if (count == 1)
+		return intel_context_create(ctx, siblings[0]);
+
+	ve = kzalloc(struct_size(ve, siblings, count), GFP_KERNEL);
+	if (!ve)
+		return ERR_PTR(-ENOMEM);
+
+	ve->base.i915 = ctx->i915;
+	ve->base.id = -1;
+	ve->base.class = OTHER_CLASS;
+	ve->base.uabi_class = I915_ENGINE_CLASS_INVALID;
+	ve->base.instance = I915_ENGINE_CLASS_INVALID_VIRTUAL;
+	ve->base.flags = I915_ENGINE_IS_VIRTUAL;
+
+	snprintf(ve->base.name, sizeof(ve->base.name), "virtual");
+
+	err = i915_timeline_init(ctx->i915, &ve->base.timeline, NULL);
+	if (err)
+		goto err_put;
+	i915_timeline_set_subclass(&ve->base.timeline, TIMELINE_VIRTUAL);
+
+	ve->base.cops = &virtual_context_ops;
+	ve->base.request_alloc = execlists_request_alloc;
+
+	ve->base.schedule = i915_schedule;
+	ve->base.submit_request = virtual_submit_request;
+
+	ve->base.execlists.queue_priority_hint = INT_MIN;
+	tasklet_init(&ve->base.execlists.tasklet,
+		     virtual_submission_tasklet,
+		     (unsigned long)ve);
+
+	intel_context_init(&ve->context, ctx, &ve->base);
+
+	for (n = 0; n < count; n++) {
+		struct intel_engine_cs *sibling = siblings[n];
+
+		GEM_BUG_ON(!is_power_of_2(sibling->mask));
+		if (sibling->mask & ve->base.mask) {
+			DRM_DEBUG("duplicate %s entry in load balancer\n",
+				  sibling->name);
+			err = -EINVAL;
+			goto err_put;
+		}
+
+		/*
+		 * The virtual engine implementation is tightly coupled to
+		 * the execlists backend -- we push out request directly
+		 * into a tree inside each physical engine. We could support
+		 * layering if we handle cloning of the requests and
+		 * submitting a copy into each backend.
+		 */
+		if (sibling->execlists.tasklet.func !=
+		    execlists_submission_tasklet) {
+			err = -ENODEV;
+			goto err_put;
+		}
+
+		GEM_BUG_ON(RB_EMPTY_NODE(&ve->nodes[sibling->id].rb));
+		RB_CLEAR_NODE(&ve->nodes[sibling->id].rb);
+
+		ve->siblings[ve->num_siblings++] = sibling;
+		ve->base.mask |= sibling->mask;
+
+		/*
+		 * All physical engines must be compatible for their emission
+		 * functions (as we build the instructions during request
+		 * construction and do not alter them before submission
+		 * on the physical engine). We use the engine class as a guide
+		 * here, although that could be refined.
+		 */
+		if (ve->base.class != OTHER_CLASS) {
+			if (ve->base.class != sibling->class) {
+				DRM_DEBUG("invalid mixing of engine class, sibling %d, already %d\n",
+					  sibling->class, ve->base.class);
+				err = -EINVAL;
+				goto err_put;
+			}
+			continue;
+		}
+
+		ve->base.class = sibling->class;
+		ve->base.uabi_class = sibling->uabi_class;
+		snprintf(ve->base.name, sizeof(ve->base.name),
+			 "v%dx%d", ve->base.class, count);
+		ve->base.context_size = sibling->context_size;
+
+		ve->base.emit_bb_start = sibling->emit_bb_start;
+		ve->base.emit_flush = sibling->emit_flush;
+		ve->base.emit_init_breadcrumb = sibling->emit_init_breadcrumb;
+		ve->base.emit_fini_breadcrumb = sibling->emit_fini_breadcrumb;
+		ve->base.emit_fini_breadcrumb_dw =
+			sibling->emit_fini_breadcrumb_dw;
+	}
+
+	return &ve->context;
+
+err_put:
+	intel_context_put(&ve->context);
+	return ERR_PTR(err);
+}
+
+struct intel_context *
+intel_execlists_clone_virtual(struct i915_gem_context *ctx,
+			      struct intel_engine_cs *src)
+{
+	struct virtual_engine *se = to_virtual_engine(src);
+	struct intel_context *dst;
+
+	dst = intel_execlists_create_virtual(ctx,
+					     se->siblings,
+					     se->num_siblings);
+	if (IS_ERR(dst))
+		return dst;
+
+	return dst;
+}
+
 void intel_execlists_show_requests(struct intel_engine_cs *engine,
 				   struct drm_printer *m,
 				   void (*show_request)(struct drm_printer *m,
@@ -2810,6 +3386,29 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
 		show_request(m, last, "\t\tQ ");
 	}
 
+	last = NULL;
+	count = 0;
+	for (rb = rb_first_cached(&execlists->virtual); rb; rb = rb_next(rb)) {
+		struct virtual_engine *ve =
+			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+		struct i915_request *rq = READ_ONCE(ve->request);
+
+		if (rq) {
+			if (count++ < max - 1)
+				show_request(m, rq, "\t\tV ");
+			else
+				last = rq;
+		}
+	}
+	if (last) {
+		if (count > max) {
+			drm_printf(m,
+				   "\t\t...skipping %d virtual requests...\n",
+				   count - max);
+		}
+		show_request(m, last, "\t\tV ");
+	}
+
 	spin_unlock_irqrestore(&engine->timeline.lock, flags);
 }
 
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.h b/drivers/gpu/drm/i915/gt/intel_lrc.h
index a56ee45b9e3c..68212491b3f0 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.h
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.h
@@ -110,4 +110,13 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
 							const char *prefix),
 				   unsigned int max);
 
+struct intel_context *
+intel_execlists_create_virtual(struct i915_gem_context *ctx,
+			       struct intel_engine_cs **siblings,
+			       unsigned int count);
+
+struct intel_context *
+intel_execlists_clone_virtual(struct i915_gem_context *ctx,
+			      struct intel_engine_cs *src);
+
 #endif /* _INTEL_LRC_H_ */
diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index 84538f69185b..f34aa9e042a3 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -1301,6 +1301,185 @@ static int live_preempt_smoke(void *arg)
 	return err;
 }
 
+static int nop_virtual_engine(struct drm_i915_private *i915,
+			      struct intel_engine_cs **siblings,
+			      unsigned int nsibling,
+			      unsigned int nctx,
+			      unsigned int flags)
+#define CHAIN BIT(0)
+{
+	IGT_TIMEOUT(end_time);
+	struct i915_request *request[16];
+	struct i915_gem_context *ctx[16];
+	struct intel_context *ve[16];
+	unsigned long n, prime, nc;
+	struct igt_live_test t;
+	ktime_t times[2] = {};
+	int err;
+
+	GEM_BUG_ON(!nctx || nctx > ARRAY_SIZE(ctx));
+
+	for (n = 0; n < nctx; n++) {
+		ctx[n] = kernel_context(i915);
+		if (!ctx[n]) {
+			err = -ENOMEM;
+			nctx = n;
+			goto out;
+		}
+
+		ve[n] = intel_execlists_create_virtual(ctx[n],
+						       siblings, nsibling);
+		if (IS_ERR(ve[n])) {
+			kernel_context_close(ctx[n]);
+			err = PTR_ERR(ve[n]);
+			nctx = n;
+			goto out;
+		}
+
+		err = intel_context_pin(ve[n]);
+		if (err) {
+			intel_context_put(ve[n]);
+			kernel_context_close(ctx[n]);
+			nctx = n;
+			goto out;
+		}
+	}
+
+	err = igt_live_test_begin(&t, i915, __func__, ve[0]->engine->name);
+	if (err)
+		goto out;
+
+	for_each_prime_number_from(prime, 1, 8192) {
+		times[1] = ktime_get_raw();
+
+		if (flags & CHAIN) {
+			for (nc = 0; nc < nctx; nc++) {
+				for (n = 0; n < prime; n++) {
+					request[nc] =
+						i915_request_create(ve[nc]);
+					if (IS_ERR(request[nc])) {
+						err = PTR_ERR(request[nc]);
+						goto out;
+					}
+
+					i915_request_add(request[nc]);
+				}
+			}
+		} else {
+			for (n = 0; n < prime; n++) {
+				for (nc = 0; nc < nctx; nc++) {
+					request[nc] =
+						i915_request_create(ve[nc]);
+					if (IS_ERR(request[nc])) {
+						err = PTR_ERR(request[nc]);
+						goto out;
+					}
+
+					i915_request_add(request[nc]);
+				}
+			}
+		}
+
+		for (nc = 0; nc < nctx; nc++) {
+			if (i915_request_wait(request[nc],
+					      I915_WAIT_LOCKED,
+					      HZ / 10) < 0) {
+				pr_err("%s(%s): wait for %llx:%lld timed out\n",
+				       __func__, ve[0]->engine->name,
+				       request[nc]->fence.context,
+				       request[nc]->fence.seqno);
+
+				GEM_TRACE("%s(%s) failed at request %llx:%lld\n",
+					  __func__, ve[0]->engine->name,
+					  request[nc]->fence.context,
+					  request[nc]->fence.seqno);
+				GEM_TRACE_DUMP();
+				i915_gem_set_wedged(i915);
+				break;
+			}
+		}
+
+		times[1] = ktime_sub(ktime_get_raw(), times[1]);
+		if (prime == 1)
+			times[0] = times[1];
+
+		if (__igt_timeout(end_time, NULL))
+			break;
+	}
+
+	err = igt_live_test_end(&t);
+	if (err)
+		goto out;
+
+	pr_info("Requestx%d latencies on %s: 1 = %lluns, %lu = %lluns\n",
+		nctx, ve[0]->engine->name, ktime_to_ns(times[0]),
+		prime, div64_u64(ktime_to_ns(times[1]), prime));
+
+out:
+	if (igt_flush_test(i915, I915_WAIT_LOCKED))
+		err = -EIO;
+
+	for (nc = 0; nc < nctx; nc++) {
+		intel_context_unpin(ve[nc]);
+		intel_context_put(ve[nc]);
+		kernel_context_close(ctx[nc]);
+	}
+	return err;
+}
+
+static int live_virtual_engine(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	unsigned int class, inst;
+	int err = -ENODEV;
+
+	if (USES_GUC_SUBMISSION(i915))
+		return 0;
+
+	mutex_lock(&i915->drm.struct_mutex);
+
+	for_each_engine(engine, i915, id) {
+		err = nop_virtual_engine(i915, &engine, 1, 1, 0);
+		if (err) {
+			pr_err("Failed to wrap engine %s: err=%d\n",
+			       engine->name, err);
+			goto out_unlock;
+		}
+	}
+
+	for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
+		int nsibling, n;
+
+		nsibling = 0;
+		for (inst = 0; inst <= MAX_ENGINE_INSTANCE; inst++) {
+			if (!i915->engine_class[class][inst])
+				break;
+
+			siblings[nsibling++] = i915->engine_class[class][inst];
+		}
+		if (nsibling < 2)
+			continue;
+
+		for (n = 1; n <= nsibling + 1; n++) {
+			err = nop_virtual_engine(i915, siblings, nsibling,
+						 n, 0);
+			if (err)
+				goto out_unlock;
+		}
+
+		err = nop_virtual_engine(i915, siblings, nsibling, n, CHAIN);
+		if (err)
+			goto out_unlock;
+	}
+
+out_unlock:
+	mutex_unlock(&i915->drm.struct_mutex);
+	return err;
+}
+
 int intel_execlists_live_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
@@ -1313,6 +1492,7 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(live_chain_preempt),
 		SUBTEST(live_preempt_hang),
 		SUBTEST(live_preempt_smoke),
+		SUBTEST(live_virtual_engine),
 	};
 
 	if (!HAS_EXECLISTS(i915))
diff --git a/drivers/gpu/drm/i915/i915_gem.h b/drivers/gpu/drm/i915/i915_gem.h
index 67f8a4a807a0..fe82d3571072 100644
--- a/drivers/gpu/drm/i915/i915_gem.h
+++ b/drivers/gpu/drm/i915/i915_gem.h
@@ -91,4 +91,9 @@ static inline bool __tasklet_enable(struct tasklet_struct *t)
 	return atomic_dec_and_test(&t->count);
 }
 
+static inline bool __tasklet_is_scheduled(struct tasklet_struct *t)
+{
+	return test_bit(TASKLET_STATE_SCHED, &t->state);
+}
+
 #endif /* __I915_GEM_H__ */
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 4909acbeee23..cf2d77ad1ae1 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -86,6 +86,7 @@
  */
 
 #include <linux/log2.h>
+#include <linux/nospec.h>
 
 #include <drm/i915_drm.h>
 
@@ -1219,7 +1220,6 @@ __intel_context_reconfigure_sseu(struct intel_context *ce,
 	int ret;
 
 	GEM_BUG_ON(INTEL_GEN(ce->gem_context->i915) < 8);
-	GEM_BUG_ON(ce->engine->id != RCS0);
 
 	ret = intel_context_pin_lock(ce);
 	if (ret)
@@ -1407,7 +1407,104 @@ struct set_engines {
 	struct i915_gem_engines *engines;
 };
 
+static int
+set_engines__load_balance(struct i915_user_extension __user *base, void *data)
+{
+	struct i915_context_engines_load_balance __user *ext =
+		container_of_user(base, typeof(*ext), base);
+	const struct set_engines *set = data;
+	struct intel_engine_cs *stack[16];
+	struct intel_engine_cs **siblings;
+	struct intel_context *ce;
+	u16 num_siblings, idx;
+	unsigned int n;
+	int err;
+
+	if (!HAS_EXECLISTS(set->ctx->i915))
+		return -ENODEV;
+
+	if (USES_GUC_SUBMISSION(set->ctx->i915))
+		return -ENODEV; /* not implement yet */
+
+	if (get_user(idx, &ext->engine_index))
+		return -EFAULT;
+
+	if (idx >= set->engines->num_engines) {
+		DRM_DEBUG("Invalid placement value, %d >= %ld\n",
+			  idx, set->engines->num_engines);
+		return -EINVAL;
+	}
+
+	idx = array_index_nospec(idx, set->engines->num_engines);
+	if (set->engines->engines[idx]) {
+		DRM_DEBUG("Invalid placement[%d], already occupied\n", idx);
+		return -EEXIST;
+	}
+
+	if (get_user(num_siblings, &ext->num_siblings))
+		return -EFAULT;
+
+	err = check_user_mbz(&ext->flags);
+	if (err)
+		return err;
+
+	for (n = 0; n < ARRAY_SIZE(ext->mbz64); n++) {
+		err = check_user_mbz(&ext->mbz64[n]);
+		if (err)
+			return err;
+	}
+
+	siblings = stack;
+	if (num_siblings > ARRAY_SIZE(stack)) {
+		siblings = kmalloc_array(num_siblings,
+					 sizeof(*siblings),
+					 GFP_KERNEL);
+		if (!siblings)
+			return -ENOMEM;
+	}
+
+	for (n = 0; n < num_siblings; n++) {
+		struct {
+			u16 class;
+			u16 instance;
+		} ci;
+
+		if (copy_from_user(&ci, &ext->class_instance[n], sizeof(ci))) {
+			err = -EFAULT;
+			goto out_siblings;
+		}
+
+		siblings[n] = intel_engine_lookup_user(set->ctx->i915,
+						       ci.class, ci.instance);
+		if (!siblings[n]) {
+			DRM_DEBUG("Invalid sibling[%d]: { class:%d, inst:%d }\n",
+				  n, ci.class, ci.instance);
+			err = -EINVAL;
+			goto out_siblings;
+		}
+	}
+
+	ce = intel_execlists_create_virtual(set->ctx, siblings, n);
+	if (IS_ERR(ce)) {
+		err = PTR_ERR(ce);
+		goto out_siblings;
+	}
+
+	if (cmpxchg(&set->engines->engines[idx], NULL, ce)) {
+		intel_context_put(ce);
+		err = -EEXIST;
+		goto out_siblings;
+	}
+
+out_siblings:
+	if (siblings != stack)
+		kfree(siblings);
+
+	return err;
+}
+
 static const i915_user_extension_fn set_engines__extensions[] = {
+	[I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE] = set_engines__load_balance,
 };
 
 static int
@@ -1710,14 +1807,29 @@ static int clone_engines(struct i915_gem_context *dst,
 
 	clone->i915 = dst->i915;
 	for (n = 0; n < e->num_engines; n++) {
+		struct intel_engine_cs *engine;
+
 		if (!e->engines[n]) {
 			clone->engines[n] = NULL;
 			continue;
 		}
+		engine = e->engines[n]->engine;
 
-		clone->engines[n] =
-			intel_context_create(dst, e->engines[n]->engine);
-		if (!clone->engines[n]) {
+		/*
+		 * Virtual engines are singletons; they can only exist
+		 * inside a single context, because they embed their
+		 * HW context... As each virtual context implies a single
+		 * timeline (each engine can only dequeue a single request
+		 * at any time), it would be surprising for two contexts
+		 * to use the same engine. So let's create a copy of
+		 * the virtual engine instead.
+		 */
+		if (intel_engine_is_virtual(engine))
+			clone->engines[n] =
+				intel_execlists_clone_virtual(dst, engine);
+		else
+			clone->engines[n] = intel_context_create(dst, engine);
+		if (IS_ERR_OR_NULL(clone->engines[n])) {
 			free_engines_n(clone, n);
 			goto err_unlock;
 		}
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index 77dbf7d74e12..71ee5ddef412 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -248,17 +248,26 @@ sched_lock_engine(const struct i915_sched_node *node,
 		  struct intel_engine_cs *locked,
 		  struct sched_cache *cache)
 {
-	struct intel_engine_cs *engine = node_to_request(node)->engine;
+	const struct i915_request *rq = node_to_request(node);
+	struct intel_engine_cs *engine;
 
 	GEM_BUG_ON(!locked);
 
-	if (engine != locked) {
+	/*
+	 * Virtual engines complicate acquiring the engine timeline lock,
+	 * as their rq->engine pointer is not stable until under that
+	 * engine lock. The simple ploy we use is to take the lock then
+	 * check that the rq still belongs to the newly locked engine.
+	 */
+	while (locked != (engine = READ_ONCE(rq->engine))) {
 		spin_unlock(&locked->timeline.lock);
 		memset(cache, 0, sizeof(*cache));
 		spin_lock(&engine->timeline.lock);
+		locked = engine;
 	}
 
-	return engine;
+	GEM_BUG_ON(locked != engine);
+	return locked;
 }
 
 static bool inflight(const struct i915_request *rq,
@@ -371,8 +380,11 @@ static void __i915_schedule(struct i915_request *rq,
 		if (prio <= node->attr.priority || node_signaled(node))
 			continue;
 
+		GEM_BUG_ON(node_to_request(node)->engine != engine);
+
 		node->attr.priority = prio;
 		if (!list_empty(&node->link)) {
+			GEM_BUG_ON(intel_engine_is_virtual(engine));
 			if (!cache.priolist)
 				cache.priolist =
 					i915_sched_lookup_priolist(engine,
diff --git a/drivers/gpu/drm/i915/i915_timeline_types.h b/drivers/gpu/drm/i915/i915_timeline_types.h
index 5256a0b5c5f7..1688705f4a2b 100644
--- a/drivers/gpu/drm/i915/i915_timeline_types.h
+++ b/drivers/gpu/drm/i915/i915_timeline_types.h
@@ -26,6 +26,7 @@ struct i915_timeline {
 	spinlock_t lock;
 #define TIMELINE_CLIENT 0 /* default subclass */
 #define TIMELINE_ENGINE 1
+#define TIMELINE_VIRTUAL 2
 	struct mutex mutex; /* protects the flow of requests */
 
 	unsigned int pin_count;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 303d3dd980e4..9747fcb9b86d 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -127,6 +127,7 @@ enum drm_i915_gem_engine_class {
 };
 
 #define I915_ENGINE_CLASS_INVALID_NONE -1
+#define I915_ENGINE_CLASS_INVALID_VIRTUAL 0
 
 /**
  * DOC: perf_events exposed by i915 through /sys/bus/event_sources/drivers/i915
@@ -1597,8 +1598,52 @@ struct drm_i915_gem_context_param_sseu {
 	__u32 rsvd;
 };
 
+/*
+ * i915_context_engines_load_balance:
+ *
+ * Enable load balancing across this set of engines.
+ *
+ * Into the I915_EXEC_DEFAULT slot [0], a virtual engine is created that when
+ * used will proxy the execbuffer request onto one of the set of engines
+ * in such a way as to distribute the load evenly across the set.
+ *
+ * The set of engines must be compatible (e.g. the same HW class) as they
+ * will share the same logical GPU context and ring.
+ *
+ * To intermix rendering with the virtual engine and direct rendering onto
+ * the backing engines (bypassing the load balancing proxy), the context must
+ * be defined to use a single timeline for all engines.
+ */
+struct i915_context_engines_load_balance {
+	struct i915_user_extension base;
+
+	__u16 engine_index;
+	__u16 num_siblings;
+	__u32 flags; /* all undefined flags must be zero */
+
+	__u64 mbz64[1]; /* reserved for future use; must be zero */
+
+	struct {
+		__u16 engine_class; /* see enum drm_i915_gem_engine_class */
+		__u16 engine_instance;
+	} class_instance[0];
+} __attribute__((packed));
+
+#define I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(name__, N__) struct { \
+	struct i915_user_extension base; \
+	__u16 engine_index; \
+	__u16 num_siblings; \
+	__u32 flags; \
+	__u64 mbz64[1]; \
+	struct { \
+		__u16 engine_class; \
+		__u16 engine_instance; \
+	} class_instance[N__]; \
+} __attribute__((packed)) name__
+
 struct i915_context_param_engines {
 	__u64 extensions; /* linked chain of extension blocks, 0 terminates */
+#define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0 /* see i915_context_engines_load_balance */
 
 	struct {
 		__u16 engine_class; /* see enum drm_i915_gem_engine_class */
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/29] drm/i915: Mark up ips for RCU protection
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (27 preceding siblings ...)
  2019-04-08  9:17 ` [PATCH 29/29] drm/i915: Load balancing across a virtual engine Chris Wilson
@ 2019-04-08  9:59 ` Patchwork
  2019-04-08 10:13 ` ✗ Fi.CI.SPARSE: " Patchwork
                   ` (2 subsequent siblings)
  31 siblings, 0 replies; 71+ messages in thread
From: Patchwork @ 2019-04-08  9:59 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/29] drm/i915: Mark up ips for RCU protection
URL   : https://patchwork.freedesktop.org/series/59155/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
17875d38d33a drm/i915: Mark up ips for RCU protection
bb6133046abe drm/i915/guc: Replace WARN with a DRM_ERROR
c9580bf3733f drm/i915: Use static allocation for i915_globals_park()
-:15: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#15: 
<4> [39.086121] 00000000004b5f91 (wakeref#3){+.+.}, at: intel_engine_pm_put+0x1b/0x40 [i915]

total: 0 errors, 1 warnings, 0 checks, 103 lines checked
c38bedbb5aef drm/i915: Consolidate the timeline->barrier
7af25481d1f9 drm/i915: Store the default sseu setup on the engine
-:383: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#383: 
new file mode 100644

-:388: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#388: FILE: drivers/gpu/drm/i915/intel_sseu.c:1:
+/*

-:389: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#389: FILE: drivers/gpu/drm/i915/intel_sseu.c:2:
+ * SPDX-License-Identifier: MIT

-:536: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#536: FILE: drivers/gpu/drm/i915/intel_sseu.h:1:
+/*

-:537: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#537: FILE: drivers/gpu/drm/i915/intel_sseu.h:2:
+ * SPDX-License-Identifier: MIT

total: 0 errors, 5 warnings, 0 checks, 533 lines checked
5b9f9fa1d320 drm/i915: Move GraphicsTechnology files under gt/
-:128: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#128: 
new file mode 100644

total: 0 errors, 1 warnings, 0 checks, 704 lines checked
bc642d18eac6 drm/i915: Only reset the pinned kernel contexts on resume
99753b992fca drm/i915: Introduce struct intel_wakeref
-:61: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#61: 
new file mode 100644

-:66: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#66: FILE: drivers/gpu/drm/i915/intel_wakeref.c:1:
+/*

-:67: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#67: FILE: drivers/gpu/drm/i915/intel_wakeref.c:2:
+ * SPDX-License-Identifier: MIT

-:123: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#123: FILE: drivers/gpu/drm/i915/intel_wakeref.h:1:
+/*

-:124: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#124: FILE: drivers/gpu/drm/i915/intel_wakeref.h:2:
+ * SPDX-License-Identifier: MIT

-:138: WARNING:NEW_TYPEDEFS: do not add new typedefs
#138: FILE: drivers/gpu/drm/i915/intel_wakeref.h:16:
+typedef depot_stack_handle_t intel_wakeref_t;

-:142: CHECK:UNCOMMENTED_DEFINITION: struct mutex definition without comment
#142: FILE: drivers/gpu/drm/i915/intel_wakeref.h:20:
+	struct mutex mutex;

total: 0 errors, 6 warnings, 1 checks, 159 lines checked
4370dbe405cc drm/i915: Pull the GEM powermangement coupling into its own file
-:499: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#499: 
new file mode 100644

-:504: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#504: FILE: drivers/gpu/drm/i915/i915_gem_pm.c:1:
+/*

-:505: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#505: FILE: drivers/gpu/drm/i915/i915_gem_pm.c:2:
+ * SPDX-License-Identifier: MIT

-:875: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#875: FILE: drivers/gpu/drm/i915/i915_gem_pm.h:1:
+/*

-:876: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#876: FILE: drivers/gpu/drm/i915/i915_gem_pm.h:2:
+ * SPDX-License-Identifier: MIT

total: 0 errors, 5 warnings, 0 checks, 904 lines checked
d5b6883c1668 drm/i915: Introduce context->enter() and context->exit()
ba648fd79d72 drm/i915: Pass intel_context to i915_request_create()
8460d7b08fed drm/i915: Invert the GEM wakeref hierarchy
-:316: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#316: 
new file mode 100644

-:321: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#321: FILE: drivers/gpu/drm/i915/gt/intel_engine_pm.c:1:
+/*

-:322: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#322: FILE: drivers/gpu/drm/i915/gt/intel_engine_pm.c:2:
+ * SPDX-License-Identifier: MIT

-:489: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#489: FILE: drivers/gpu/drm/i915/gt/intel_engine_pm.h:1:
+/*

-:490: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#490: FILE: drivers/gpu/drm/i915/gt/intel_engine_pm.h:2:
+ * SPDX-License-Identifier: MIT

-:547: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#547: FILE: drivers/gpu/drm/i915/gt/intel_gt_pm.c:1:
+/*

-:548: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#548: FILE: drivers/gpu/drm/i915/gt/intel_gt_pm.c:2:
+ * SPDX-License-Identifier: MIT

-:691: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#691: FILE: drivers/gpu/drm/i915/gt/intel_gt_pm.h:1:
+/*

-:692: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#692: FILE: drivers/gpu/drm/i915/gt/intel_gt_pm.h:2:
+ * SPDX-License-Identifier: MIT

total: 0 errors, 9 warnings, 0 checks, 2167 lines checked
603ca7f02cae drm/i915/gvt: Pin the per-engine GVT shadow contexts
54c85fab6c7c drm/i915: Explicitly pin the logical context for execbuf
f6ff7ee156ff drm/i915/guc: Replace preempt_client lookup with engine->preempt_context
b283b276df90 drm/i915: Export intel_context_instance()
b556c5b31e3a drm/i915/selftests: Use the real kernel context for sseu isolation tests
df005e48a361 drm/i915/selftests: Pass around intel_context for sseu
af37325f8ba2 drm/i915: Pass intel_context to intel_context_pin_lock()
c72ee1eb49eb drm/i915: Split engine setup/init into two phases
4f8c507eab86 drm/i915: Switch back to an array of logical per-engine HW contexts
-:549: WARNING:LINE_SPACING: Missing a blank line after declarations
#549: FILE: drivers/gpu/drm/i915/i915_gem_context.h:214:
+		struct i915_gem_engines *e = rcu_dereference(ctx->engines);
+		if (likely(idx < e->num_engines && e->engines[idx]))

total: 0 errors, 1 warnings, 0 checks, 885 lines checked
98c2daab49d5 drm/i915: Remove intel_context.active_link
dfffb979bb26 drm/i915: Move i915_request_alloc into selftests/
-:394: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#394: 
new file mode 100644

-:399: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#399: FILE: drivers/gpu/drm/i915/selftests/igt_gem_utils.c:1:
+/*

-:400: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#400: FILE: drivers/gpu/drm/i915/selftests/igt_gem_utils.c:2:
+ * SPDX-License-Identifier: MIT

-:448: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#448: FILE: drivers/gpu/drm/i915/selftests/igt_gem_utils.h:1:
+/*

-:449: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#449: FILE: drivers/gpu/drm/i915/selftests/igt_gem_utils.h:2:
+ * SPDX-License-Identifier: MIT

total: 0 errors, 5 warnings, 0 checks, 394 lines checked
83efb37a5231 drm/i915: Allow multiple user handles to the same VM
fca3de8b6aaa drm/i915: Restore control over ppgtt for context creation ABI
-:80: WARNING:LONG_LINE: line over 100 characters
#80: FILE: include/uapi/drm/i915_drm.h:408:
+#define DRM_IOCTL_I915_GEM_VM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)

-:81: WARNING:LONG_LINE: line over 100 characters
#81: FILE: include/uapi/drm/i915_drm.h:409:
+#define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)

-:81: WARNING:SPACING: space prohibited between function name and open parenthesis '('
#81: FILE: include/uapi/drm/i915_drm.h:409:
+#define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)

-:81: ERROR:COMPLEX_MACRO: Macros with complex values should be enclosed in parentheses
#81: FILE: include/uapi/drm/i915_drm.h:409:
+#define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)

total: 1 errors, 3 warnings, 0 checks, 64 lines checked
0f931477a6b2 drm/i915: Allow a context to define its set of engines
-:415: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects?
#415: FILE: drivers/gpu/drm/i915/i915_utils.h:107:
+#define check_struct_size(p, member, n, sz) \
+	likely(__check_struct_size(sizeof(*(p)), \
+				   sizeof(*(p)->member) + __must_be_array((p)->member), \
+				   n, sz))

-:415: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'member' - possible side-effects?
#415: FILE: drivers/gpu/drm/i915/i915_utils.h:107:
+#define check_struct_size(p, member, n, sz) \
+	likely(__check_struct_size(sizeof(*(p)), \
+				   sizeof(*(p)->member) + __must_be_array((p)->member), \
+				   n, sz))

-:415: CHECK:MACRO_ARG_PRECEDENCE: Macro argument 'member' may be better as '(member)' to avoid precedence issues
#415: FILE: drivers/gpu/drm/i915/i915_utils.h:107:
+#define check_struct_size(p, member, n, sz) \
+	likely(__check_struct_size(sizeof(*(p)), \
+				   sizeof(*(p)->member) + __must_be_array((p)->member), \
+				   n, sz))

total: 0 errors, 0 warnings, 3 checks, 414 lines checked
45a6d1e2aba0 drm/i915: Allow userspace to clone contexts on creation
-:212: ERROR:BRACKET_SPACE: space prohibited before open square bracket '['
#212: FILE: drivers/gpu/drm/i915/i915_gem_context.c:1860:
+#define MAP(x, y) [ilog2(I915_CONTEXT_CLONE_##x)] = y

total: 1 errors, 0 warnings, 0 checks, 235 lines checked
9e4467a5ff80 drm/i915: Re-expose SINGLE_TIMELINE flags for context creation
ec9245b02fa1 drm/i915: Load balancing across a virtual engine

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* ✗ Fi.CI.SPARSE: warning for series starting with [01/29] drm/i915: Mark up ips for RCU protection
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (28 preceding siblings ...)
  2019-04-08  9:59 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/29] drm/i915: Mark up ips for RCU protection Patchwork
@ 2019-04-08 10:13 ` Patchwork
  2019-04-08 10:28 ` ✗ Fi.CI.BAT: failure " Patchwork
  2019-04-11 22:20 ` ✗ Fi.CI.BAT: failure for series starting with [01/29] drm/i915: Mark up ips for RCU protection (rev2) Patchwork
  31 siblings, 0 replies; 71+ messages in thread
From: Patchwork @ 2019-04-08 10:13 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/29] drm/i915: Mark up ips for RCU protection
URL   : https://patchwork.freedesktop.org/series/59155/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Sparse version: v0.5.2
Commit: drm/i915: Mark up ips for RCU protection
-drivers/gpu/drm/i915/intel_pm.c:8355:9: error: incompatible types in comparison expression (different address spaces)
-drivers/gpu/drm/i915/intel_pm.c:8362:9: error: incompatible types in comparison expression (different address spaces)

Commit: drm/i915/guc: Replace WARN with a DRM_ERROR
Okay!

Commit: drm/i915: Use static allocation for i915_globals_park()
Okay!

Commit: drm/i915: Consolidate the timeline->barrier
Okay!

Commit: drm/i915: Store the default sseu setup on the engine
-O:drivers/gpu/drm/i915/intel_lrc.c:2574:25: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/intel_sseu.c:82:25: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3617:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3603:16: warning: expression using sizeof(void)
+./include/uapi/linux/perf_event.h:147:56: warning: cast truncates bits from constant value (8000000000000000 becomes 0)

Commit: drm/i915: Move GraphicsTechnology files under gt/
+drivers/gpu/drm/i915/gt/intel_context.c:130:22: warning: context imbalance in 'intel_context_pin_lock' - wrong count at exit
+drivers/gpu/drm/i915/gt/intel_engine.h:124:23: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/gt/intel_engine.h:124:23: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/gt/intel_reset.c:1353:5: warning: context imbalance in 'i915_reset_trylock' - different lock contexts for basic block
+drivers/gpu/drm/i915/gt/intel_sseu.c:82:25: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/gt/intel_reset.c:1353:5: warning: context imbalance in 'i915_reset_trylock' - different lock contexts for basic block
-drivers/gpu/drm/i915/gt/intel_context.c:130:22: warning: context imbalance in 'intel_context_pin_lock' - wrong count at exit
-drivers/gpu/drm/i915/gt/intel_engine.h:124:23: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/gt/intel_engine.h:124:23: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/gt/intel_sseu.c:82:25: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3603:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3604:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/i915_syncmap.c:80:54: warning: dubious: x | !y
+./include/linux/slab.h:666:13: error: undefined identifier '__builtin_mul_overflow'
+./include/linux/slab.h:666:13: error: undefined identifier '__builtin_mul_overflow'
+./include/linux/slab.h:666:13: warning: call with no type!
+./include/uapi/linux/perf_event.h:147:56: warning: cast truncates bits from constant value (8000000000000000 becomes 0)
+./include/uapi/linux/perf_event.h:147:56: warning: cast truncates bits from constant value (8000000000000000 becomes 0)
+./include/uapi/linux/perf_event.h:147:56: warning: cast truncates bits from constant value (8000000000000000 becomes 0)
+./include/uapi/linux/perf_event.h:147:56: warning: cast truncates bits from constant value (8000000000000000 becomes 0)
+./include/uapi/linux/perf_event.h:147:56: warning: cast truncates bits from constant value (8000000000000000 becomes 0)
+./include/uapi/linux/perf_event.h:147:56: warning: cast truncates bits from constant value (8000000000000000 becomes 0)
+./include/uapi/linux/perf_event.h:147:56: warning: cast truncates bits from constant value (8000000000000000 becomes 0)

Commit: drm/i915: Only reset the pinned kernel contexts on resume
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3604:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3603:16: warning: expression using sizeof(void)

Commit: drm/i915: Introduce struct intel_wakeref
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3603:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3602:16: warning: expression using sizeof(void)
+./include/uapi/linux/perf_event.h:147:56: warning: cast truncates bits from constant value (8000000000000000 becomes 0)

Commit: drm/i915: Pull the GEM powermangement coupling into its own file
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3602:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3604:16: warning: expression using sizeof(void)
+./include/uapi/linux/perf_event.h:147:56: warning: cast truncates bits from constant value (8000000000000000 becomes 0)

Commit: drm/i915: Introduce context->enter() and context->exit()
Okay!

Commit: drm/i915: Pass intel_context to i915_request_create()
Okay!

Commit: drm/i915: Invert the GEM wakeref hierarchy
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3604:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3608:16: warning: expression using sizeof(void)
+./include/uapi/linux/perf_event.h:147:56: warning: cast truncates bits from constant value (8000000000000000 becomes 0)
+./include/uapi/linux/perf_event.h:147:56: warning: cast truncates bits from constant value (8000000000000000 becomes 0)

Commit: drm/i915/gvt: Pin the per-engine GVT shadow contexts
Okay!

Commit: drm/i915: Explicitly pin the logical context for execbuf
Okay!

Commit: drm/i915/guc: Replace preempt_client lookup with engine->preempt_context
Okay!

Commit: drm/i915: Export intel_context_instance()
-O:drivers/gpu/drm/i915/gt/intel_context.c:131:22: warning: context imbalance in 'intel_context_pin_lock' - wrong count at exit
+drivers/gpu/drm/i915/gt/intel_context.c:131:22: warning: context imbalance in 'intel_context_pin_lock' - wrong count at exit
+drivers/gpu/drm/i915/gt/intel_context.c:150:6: warning: context imbalance in 'intel_context_pin_unlock' - wrong count at exit

Commit: drm/i915/selftests: Use the real kernel context for sseu isolation tests
Okay!

Commit: drm/i915/selftests: Pass around intel_context for sseu
Okay!

Commit: drm/i915: Pass intel_context to intel_context_pin_lock()
-O:drivers/gpu/drm/i915/gt/intel_context.c:131:22: warning: context imbalance in 'intel_context_pin_lock' - wrong count at exit
-O:drivers/gpu/drm/i915/gt/intel_context.c:150:6: warning: context imbalance in 'intel_context_pin_unlock' - wrong count at exit

Commit: drm/i915: Split engine setup/init into two phases
Okay!

Commit: drm/i915: Switch back to an array of logical per-engine HW contexts
+./include/linux/overflow.h:285:13: error: incorrect type in conditional
+./include/linux/overflow.h:285:13: error: undefined identifier '__builtin_mul_overflow'
+./include/linux/overflow.h:285:13:    got void
+./include/linux/overflow.h:285:13: warning: call with no type!
+./include/linux/overflow.h:287:13: error: incorrect type in conditional
+./include/linux/overflow.h:287:13: error: undefined identifier '__builtin_add_overflow'
+./include/linux/overflow.h:287:13:    got void
+./include/linux/overflow.h:287:13: warning: call with no type!

Commit: drm/i915: Remove intel_context.active_link
Okay!

Commit: drm/i915: Move i915_request_alloc into selftests/
+./include/uapi/linux/perf_event.h:147:56: warning: cast truncates bits from constant value (8000000000000000 becomes 0)

Commit: drm/i915: Allow multiple user handles to the same VM
Okay!

Commit: drm/i915: Restore control over ppgtt for context creation ABI
Okay!

Commit: drm/i915: Allow a context to define its set of engines
+drivers/gpu/drm/i915/i915_utils.h:84:13: error: incorrect type in conditional
+drivers/gpu/drm/i915/i915_utils.h:84:13: error: undefined identifier '__builtin_mul_overflow'
+drivers/gpu/drm/i915/i915_utils.h:84:13:    got void
+drivers/gpu/drm/i915/i915_utils.h:84:13: warning: call with no type!
+drivers/gpu/drm/i915/i915_utils.h:87:13: error: incorrect type in conditional
+drivers/gpu/drm/i915/i915_utils.h:87:13: error: undefined identifier '__builtin_add_overflow'
+drivers/gpu/drm/i915/i915_utils.h:87:13:    got void
+drivers/gpu/drm/i915/i915_utils.h:87:13: warning: call with no type!
+./include/linux/overflow.h:285:13: error: incorrect type in conditional
+./include/linux/overflow.h:285:13: error: not a function <noident>
+./include/linux/overflow.h:285:13:    got void
+./include/linux/overflow.h:287:13: error: incorrect type in conditional
+./include/linux/overflow.h:287:13: error: not a function <noident>
+./include/linux/overflow.h:287:13:    got void

Commit: drm/i915: Allow userspace to clone contexts on creation
+drivers/gpu/drm/i915/i915_gem_context.c:1861:17: error: bad integer constant expression
+drivers/gpu/drm/i915/i915_gem_context.c:1862:17: error: bad integer constant expression
+drivers/gpu/drm/i915/i915_gem_context.c:1863:17: error: bad integer constant expression
+drivers/gpu/drm/i915/i915_gem_context.c:1864:17: error: bad integer constant expression
+drivers/gpu/drm/i915/i915_gem_context.c:1865:17: error: bad integer constant expression
+drivers/gpu/drm/i915/i915_gem_context.c:1866:17: error: bad integer constant expression
-drivers/gpu/drm/i915/i915_utils.h:84:13: warning: call with no type!
-drivers/gpu/drm/i915/i915_utils.h:87:13: warning: call with no type!
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:1266:25: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:1266:25: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:454:16: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:571:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:571:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:693:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:693:33: warning: expression using sizeof(void)
+./include/linux/overflow.h:285:13: error: incorrect type in conditional
+./include/linux/overflow.h:285:13: error: not a function <noident>
-./include/linux/overflow.h:285:13: warning: call with no type!
+./include/linux/overflow.h:285:13:    got void
+./include/linux/overflow.h:287:13: error: incorrect type in conditional
+./include/linux/overflow.h:287:13: error: not a function <noident>
-./include/linux/overflow.h:287:13: warning: call with no type!
+./include/linux/overflow.h:287:13:    got void
-./include/linux/slab.h:666:13: warning: call with no type!

Commit: drm/i915: Re-expose SINGLE_TIMELINE flags for context creation
Okay!

Commit: drm/i915: Load balancing across a virtual engine
+./include/linux/overflow.h:285:13: error: incorrect type in conditional
+./include/linux/overflow.h:285:13: error: undefined identifier '__builtin_mul_overflow'
+./include/linux/overflow.h:285:13:    got void
+./include/linux/overflow.h:285:13: warning: call with no type!
+./include/linux/overflow.h:287:13: error: incorrect type in conditional
+./include/linux/overflow.h:287:13: error: undefined identifier '__builtin_add_overflow'
+./include/linux/overflow.h:287:13:    got void
+./include/linux/overflow.h:287:13: warning: call with no type!
+./include/linux/slab.h:666:13: error: not a function <noident>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* ✗ Fi.CI.BAT: failure for series starting with [01/29] drm/i915: Mark up ips for RCU protection
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (29 preceding siblings ...)
  2019-04-08 10:13 ` ✗ Fi.CI.SPARSE: " Patchwork
@ 2019-04-08 10:28 ` Patchwork
  2019-04-08 10:37   ` Chris Wilson
  2019-04-11 22:20 ` ✗ Fi.CI.BAT: failure for series starting with [01/29] drm/i915: Mark up ips for RCU protection (rev2) Patchwork
  31 siblings, 1 reply; 71+ messages in thread
From: Patchwork @ 2019-04-08 10:28 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/29] drm/i915: Mark up ips for RCU protection
URL   : https://patchwork.freedesktop.org/series/59155/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_5887 -> Patchwork_12723
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_12723 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_12723, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/59155/revisions/1/mbox/

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_12723:

### IGT changes ###

#### Possible regressions ####

  * igt@gem_exec_suspend@basic-s3:
    - fi-apl-guc:         PASS -> DMESG-WARN

  * igt@i915_selftest@live_execlists:
    - fi-bdw-5557u:       PASS -> INCOMPLETE

  * igt@i915_selftest@live_uncore:
    - fi-bxt-j4205:       PASS -> DMESG-FAIL

  * igt@runner@aborted:
    - fi-cfl-8109u:       NOTRUN -> FAIL
    - fi-kbl-x1275:       NOTRUN -> FAIL

  
Known issues
------------

  Here are the changes found in Patchwork_12723 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_suspend@basic-s3:
    - fi-blb-e6850:       PASS -> INCOMPLETE [fdo#107718]

  * igt@gem_tiled_fence_blits@basic:
    - fi-apl-guc:         PASS -> SKIP [fdo#109271] +39

  * igt@gem_workarounds@basic-read:
    - fi-snb-2600:        NOTRUN -> SKIP [fdo#109271] +57

  * igt@i915_pm_rpm@module-reload:
    - fi-skl-6770hq:      PASS -> FAIL [fdo#108511]

  * igt@i915_selftest@live_hangcheck:
    - fi-cfl-8109u:       PASS -> INCOMPLETE [fdo#106070]
    - fi-icl-u2:          PASS -> INCOMPLETE [fdo#108569]
    - fi-kbl-x1275:       PASS -> INCOMPLETE [fdo#108744]
    - fi-bxt-dsi:         PASS -> INCOMPLETE [fdo#103927]

  * igt@kms_busy@basic-flip-c:
    - fi-snb-2600:        NOTRUN -> SKIP [fdo#109271] / [fdo#109278]

  * igt@runner@aborted:
    - fi-icl-u2:          NOTRUN -> FAIL [fdo#108569]

  
#### Possible fixes ####

  * igt@i915_selftest@live_execlists:
    - fi-apl-guc:         INCOMPLETE [fdo#103927] / [fdo#109720] -> PASS

  * igt@i915_selftest@live_hangcheck:
    - fi-ilk-650:         INCOMPLETE [fdo#109723] -> PASS

  * igt@i915_selftest@live_uncore:
    - fi-ivb-3770:        DMESG-FAIL [fdo#110210] -> PASS

  
  [fdo#103927]: https://bugs.freedesktop.org/show_bug.cgi?id=103927
  [fdo#106070]: https://bugs.freedesktop.org/show_bug.cgi?id=106070
  [fdo#107718]: https://bugs.freedesktop.org/show_bug.cgi?id=107718
  [fdo#108511]: https://bugs.freedesktop.org/show_bug.cgi?id=108511
  [fdo#108569]: https://bugs.freedesktop.org/show_bug.cgi?id=108569
  [fdo#108744]: https://bugs.freedesktop.org/show_bug.cgi?id=108744
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109278]: https://bugs.freedesktop.org/show_bug.cgi?id=109278
  [fdo#109720]: https://bugs.freedesktop.org/show_bug.cgi?id=109720
  [fdo#109723]: https://bugs.freedesktop.org/show_bug.cgi?id=109723
  [fdo#110210]: https://bugs.freedesktop.org/show_bug.cgi?id=110210


Participating hosts (50 -> 44)
------------------------------

  Additional (1): fi-snb-2600 
  Missing    (7): fi-kbl-soraka fi-ilk-m540 fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-icl-u3 fi-bdw-samus 


Build changes
-------------

    * Linux: CI_DRM_5887 -> Patchwork_12723

  CI_DRM_5887: 360bc4546928ec1c9324ddd3991a01e89476a16c @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4932: 08cf63a8fac11e3594b57580331fb319241a0d69 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_12723: ec9245b02fa1d9c76177e5eafdeb2651d5c7e9a7 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

ec9245b02fa1 drm/i915: Load balancing across a virtual engine
9e4467a5ff80 drm/i915: Re-expose SINGLE_TIMELINE flags for context creation
45a6d1e2aba0 drm/i915: Allow userspace to clone contexts on creation
0f931477a6b2 drm/i915: Allow a context to define its set of engines
fca3de8b6aaa drm/i915: Restore control over ppgtt for context creation ABI
83efb37a5231 drm/i915: Allow multiple user handles to the same VM
dfffb979bb26 drm/i915: Move i915_request_alloc into selftests/
98c2daab49d5 drm/i915: Remove intel_context.active_link
4f8c507eab86 drm/i915: Switch back to an array of logical per-engine HW contexts
c72ee1eb49eb drm/i915: Split engine setup/init into two phases
af37325f8ba2 drm/i915: Pass intel_context to intel_context_pin_lock()
df005e48a361 drm/i915/selftests: Pass around intel_context for sseu
b556c5b31e3a drm/i915/selftests: Use the real kernel context for sseu isolation tests
b283b276df90 drm/i915: Export intel_context_instance()
f6ff7ee156ff drm/i915/guc: Replace preempt_client lookup with engine->preempt_context
54c85fab6c7c drm/i915: Explicitly pin the logical context for execbuf
603ca7f02cae drm/i915/gvt: Pin the per-engine GVT shadow contexts
8460d7b08fed drm/i915: Invert the GEM wakeref hierarchy
ba648fd79d72 drm/i915: Pass intel_context to i915_request_create()
d5b6883c1668 drm/i915: Introduce context->enter() and context->exit()
4370dbe405cc drm/i915: Pull the GEM powermangement coupling into its own file
99753b992fca drm/i915: Introduce struct intel_wakeref
bc642d18eac6 drm/i915: Only reset the pinned kernel contexts on resume
5b9f9fa1d320 drm/i915: Move GraphicsTechnology files under gt/
7af25481d1f9 drm/i915: Store the default sseu setup on the engine
c38bedbb5aef drm/i915: Consolidate the timeline->barrier
c9580bf3733f drm/i915: Use static allocation for i915_globals_park()
bb6133046abe drm/i915/guc: Replace WARN with a DRM_ERROR
17875d38d33a drm/i915: Mark up ips for RCU protection

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12723/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ✗ Fi.CI.BAT: failure for series starting with [01/29] drm/i915: Mark up ips for RCU protection
  2019-04-08 10:28 ` ✗ Fi.CI.BAT: failure " Patchwork
@ 2019-04-08 10:37   ` Chris Wilson
  0 siblings, 0 replies; 71+ messages in thread
From: Chris Wilson @ 2019-04-08 10:37 UTC (permalink / raw)
  To: Patchwork; +Cc: intel-gfx

Quoting Patchwork (2019-04-08 11:28:00)
> == Series Details ==
> 
> Series: series starting with [01/29] drm/i915: Mark up ips for RCU protection
> URL   : https://patchwork.freedesktop.org/series/59155/
> State : failure
> 
> == Summary ==
> 
> CI Bug Log - changes from CI_DRM_5887 -> Patchwork_12723
> ====================================================
> 
> Summary
> -------
> 
>   **FAILURE**
> 
>   Serious unknown changes coming with Patchwork_12723 absolutely need to be
>   verified manually.
>   
>   If you think the reported changes have nothing to do with the changes
>   introduced in Patchwork_12723, please notify your bug team to allow them
>   to document this new failure mode, which will reduce false positives in CI.
> 
>   External URL: https://patchwork.freedesktop.org/api/1.0/series/59155/revisions/1/mbox/
> 
> Possible new issues
> -------------------
> 
>   Here are the unknown changes that may have been introduced in Patchwork_12723:
> 
> ### IGT changes ###
> 
> #### Possible regressions ####
> 
>   * igt@gem_exec_suspend@basic-s3:
>     - fi-apl-guc:         PASS -> DMESG-WARN

guc resume is snafu; we don't try to reenable guc during resume
apparently.

>   * igt@i915_selftest@live_execlists:
>     - fi-bdw-5557u:       PASS -> INCOMPLETE

Uh. Not sure, first time seeing that error.

>   * igt@i915_selftest@live_hangcheck:
>     - fi-cfl-8109u:       PASS -> INCOMPLETE [fdo#106070]
>     - fi-icl-u2:          PASS -> INCOMPLETE [fdo#108569]
>     - fi-kbl-x1275:       PASS -> INCOMPLETE [fdo#108744]
>     - fi-bxt-dsi:         PASS -> INCOMPLETE [fdo#103927]

I have a working theory for these after staring at the dumps over the
w/e. When we reset the engine, we clobber the context, but since the
request had already completed, we don't restore the RING registers. So
on the next request, it begins from the start of the ring -- causing the
seqno to be reset, and that just coincides with us retiring. Aiui, not a
new bug, but easier to hit. Explains a lot though. We need to track the
active context independently of the request during retire, and I feel
more confident in doing that after preempt-to-busy.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 02/29] drm/i915/guc: Replace WARN with a DRM_ERROR
  2019-04-08  9:17 ` [PATCH 02/29] drm/i915/guc: Replace WARN with a DRM_ERROR Chris Wilson
@ 2019-04-08 14:26   ` Tvrtko Ursulin
  0 siblings, 0 replies; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-08 14:26 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 08/04/2019 10:17, Chris Wilson wrote:
> Replace the WARN with a simple if() + error message to squech the sparse
> warning that entire wait_for() macro was being stringified:
> 
> drivers/gpu/drm/i915/intel_guc_submission.c:658:9: error: too long token expansion
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/intel_guc_submission.c | 7 ++++---
>   1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
> index c58922226d47..42fcd622d7a3 100644
> --- a/drivers/gpu/drm/i915/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/intel_guc_submission.c
> @@ -650,9 +650,10 @@ static void wait_for_guc_preempt_report(struct intel_engine_cs *engine)
>   	struct guc_ctx_report *report =
>   		&data->preempt_ctx_report[engine->guc_id];
>   
> -	WARN_ON(wait_for_atomic(report->report_return_status ==
> -				INTEL_GUC_REPORT_STATUS_COMPLETE,
> -				GUC_PREEMPT_POSTPROCESS_DELAY_MS));
> +	if (wait_for_atomic(report->report_return_status ==
> +			    INTEL_GUC_REPORT_STATUS_COMPLETE,
> +			    GUC_PREEMPT_POSTPROCESS_DELAY_MS))
> +		DRM_ERROR("Timed out waiting for GuC preemption report\n");
>   	/*
>   	 * GuC is expecting that we're also going to clear the affected context
>   	 * counter, let's also reset the return status to not depend on GuC
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 03/29] drm/i915: Use static allocation for i915_globals_park()
  2019-04-08  9:17 ` [PATCH 03/29] drm/i915: Use static allocation for i915_globals_park() Chris Wilson
@ 2019-04-08 14:31   ` Tvrtko Ursulin
  0 siblings, 0 replies; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-08 14:31 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 08/04/2019 10:17, Chris Wilson wrote:
> In order to avoid the malloc inside i915_globals_park() occurring
> underneath a lock connected to the shrinker (thus causing circular
> lockdeps warnings), move the rcu_worker to a global.
> 
> <4> [39.085073] ======================================================
> <4> [39.085273] WARNING: possible circular locking dependency detected
> <4> [39.085552] 5.1.0-rc3-CI-Trybot_4088+ #1 Tainted: G     U
> <4> [39.085752] ------------------------------------------------------
> <4> [39.085949] kswapd0/32 is trying to acquire lock:
> <4> [39.086121] 00000000004b5f91 (wakeref#3){+.+.}, at: intel_engine_pm_put+0x1b/0x40 [i915]
> <4> [39.086493]
> but task is already holding lock:
> <4> [39.086682] 00000000dd009a9a (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x0/0x30
> <4> [39.086910]
> which lock already depends on the new lock.
> 
> <4> [39.087139]
> the existing dependency chain (in reverse order) is:
> <4> [39.087356]
> -> #2 (fs_reclaim){+.+.}:
> <4> [39.087604]        fs_reclaim_acquire.part.24+0x24/0x30
> <4> [39.087785]        kmem_cache_alloc_trace+0x2a/0x290
> <4> [39.087998]        i915_globals_park+0x22/0xa0 [i915]
> <4> [39.088478]        idle_work_handler+0x1df/0x220 [i915]
> <4> [39.089016]        process_one_work+0x245/0x610
> <4> [39.089447]        worker_thread+0x37/0x380
> <4> [39.089956]        kthread+0x119/0x130
> <4> [39.090374]        ret_from_fork+0x3a/0x50
> <4> [39.090868]
> -> #1 (wakeref#4){+.+.}:
> <4> [39.091569]        __mutex_lock+0x8c/0x960
> <4> [39.092054]        atomic_dec_and_mutex_lock+0x33/0x50
> <4> [39.092521]        intel_gt_pm_put+0x1b/0x40 [i915]
> <4> [39.093047]        intel_engine_park+0xeb/0x1d0 [i915]
> <4> [39.093514]        __intel_wakeref_put_once+0x10/0x30 [i915]
> <4> [39.094062]        i915_request_retire+0x477/0xaf0 [i915]
> <4> [39.094547]        ring_retire_requests+0x86/0x160 [i915]
> <4> [39.095110]        i915_retire_requests+0x58/0xc0 [i915]
> <4> [39.095587]        i915_gem_wait_for_idle.part.22+0xb2/0xf0 [i915]
> <4> [39.096142]        switch_to_kernel_context_sync+0x2a/0x70 [i915]
> <4> [39.096633]        i915_gem_init+0x59c/0x9c0 [i915]
> <4> [39.097174]        i915_driver_load+0xd96/0x1880 [i915]
> <4> [39.097640]        i915_pci_probe+0x29/0xa0 [i915]
> <4> [39.098145]        pci_device_probe+0xa1/0x120
> <4> [39.098607]        really_probe+0xf3/0x3e0
> <4> [39.099031]        driver_probe_device+0x10a/0x120
> <4> [39.099599]        device_driver_attach+0x4b/0x50
> <4> [39.100033]        __driver_attach+0x97/0x130
> <4> [39.100525]        bus_for_each_dev+0x74/0xc0
> <4> [39.100954]        bus_add_driver+0x13f/0x210
> <4> [39.101441]        driver_register+0x56/0xe0
> <4> [39.101891]        do_one_initcall+0x58/0x2e0
> <4> [39.102319]        do_init_module+0x56/0x1ea
> <4> [39.102805]        load_module+0x2701/0x29e0
> <4> [39.103231]        __se_sys_finit_module+0xd3/0xf0
> <4> [39.103727]        do_syscall_64+0x55/0x190
> <4> [39.104153]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
> <4> [39.104736]
> -> #0 (wakeref#3){+.+.}:
> <4> [39.105437]        lock_acquire+0xa6/0x1c0
> <4> [39.105923]        __mutex_lock+0x8c/0x960
> <4> [39.106345]        atomic_dec_and_mutex_lock+0x33/0x50
> <4> [39.106897]        intel_engine_pm_put+0x1b/0x40 [i915]
> <4> [39.107375]        i915_request_retire+0x477/0xaf0 [i915]
> <4> [39.107930]        ring_retire_requests+0x86/0x160 [i915]
> <4> [39.108412]        i915_retire_requests+0x58/0xc0 [i915]
> <4> [39.108934]        i915_gem_shrink+0xd8/0x5b0 [i915]
> <4> [39.109431]        i915_gem_shrinker_scan+0x59/0x130 [i915]
> <4> [39.109884]        do_shrink_slab+0x131/0x3e0
> <4> [39.110380]        shrink_slab+0x228/0x2c0
> <4> [39.110810]        shrink_node+0x177/0x460
> <4> [39.111317]        balance_pgdat+0x239/0x580
> <4> [39.111743]        kswapd+0x186/0x570
> <4> [39.112221]        kthread+0x119/0x130
> <4> [39.112641]        ret_from_fork+0x3a/0x50
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_globals.c | 74 +++++++++++++----------------
>   1 file changed, 32 insertions(+), 42 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_globals.c b/drivers/gpu/drm/i915/i915_globals.c
> index 2f5c72e2a9d1..81e5c2ce336b 100644
> --- a/drivers/gpu/drm/i915/i915_globals.c
> +++ b/drivers/gpu/drm/i915/i915_globals.c
> @@ -17,6 +17,33 @@
>   
>   static LIST_HEAD(globals);
>   
> +static atomic_t active;
> +static atomic_t epoch;
> +static struct park_work {
> +	struct rcu_work work;
> +	int epoch;
> +} park;
> +
> +static void i915_globals_shrink(void)
> +{
> +	struct i915_global *global;
> +
> +	/*
> +	 * kmem_cache_shrink() discards empty slabs and reorders partially
> +	 * filled slabs to prioritise allocating from the mostly full slabs,
> +	 * with the aim of reducing fragmentation.
> +	 */
> +	list_for_each_entry(global, &globals, link)
> +		global->shrink();
> +}
> +
> +static void __i915_globals_park(struct work_struct *work)
> +{
> +	/* Confirm nothing woke up in the last grace period */
> +	if (park.epoch == atomic_read(&epoch))
> +		i915_globals_shrink();
> +}
> +
>   void __init i915_global_register(struct i915_global *global)
>   {
>   	GEM_BUG_ON(!global->shrink);
> @@ -57,44 +84,12 @@ int __init i915_globals_init(void)
>   		}
>   	}
>   
> +	INIT_RCU_WORK(&park.work, __i915_globals_park);
>   	return 0;
>   }
>   
> -static void i915_globals_shrink(void)
> -{
> -	struct i915_global *global;
> -
> -	/*
> -	 * kmem_cache_shrink() discards empty slabs and reorders partially
> -	 * filled slabs to prioritise allocating from the mostly full slabs,
> -	 * with the aim of reducing fragmentation.
> -	 */
> -	list_for_each_entry(global, &globals, link)
> -		global->shrink();
> -}
> -
> -static atomic_t active;
> -static atomic_t epoch;
> -struct park_work {
> -	struct rcu_work work;
> -	int epoch;
> -};
> -
> -static void __i915_globals_park(struct work_struct *work)
> -{
> -	struct park_work *wrk = container_of(work, typeof(*wrk), work.work);
> -
> -	/* Confirm nothing woke up in the last grace period */
> -	if (wrk->epoch == atomic_read(&epoch))
> -		i915_globals_shrink();
> -
> -	kfree(wrk);
> -}
> -
>   void i915_globals_park(void)
>   {
> -	struct park_work *wrk;
> -
>   	/*
>   	 * Defer shrinking the global slab caches (and other work) until
>   	 * after a RCU grace period has completed with no activity. This
> @@ -107,13 +102,8 @@ void i915_globals_park(void)
>   	if (!atomic_dec_and_test(&active))
>   		return;
>   
> -	wrk = kmalloc(sizeof(*wrk), GFP_KERNEL);
> -	if (!wrk)
> -		return;
> -
> -	wrk->epoch = atomic_inc_return(&epoch);
> -	INIT_RCU_WORK(&wrk->work, __i915_globals_park);
> -	queue_rcu_work(system_wq, &wrk->work);
> +	park.epoch = atomic_inc_return(&epoch);
> +	queue_rcu_work(system_wq, &park.work);
>   }
>   
>   void i915_globals_unpark(void)
> @@ -125,8 +115,8 @@ void i915_globals_unpark(void)
>   void __exit i915_globals_exit(void)
>   {
>   	/* Flush any residual park_work */
> -	rcu_barrier();
> -	flush_scheduled_work();
> +	atomic_inc(&epoch);
> +	flush_rcu_work(&park.work);
>   
>   	__i915_globals_cleanup();
>   
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 04/29] drm/i915: Consolidate the timeline->barrier
  2019-04-08  9:17 ` [PATCH 04/29] drm/i915: Consolidate the timeline->barrier Chris Wilson
@ 2019-04-08 14:42   ` Tvrtko Ursulin
  0 siblings, 0 replies; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-08 14:42 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 08/04/2019 10:17, Chris Wilson wrote:
> The timeline is strictly ordered, so by inserting the timeline->barrier
> request into the timeline->last_request it naturally provides the same
> barrier. Consolidate the pair of barriers into one as they serve the
> same purpose.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_gem_context.c        | 13 ++-----------
>   drivers/gpu/drm/i915/i915_request.c            |  9 ---------
>   drivers/gpu/drm/i915/i915_timeline.c           |  2 --
>   drivers/gpu/drm/i915/i915_timeline.h           | 15 ---------------
>   drivers/gpu/drm/i915/i915_timeline_types.h     | 10 ----------
>   drivers/gpu/drm/i915/selftests/mock_timeline.c |  1 -
>   6 files changed, 2 insertions(+), 48 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 66b6852cb4d2..7fc34ab6df87 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -1167,7 +1167,7 @@ static int
>   gen8_modify_rpcs(struct intel_context *ce, struct intel_sseu sseu)
>   {
>   	struct drm_i915_private *i915 = ce->engine->i915;
> -	struct i915_request *rq, *prev;
> +	struct i915_request *rq;
>   	intel_wakeref_t wakeref;
>   	int ret;
>   
> @@ -1192,16 +1192,7 @@ gen8_modify_rpcs(struct intel_context *ce, struct intel_sseu sseu)
>   	}
>   
>   	/* Queue this switch after all other activity by this context. */
> -	prev = i915_active_request_raw(&ce->ring->timeline->last_request,
> -				       &i915->drm.struct_mutex);
> -	if (prev && !i915_request_completed(prev)) {
> -		ret = i915_request_await_dma_fence(rq, &prev->fence);
> -		if (ret < 0)
> -			goto out_add;
> -	}
> -
> -	/* Order all following requests to be after. */
> -	ret = i915_timeline_set_barrier(ce->ring->timeline, rq);
> +	ret = i915_active_request_set(&ce->ring->timeline->last_request, rq);
>   	if (ret)
>   		goto out_add;
>   
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 2da0d6436a1a..96a9e8bcd805 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -584,11 +584,6 @@ i915_request_alloc_slow(struct intel_context *ce)
>   	return kmem_cache_alloc(global.slab_requests, GFP_KERNEL);
>   }
>   
> -static int add_timeline_barrier(struct i915_request *rq)
> -{
> -	return i915_request_await_active_request(rq, &rq->timeline->barrier);
> -}
> -
>   /**
>    * i915_request_alloc - allocate a request structure
>    *
> @@ -738,10 +733,6 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   	 */
>   	rq->head = rq->ring->emit;
>   
> -	ret = add_timeline_barrier(rq);
> -	if (ret)
> -		goto err_unwind;
> -
>   	ret = engine->request_alloc(rq);
>   	if (ret)
>   		goto err_unwind;
> diff --git a/drivers/gpu/drm/i915/i915_timeline.c b/drivers/gpu/drm/i915/i915_timeline.c
> index 2f4907364920..5fbea0892f33 100644
> --- a/drivers/gpu/drm/i915/i915_timeline.c
> +++ b/drivers/gpu/drm/i915/i915_timeline.c
> @@ -253,7 +253,6 @@ int i915_timeline_init(struct drm_i915_private *i915,
>   	spin_lock_init(&timeline->lock);
>   	mutex_init(&timeline->mutex);
>   
> -	INIT_ACTIVE_REQUEST(&timeline->barrier);
>   	INIT_ACTIVE_REQUEST(&timeline->last_request);
>   	INIT_LIST_HEAD(&timeline->requests);
>   
> @@ -326,7 +325,6 @@ void i915_timeline_fini(struct i915_timeline *timeline)
>   {
>   	GEM_BUG_ON(timeline->pin_count);
>   	GEM_BUG_ON(!list_empty(&timeline->requests));
> -	GEM_BUG_ON(i915_active_request_isset(&timeline->barrier));
>   
>   	i915_syncmap_free(&timeline->sync);
>   
> diff --git a/drivers/gpu/drm/i915/i915_timeline.h b/drivers/gpu/drm/i915/i915_timeline.h
> index 4ca7f80bdf6d..27668a1a69a3 100644
> --- a/drivers/gpu/drm/i915/i915_timeline.h
> +++ b/drivers/gpu/drm/i915/i915_timeline.h
> @@ -110,19 +110,4 @@ void i915_timelines_init(struct drm_i915_private *i915);
>   void i915_timelines_park(struct drm_i915_private *i915);
>   void i915_timelines_fini(struct drm_i915_private *i915);
>   
> -/**
> - * i915_timeline_set_barrier - orders submission between different timelines
> - * @timeline: timeline to set the barrier on
> - * @rq: request after which new submissions can proceed
> - *
> - * Sets the passed in request as the serialization point for all subsequent
> - * submissions on @timeline. Subsequent requests will not be submitted to GPU
> - * until the barrier has been completed.
> - */
> -static inline int
> -i915_timeline_set_barrier(struct i915_timeline *tl, struct i915_request *rq)
> -{
> -	return i915_active_request_set(&tl->barrier, rq);
> -}
> -
>   #endif
> diff --git a/drivers/gpu/drm/i915/i915_timeline_types.h b/drivers/gpu/drm/i915/i915_timeline_types.h
> index 1f5b55d9ffb5..5256a0b5c5f7 100644
> --- a/drivers/gpu/drm/i915/i915_timeline_types.h
> +++ b/drivers/gpu/drm/i915/i915_timeline_types.h
> @@ -61,16 +61,6 @@ struct i915_timeline {
>   	 */
>   	struct i915_syncmap *sync;
>   
> -	/**
> -	 * Barrier provides the ability to serialize ordering between different
> -	 * timelines.
> -	 *
> -	 * Users can call i915_timeline_set_barrier which will make all
> -	 * subsequent submissions to this timeline be executed only after the
> -	 * barrier has been completed.
> -	 */
> -	struct i915_active_request barrier;
> -
>   	struct list_head link;
>   	struct drm_i915_private *i915;
>   
> diff --git a/drivers/gpu/drm/i915/selftests/mock_timeline.c b/drivers/gpu/drm/i915/selftests/mock_timeline.c
> index 416d85233263..e084476469ef 100644
> --- a/drivers/gpu/drm/i915/selftests/mock_timeline.c
> +++ b/drivers/gpu/drm/i915/selftests/mock_timeline.c
> @@ -16,7 +16,6 @@ void mock_timeline_init(struct i915_timeline *timeline, u64 context)
>   	spin_lock_init(&timeline->lock);
>   	mutex_init(&timeline->mutex);
>   
> -	INIT_ACTIVE_REQUEST(&timeline->barrier);
>   	INIT_ACTIVE_REQUEST(&timeline->last_request);
>   	INIT_LIST_HEAD(&timeline->requests);
>   
> 

Modulo the persistently impenetrable sw fence await flavours, this is 
indeed the same. I guess it goes under the thinking outside the box 
category, tracking a foreign context request on a different timeline.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 05/29] drm/i915: Store the default sseu setup on the engine
  2019-04-08  9:17 ` [PATCH 05/29] drm/i915: Store the default sseu setup on the engine Chris Wilson
@ 2019-04-08 14:54   ` Tvrtko Ursulin
  2019-04-08 15:57     ` Chris Wilson
  0 siblings, 1 reply; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-08 14:54 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 08/04/2019 10:17, Chris Wilson wrote:
> As we push for better compartmentalisation, it is more convenient to
> copy the default sseu configuration from the engine into the derived
> logical context, than it is to dig it out from i915->runtime_info.

Why it is more convenient?

> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/Makefile                 |   1 +
>   drivers/gpu/drm/i915/Makefile.header-test     |   1 +
>   drivers/gpu/drm/i915/i915_drv.h               |  14 --
>   drivers/gpu/drm/i915/i915_gem_context.c       |   2 +-
>   drivers/gpu/drm/i915/i915_perf.c              |   2 +-
>   drivers/gpu/drm/i915/intel_context.c          |   4 +-
>   drivers/gpu/drm/i915/intel_context_types.h    |  11 +-
>   drivers/gpu/drm/i915/intel_device_info.h      |  28 +---
>   drivers/gpu/drm/i915/intel_engine_cs.c        |   4 +
>   drivers/gpu/drm/i915/intel_engine_types.h     |   3 +
>   drivers/gpu/drm/i915/intel_lrc.c              | 134 +----------------
>   drivers/gpu/drm/i915/intel_lrc.h              |   2 -
>   drivers/gpu/drm/i915/intel_sseu.c             | 142 ++++++++++++++++++
>   drivers/gpu/drm/i915/intel_sseu.h             |  67 +++++++++
>   .../gpu/drm/i915/selftests/i915_gem_context.c |   5 +-
>   15 files changed, 226 insertions(+), 194 deletions(-)
>   create mode 100644 drivers/gpu/drm/i915/intel_sseu.c
>   create mode 100644 drivers/gpu/drm/i915/intel_sseu.h
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index fbcb0904f4a8..53ff209b91bb 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -95,6 +95,7 @@ i915-y += \
>   	  intel_lrc.o \
>   	  intel_mocs.o \
>   	  intel_ringbuffer.o \
> +	  intel_sseu.o \
>   	  intel_uncore.o \
>   	  intel_wopcm.o
>   
> diff --git a/drivers/gpu/drm/i915/Makefile.header-test b/drivers/gpu/drm/i915/Makefile.header-test
> index c1c391816fa7..5bcc78d7ac96 100644
> --- a/drivers/gpu/drm/i915/Makefile.header-test
> +++ b/drivers/gpu/drm/i915/Makefile.header-test
> @@ -33,6 +33,7 @@ header_test := \
>   	intel_psr.h \
>   	intel_sdvo.h \
>   	intel_sprite.h \
> +	intel_sseu.h \
>   	intel_tv.h \
>   	intel_workarounds_types.h
>   
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 63eca3061d10..6e46fbc7b350 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -3388,20 +3388,6 @@ mkwrite_device_info(struct drm_i915_private *dev_priv)
>   	return (struct intel_device_info *)INTEL_INFO(dev_priv);
>   }
>   
> -static inline struct intel_sseu
> -intel_device_default_sseu(struct drm_i915_private *i915)
> -{
> -	const struct sseu_dev_info *sseu = &RUNTIME_INFO(i915)->sseu;
> -	struct intel_sseu value = {
> -		.slice_mask = sseu->slice_mask,
> -		.subslice_mask = sseu->subslice_mask[0],
> -		.min_eus_per_subslice = sseu->max_eus_per_subslice,
> -		.max_eus_per_subslice = sseu->max_eus_per_subslice,
> -	};
> -
> -	return value;
> -}
> -
>   /* modesetting */
>   extern void intel_modeset_init_hw(struct drm_device *dev);
>   extern int intel_modeset_init(struct drm_device *dev);
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 7fc34ab6df87..0883a879d3e7 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -1156,7 +1156,7 @@ static int gen8_emit_rpcs_config(struct i915_request *rq,
>   	*cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
>   	*cs++ = lower_32_bits(offset);
>   	*cs++ = upper_32_bits(offset);
> -	*cs++ = gen8_make_rpcs(rq->i915, &sseu);
> +	*cs++ = intel_sseu_make_rpcs(rq->i915, &sseu);
>   
>   	intel_ring_advance(rq, cs);
>   
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index 39a4804091d7..56da457bed21 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -1679,7 +1679,7 @@ gen8_update_reg_state_unlocked(struct intel_context *ce,
>   
>   	CTX_REG(reg_state,
>   		CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE,
> -		gen8_make_rpcs(i915, &ce->sseu));
> +		intel_sseu_make_rpcs(i915, &ce->sseu));
>   }
>   
>   /*
> diff --git a/drivers/gpu/drm/i915/intel_context.c b/drivers/gpu/drm/i915/intel_context.c
> index 8931e0fee873..961d1445833d 100644
> --- a/drivers/gpu/drm/i915/intel_context.c
> +++ b/drivers/gpu/drm/i915/intel_context.c
> @@ -230,15 +230,13 @@ intel_context_init(struct intel_context *ce,
>   	ce->gem_context = ctx;
>   	ce->engine = engine;
>   	ce->ops = engine->cops;
> +	ce->sseu = engine->sseu;
>   
>   	INIT_LIST_HEAD(&ce->signal_link);
>   	INIT_LIST_HEAD(&ce->signals);
>   
>   	mutex_init(&ce->pin_mutex);
>   
> -	/* Use the whole device by default */
> -	ce->sseu = intel_device_default_sseu(ctx->i915);
> -
>   	i915_active_request_init(&ce->active_tracker,
>   				 NULL, intel_context_retire);
>   }
> diff --git a/drivers/gpu/drm/i915/intel_context_types.h b/drivers/gpu/drm/i915/intel_context_types.h
> index 624729a35875..2109610e21e8 100644
> --- a/drivers/gpu/drm/i915/intel_context_types.h
> +++ b/drivers/gpu/drm/i915/intel_context_types.h
> @@ -14,6 +14,7 @@
>   #include <linux/types.h>
>   
>   #include "i915_active_types.h"
> +#include "intel_sseu.h"
>   
>   struct i915_gem_context;
>   struct i915_vma;
> @@ -27,16 +28,6 @@ struct intel_context_ops {
>   	void (*destroy)(struct kref *kref);
>   };
>   
> -/*
> - * Powergating configuration for a particular (context,engine).
> - */
> -struct intel_sseu {
> -	u8 slice_mask;
> -	u8 subslice_mask;
> -	u8 min_eus_per_subslice;
> -	u8 max_eus_per_subslice;
> -};
> -
>   struct intel_context {
>   	struct kref ref;
>   
> diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
> index 0e579f158016..3045e0dee2a1 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.h
> +++ b/drivers/gpu/drm/i915/intel_device_info.h
> @@ -29,6 +29,7 @@
>   
>   #include "intel_engine_types.h"
>   #include "intel_display.h"
> +#include "intel_sseu.h"
>   
>   struct drm_printer;
>   struct drm_i915_private;
> @@ -139,33 +140,6 @@ enum intel_ppgtt_type {
>   	func(overlay_needs_physical); \
>   	func(supports_tv);
>   
> -#define GEN_MAX_SLICES		(6) /* CNL upper bound */
> -#define GEN_MAX_SUBSLICES	(8) /* ICL upper bound */
> -
> -struct sseu_dev_info {
> -	u8 slice_mask;
> -	u8 subslice_mask[GEN_MAX_SLICES];
> -	u16 eu_total;
> -	u8 eu_per_subslice;
> -	u8 min_eu_in_pool;
> -	/* For each slice, which subslice(s) has(have) 7 EUs (bitfield)? */
> -	u8 subslice_7eu[3];
> -	u8 has_slice_pg:1;
> -	u8 has_subslice_pg:1;
> -	u8 has_eu_pg:1;
> -
> -	/* Topology fields */
> -	u8 max_slices;
> -	u8 max_subslices;
> -	u8 max_eus_per_subslice;
> -
> -	/* We don't have more than 8 eus per subslice at the moment and as we
> -	 * store eus enabled using bits, no need to multiply by eus per
> -	 * subslice.
> -	 */
> -	u8 eu_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICES];
> -};
> -
>   struct intel_device_info {
>   	u16 gen_mask;
>   
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index d0427c2e3997..62f4b7cfe0df 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -588,6 +588,10 @@ int intel_engine_setup_common(struct intel_engine_cs *engine)
>   	intel_engine_init_batch_pool(engine);
>   	intel_engine_init_cmd_parser(engine);
>   
> +	/* Use the whole device by default */
> +	engine->sseu =
> +		intel_device_default_sseu(&RUNTIME_INFO(engine->i915)->sseu);
> +
>   	return 0;
>   
>   err_hwsp:
> diff --git a/drivers/gpu/drm/i915/intel_engine_types.h b/drivers/gpu/drm/i915/intel_engine_types.h
> index 6c6d8a9aca94..60cd1299a8f2 100644
> --- a/drivers/gpu/drm/i915/intel_engine_types.h
> +++ b/drivers/gpu/drm/i915/intel_engine_types.h
> @@ -17,6 +17,7 @@
>   #include "i915_priolist_types.h"
>   #include "i915_selftest.h"
>   #include "i915_timeline_types.h"
> +#include "intel_sseu.h"
>   #include "intel_workarounds_types.h"
>   
>   #include "i915_gem_batch_pool.h"
> @@ -273,6 +274,8 @@ struct intel_engine_cs {
>   	u32 context_size;
>   	u32 mmio_base;
>   
> +	struct intel_sseu sseu;
> +
>   	struct intel_ring *buffer;
>   
>   	struct i915_timeline timeline;
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 6931dbb2888c..fac3968d7f1b 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1321,7 +1321,7 @@ __execlists_update_reg_state(struct intel_context *ce,
>   	/* RPCS */
>   	if (engine->class == RENDER_CLASS)
>   		regs[CTX_R_PWR_CLK_STATE + 1] =
> -			gen8_make_rpcs(engine->i915, &ce->sseu);
> +			intel_sseu_make_rpcs(engine->i915, &ce->sseu);
>   }
>   
>   static int
> @@ -2501,138 +2501,6 @@ int logical_xcs_ring_init(struct intel_engine_cs *engine)
>   	return logical_ring_init(engine);
>   }
>   
> -u32 gen8_make_rpcs(struct drm_i915_private *i915, struct intel_sseu *req_sseu)
> -{
> -	const struct sseu_dev_info *sseu = &RUNTIME_INFO(i915)->sseu;
> -	bool subslice_pg = sseu->has_subslice_pg;
> -	struct intel_sseu ctx_sseu;
> -	u8 slices, subslices;
> -	u32 rpcs = 0;
> -
> -	/*
> -	 * No explicit RPCS request is needed to ensure full
> -	 * slice/subslice/EU enablement prior to Gen9.
> -	*/
> -	if (INTEL_GEN(i915) < 9)
> -		return 0;
> -
> -	/*
> -	 * If i915/perf is active, we want a stable powergating configuration
> -	 * on the system.
> -	 *
> -	 * We could choose full enablement, but on ICL we know there are use
> -	 * cases which disable slices for functional, apart for performance
> -	 * reasons. So in this case we select a known stable subset.
> -	 */
> -	if (!i915->perf.oa.exclusive_stream) {
> -		ctx_sseu = *req_sseu;
> -	} else {
> -		ctx_sseu = intel_device_default_sseu(i915);
> -
> -		if (IS_GEN(i915, 11)) {
> -			/*
> -			 * We only need subslice count so it doesn't matter
> -			 * which ones we select - just turn off low bits in the
> -			 * amount of half of all available subslices per slice.
> -			 */
> -			ctx_sseu.subslice_mask =
> -				~(~0 << (hweight8(ctx_sseu.subslice_mask) / 2));
> -			ctx_sseu.slice_mask = 0x1;
> -		}
> -	}
> -
> -	slices = hweight8(ctx_sseu.slice_mask);
> -	subslices = hweight8(ctx_sseu.subslice_mask);
> -
> -	/*
> -	 * Since the SScount bitfield in GEN8_R_PWR_CLK_STATE is only three bits
> -	 * wide and Icelake has up to eight subslices, specfial programming is
> -	 * needed in order to correctly enable all subslices.
> -	 *
> -	 * According to documentation software must consider the configuration
> -	 * as 2x4x8 and hardware will translate this to 1x8x8.
> -	 *
> -	 * Furthemore, even though SScount is three bits, maximum documented
> -	 * value for it is four. From this some rules/restrictions follow:
> -	 *
> -	 * 1.
> -	 * If enabled subslice count is greater than four, two whole slices must
> -	 * be enabled instead.
> -	 *
> -	 * 2.
> -	 * When more than one slice is enabled, hardware ignores the subslice
> -	 * count altogether.
> -	 *
> -	 * From these restrictions it follows that it is not possible to enable
> -	 * a count of subslices between the SScount maximum of four restriction,
> -	 * and the maximum available number on a particular SKU. Either all
> -	 * subslices are enabled, or a count between one and four on the first
> -	 * slice.
> -	 */
> -	if (IS_GEN(i915, 11) &&
> -	    slices == 1 &&
> -	    subslices > min_t(u8, 4, hweight8(sseu->subslice_mask[0]) / 2)) {
> -		GEM_BUG_ON(subslices & 1);
> -
> -		subslice_pg = false;
> -		slices *= 2;
> -	}
> -
> -	/*
> -	 * Starting in Gen9, render power gating can leave
> -	 * slice/subslice/EU in a partially enabled state. We
> -	 * must make an explicit request through RPCS for full
> -	 * enablement.
> -	*/
> -	if (sseu->has_slice_pg) {
> -		u32 mask, val = slices;
> -
> -		if (INTEL_GEN(i915) >= 11) {
> -			mask = GEN11_RPCS_S_CNT_MASK;
> -			val <<= GEN11_RPCS_S_CNT_SHIFT;
> -		} else {
> -			mask = GEN8_RPCS_S_CNT_MASK;
> -			val <<= GEN8_RPCS_S_CNT_SHIFT;
> -		}
> -
> -		GEM_BUG_ON(val & ~mask);
> -		val &= mask;
> -
> -		rpcs |= GEN8_RPCS_ENABLE | GEN8_RPCS_S_CNT_ENABLE | val;
> -	}
> -
> -	if (subslice_pg) {
> -		u32 val = subslices;
> -
> -		val <<= GEN8_RPCS_SS_CNT_SHIFT;
> -
> -		GEM_BUG_ON(val & ~GEN8_RPCS_SS_CNT_MASK);
> -		val &= GEN8_RPCS_SS_CNT_MASK;
> -
> -		rpcs |= GEN8_RPCS_ENABLE | GEN8_RPCS_SS_CNT_ENABLE | val;
> -	}
> -
> -	if (sseu->has_eu_pg) {
> -		u32 val;
> -
> -		val = ctx_sseu.min_eus_per_subslice << GEN8_RPCS_EU_MIN_SHIFT;
> -		GEM_BUG_ON(val & ~GEN8_RPCS_EU_MIN_MASK);
> -		val &= GEN8_RPCS_EU_MIN_MASK;
> -
> -		rpcs |= val;
> -
> -		val = ctx_sseu.max_eus_per_subslice << GEN8_RPCS_EU_MAX_SHIFT;
> -		GEM_BUG_ON(val & ~GEN8_RPCS_EU_MAX_MASK);
> -		val &= GEN8_RPCS_EU_MAX_MASK;
> -
> -		rpcs |= val;
> -
> -		rpcs |= GEN8_RPCS_ENABLE;
> -	}
> -
> -	return rpcs;
> -}
> -
>   static u32 intel_lr_indirect_ctx_offset(struct intel_engine_cs *engine)
>   {
>   	u32 indirect_ctx_offset;
> diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
> index 92642ab91472..72488e293d77 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.h
> +++ b/drivers/gpu/drm/i915/intel_lrc.h
> @@ -112,6 +112,4 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
>   							const char *prefix),
>   				   unsigned int max);
>   
> -u32 gen8_make_rpcs(struct drm_i915_private *i915, struct intel_sseu *ctx_sseu);
> -
>   #endif /* _INTEL_LRC_H_ */
> diff --git a/drivers/gpu/drm/i915/intel_sseu.c b/drivers/gpu/drm/i915/intel_sseu.c
> new file mode 100644
> index 000000000000..cfc80813f662
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/intel_sseu.c
> @@ -0,0 +1,142 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#include "i915_drv.h"
> +#include "intel_lrc_reg.h"
> +#include "intel_sseu.h"
> +
> +u32 intel_sseu_make_rpcs(struct drm_i915_private *i915,
> +			 const struct intel_sseu *req_sseu)
> +{
> +	const struct sseu_dev_info *sseu = &RUNTIME_INFO(i915)->sseu;
> +	bool subslice_pg = sseu->has_subslice_pg;
> +	struct intel_sseu ctx_sseu;
> +	u8 slices, subslices;
> +	u32 rpcs = 0;
> +
> +	/*
> +	 * No explicit RPCS request is needed to ensure full
> +	 * slice/subslice/EU enablement prior to Gen9.
> +	 */
> +	if (INTEL_GEN(i915) < 9)
> +		return 0;
> +
> +	/*
> +	 * If i915/perf is active, we want a stable powergating configuration
> +	 * on the system.
> +	 *
> +	 * We could choose full enablement, but on ICL we know there are use
> +	 * cases which disable slices for functional, apart for performance
> +	 * reasons. So in this case we select a known stable subset.
> +	 */
> +	if (!i915->perf.oa.exclusive_stream) {
> +		ctx_sseu = *req_sseu;
> +	} else {
> +		ctx_sseu = intel_device_default_sseu(sseu);
> +
> +		if (IS_GEN(i915, 11)) {
> +			/*
> +			 * We only need subslice count so it doesn't matter
> +			 * which ones we select - just turn off low bits in the
> +			 * amount of half of all available subslices per slice.
> +			 */
> +			ctx_sseu.subslice_mask =
> +				~(~0 << (hweight8(ctx_sseu.subslice_mask) / 2));
> +			ctx_sseu.slice_mask = 0x1;
> +		}
> +	}
> +
> +	slices = hweight8(ctx_sseu.slice_mask);
> +	subslices = hweight8(ctx_sseu.subslice_mask);
> +
> +	/*
> +	 * Since the SScount bitfield in GEN8_R_PWR_CLK_STATE is only three bits
> +	 * wide and Icelake has up to eight subslices, specfial programming is
> +	 * needed in order to correctly enable all subslices.
> +	 *
> +	 * According to documentation software must consider the configuration
> +	 * as 2x4x8 and hardware will translate this to 1x8x8.
> +	 *
> +	 * Furthemore, even though SScount is three bits, maximum documented
> +	 * value for it is four. From this some rules/restrictions follow:
> +	 *
> +	 * 1.
> +	 * If enabled subslice count is greater than four, two whole slices must
> +	 * be enabled instead.
> +	 *
> +	 * 2.
> +	 * When more than one slice is enabled, hardware ignores the subslice
> +	 * count altogether.
> +	 *
> +	 * From these restrictions it follows that it is not possible to enable
> +	 * a count of subslices between the SScount maximum of four restriction,
> +	 * and the maximum available number on a particular SKU. Either all
> +	 * subslices are enabled, or a count between one and four on the first
> +	 * slice.
> +	 */
> +	if (IS_GEN(i915, 11) &&
> +	    slices == 1 &&
> +	    subslices > min_t(u8, 4, hweight8(sseu->subslice_mask[0]) / 2)) {
> +		GEM_BUG_ON(subslices & 1);
> +
> +		subslice_pg = false;
> +		slices *= 2;
> +	}
> +
> +	/*
> +	 * Starting in Gen9, render power gating can leave
> +	 * slice/subslice/EU in a partially enabled state. We
> +	 * must make an explicit request through RPCS for full
> +	 * enablement.
> +	 */
> +	if (sseu->has_slice_pg) {
> +		u32 mask, val = slices;
> +
> +		if (INTEL_GEN(i915) >= 11) {
> +			mask = GEN11_RPCS_S_CNT_MASK;
> +			val <<= GEN11_RPCS_S_CNT_SHIFT;
> +		} else {
> +			mask = GEN8_RPCS_S_CNT_MASK;
> +			val <<= GEN8_RPCS_S_CNT_SHIFT;
> +		}
> +
> +		GEM_BUG_ON(val & ~mask);
> +		val &= mask;
> +
> +		rpcs |= GEN8_RPCS_ENABLE | GEN8_RPCS_S_CNT_ENABLE | val;
> +	}
> +
> +	if (subslice_pg) {
> +		u32 val = subslices;
> +
> +		val <<= GEN8_RPCS_SS_CNT_SHIFT;
> +
> +		GEM_BUG_ON(val & ~GEN8_RPCS_SS_CNT_MASK);
> +		val &= GEN8_RPCS_SS_CNT_MASK;
> +
> +		rpcs |= GEN8_RPCS_ENABLE | GEN8_RPCS_SS_CNT_ENABLE | val;
> +	}
> +
> +	if (sseu->has_eu_pg) {
> +		u32 val;
> +
> +		val = ctx_sseu.min_eus_per_subslice << GEN8_RPCS_EU_MIN_SHIFT;
> +		GEM_BUG_ON(val & ~GEN8_RPCS_EU_MIN_MASK);
> +		val &= GEN8_RPCS_EU_MIN_MASK;
> +
> +		rpcs |= val;
> +
> +		val = ctx_sseu.max_eus_per_subslice << GEN8_RPCS_EU_MAX_SHIFT;
> +		GEM_BUG_ON(val & ~GEN8_RPCS_EU_MAX_MASK);
> +		val &= GEN8_RPCS_EU_MAX_MASK;
> +
> +		rpcs |= val;
> +
> +		rpcs |= GEN8_RPCS_ENABLE;
> +	}
> +
> +	return rpcs;
> +}
> diff --git a/drivers/gpu/drm/i915/intel_sseu.h b/drivers/gpu/drm/i915/intel_sseu.h
> new file mode 100644
> index 000000000000..bf6fa019fd00
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/intel_sseu.h
> @@ -0,0 +1,67 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#ifndef __INTEL_SSEU_H__
> +#define __INTEL_SSEU_H__
> +
> +#include <linux/types.h>
> +
> +struct drm_i915_private;
> +
> +#define GEN_MAX_SLICES		(6) /* CNL upper bound */
> +#define GEN_MAX_SUBSLICES	(8) /* ICL upper bound */
> +
> +struct sseu_dev_info {
> +	u8 slice_mask;
> +	u8 subslice_mask[GEN_MAX_SLICES];
> +	u16 eu_total;
> +	u8 eu_per_subslice;
> +	u8 min_eu_in_pool;
> +	/* For each slice, which subslice(s) has(have) 7 EUs (bitfield)? */
> +	u8 subslice_7eu[3];
> +	u8 has_slice_pg:1;
> +	u8 has_subslice_pg:1;
> +	u8 has_eu_pg:1;
> +
> +	/* Topology fields */
> +	u8 max_slices;
> +	u8 max_subslices;
> +	u8 max_eus_per_subslice;
> +
> +	/* We don't have more than 8 eus per subslice at the moment and as we
> +	 * store eus enabled using bits, no need to multiply by eus per
> +	 * subslice.
> +	 */
> +	u8 eu_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICES];
> +};
> +
> +/*
> + * Powergating configuration for a particular (context,engine).
> + */
> +struct intel_sseu {
> +	u8 slice_mask;
> +	u8 subslice_mask;
> +	u8 min_eus_per_subslice;
> +	u8 max_eus_per_subslice;
> +};
> +
> +static inline struct intel_sseu
> +intel_device_default_sseu(const struct sseu_dev_info *sseu)
> +{

I think function parameter change is dangerous. If caller can pass in 
any pointer then it is slightly less "intel_*device_default*_sseu. It is 
more a convert and copy from device sseu to intel_sseu. Like 
intel_convert_device_sseu.

Although if the function just kept taking i915 then the dilemma would 
not even be there.

> +	struct intel_sseu value = {
> +		.slice_mask = sseu->slice_mask,
> +		.subslice_mask = sseu->subslice_mask[0],
> +		.min_eus_per_subslice = sseu->max_eus_per_subslice,
> +		.max_eus_per_subslice = sseu->max_eus_per_subslice,
> +	};
> +
> +	return value;
> +}
> +
> +u32 intel_sseu_make_rpcs(struct drm_i915_private *i915,
> +			 const struct intel_sseu *req_sseu);
> +
> +#endif /* __INTEL_SSEU_H__ */
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> index 4e1b6efc6b22..e1cb22f03e8e 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> @@ -962,8 +962,7 @@ __sseu_finish(struct drm_i915_private *i915,
>   	      unsigned int expected,
>   	      struct igt_spinner *spin)
>   {
> -	unsigned int slices =
> -		hweight32(intel_device_default_sseu(i915).slice_mask);
> +	unsigned int slices = hweight32(engine->sseu.slice_mask);
>   	u32 rpcs = 0;
>   	int ret = 0;
>   
> @@ -1047,8 +1046,8 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
>   	       const char *name,
>   	       unsigned int flags)
>   {
> -	struct intel_sseu default_sseu = intel_device_default_sseu(i915);
>   	struct intel_engine_cs *engine = i915->engine[RCS0];
> +	struct intel_sseu default_sseu = engine->sseu;
>   	struct drm_i915_gem_object *obj;
>   	struct i915_gem_context *ctx;
>   	struct intel_sseu pg_sseu;
> 

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 09/29] drm/i915: Pull the GEM powermangement coupling into its own file
  2019-04-08  9:17 ` [PATCH 09/29] drm/i915: Pull the GEM powermangement coupling into its own file Chris Wilson
@ 2019-04-08 14:56   ` Tvrtko Ursulin
  2019-04-08 16:00     ` Chris Wilson
  2019-04-10  9:57   ` Tvrtko Ursulin
  1 sibling, 1 reply; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-08 14:56 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 08/04/2019 10:17, Chris Wilson wrote:
> Split out the powermanagement portion (GT wakeref, suspend/resume) of
> GEM from i915_gem.c into its own file.

This is the same patch I gave r-b on before? Judging from the title and 
commit message I'd say yes. But I know it is not, don't worry.

Regards,

Tvrtko

> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/Makefile                 |   1 +
>   drivers/gpu/drm/i915/Makefile.header-test     |   1 +
>   drivers/gpu/drm/i915/i915_debugfs.c           |   4 +-
>   drivers/gpu/drm/i915/i915_drv.h               |  12 +-
>   drivers/gpu/drm/i915/i915_gem.c               | 363 +----------------
>   drivers/gpu/drm/i915/i915_gem_pm.c            | 365 ++++++++++++++++++
>   drivers/gpu/drm/i915/i915_gem_pm.h            |  28 ++
>   .../gpu/drm/i915/selftests/i915_gem_context.c |   2 +-
>   .../gpu/drm/i915/selftests/i915_gem_object.c  |   8 +-
>   .../gpu/drm/i915/selftests/mock_gem_device.c  |  10 +-
>   10 files changed, 418 insertions(+), 376 deletions(-)
>   create mode 100644 drivers/gpu/drm/i915/i915_gem_pm.c
>   create mode 100644 drivers/gpu/drm/i915/i915_gem_pm.h
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 233bad5e361f..858642c7bc40 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -97,6 +97,7 @@ i915-y += \
>   	  i915_gem_internal.o \
>   	  i915_gem.o \
>   	  i915_gem_object.o \
> +	  i915_gem_pm.o \
>   	  i915_gem_render_state.o \
>   	  i915_gem_shrinker.o \
>   	  i915_gem_stolen.o \
> diff --git a/drivers/gpu/drm/i915/Makefile.header-test b/drivers/gpu/drm/i915/Makefile.header-test
> index e6b3e7588860..702e3a7ade4c 100644
> --- a/drivers/gpu/drm/i915/Makefile.header-test
> +++ b/drivers/gpu/drm/i915/Makefile.header-test
> @@ -5,6 +5,7 @@
>   header_test := \
>   	i915_active_types.h \
>   	i915_gem_context_types.h \
> +	i915_gem_pm.h \
>   	i915_priolist_types.h \
>   	i915_scheduler_types.h \
>   	i915_timeline_types.h \
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index eac4095278b2..e3eca88614b2 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -3941,8 +3941,8 @@ i915_drop_caches_set(void *data, u64 val)
>   	if (val & DROP_IDLE) {
>   		do {
>   			if (READ_ONCE(i915->gt.active_requests))
> -				flush_delayed_work(&i915->gt.retire_work);
> -			drain_delayed_work(&i915->gt.idle_work);
> +				flush_delayed_work(&i915->gem.retire_work);
> +			drain_delayed_work(&i915->gem.idle_work);
>   		} while (READ_ONCE(i915->gt.awake));
>   	}
>   
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 62a7e91acd7f..cbae9be052e0 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2020,6 +2020,12 @@ struct drm_i915_private {
>   		 */
>   		intel_wakeref_t awake;
>   
> +		ktime_t last_init_time;
> +
> +		struct i915_vma *scratch;
> +	} gt;
> +
> +	struct {
>   		/**
>   		 * We leave the user IRQ off as much as possible,
>   		 * but this means that requests will finish and never
> @@ -2037,11 +2043,7 @@ struct drm_i915_private {
>   		 * off the idle_work.
>   		 */
>   		struct delayed_work idle_work;
> -
> -		ktime_t last_init_time;
> -
> -		struct i915_vma *scratch;
> -	} gt;
> +	} gem;
>   
>   	/* For i945gm vblank irq vs. C3 workaround */
>   	struct {
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index a8b0aa05288b..c56633e87c32 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -46,7 +46,7 @@
>   #include "i915_drv.h"
>   #include "i915_gem_clflush.h"
>   #include "i915_gemfs.h"
> -#include "i915_globals.h"
> +#include "i915_gem_pm.h"
>   #include "i915_trace.h"
>   #include "i915_vgpu.h"
>   
> @@ -103,105 +103,6 @@ static void i915_gem_info_remove_obj(struct drm_i915_private *dev_priv,
>   	spin_unlock(&dev_priv->mm.object_stat_lock);
>   }
>   
> -static void __i915_gem_park(struct drm_i915_private *i915)
> -{
> -	intel_wakeref_t wakeref;
> -
> -	GEM_TRACE("\n");
> -
> -	lockdep_assert_held(&i915->drm.struct_mutex);
> -	GEM_BUG_ON(i915->gt.active_requests);
> -	GEM_BUG_ON(!list_empty(&i915->gt.active_rings));
> -
> -	if (!i915->gt.awake)
> -		return;
> -
> -	/*
> -	 * Be paranoid and flush a concurrent interrupt to make sure
> -	 * we don't reactivate any irq tasklets after parking.
> -	 *
> -	 * FIXME: Note that even though we have waited for execlists to be idle,
> -	 * there may still be an in-flight interrupt even though the CSB
> -	 * is now empty. synchronize_irq() makes sure that a residual interrupt
> -	 * is completed before we continue, but it doesn't prevent the HW from
> -	 * raising a spurious interrupt later. To complete the shield we should
> -	 * coordinate disabling the CS irq with flushing the interrupts.
> -	 */
> -	synchronize_irq(i915->drm.irq);
> -
> -	intel_engines_park(i915);
> -	i915_timelines_park(i915);
> -
> -	i915_pmu_gt_parked(i915);
> -	i915_vma_parked(i915);
> -
> -	wakeref = fetch_and_zero(&i915->gt.awake);
> -	GEM_BUG_ON(!wakeref);
> -
> -	if (INTEL_GEN(i915) >= 6)
> -		gen6_rps_idle(i915);
> -
> -	intel_display_power_put(i915, POWER_DOMAIN_GT_IRQ, wakeref);
> -
> -	i915_globals_park();
> -}
> -
> -void i915_gem_park(struct drm_i915_private *i915)
> -{
> -	GEM_TRACE("\n");
> -
> -	lockdep_assert_held(&i915->drm.struct_mutex);
> -	GEM_BUG_ON(i915->gt.active_requests);
> -
> -	if (!i915->gt.awake)
> -		return;
> -
> -	/* Defer the actual call to __i915_gem_park() to prevent ping-pongs */
> -	mod_delayed_work(i915->wq, &i915->gt.idle_work, msecs_to_jiffies(100));
> -}
> -
> -void i915_gem_unpark(struct drm_i915_private *i915)
> -{
> -	GEM_TRACE("\n");
> -
> -	lockdep_assert_held(&i915->drm.struct_mutex);
> -	GEM_BUG_ON(!i915->gt.active_requests);
> -	assert_rpm_wakelock_held(i915);
> -
> -	if (i915->gt.awake)
> -		return;
> -
> -	/*
> -	 * It seems that the DMC likes to transition between the DC states a lot
> -	 * when there are no connected displays (no active power domains) during
> -	 * command submission.
> -	 *
> -	 * This activity has negative impact on the performance of the chip with
> -	 * huge latencies observed in the interrupt handler and elsewhere.
> -	 *
> -	 * Work around it by grabbing a GT IRQ power domain whilst there is any
> -	 * GT activity, preventing any DC state transitions.
> -	 */
> -	i915->gt.awake = intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
> -	GEM_BUG_ON(!i915->gt.awake);
> -
> -	i915_globals_unpark();
> -
> -	intel_enable_gt_powersave(i915);
> -	i915_update_gfx_val(i915);
> -	if (INTEL_GEN(i915) >= 6)
> -		gen6_rps_busy(i915);
> -	i915_pmu_gt_unparked(i915);
> -
> -	intel_engines_unpark(i915);
> -
> -	i915_queue_hangcheck(i915);
> -
> -	queue_delayed_work(i915->wq,
> -			   &i915->gt.retire_work,
> -			   round_jiffies_up_relative(HZ));
> -}
> -
>   int
>   i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
>   			    struct drm_file *file)
> @@ -2088,7 +1989,7 @@ static int i915_gem_object_create_mmap_offset(struct drm_i915_gem_object *obj)
>   		if (!err)
>   			break;
>   
> -	} while (flush_delayed_work(&dev_priv->gt.retire_work));
> +	} while (flush_delayed_work(&dev_priv->gem.retire_work));
>   
>   	return err;
>   }
> @@ -2871,132 +2772,6 @@ i915_gem_object_pwrite_gtt(struct drm_i915_gem_object *obj,
>   	return 0;
>   }
>   
> -static void
> -i915_gem_retire_work_handler(struct work_struct *work)
> -{
> -	struct drm_i915_private *dev_priv =
> -		container_of(work, typeof(*dev_priv), gt.retire_work.work);
> -	struct drm_device *dev = &dev_priv->drm;
> -
> -	/* Come back later if the device is busy... */
> -	if (mutex_trylock(&dev->struct_mutex)) {
> -		i915_retire_requests(dev_priv);
> -		mutex_unlock(&dev->struct_mutex);
> -	}
> -
> -	/*
> -	 * Keep the retire handler running until we are finally idle.
> -	 * We do not need to do this test under locking as in the worst-case
> -	 * we queue the retire worker once too often.
> -	 */
> -	if (READ_ONCE(dev_priv->gt.awake))
> -		queue_delayed_work(dev_priv->wq,
> -				   &dev_priv->gt.retire_work,
> -				   round_jiffies_up_relative(HZ));
> -}
> -
> -static bool switch_to_kernel_context_sync(struct drm_i915_private *i915,
> -					  unsigned long mask)
> -{
> -	bool result = true;
> -
> -	/*
> -	 * Even if we fail to switch, give whatever is running a small chance
> -	 * to save itself before we report the failure. Yes, this may be a
> -	 * false positive due to e.g. ENOMEM, caveat emptor!
> -	 */
> -	if (i915_gem_switch_to_kernel_context(i915, mask))
> -		result = false;
> -
> -	if (i915_gem_wait_for_idle(i915,
> -				   I915_WAIT_LOCKED |
> -				   I915_WAIT_FOR_IDLE_BOOST,
> -				   I915_GEM_IDLE_TIMEOUT))
> -		result = false;
> -
> -	if (!result) {
> -		if (i915_modparams.reset) { /* XXX hide warning from gem_eio */
> -			dev_err(i915->drm.dev,
> -				"Failed to idle engines, declaring wedged!\n");
> -			GEM_TRACE_DUMP();
> -		}
> -
> -		/* Forcibly cancel outstanding work and leave the gpu quiet. */
> -		i915_gem_set_wedged(i915);
> -	}
> -
> -	i915_retire_requests(i915); /* ensure we flush after wedging */
> -	return result;
> -}
> -
> -static bool load_power_context(struct drm_i915_private *i915)
> -{
> -	/* Force loading the kernel context on all engines */
> -	if (!switch_to_kernel_context_sync(i915, ALL_ENGINES))
> -		return false;
> -
> -	/*
> -	 * Immediately park the GPU so that we enable powersaving and
> -	 * treat it as idle. The next time we issue a request, we will
> -	 * unpark and start using the engine->pinned_default_state, otherwise
> -	 * it is in limbo and an early reset may fail.
> -	 */
> -	__i915_gem_park(i915);
> -
> -	return true;
> -}
> -
> -static void
> -i915_gem_idle_work_handler(struct work_struct *work)
> -{
> -	struct drm_i915_private *i915 =
> -		container_of(work, typeof(*i915), gt.idle_work.work);
> -	bool rearm_hangcheck;
> -
> -	if (!READ_ONCE(i915->gt.awake))
> -		return;
> -
> -	if (READ_ONCE(i915->gt.active_requests))
> -		return;
> -
> -	rearm_hangcheck =
> -		cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
> -
> -	if (!mutex_trylock(&i915->drm.struct_mutex)) {
> -		/* Currently busy, come back later */
> -		mod_delayed_work(i915->wq,
> -				 &i915->gt.idle_work,
> -				 msecs_to_jiffies(50));
> -		goto out_rearm;
> -	}
> -
> -	/*
> -	 * Flush out the last user context, leaving only the pinned
> -	 * kernel context resident. Should anything unfortunate happen
> -	 * while we are idle (such as the GPU being power cycled), no users
> -	 * will be harmed.
> -	 */
> -	if (!work_pending(&i915->gt.idle_work.work) &&
> -	    !i915->gt.active_requests) {
> -		++i915->gt.active_requests; /* don't requeue idle */
> -
> -		switch_to_kernel_context_sync(i915, i915->gt.active_engines);
> -
> -		if (!--i915->gt.active_requests) {
> -			__i915_gem_park(i915);
> -			rearm_hangcheck = false;
> -		}
> -	}
> -
> -	mutex_unlock(&i915->drm.struct_mutex);
> -
> -out_rearm:
> -	if (rearm_hangcheck) {
> -		GEM_BUG_ON(!i915->gt.awake);
> -		i915_queue_hangcheck(i915);
> -	}
> -}
> -
>   void i915_gem_close_object(struct drm_gem_object *gem, struct drm_file *file)
>   {
>   	struct drm_i915_private *i915 = to_i915(gem->dev);
> @@ -4412,133 +4187,6 @@ void i915_gem_sanitize(struct drm_i915_private *i915)
>   	mutex_unlock(&i915->drm.struct_mutex);
>   }
>   
> -void i915_gem_suspend(struct drm_i915_private *i915)
> -{
> -	intel_wakeref_t wakeref;
> -
> -	GEM_TRACE("\n");
> -
> -	wakeref = intel_runtime_pm_get(i915);
> -
> -	flush_workqueue(i915->wq);
> -
> -	mutex_lock(&i915->drm.struct_mutex);
> -
> -	/*
> -	 * We have to flush all the executing contexts to main memory so
> -	 * that they can saved in the hibernation image. To ensure the last
> -	 * context image is coherent, we have to switch away from it. That
> -	 * leaves the i915->kernel_context still active when
> -	 * we actually suspend, and its image in memory may not match the GPU
> -	 * state. Fortunately, the kernel_context is disposable and we do
> -	 * not rely on its state.
> -	 */
> -	switch_to_kernel_context_sync(i915, i915->gt.active_engines);
> -
> -	mutex_unlock(&i915->drm.struct_mutex);
> -	i915_reset_flush(i915);
> -
> -	drain_delayed_work(&i915->gt.retire_work);
> -
> -	/*
> -	 * As the idle_work is rearming if it detects a race, play safe and
> -	 * repeat the flush until it is definitely idle.
> -	 */
> -	drain_delayed_work(&i915->gt.idle_work);
> -
> -	/*
> -	 * Assert that we successfully flushed all the work and
> -	 * reset the GPU back to its idle, low power state.
> -	 */
> -	GEM_BUG_ON(i915->gt.awake);
> -
> -	intel_uc_suspend(i915);
> -
> -	intel_runtime_pm_put(i915, wakeref);
> -}
> -
> -void i915_gem_suspend_late(struct drm_i915_private *i915)
> -{
> -	struct drm_i915_gem_object *obj;
> -	struct list_head *phases[] = {
> -		&i915->mm.unbound_list,
> -		&i915->mm.bound_list,
> -		NULL
> -	}, **phase;
> -
> -	/*
> -	 * Neither the BIOS, ourselves or any other kernel
> -	 * expects the system to be in execlists mode on startup,
> -	 * so we need to reset the GPU back to legacy mode. And the only
> -	 * known way to disable logical contexts is through a GPU reset.
> -	 *
> -	 * So in order to leave the system in a known default configuration,
> -	 * always reset the GPU upon unload and suspend. Afterwards we then
> -	 * clean up the GEM state tracking, flushing off the requests and
> -	 * leaving the system in a known idle state.
> -	 *
> -	 * Note that is of the upmost importance that the GPU is idle and
> -	 * all stray writes are flushed *before* we dismantle the backing
> -	 * storage for the pinned objects.
> -	 *
> -	 * However, since we are uncertain that resetting the GPU on older
> -	 * machines is a good idea, we don't - just in case it leaves the
> -	 * machine in an unusable condition.
> -	 */
> -
> -	mutex_lock(&i915->drm.struct_mutex);
> -	for (phase = phases; *phase; phase++) {
> -		list_for_each_entry(obj, *phase, mm.link)
> -			WARN_ON(i915_gem_object_set_to_gtt_domain(obj, false));
> -	}
> -	mutex_unlock(&i915->drm.struct_mutex);
> -
> -	intel_uc_sanitize(i915);
> -	i915_gem_sanitize(i915);
> -}
> -
> -void i915_gem_resume(struct drm_i915_private *i915)
> -{
> -	GEM_TRACE("\n");
> -
> -	WARN_ON(i915->gt.awake);
> -
> -	mutex_lock(&i915->drm.struct_mutex);
> -	intel_uncore_forcewake_get(&i915->uncore, FORCEWAKE_ALL);
> -
> -	i915_gem_restore_gtt_mappings(i915);
> -	i915_gem_restore_fences(i915);
> -
> -	/*
> -	 * As we didn't flush the kernel context before suspend, we cannot
> -	 * guarantee that the context image is complete. So let's just reset
> -	 * it and start again.
> -	 */
> -	intel_gt_resume(i915);
> -
> -	if (i915_gem_init_hw(i915))
> -		goto err_wedged;
> -
> -	intel_uc_resume(i915);
> -
> -	/* Always reload a context for powersaving. */
> -	if (!load_power_context(i915))
> -		goto err_wedged;
> -
> -out_unlock:
> -	intel_uncore_forcewake_put(&i915->uncore, FORCEWAKE_ALL);
> -	mutex_unlock(&i915->drm.struct_mutex);
> -	return;
> -
> -err_wedged:
> -	if (!i915_reset_failed(i915)) {
> -		dev_err(i915->drm.dev,
> -			"Failed to re-initialize GPU, declaring it wedged!\n");
> -		i915_gem_set_wedged(i915);
> -	}
> -	goto out_unlock;
> -}
> -
>   void i915_gem_init_swizzling(struct drm_i915_private *dev_priv)
>   {
>   	if (INTEL_GEN(dev_priv) < 5 ||
> @@ -4721,7 +4369,7 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
>   	}
>   
>   	/* Flush the default context image to memory, and enable powersaving. */
> -	if (!load_power_context(i915)) {
> +	if (!i915_gem_load_power_context(i915)) {
>   		err = -EIO;
>   		goto err_active;
>   	}
> @@ -5115,11 +4763,8 @@ int i915_gem_init_early(struct drm_i915_private *dev_priv)
>   	INIT_LIST_HEAD(&dev_priv->gt.closed_vma);
>   
>   	i915_gem_init__mm(dev_priv);
> +	i915_gem_init__pm(dev_priv);
>   
> -	INIT_DELAYED_WORK(&dev_priv->gt.retire_work,
> -			  i915_gem_retire_work_handler);
> -	INIT_DELAYED_WORK(&dev_priv->gt.idle_work,
> -			  i915_gem_idle_work_handler);
>   	init_waitqueue_head(&dev_priv->gpu_error.wait_queue);
>   	init_waitqueue_head(&dev_priv->gpu_error.reset_queue);
>   	mutex_init(&dev_priv->gpu_error.wedge_mutex);
> diff --git a/drivers/gpu/drm/i915/i915_gem_pm.c b/drivers/gpu/drm/i915/i915_gem_pm.c
> new file mode 100644
> index 000000000000..bc0663cac7b6
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_gem_pm.c
> @@ -0,0 +1,365 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#include "i915_drv.h"
> +#include "i915_gem_pm.h"
> +#include "i915_globals.h"
> +#include "intel_pm.h"
> +
> +static void __i915_gem_park(struct drm_i915_private *i915)
> +{
> +	intel_wakeref_t wakeref;
> +
> +	GEM_TRACE("\n");
> +
> +	lockdep_assert_held(&i915->drm.struct_mutex);
> +	GEM_BUG_ON(i915->gt.active_requests);
> +	GEM_BUG_ON(!list_empty(&i915->gt.active_rings));
> +
> +	if (!i915->gt.awake)
> +		return;
> +
> +	/*
> +	 * Be paranoid and flush a concurrent interrupt to make sure
> +	 * we don't reactivate any irq tasklets after parking.
> +	 *
> +	 * FIXME: Note that even though we have waited for execlists to be idle,
> +	 * there may still be an in-flight interrupt even though the CSB
> +	 * is now empty. synchronize_irq() makes sure that a residual interrupt
> +	 * is completed before we continue, but it doesn't prevent the HW from
> +	 * raising a spurious interrupt later. To complete the shield we should
> +	 * coordinate disabling the CS irq with flushing the interrupts.
> +	 */
> +	synchronize_irq(i915->drm.irq);
> +
> +	intel_engines_park(i915);
> +	i915_timelines_park(i915);
> +
> +	i915_pmu_gt_parked(i915);
> +	i915_vma_parked(i915);
> +
> +	wakeref = fetch_and_zero(&i915->gt.awake);
> +	GEM_BUG_ON(!wakeref);
> +
> +	if (INTEL_GEN(i915) >= 6)
> +		gen6_rps_idle(i915);
> +
> +	intel_display_power_put(i915, POWER_DOMAIN_GT_IRQ, wakeref);
> +
> +	i915_globals_park();
> +}
> +
> +static bool switch_to_kernel_context_sync(struct drm_i915_private *i915,
> +					  unsigned long mask)
> +{
> +	bool result = true;
> +
> +	/*
> +	 * Even if we fail to switch, give whatever is running a small chance
> +	 * to save itself before we report the failure. Yes, this may be a
> +	 * false positive due to e.g. ENOMEM, caveat emptor!
> +	 */
> +	if (i915_gem_switch_to_kernel_context(i915, mask))
> +		result = false;
> +
> +	if (i915_gem_wait_for_idle(i915,
> +				   I915_WAIT_LOCKED |
> +				   I915_WAIT_FOR_IDLE_BOOST,
> +				   I915_GEM_IDLE_TIMEOUT))
> +		result = false;
> +
> +	if (!result) {
> +		if (i915_modparams.reset) { /* XXX hide warning from gem_eio */
> +			dev_err(i915->drm.dev,
> +				"Failed to idle engines, declaring wedged!\n");
> +			GEM_TRACE_DUMP();
> +		}
> +
> +		/* Forcibly cancel outstanding work and leave the gpu quiet. */
> +		i915_gem_set_wedged(i915);
> +	}
> +
> +	i915_retire_requests(i915); /* ensure we flush after wedging */
> +	return result;
> +}
> +
> +static void idle_work_handler(struct work_struct *work)
> +{
> +	struct drm_i915_private *i915 =
> +		container_of(work, typeof(*i915), gem.idle_work.work);
> +	bool rearm_hangcheck;
> +
> +	if (!READ_ONCE(i915->gt.awake))
> +		return;
> +
> +	if (READ_ONCE(i915->gt.active_requests))
> +		return;
> +
> +	rearm_hangcheck =
> +		cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
> +
> +	if (!mutex_trylock(&i915->drm.struct_mutex)) {
> +		/* Currently busy, come back later */
> +		mod_delayed_work(i915->wq,
> +				 &i915->gem.idle_work,
> +				 msecs_to_jiffies(50));
> +		goto out_rearm;
> +	}
> +
> +	/*
> +	 * Flush out the last user context, leaving only the pinned
> +	 * kernel context resident. Should anything unfortunate happen
> +	 * while we are idle (such as the GPU being power cycled), no users
> +	 * will be harmed.
> +	 */
> +	if (!work_pending(&i915->gem.idle_work.work) &&
> +	    !i915->gt.active_requests) {
> +		++i915->gt.active_requests; /* don't requeue idle */
> +
> +		switch_to_kernel_context_sync(i915, i915->gt.active_engines);
> +
> +		if (!--i915->gt.active_requests) {
> +			__i915_gem_park(i915);
> +			rearm_hangcheck = false;
> +		}
> +	}
> +
> +	mutex_unlock(&i915->drm.struct_mutex);
> +
> +out_rearm:
> +	if (rearm_hangcheck) {
> +		GEM_BUG_ON(!i915->gt.awake);
> +		i915_queue_hangcheck(i915);
> +	}
> +}
> +
> +static void retire_work_handler(struct work_struct *work)
> +{
> +	struct drm_i915_private *i915 =
> +		container_of(work, typeof(*i915), gem.retire_work.work);
> +
> +	/* Come back later if the device is busy... */
> +	if (mutex_trylock(&i915->drm.struct_mutex)) {
> +		i915_retire_requests(i915);
> +		mutex_unlock(&i915->drm.struct_mutex);
> +	}
> +
> +	/*
> +	 * Keep the retire handler running until we are finally idle.
> +	 * We do not need to do this test under locking as in the worst-case
> +	 * we queue the retire worker once too often.
> +	 */
> +	if (READ_ONCE(i915->gt.awake))
> +		queue_delayed_work(i915->wq,
> +				   &i915->gem.retire_work,
> +				   round_jiffies_up_relative(HZ));
> +}
> +
> +void i915_gem_park(struct drm_i915_private *i915)
> +{
> +	GEM_TRACE("\n");
> +
> +	lockdep_assert_held(&i915->drm.struct_mutex);
> +	GEM_BUG_ON(i915->gt.active_requests);
> +
> +	if (!i915->gt.awake)
> +		return;
> +
> +	/* Defer the actual call to __i915_gem_park() to prevent ping-pongs */
> +	mod_delayed_work(i915->wq, &i915->gem.idle_work, msecs_to_jiffies(100));
> +}
> +
> +void i915_gem_unpark(struct drm_i915_private *i915)
> +{
> +	GEM_TRACE("\n");
> +
> +	lockdep_assert_held(&i915->drm.struct_mutex);
> +	GEM_BUG_ON(!i915->gt.active_requests);
> +	assert_rpm_wakelock_held(i915);
> +
> +	if (i915->gt.awake)
> +		return;
> +
> +	/*
> +	 * It seems that the DMC likes to transition between the DC states a lot
> +	 * when there are no connected displays (no active power domains) during
> +	 * command submission.
> +	 *
> +	 * This activity has negative impact on the performance of the chip with
> +	 * huge latencies observed in the interrupt handler and elsewhere.
> +	 *
> +	 * Work around it by grabbing a GT IRQ power domain whilst there is any
> +	 * GT activity, preventing any DC state transitions.
> +	 */
> +	i915->gt.awake = intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
> +	GEM_BUG_ON(!i915->gt.awake);
> +
> +	i915_globals_unpark();
> +
> +	intel_enable_gt_powersave(i915);
> +	i915_update_gfx_val(i915);
> +	if (INTEL_GEN(i915) >= 6)
> +		gen6_rps_busy(i915);
> +	i915_pmu_gt_unparked(i915);
> +
> +	intel_engines_unpark(i915);
> +
> +	i915_queue_hangcheck(i915);
> +
> +	queue_delayed_work(i915->wq,
> +			   &i915->gem.retire_work,
> +			   round_jiffies_up_relative(HZ));
> +}
> +
> +bool i915_gem_load_power_context(struct drm_i915_private *i915)
> +{
> +	/* Force loading the kernel context on all engines */
> +	if (!switch_to_kernel_context_sync(i915, ALL_ENGINES))
> +		return false;
> +
> +	/*
> +	 * Immediately park the GPU so that we enable powersaving and
> +	 * treat it as idle. The next time we issue a request, we will
> +	 * unpark and start using the engine->pinned_default_state, otherwise
> +	 * it is in limbo and an early reset may fail.
> +	 */
> +	__i915_gem_park(i915);
> +
> +	return true;
> +}
> +
> +void i915_gem_suspend(struct drm_i915_private *i915)
> +{
> +	intel_wakeref_t wakeref;
> +
> +	GEM_TRACE("\n");
> +
> +	wakeref = intel_runtime_pm_get(i915);
> +
> +	flush_workqueue(i915->wq);
> +
> +	mutex_lock(&i915->drm.struct_mutex);
> +
> +	/*
> +	 * We have to flush all the executing contexts to main memory so
> +	 * that they can saved in the hibernation image. To ensure the last
> +	 * context image is coherent, we have to switch away from it. That
> +	 * leaves the i915->kernel_context still active when
> +	 * we actually suspend, and its image in memory may not match the GPU
> +	 * state. Fortunately, the kernel_context is disposable and we do
> +	 * not rely on its state.
> +	 */
> +	switch_to_kernel_context_sync(i915, i915->gt.active_engines);
> +
> +	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_reset_flush(i915);
> +
> +	drain_delayed_work(&i915->gem.retire_work);
> +
> +	/*
> +	 * As the idle_work is rearming if it detects a race, play safe and
> +	 * repeat the flush until it is definitely idle.
> +	 */
> +	drain_delayed_work(&i915->gem.idle_work);
> +
> +	/*
> +	 * Assert that we successfully flushed all the work and
> +	 * reset the GPU back to its idle, low power state.
> +	 */
> +	GEM_BUG_ON(i915->gt.awake);
> +
> +	intel_uc_suspend(i915);
> +
> +	intel_runtime_pm_put(i915, wakeref);
> +}
> +
> +void i915_gem_suspend_late(struct drm_i915_private *i915)
> +{
> +	struct drm_i915_gem_object *obj;
> +	struct list_head *phases[] = {
> +		&i915->mm.unbound_list,
> +		&i915->mm.bound_list,
> +		NULL
> +	}, **phase;
> +
> +	/*
> +	 * Neither the BIOS, ourselves or any other kernel
> +	 * expects the system to be in execlists mode on startup,
> +	 * so we need to reset the GPU back to legacy mode. And the only
> +	 * known way to disable logical contexts is through a GPU reset.
> +	 *
> +	 * So in order to leave the system in a known default configuration,
> +	 * always reset the GPU upon unload and suspend. Afterwards we then
> +	 * clean up the GEM state tracking, flushing off the requests and
> +	 * leaving the system in a known idle state.
> +	 *
> +	 * Note that is of the upmost importance that the GPU is idle and
> +	 * all stray writes are flushed *before* we dismantle the backing
> +	 * storage for the pinned objects.
> +	 *
> +	 * However, since we are uncertain that resetting the GPU on older
> +	 * machines is a good idea, we don't - just in case it leaves the
> +	 * machine in an unusable condition.
> +	 */
> +
> +	mutex_lock(&i915->drm.struct_mutex);
> +	for (phase = phases; *phase; phase++) {
> +		list_for_each_entry(obj, *phase, mm.link)
> +			WARN_ON(i915_gem_object_set_to_gtt_domain(obj, false));
> +	}
> +	mutex_unlock(&i915->drm.struct_mutex);
> +
> +	intel_uc_sanitize(i915);
> +	i915_gem_sanitize(i915);
> +}
> +
> +void i915_gem_resume(struct drm_i915_private *i915)
> +{
> +	GEM_TRACE("\n");
> +
> +	WARN_ON(i915->gt.awake);
> +
> +	mutex_lock(&i915->drm.struct_mutex);
> +	intel_uncore_forcewake_get(&i915->uncore, FORCEWAKE_ALL);
> +
> +	i915_gem_restore_gtt_mappings(i915);
> +	i915_gem_restore_fences(i915);
> +
> +	/*
> +	 * As we didn't flush the kernel context before suspend, we cannot
> +	 * guarantee that the context image is complete. So let's just reset
> +	 * it and start again.
> +	 */
> +	intel_gt_resume(i915);
> +
> +	if (i915_gem_init_hw(i915))
> +		goto err_wedged;
> +
> +	intel_uc_resume(i915);
> +
> +	/* Always reload a context for powersaving. */
> +	if (!i915_gem_load_power_context(i915))
> +		goto err_wedged;
> +
> +out_unlock:
> +	intel_uncore_forcewake_put(&i915->uncore, FORCEWAKE_ALL);
> +	mutex_unlock(&i915->drm.struct_mutex);
> +	return;
> +
> +err_wedged:
> +	if (!i915_reset_failed(i915)) {
> +		dev_err(i915->drm.dev,
> +			"Failed to re-initialize GPU, declaring it wedged!\n");
> +		i915_gem_set_wedged(i915);
> +	}
> +	goto out_unlock;
> +}
> +
> +void i915_gem_init__pm(struct drm_i915_private *i915)
> +{
> +	INIT_DELAYED_WORK(&i915->gem.idle_work, idle_work_handler);
> +	INIT_DELAYED_WORK(&i915->gem.retire_work, retire_work_handler);
> +}
> diff --git a/drivers/gpu/drm/i915/i915_gem_pm.h b/drivers/gpu/drm/i915/i915_gem_pm.h
> new file mode 100644
> index 000000000000..52f65e3f06b5
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_gem_pm.h
> @@ -0,0 +1,28 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#ifndef __I915_GEM_PM_H__
> +#define __I915_GEM_PM_H__
> +
> +#include <linux/types.h>
> +
> +struct drm_i915_private;
> +struct work_struct;
> +
> +void i915_gem_init__pm(struct drm_i915_private *i915);
> +
> +bool i915_gem_load_power_context(struct drm_i915_private *i915);
> +void i915_gem_resume(struct drm_i915_private *i915);
> +
> +void i915_gem_unpark(struct drm_i915_private *i915);
> +void i915_gem_park(struct drm_i915_private *i915);
> +
> +void i915_gem_idle_work_handler(struct work_struct *work);
> +
> +void i915_gem_suspend(struct drm_i915_private *i915);
> +void i915_gem_suspend_late(struct drm_i915_private *i915);
> +
> +#endif /* __I915_GEM_PM_H__ */
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> index 6f52ca881173..9d646fa1b74e 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> @@ -1658,7 +1658,7 @@ static int __igt_switch_to_kernel_context(struct drm_i915_private *i915,
>   		/* XXX Bonus points for proving we are the kernel context! */
>   
>   		mutex_unlock(&i915->drm.struct_mutex);
> -		drain_delayed_work(&i915->gt.idle_work);
> +		drain_delayed_work(&i915->gem.idle_work);
>   		mutex_lock(&i915->drm.struct_mutex);
>   	}
>   
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
> index 971148fbe6f5..12fc53c694a6 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_object.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
> @@ -514,8 +514,8 @@ static void disable_retire_worker(struct drm_i915_private *i915)
>   	}
>   	mutex_unlock(&i915->drm.struct_mutex);
>   
> -	cancel_delayed_work_sync(&i915->gt.retire_work);
> -	cancel_delayed_work_sync(&i915->gt.idle_work);
> +	cancel_delayed_work_sync(&i915->gem.retire_work);
> +	cancel_delayed_work_sync(&i915->gem.idle_work);
>   }
>   
>   static int igt_mmap_offset_exhaustion(void *arg)
> @@ -617,9 +617,9 @@ static int igt_mmap_offset_exhaustion(void *arg)
>   out_park:
>   	mutex_lock(&i915->drm.struct_mutex);
>   	if (--i915->gt.active_requests)
> -		queue_delayed_work(i915->wq, &i915->gt.retire_work, 0);
> +		queue_delayed_work(i915->wq, &i915->gem.retire_work, 0);
>   	else
> -		queue_delayed_work(i915->wq, &i915->gt.idle_work, 0);
> +		queue_delayed_work(i915->wq, &i915->gem.idle_work, 0);
>   	mutex_unlock(&i915->drm.struct_mutex);
>   	i915_gem_shrinker_register(i915);
>   	return err;
> diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> index f444ee5add27..fb677b4019a0 100644
> --- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> +++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> @@ -59,8 +59,8 @@ static void mock_device_release(struct drm_device *dev)
>   	i915_gem_contexts_lost(i915);
>   	mutex_unlock(&i915->drm.struct_mutex);
>   
> -	drain_delayed_work(&i915->gt.retire_work);
> -	drain_delayed_work(&i915->gt.idle_work);
> +	drain_delayed_work(&i915->gem.retire_work);
> +	drain_delayed_work(&i915->gem.idle_work);
>   	i915_gem_drain_workqueue(i915);
>   
>   	mutex_lock(&i915->drm.struct_mutex);
> @@ -111,7 +111,7 @@ static void mock_retire_work_handler(struct work_struct *work)
>   static void mock_idle_work_handler(struct work_struct *work)
>   {
>   	struct drm_i915_private *i915 =
> -		container_of(work, typeof(*i915), gt.idle_work.work);
> +		container_of(work, typeof(*i915), gem.idle_work.work);
>   
>   	i915->gt.active_engines = 0;
>   }
> @@ -197,8 +197,8 @@ struct drm_i915_private *mock_gem_device(void)
>   
>   	mock_init_contexts(i915);
>   
> -	INIT_DELAYED_WORK(&i915->gt.retire_work, mock_retire_work_handler);
> -	INIT_DELAYED_WORK(&i915->gt.idle_work, mock_idle_work_handler);
> +	INIT_DELAYED_WORK(&i915->gem.retire_work, mock_retire_work_handler);
> +	INIT_DELAYED_WORK(&i915->gem.idle_work, mock_idle_work_handler);
>   
>   	i915->gt.awake = true;
>   
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 15/29] drm/i915/guc: Replace preempt_client lookup with engine->preempt_context
  2019-04-08  9:17 ` [PATCH 15/29] drm/i915/guc: Replace preempt_client lookup with engine->preempt_context Chris Wilson
@ 2019-04-08 14:57   ` Tvrtko Ursulin
  0 siblings, 0 replies; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-08 14:57 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 08/04/2019 10:17, Chris Wilson wrote:
> Circumvent the dance we currently perform to find the preempt_client and
> lookup its HW context for this engine, as we know we have already pinned
> the preempt_context on the engine.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/intel_guc_submission.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
> index 4361d052a571..7fbfcb3d63e0 100644
> --- a/drivers/gpu/drm/i915/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/intel_guc_submission.c
> @@ -568,7 +568,7 @@ static void inject_preempt_context(struct work_struct *work)
>   					     preempt_work[engine->id]);
>   	struct intel_guc_client *client = guc->preempt_client;
>   	struct guc_stage_desc *stage_desc = __get_stage_desc(client);
> -	struct intel_context *ce = intel_context_lookup(client->owner, engine);
> +	struct intel_context *ce = engine->preempt_context;
>   	u32 data[7];
>   
>   	if (!ce->ring->emit) { /* recreate upon load/resume */
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 17/29] drm/i915/selftests: Use the real kernel context for sseu isolation tests
  2019-04-08  9:17 ` [PATCH 17/29] drm/i915/selftests: Use the real kernel context for sseu isolation tests Chris Wilson
@ 2019-04-08 15:00   ` Tvrtko Ursulin
  0 siblings, 0 replies; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-08 15:00 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 08/04/2019 10:17, Chris Wilson wrote:
> Simply the setup slightly for the sseu selftests to use the actual
> kernel_context.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   .../gpu/drm/i915/selftests/i915_gem_context.c   | 17 ++++-------------
>   1 file changed, 4 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> index 71d896bbade2..807644ae6877 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> @@ -957,7 +957,6 @@ __sseu_finish(struct drm_i915_private *i915,
>   	      const char *name,
>   	      unsigned int flags,
>   	      struct i915_gem_context *ctx,
> -	      struct i915_gem_context *kctx,
>   	      struct intel_engine_cs *engine,
>   	      struct drm_i915_gem_object *obj,
>   	      unsigned int expected,
> @@ -979,7 +978,8 @@ __sseu_finish(struct drm_i915_private *i915,
>   	if (ret)
>   		goto out;
>   
> -	ret = __read_slice_count(i915, kctx, engine, obj, NULL, &rpcs);
> +	ret = __read_slice_count(i915, i915->kernel_context, engine, obj,
> +				 NULL, &rpcs);
>   	ret = __check_rpcs(name, rpcs, ret, slices, "Kernel context", "!");
>   
>   out:
> @@ -1011,22 +1011,17 @@ __sseu_test(struct drm_i915_private *i915,
>   	    struct intel_sseu sseu)
>   {
>   	struct igt_spinner *spin = NULL;
> -	struct i915_gem_context *kctx;
>   	int ret;
>   
> -	kctx = kernel_context(i915);
> -	if (IS_ERR(kctx))
> -		return PTR_ERR(kctx);
> -
>   	ret = __sseu_prepare(i915, name, flags, ctx, engine, &spin);
>   	if (ret)
> -		goto out_context;
> +		return ret;
>   
>   	ret = __i915_gem_context_reconfigure_sseu(ctx, engine, sseu);
>   	if (ret)
>   		goto out_spin;
>   
> -	ret = __sseu_finish(i915, name, flags, ctx, kctx, engine, obj,
> +	ret = __sseu_finish(i915, name, flags, ctx, engine, obj,
>   			    hweight32(sseu.slice_mask), spin);
>   
>   out_spin:
> @@ -1035,10 +1030,6 @@ __sseu_test(struct drm_i915_private *i915,
>   		igt_spinner_fini(spin);
>   		kfree(spin);
>   	}
> -
> -out_context:
> -	kernel_context_close(kctx);
> -
>   	return ret;
>   }
>   
> 


Looks okay. I can't remember why I thought I needed a separate context 
Maybe just to avoid using the kernel context for emitting the read 
request etc.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 14/29] drm/i915: Explicitly pin the logical context for execbuf
  2019-04-08  9:17 ` [PATCH 14/29] drm/i915: Explicitly pin the logical context for execbuf Chris Wilson
@ 2019-04-08 15:17   ` Tvrtko Ursulin
  0 siblings, 0 replies; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-08 15:17 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 08/04/2019 10:17, Chris Wilson wrote:
> In order to separate the reservation phase of building a request from
> its emission phase, we need to pull some of the request alloc activities
> from deep inside i915_request to the surface, GEM_EXECBUFFER.

I had some comments and gave conditional r-b on this one in the previous 
round.

Just preserving one comment basically and caching eb->i915.

Regards,

Tvrtko

> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_gem_execbuffer.c | 110 +++++++++++++--------
>   drivers/gpu/drm/i915/i915_request.c        |   9 --
>   2 files changed, 70 insertions(+), 49 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 3d672c9edb94..35732c2ae17f 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -34,8 +34,11 @@
>   #include <drm/drm_syncobj.h>
>   #include <drm/i915_drm.h>
>   
> +#include "gt/intel_gt_pm.h"
> +
>   #include "i915_drv.h"
>   #include "i915_gem_clflush.h"
> +#include "i915_gem_pm.h"
>   #include "i915_trace.h"
>   #include "intel_drv.h"
>   #include "intel_frontbuffer.h"
> @@ -236,7 +239,8 @@ struct i915_execbuffer {
>   	unsigned int *flags;
>   
>   	struct intel_engine_cs *engine; /** engine to queue the request to */
> -	struct i915_gem_context *ctx; /** context for building the request */
> +	struct intel_context *context; /* logical state for the request */
> +	struct i915_gem_context *gem_context; /** caller's context */
>   	struct i915_address_space *vm; /** GTT and vma for the request */
>   
>   	struct i915_request *request; /** our request to build */
> @@ -738,7 +742,7 @@ static int eb_select_context(struct i915_execbuffer *eb)
>   	if (unlikely(!ctx))
>   		return -ENOENT;
>   
> -	eb->ctx = ctx;
> +	eb->gem_context = ctx;
>   	if (ctx->ppgtt) {
>   		eb->vm = &ctx->ppgtt->vm;
>   		eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
> @@ -761,7 +765,7 @@ static struct i915_request *__eb_wait_for_ring(struct intel_ring *ring)
>   	 * Completely unscientific finger-in-the-air estimates for suitable
>   	 * maximum user request size (to avoid blocking) and then backoff.
>   	 */
> -	if (intel_ring_update_space(ring) >= PAGE_SIZE)
> +	if (!ring || intel_ring_update_space(ring) >= PAGE_SIZE)
>   		return NULL;
>   
>   	/*
> @@ -784,7 +788,6 @@ static struct i915_request *__eb_wait_for_ring(struct intel_ring *ring)
>   
>   static int eb_wait_for_ring(const struct i915_execbuffer *eb)
>   {
> -	const struct intel_context *ce;
>   	struct i915_request *rq;
>   	int ret = 0;
>   
> @@ -794,11 +797,7 @@ static int eb_wait_for_ring(const struct i915_execbuffer *eb)
>   	 * keeping all of their resources pinned.
>   	 */
>   
> -	ce = intel_context_lookup(eb->ctx, eb->engine);
> -	if (!ce || !ce->ring) /* first use, assume empty! */
> -		return 0;
> -
> -	rq = __eb_wait_for_ring(ce->ring);
> +	rq = __eb_wait_for_ring(eb->context->ring);
>   	if (rq) {
>   		mutex_unlock(&eb->i915->drm.struct_mutex);
>   
> @@ -817,15 +816,15 @@ static int eb_wait_for_ring(const struct i915_execbuffer *eb)
>   
>   static int eb_lookup_vmas(struct i915_execbuffer *eb)
>   {
> -	struct radix_tree_root *handles_vma = &eb->ctx->handles_vma;
> +	struct radix_tree_root *handles_vma = &eb->gem_context->handles_vma;
>   	struct drm_i915_gem_object *obj;
>   	unsigned int i, batch;
>   	int err;
>   
> -	if (unlikely(i915_gem_context_is_closed(eb->ctx)))
> +	if (unlikely(i915_gem_context_is_closed(eb->gem_context)))
>   		return -ENOENT;
>   
> -	if (unlikely(i915_gem_context_is_banned(eb->ctx)))
> +	if (unlikely(i915_gem_context_is_banned(eb->gem_context)))
>   		return -EIO;
>   
>   	INIT_LIST_HEAD(&eb->relocs);
> @@ -870,8 +869,8 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb)
>   		if (!vma->open_count++)
>   			i915_vma_reopen(vma);
>   		list_add(&lut->obj_link, &obj->lut_list);
> -		list_add(&lut->ctx_link, &eb->ctx->handles_list);
> -		lut->ctx = eb->ctx;
> +		list_add(&lut->ctx_link, &eb->gem_context->handles_list);
> +		lut->ctx = eb->gem_context;
>   		lut->handle = handle;
>   
>   add_vma:
> @@ -1227,7 +1226,7 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb,
>   	if (err)
>   		goto err_unmap;
>   
> -	rq = i915_request_alloc(eb->engine, eb->ctx);
> +	rq = i915_request_create(eb->context);
>   	if (IS_ERR(rq)) {
>   		err = PTR_ERR(rq);
>   		goto err_unpin;
> @@ -2088,8 +2087,41 @@ static const enum intel_engine_id user_ring_map[I915_USER_RINGS + 1] = {
>   	[I915_EXEC_VEBOX]	= VECS0
>   };
>   
> -static struct intel_engine_cs *
> -eb_select_engine(struct drm_i915_private *dev_priv,
> +static int eb_pin_context(struct i915_execbuffer *eb,
> +			  struct intel_engine_cs *engine)
> +{
> +	struct intel_context *ce;
> +	int err;
> +
> +	/*
> +	 * ABI: Before userspace accesses the GPU (e.g. execbuffer), report
> +	 * EIO if the GPU is already wedged.
> +	 */
> +	err = i915_terminally_wedged(eb->i915);
> +	if (err)
> +		return err;
> +
> +	/*
> +	 * Pinning the contexts may generate requests in order to acquire
> +	 * GGTT space, so do this first before we reserve a seqno for
> +	 * ourselves.
> +	 */
> +	ce = intel_context_pin(eb->gem_context, engine);
> +	if (IS_ERR(ce))
> +		return PTR_ERR(ce);
> +
> +	eb->engine = engine;
> +	eb->context = ce;
> +	return 0;
> +}
> +
> +static void eb_unpin_context(struct i915_execbuffer *eb)
> +{
> +	intel_context_unpin(eb->context);
> +}
> +
> +static int
> +eb_select_engine(struct i915_execbuffer *eb,
>   		 struct drm_file *file,
>   		 struct drm_i915_gem_execbuffer2 *args)
>   {
> @@ -2098,21 +2130,21 @@ eb_select_engine(struct drm_i915_private *dev_priv,
>   
>   	if (user_ring_id > I915_USER_RINGS) {
>   		DRM_DEBUG("execbuf with unknown ring: %u\n", user_ring_id);
> -		return NULL;
> +		return -EINVAL;
>   	}
>   
>   	if ((user_ring_id != I915_EXEC_BSD) &&
>   	    ((args->flags & I915_EXEC_BSD_MASK) != 0)) {
>   		DRM_DEBUG("execbuf with non bsd ring but with invalid "
>   			  "bsd dispatch flags: %d\n", (int)(args->flags));
> -		return NULL;
> +		return -EINVAL;
>   	}
>   
> -	if (user_ring_id == I915_EXEC_BSD && HAS_ENGINE(dev_priv, VCS1)) {
> +	if (user_ring_id == I915_EXEC_BSD && HAS_ENGINE(eb->i915, VCS1)) {
>   		unsigned int bsd_idx = args->flags & I915_EXEC_BSD_MASK;
>   
>   		if (bsd_idx == I915_EXEC_BSD_DEFAULT) {
> -			bsd_idx = gen8_dispatch_bsd_engine(dev_priv, file);
> +			bsd_idx = gen8_dispatch_bsd_engine(eb->i915, file);
>   		} else if (bsd_idx >= I915_EXEC_BSD_RING1 &&
>   			   bsd_idx <= I915_EXEC_BSD_RING2) {
>   			bsd_idx >>= I915_EXEC_BSD_SHIFT;
> @@ -2120,20 +2152,20 @@ eb_select_engine(struct drm_i915_private *dev_priv,
>   		} else {
>   			DRM_DEBUG("execbuf with unknown bsd ring: %u\n",
>   				  bsd_idx);
> -			return NULL;
> +			return -EINVAL;
>   		}
>   
> -		engine = dev_priv->engine[_VCS(bsd_idx)];
> +		engine = eb->i915->engine[_VCS(bsd_idx)];
>   	} else {
> -		engine = dev_priv->engine[user_ring_map[user_ring_id]];
> +		engine = eb->i915->engine[user_ring_map[user_ring_id]];
>   	}
>   
>   	if (!engine) {
>   		DRM_DEBUG("execbuf with invalid ring: %u\n", user_ring_id);
> -		return NULL;
> +		return -EINVAL;
>   	}
>   
> -	return engine;
> +	return eb_pin_context(eb, engine);
>   }
>   
>   static void
> @@ -2275,7 +2307,6 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>   	struct i915_execbuffer eb;
>   	struct dma_fence *in_fence = NULL;
>   	struct sync_file *out_fence = NULL;
> -	intel_wakeref_t wakeref;
>   	int out_fence_fd = -1;
>   	int err;
>   
> @@ -2335,12 +2366,6 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>   	if (unlikely(err))
>   		goto err_destroy;
>   
> -	eb.engine = eb_select_engine(eb.i915, file, args);
> -	if (!eb.engine) {
> -		err = -EINVAL;
> -		goto err_engine;
> -	}
> -
>   	/*
>   	 * Take a local wakeref for preparing to dispatch the execbuf as
>   	 * we expect to access the hardware fairly frequently in the
> @@ -2348,16 +2373,20 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>   	 * wakeref that we hold until the GPU has been idle for at least
>   	 * 100ms.
>   	 */
> -	wakeref = intel_runtime_pm_get(eb.i915);
> +	intel_gt_pm_get(eb.i915);
>   
>   	err = i915_mutex_lock_interruptible(dev);
>   	if (err)
>   		goto err_rpm;
>   
> -	err = eb_wait_for_ring(&eb); /* may temporarily drop struct_mutex */
> +	err = eb_select_engine(&eb, file, args);
>   	if (unlikely(err))
>   		goto err_unlock;
>   
> +	err = eb_wait_for_ring(&eb); /* may temporarily drop struct_mutex */
> +	if (unlikely(err))
> +		goto err_engine;
> +
>   	err = eb_relocate(&eb);
>   	if (err) {
>   		/*
> @@ -2441,7 +2470,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>   	GEM_BUG_ON(eb.reloc_cache.rq);
>   
>   	/* Allocate a request for this batch buffer nice and early. */
> -	eb.request = i915_request_alloc(eb.engine, eb.ctx);
> +	eb.request = i915_request_create(eb.context);
>   	if (IS_ERR(eb.request)) {
>   		err = PTR_ERR(eb.request);
>   		goto err_batch_unpin;
> @@ -2479,8 +2508,8 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>   	trace_i915_request_queue(eb.request, eb.batch_flags);
>   	err = eb_submit(&eb);
>   err_request:
> -	i915_request_add(eb.request);
>   	add_to_client(eb.request, file);
> +	i915_request_add(eb.request);
>   
>   	if (fences)
>   		signal_fence_array(&eb, fences);
> @@ -2502,12 +2531,13 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>   err_vma:
>   	if (eb.exec)
>   		eb_release_vmas(&eb);
> +err_engine:
> +	eb_unpin_context(&eb);
>   err_unlock:
>   	mutex_unlock(&dev->struct_mutex);
>   err_rpm:
> -	intel_runtime_pm_put(eb.i915, wakeref);
> -err_engine:
> -	i915_gem_context_put(eb.ctx);
> +	intel_gt_pm_put(eb.i915);
> +	i915_gem_context_put(eb.gem_context);
>   err_destroy:
>   	eb_destroy(&eb);
>   err_out_fence:
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 7b973b3c6ecf..7932d1209247 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -753,7 +753,6 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   	struct drm_i915_private *i915 = engine->i915;
>   	struct intel_context *ce;
>   	struct i915_request *rq;
> -	int ret;
>   
>   	/*
>   	 * Preempt contexts are reserved for exclusive use to inject a
> @@ -762,14 +761,6 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   	 */
>   	GEM_BUG_ON(ctx == i915->preempt_context);
>   
> -	/*
> -	 * ABI: Before userspace accesses the GPU (e.g. execbuffer), report
> -	 * EIO if the GPU is already wedged.
> -	 */
> -	ret = i915_terminally_wedged(i915);
> -	if (ret)
> -		return ERR_PTR(ret);
> -
>   	/*
>   	 * Pinning the contexts may generate requests in order to acquire
>   	 * GGTT space, so do this first before we reserve a seqno for
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 05/29] drm/i915: Store the default sseu setup on the engine
  2019-04-08 14:54   ` Tvrtko Ursulin
@ 2019-04-08 15:57     ` Chris Wilson
  2019-04-08 16:04       ` Tvrtko Ursulin
  0 siblings, 1 reply; 71+ messages in thread
From: Chris Wilson @ 2019-04-08 15:57 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-04-08 15:54:01)
> 
> On 08/04/2019 10:17, Chris Wilson wrote:
> > As we push for better compartmentalisation, it is more convenient to
> > copy the default sseu configuration from the engine into the derived
> > logical context, than it is to dig it out from i915->runtime_info.
> 
> Why it is more convenient?

Read on in the series.

> > +static inline struct intel_sseu
> > +intel_device_default_sseu(const struct sseu_dev_info *sseu)
> > +{
> 
> I think function parameter change is dangerous. If caller can pass in 
> any pointer then it is slightly less "intel_*device_default*_sseu. It is 
> more a convert and copy from device sseu to intel_sseu. Like 
> intel_convert_device_sseu.
> 
> Although if the function just kept taking i915 then the dilemma would 
> not even be there.

What i915? How about if it took the parameter that described the device
and converted it into the internal format?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 09/29] drm/i915: Pull the GEM powermangement coupling into its own file
  2019-04-08 14:56   ` Tvrtko Ursulin
@ 2019-04-08 16:00     ` Chris Wilson
  0 siblings, 0 replies; 71+ messages in thread
From: Chris Wilson @ 2019-04-08 16:00 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-04-08 15:56:39)
> 
> On 08/04/2019 10:17, Chris Wilson wrote:
> > Split out the powermanagement portion (GT wakeref, suspend/resume) of
> > GEM from i915_gem.c into its own file.
> 
> This is the same patch I gave r-b on before? Judging from the title and 
> commit message I'd say yes. But I know it is not, don't worry.

We are in a different series where this step may or may not make sense.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 05/29] drm/i915: Store the default sseu setup on the engine
  2019-04-08 15:57     ` Chris Wilson
@ 2019-04-08 16:04       ` Tvrtko Ursulin
  0 siblings, 0 replies; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-08 16:04 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 08/04/2019 16:57, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-04-08 15:54:01)
>>
>> On 08/04/2019 10:17, Chris Wilson wrote:
>>> As we push for better compartmentalisation, it is more convenient to
>>> copy the default sseu configuration from the engine into the derived
>>> logical context, than it is to dig it out from i915->runtime_info.
>>
>> Why it is more convenient?
> 
> Read on in the series.
> 
>>> +static inline struct intel_sseu
>>> +intel_device_default_sseu(const struct sseu_dev_info *sseu)
>>> +{
>>
>> I think function parameter change is dangerous. If caller can pass in
>> any pointer then it is slightly less "intel_*device_default*_sseu. It is
>> more a convert and copy from device sseu to intel_sseu. Like
>> intel_convert_device_sseu.
>>
>> Although if the function just kept taking i915 then the dilemma would
>> not even be there.
> 
> What i915? How about if it took the parameter that described the device
> and converted it into the internal format?

Don't know what i915 - function name says "default". If the caller 
passes in the pointer it is not a default but could be anything. So you 
can go with my first alternative, which is to give function a better 
name if you want to stick with the parameter change.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 07/29] drm/i915: Only reset the pinned kernel contexts on resume
  2019-04-08  9:17 ` [PATCH 07/29] drm/i915: Only reset the pinned kernel contexts on resume Chris Wilson
@ 2019-04-10  9:39   ` Tvrtko Ursulin
  0 siblings, 0 replies; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-10  9:39 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 08/04/2019 10:17, Chris Wilson wrote:
> On resume, we know that the only pinned contexts in danger of seeing
> corruption are the kernel context, and so we do not need to walk the
> list of all GEM contexts as we tracked them on each engine.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/intel_context_types.h |  1 +
>   drivers/gpu/drm/i915/gt/intel_engine.h        |  3 +-
>   drivers/gpu/drm/i915/gt/intel_engine_cs.c     | 18 +++++++
>   drivers/gpu/drm/i915/gt/intel_lrc.c           | 49 +++++++++----------
>   drivers/gpu/drm/i915/gt/intel_lrc.h           |  1 -
>   drivers/gpu/drm/i915/gt/intel_ringbuffer.c    | 17 +++----
>   drivers/gpu/drm/i915/i915_drv.h               |  1 -
>   drivers/gpu/drm/i915/i915_gem.c               |  9 ++--
>   8 files changed, 54 insertions(+), 45 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h
> index 2109610e21e8..9ec4f787c908 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
> @@ -25,6 +25,7 @@ struct intel_context_ops {
>   	int (*pin)(struct intel_context *ce);
>   	void (*unpin)(struct intel_context *ce);
>   
> +	void (*reset)(struct intel_context *ce);
>   	void (*destroy)(struct kref *kref);
>   };
>   
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
> index e58d6f04177b..0dea6c7fd438 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine.h
> @@ -268,8 +268,6 @@ static inline void intel_ring_put(struct intel_ring *ring)
>   void intel_engine_stop(struct intel_engine_cs *engine);
>   void intel_engine_cleanup(struct intel_engine_cs *engine);
>   
> -void intel_legacy_submission_resume(struct drm_i915_private *dev_priv);
> -
>   int __must_check intel_ring_cacheline_align(struct i915_request *rq);
>   
>   u32 __must_check *intel_ring_begin(struct i915_request *rq, unsigned int n);
> @@ -463,6 +461,7 @@ static inline void intel_engine_reset(struct intel_engine_cs *engine,
>   }
>   
>   void intel_engines_sanitize(struct drm_i915_private *i915, bool force);
> +void intel_gt_resume(struct drm_i915_private *i915);
>   
>   bool intel_engine_is_idle(struct intel_engine_cs *engine);
>   bool intel_engines_are_idle(struct drm_i915_private *dev_priv);
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 68eebf292314..fdfce11e4944 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -758,6 +758,24 @@ int intel_engine_init_common(struct intel_engine_cs *engine)
>   	return ret;
>   }
>   
> +void intel_gt_resume(struct drm_i915_private *i915)
> +{
> +	struct intel_engine_cs *engine;
> +	enum intel_engine_id id;
> +
> +	/*
> +	 * After resume, we may need to poke into the pinned kernel
> +	 * contexts to paper over any damage caused by the sudden suspend.
> +	 * Only the kernel contexts should remain pinned over suspend,
> +	 * allowing us to fixup the user contexts on their first pin.
> +	 */
> +	for_each_engine(engine, i915, id) {
> +		struct intel_context *ce = engine->kernel_context;
> +
> +		ce->ops->reset(ce);
> +	}
> +}
> +
>   /**
>    * intel_engines_cleanup_common - cleans up the engine state created by
>    *                                the common initiailizers.
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 85bdae1db76c..1565b11b15f6 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -1378,9 +1378,33 @@ static int execlists_context_pin(struct intel_context *ce)
>   	return __execlists_context_pin(ce, ce->engine);
>   }
>   
> +static void execlists_context_reset(struct intel_context *ce)
> +{
> +	/*
> +	 * Because we emit WA_TAIL_DWORDS there may be a disparity
> +	 * between our bookkeeping in ce->ring->head and ce->ring->tail and
> +	 * that stored in context. As we only write new commands from
> +	 * ce->ring->tail onwards, everything before that is junk. If the GPU
> +	 * starts reading from its RING_HEAD from the context, it may try to
> +	 * execute that junk and die.
> +	 *
> +	 * The contexts that are stilled pinned on resume belong to the
> +	 * kernel, and are local to each engine. All other contexts will
> +	 * have their head/tail sanitized upon pinning before use, so they
> +	 * will never see garbage,
> +	 *
> +	 * So to avoid that we reset the context images upon resume. For
> +	 * simplicity, we just zero everything out.
> +	 */
> +	intel_ring_reset(ce->ring, 0);
> +	__execlists_update_reg_state(ce, ce->engine);
> +}
> +
>   static const struct intel_context_ops execlists_context_ops = {
>   	.pin = execlists_context_pin,
>   	.unpin = execlists_context_unpin,
> +
> +	.reset = execlists_context_reset,
>   	.destroy = execlists_context_destroy,
>   };
>   
> @@ -2762,31 +2786,6 @@ static int execlists_context_deferred_alloc(struct intel_context *ce,
>   	return ret;
>   }
>   
> -void intel_lr_context_resume(struct drm_i915_private *i915)
> -{
> -	struct i915_gem_context *ctx;
> -	struct intel_context *ce;
> -
> -	/*
> -	 * Because we emit WA_TAIL_DWORDS there may be a disparity
> -	 * between our bookkeeping in ce->ring->head and ce->ring->tail and
> -	 * that stored in context. As we only write new commands from
> -	 * ce->ring->tail onwards, everything before that is junk. If the GPU
> -	 * starts reading from its RING_HEAD from the context, it may try to
> -	 * execute that junk and die.
> -	 *
> -	 * So to avoid that we reset the context images upon resume. For
> -	 * simplicity, we just zero everything out.
> -	 */
> -	list_for_each_entry(ctx, &i915->contexts.list, link) {
> -		list_for_each_entry(ce, &ctx->active_engines, active_link) {
> -			GEM_BUG_ON(!ce->ring);
> -			intel_ring_reset(ce->ring, 0);
> -			__execlists_update_reg_state(ce, ce->engine);
> -		}
> -	}
> -}
> -
>   void intel_execlists_show_requests(struct intel_engine_cs *engine,
>   				   struct drm_printer *m,
>   				   void (*show_request)(struct drm_printer *m,
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.h b/drivers/gpu/drm/i915/gt/intel_lrc.h
> index b3274e707423..a598f2d56de3 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.h
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.h
> @@ -100,7 +100,6 @@ struct drm_printer;
>   
>   struct drm_i915_private;
>   
> -void intel_lr_context_resume(struct drm_i915_private *dev_priv);
>   void intel_execlists_set_default_submission(struct intel_engine_cs *engine);
>   
>   void intel_execlists_show_requests(struct intel_engine_cs *engine,
> diff --git a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
> index be83cb8167f3..bcc842b8bbe7 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
> @@ -1507,9 +1507,16 @@ static int ring_context_pin(struct intel_context *ce)
>   	return err;
>   }
>   
> +static void ring_context_reset(struct intel_context *ce)
> +{
> +	intel_ring_reset(ce->ring, 0);
> +}
> +
>   static const struct intel_context_ops ring_context_ops = {
>   	.pin = ring_context_pin,
>   	.unpin = ring_context_unpin,
> +
> +	.reset = ring_context_reset,
>   	.destroy = ring_context_destroy,
>   };
>   
> @@ -1580,16 +1587,6 @@ void intel_engine_cleanup(struct intel_engine_cs *engine)
>   	kfree(engine);
>   }
>   
> -void intel_legacy_submission_resume(struct drm_i915_private *dev_priv)
> -{
> -	struct intel_engine_cs *engine;
> -	enum intel_engine_id id;
> -
> -	/* Restart from the beginning of the rings for convenience */
> -	for_each_engine(engine, dev_priv, id)
> -		intel_ring_reset(engine->buffer, 0);
> -}
> -
>   static int load_pd_dir(struct i915_request *rq,
>   		       const struct i915_hw_ppgtt *ppgtt)
>   {
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 312278723712..fad5306f07da 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1996,7 +1996,6 @@ struct drm_i915_private {
>   
>   	/* Abstract the submission mechanism (legacy ringbuffer or execlists) away */
>   	struct {
> -		void (*resume)(struct drm_i915_private *);
>   		void (*cleanup_engine)(struct intel_engine_cs *engine);
>   
>   		struct i915_gt_timelines {
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 0aa28d622d2a..a8b0aa05288b 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4514,7 +4514,7 @@ void i915_gem_resume(struct drm_i915_private *i915)
>   	 * guarantee that the context image is complete. So let's just reset
>   	 * it and start again.
>   	 */
> -	i915->gt.resume(i915);
> +	intel_gt_resume(i915);
>   
>   	if (i915_gem_init_hw(i915))
>   		goto err_wedged;
> @@ -4854,13 +4854,10 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
>   
>   	dev_priv->mm.unordered_timeline = dma_fence_context_alloc(1);
>   
> -	if (HAS_LOGICAL_RING_CONTEXTS(dev_priv)) {
> -		dev_priv->gt.resume = intel_lr_context_resume;
> +	if (HAS_LOGICAL_RING_CONTEXTS(dev_priv))
>   		dev_priv->gt.cleanup_engine = intel_logical_ring_cleanup;
> -	} else {
> -		dev_priv->gt.resume = intel_legacy_submission_resume;
> +	else
>   		dev_priv->gt.cleanup_engine = intel_engine_cleanup;
> -	}
>   
>   	i915_timelines_init(dev_priv);
>   
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 08/29] drm/i915: Introduce struct intel_wakeref
  2019-04-08  9:17 ` [PATCH 08/29] drm/i915: Introduce struct intel_wakeref Chris Wilson
@ 2019-04-10  9:49   ` Tvrtko Ursulin
  2019-04-10 10:01     ` Chris Wilson
  0 siblings, 1 reply; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-10  9:49 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 08/04/2019 10:17, Chris Wilson wrote:
> For controlling runtime pm of the GT and engines, we would like to have
> a callback to do extra work the first time we wake up and the last time
> we drop the wakeref. This first/last access needs serialisation and so
> we encompass a mutex with the regular intel_wakeref_t tracker.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/Makefile             |  1 +
>   drivers/gpu/drm/i915/Makefile.header-test |  3 +-
>   drivers/gpu/drm/i915/i915_drv.h           |  3 +-
>   drivers/gpu/drm/i915/intel_wakeref.c      | 51 +++++++++++++++
>   drivers/gpu/drm/i915/intel_wakeref.h      | 77 +++++++++++++++++++++++
>   5 files changed, 132 insertions(+), 3 deletions(-)
>   create mode 100644 drivers/gpu/drm/i915/intel_wakeref.c
>   create mode 100644 drivers/gpu/drm/i915/intel_wakeref.h
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 40130cf5c003..233bad5e361f 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -50,6 +50,7 @@ i915-y += i915_drv.o \
>   	  intel_device_info.o \
>   	  intel_pm.o \
>   	  intel_runtime_pm.o \
> +	  intel_wakeref.o \
>   	  intel_uncore.o
>   
>   # core library code
> diff --git a/drivers/gpu/drm/i915/Makefile.header-test b/drivers/gpu/drm/i915/Makefile.header-test
> index 96a5d90629ec..e6b3e7588860 100644
> --- a/drivers/gpu/drm/i915/Makefile.header-test
> +++ b/drivers/gpu/drm/i915/Makefile.header-test
> @@ -31,7 +31,8 @@ header_test := \
>   	intel_psr.h \
>   	intel_sdvo.h \
>   	intel_sprite.h \
> -	intel_tv.h
> +	intel_tv.h \
> +	intel_wakeref.h
>   
>   quiet_cmd_header_test = HDRTEST $@
>         cmd_header_test = echo "\#include \"$(<F)\"" > $@
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index fad5306f07da..62a7e91acd7f 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -74,6 +74,7 @@
>   #include "intel_opregion.h"
>   #include "intel_uc.h"
>   #include "intel_uncore.h"
> +#include "intel_wakeref.h"
>   #include "intel_wopcm.h"
>   
>   #include "i915_gem.h"
> @@ -134,8 +135,6 @@ bool i915_error_injected(void);
>   	__i915_printk(i915, i915_error_injected() ? KERN_DEBUG : KERN_ERR, \
>   		      fmt, ##__VA_ARGS__)
>   
> -typedef depot_stack_handle_t intel_wakeref_t;
> -
>   enum hpd_pin {
>   	HPD_NONE = 0,
>   	HPD_TV = HPD_NONE,     /* TV is known to be unreliable */
> diff --git a/drivers/gpu/drm/i915/intel_wakeref.c b/drivers/gpu/drm/i915/intel_wakeref.c
> new file mode 100644
> index 000000000000..c133c91cf277
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/intel_wakeref.c
> @@ -0,0 +1,51 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#include "intel_drv.h"
> +#include "intel_wakeref.h"
> +
> +void __intel_wakeref_get_once(struct drm_i915_private *i915,
> +			      struct intel_wakeref_count *wc,
> +			      bool (*fn)(struct intel_wakeref_count *wc))
> +{
> +	/*
> +	 * Treat get/put as different subclasses, as we may need to run
> +	 * the put callback from under the shrinker and do not want to
> +	 * cross-contanimate that callback with any extra work performed
> +	 * upon acquiring the wakeref.
> +	 */
> +	mutex_lock_nested(&wc->mutex, SINGLE_DEPTH_NESTING);
> +	if (!atomic_read(&wc->count)) {
> +		wc->wakeref = intel_runtime_pm_get(i915);
> +		if (unlikely(!fn(wc))) {
> +			intel_runtime_pm_put(i915, wc->wakeref);
> +			mutex_unlock(&wc->mutex);
> +			return;

Feels like a bad API that callback return error gets ignored. I mean 
from the caller's perspective - how do they know if the wakeref get 
succeeded or not?

> +		}
> +
> +		smp_mb__before_atomic(); /* release wc->count */
> +	}
> +	atomic_inc(&wc->count);
> +	mutex_unlock(&wc->mutex);
> +}
> +
> +void __intel_wakeref_put_once(struct drm_i915_private *i915,
> +			      struct intel_wakeref_count *wc,
> +			      bool (*fn)(struct intel_wakeref_count *wc))
> +{
> +	if (likely(fn(wc)))
> +		intel_runtime_pm_put(i915, wc->wakeref);
> +	else
> +		atomic_inc(&wc->count);
> +	mutex_unlock(&wc->mutex);
> +}
> +
> +void __intel_wakeref_init(struct intel_wakeref_count *wc,
> +			  struct lock_class_key *key)
> +{
> +	__mutex_init(&wc->mutex, "wakeref", key);
> +	atomic_set(&wc->count, 0);
> +}
> diff --git a/drivers/gpu/drm/i915/intel_wakeref.h b/drivers/gpu/drm/i915/intel_wakeref.h
> new file mode 100644
> index 000000000000..cd4d8f57e502
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/intel_wakeref.h
> @@ -0,0 +1,77 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#ifndef INTEL_WAKEREF_H
> +#define INTEL_WAKEREF_H
> +
> +#include <linux/atomic.h>
> +#include <linux/mutex.h>
> +#include <linux/stackdepot.h>
> +
> +struct drm_i915_private;
> +
> +typedef depot_stack_handle_t intel_wakeref_t;
> +
> +struct intel_wakeref_count {
> +	atomic_t count;
> +	struct mutex mutex;
> +	intel_wakeref_t wakeref;
> +};
> +
> +void __intel_wakeref_init(struct intel_wakeref_count *wc,
> +			  struct lock_class_key *key);
> +#define intel_wakeref_init(wc) do {					\
> +	static struct lock_class_key __key;				\
> +									\
> +	__intel_wakeref_init((wc), &__key);				\
> +} while (0)
> +
> +void __intel_wakeref_get_once(struct drm_i915_private *i915,
> +			      struct intel_wakeref_count *wc,
> +			      bool (*fn)(struct intel_wakeref_count *wc));
> +void __intel_wakeref_put_once(struct drm_i915_private *i915,
> +			      struct intel_wakeref_count *wc,
> +			      bool (*fn)(struct intel_wakeref_count *wc));
> +
> +static inline void
> +intel_wakeref_get_once(struct drm_i915_private *i915,
> +		       struct intel_wakeref_count *wc,
> +		       bool (*fn)(struct intel_wakeref_count *wc))

What does the "once" suffix refer to here? That it will only call the 
callback once? Is that needed in the function name and couldn't be 
documented in kerneldoc, given how there is no non-suffixed version?

> +{
> +	if (unlikely(!atomic_inc_not_zero(&wc->count)))
> +		__intel_wakeref_get_once(i915, wc, fn);
> +}
> +
> +static inline void
> +intel_wakeref_put_once(struct drm_i915_private *i915,
> +		       struct intel_wakeref_count *wc,
> +		       bool (*fn)(struct intel_wakeref_count *wc))
> +{
> +	if (atomic_dec_and_mutex_lock(&wc->count, &wc->mutex))
> +		__intel_wakeref_put_once(i915, wc, fn);
> +}
> +
> +static inline void
> +intel_wakeref_lock(struct intel_wakeref_count *wc)
> +	__acquires(wc->mutex)
> +{
> +	mutex_lock(&wc->mutex);
> +}
> +
> +static inline void
> +intel_wakeref_unlock(struct intel_wakeref_count *wc)
> +	__releases(wc->mutex)
> +{
> +	mutex_unlock(&wc->mutex);
> +}
> +
> +static inline bool
> +intel_wakeref_active(struct intel_wakeref_count *wc)
> +{
> +	return atomic_read(&wc->count);
> +}
> +
> +#endif /* INTEL_WAKEREF_H */
> 

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 09/29] drm/i915: Pull the GEM powermangement coupling into its own file
  2019-04-08  9:17 ` [PATCH 09/29] drm/i915: Pull the GEM powermangement coupling into its own file Chris Wilson
  2019-04-08 14:56   ` Tvrtko Ursulin
@ 2019-04-10  9:57   ` Tvrtko Ursulin
  1 sibling, 0 replies; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-10  9:57 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 08/04/2019 10:17, Chris Wilson wrote:
> Split out the powermanagement portion (GT wakeref, suspend/resume) of
> GEM from i915_gem.c into its own file.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/Makefile                 |   1 +
>   drivers/gpu/drm/i915/Makefile.header-test     |   1 +
>   drivers/gpu/drm/i915/i915_debugfs.c           |   4 +-
>   drivers/gpu/drm/i915/i915_drv.h               |  12 +-
>   drivers/gpu/drm/i915/i915_gem.c               | 363 +----------------
>   drivers/gpu/drm/i915/i915_gem_pm.c            | 365 ++++++++++++++++++
>   drivers/gpu/drm/i915/i915_gem_pm.h            |  28 ++
>   .../gpu/drm/i915/selftests/i915_gem_context.c |   2 +-
>   .../gpu/drm/i915/selftests/i915_gem_object.c  |   8 +-
>   .../gpu/drm/i915/selftests/mock_gem_device.c  |  10 +-
>   10 files changed, 418 insertions(+), 376 deletions(-)
>   create mode 100644 drivers/gpu/drm/i915/i915_gem_pm.c
>   create mode 100644 drivers/gpu/drm/i915/i915_gem_pm.h
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 233bad5e361f..858642c7bc40 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -97,6 +97,7 @@ i915-y += \
>   	  i915_gem_internal.o \
>   	  i915_gem.o \
>   	  i915_gem_object.o \
> +	  i915_gem_pm.o \
>   	  i915_gem_render_state.o \
>   	  i915_gem_shrinker.o \
>   	  i915_gem_stolen.o \
> diff --git a/drivers/gpu/drm/i915/Makefile.header-test b/drivers/gpu/drm/i915/Makefile.header-test
> index e6b3e7588860..702e3a7ade4c 100644
> --- a/drivers/gpu/drm/i915/Makefile.header-test
> +++ b/drivers/gpu/drm/i915/Makefile.header-test
> @@ -5,6 +5,7 @@
>   header_test := \
>   	i915_active_types.h \
>   	i915_gem_context_types.h \
> +	i915_gem_pm.h \
>   	i915_priolist_types.h \
>   	i915_scheduler_types.h \
>   	i915_timeline_types.h \
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index eac4095278b2..e3eca88614b2 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -3941,8 +3941,8 @@ i915_drop_caches_set(void *data, u64 val)
>   	if (val & DROP_IDLE) {
>   		do {
>   			if (READ_ONCE(i915->gt.active_requests))
> -				flush_delayed_work(&i915->gt.retire_work);
> -			drain_delayed_work(&i915->gt.idle_work);
> +				flush_delayed_work(&i915->gem.retire_work);
> +			drain_delayed_work(&i915->gem.idle_work);
>   		} while (READ_ONCE(i915->gt.awake));
>   	}
>   
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 62a7e91acd7f..cbae9be052e0 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2020,6 +2020,12 @@ struct drm_i915_private {
>   		 */
>   		intel_wakeref_t awake;
>   
> +		ktime_t last_init_time;
> +
> +		struct i915_vma *scratch;
> +	} gt;
> +
> +	struct {
>   		/**
>   		 * We leave the user IRQ off as much as possible,
>   		 * but this means that requests will finish and never
> @@ -2037,11 +2043,7 @@ struct drm_i915_private {
>   		 * off the idle_work.
>   		 */
>   		struct delayed_work idle_work;
> -
> -		ktime_t last_init_time;
> -
> -		struct i915_vma *scratch;
> -	} gt;
> +	} gem;
>   
>   	/* For i945gm vblank irq vs. C3 workaround */
>   	struct {
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index a8b0aa05288b..c56633e87c32 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -46,7 +46,7 @@
>   #include "i915_drv.h"
>   #include "i915_gem_clflush.h"
>   #include "i915_gemfs.h"
> -#include "i915_globals.h"
> +#include "i915_gem_pm.h"
>   #include "i915_trace.h"
>   #include "i915_vgpu.h"
>   
> @@ -103,105 +103,6 @@ static void i915_gem_info_remove_obj(struct drm_i915_private *dev_priv,
>   	spin_unlock(&dev_priv->mm.object_stat_lock);
>   }
>   
> -static void __i915_gem_park(struct drm_i915_private *i915)
> -{
> -	intel_wakeref_t wakeref;
> -
> -	GEM_TRACE("\n");
> -
> -	lockdep_assert_held(&i915->drm.struct_mutex);
> -	GEM_BUG_ON(i915->gt.active_requests);
> -	GEM_BUG_ON(!list_empty(&i915->gt.active_rings));
> -
> -	if (!i915->gt.awake)
> -		return;
> -
> -	/*
> -	 * Be paranoid and flush a concurrent interrupt to make sure
> -	 * we don't reactivate any irq tasklets after parking.
> -	 *
> -	 * FIXME: Note that even though we have waited for execlists to be idle,
> -	 * there may still be an in-flight interrupt even though the CSB
> -	 * is now empty. synchronize_irq() makes sure that a residual interrupt
> -	 * is completed before we continue, but it doesn't prevent the HW from
> -	 * raising a spurious interrupt later. To complete the shield we should
> -	 * coordinate disabling the CS irq with flushing the interrupts.
> -	 */
> -	synchronize_irq(i915->drm.irq);
> -
> -	intel_engines_park(i915);
> -	i915_timelines_park(i915);
> -
> -	i915_pmu_gt_parked(i915);
> -	i915_vma_parked(i915);
> -
> -	wakeref = fetch_and_zero(&i915->gt.awake);
> -	GEM_BUG_ON(!wakeref);
> -
> -	if (INTEL_GEN(i915) >= 6)
> -		gen6_rps_idle(i915);
> -
> -	intel_display_power_put(i915, POWER_DOMAIN_GT_IRQ, wakeref);
> -
> -	i915_globals_park();
> -}
> -
> -void i915_gem_park(struct drm_i915_private *i915)
> -{
> -	GEM_TRACE("\n");
> -
> -	lockdep_assert_held(&i915->drm.struct_mutex);
> -	GEM_BUG_ON(i915->gt.active_requests);
> -
> -	if (!i915->gt.awake)
> -		return;
> -
> -	/* Defer the actual call to __i915_gem_park() to prevent ping-pongs */
> -	mod_delayed_work(i915->wq, &i915->gt.idle_work, msecs_to_jiffies(100));
> -}
> -
> -void i915_gem_unpark(struct drm_i915_private *i915)
> -{
> -	GEM_TRACE("\n");
> -
> -	lockdep_assert_held(&i915->drm.struct_mutex);
> -	GEM_BUG_ON(!i915->gt.active_requests);
> -	assert_rpm_wakelock_held(i915);
> -
> -	if (i915->gt.awake)
> -		return;
> -
> -	/*
> -	 * It seems that the DMC likes to transition between the DC states a lot
> -	 * when there are no connected displays (no active power domains) during
> -	 * command submission.
> -	 *
> -	 * This activity has negative impact on the performance of the chip with
> -	 * huge latencies observed in the interrupt handler and elsewhere.
> -	 *
> -	 * Work around it by grabbing a GT IRQ power domain whilst there is any
> -	 * GT activity, preventing any DC state transitions.
> -	 */
> -	i915->gt.awake = intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
> -	GEM_BUG_ON(!i915->gt.awake);
> -
> -	i915_globals_unpark();
> -
> -	intel_enable_gt_powersave(i915);
> -	i915_update_gfx_val(i915);
> -	if (INTEL_GEN(i915) >= 6)
> -		gen6_rps_busy(i915);
> -	i915_pmu_gt_unparked(i915);
> -
> -	intel_engines_unpark(i915);
> -
> -	i915_queue_hangcheck(i915);
> -
> -	queue_delayed_work(i915->wq,
> -			   &i915->gt.retire_work,
> -			   round_jiffies_up_relative(HZ));
> -}
> -
>   int
>   i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
>   			    struct drm_file *file)
> @@ -2088,7 +1989,7 @@ static int i915_gem_object_create_mmap_offset(struct drm_i915_gem_object *obj)
>   		if (!err)
>   			break;
>   
> -	} while (flush_delayed_work(&dev_priv->gt.retire_work));
> +	} while (flush_delayed_work(&dev_priv->gem.retire_work));
>   
>   	return err;
>   }
> @@ -2871,132 +2772,6 @@ i915_gem_object_pwrite_gtt(struct drm_i915_gem_object *obj,
>   	return 0;
>   }
>   
> -static void
> -i915_gem_retire_work_handler(struct work_struct *work)
> -{
> -	struct drm_i915_private *dev_priv =
> -		container_of(work, typeof(*dev_priv), gt.retire_work.work);
> -	struct drm_device *dev = &dev_priv->drm;
> -
> -	/* Come back later if the device is busy... */
> -	if (mutex_trylock(&dev->struct_mutex)) {
> -		i915_retire_requests(dev_priv);
> -		mutex_unlock(&dev->struct_mutex);
> -	}
> -
> -	/*
> -	 * Keep the retire handler running until we are finally idle.
> -	 * We do not need to do this test under locking as in the worst-case
> -	 * we queue the retire worker once too often.
> -	 */
> -	if (READ_ONCE(dev_priv->gt.awake))
> -		queue_delayed_work(dev_priv->wq,
> -				   &dev_priv->gt.retire_work,
> -				   round_jiffies_up_relative(HZ));
> -}
> -
> -static bool switch_to_kernel_context_sync(struct drm_i915_private *i915,
> -					  unsigned long mask)
> -{
> -	bool result = true;
> -
> -	/*
> -	 * Even if we fail to switch, give whatever is running a small chance
> -	 * to save itself before we report the failure. Yes, this may be a
> -	 * false positive due to e.g. ENOMEM, caveat emptor!
> -	 */
> -	if (i915_gem_switch_to_kernel_context(i915, mask))
> -		result = false;
> -
> -	if (i915_gem_wait_for_idle(i915,
> -				   I915_WAIT_LOCKED |
> -				   I915_WAIT_FOR_IDLE_BOOST,
> -				   I915_GEM_IDLE_TIMEOUT))
> -		result = false;
> -
> -	if (!result) {
> -		if (i915_modparams.reset) { /* XXX hide warning from gem_eio */
> -			dev_err(i915->drm.dev,
> -				"Failed to idle engines, declaring wedged!\n");
> -			GEM_TRACE_DUMP();
> -		}
> -
> -		/* Forcibly cancel outstanding work and leave the gpu quiet. */
> -		i915_gem_set_wedged(i915);
> -	}
> -
> -	i915_retire_requests(i915); /* ensure we flush after wedging */
> -	return result;
> -}
> -
> -static bool load_power_context(struct drm_i915_private *i915)
> -{
> -	/* Force loading the kernel context on all engines */
> -	if (!switch_to_kernel_context_sync(i915, ALL_ENGINES))
> -		return false;
> -
> -	/*
> -	 * Immediately park the GPU so that we enable powersaving and
> -	 * treat it as idle. The next time we issue a request, we will
> -	 * unpark and start using the engine->pinned_default_state, otherwise
> -	 * it is in limbo and an early reset may fail.
> -	 */
> -	__i915_gem_park(i915);
> -
> -	return true;
> -}
> -
> -static void
> -i915_gem_idle_work_handler(struct work_struct *work)
> -{
> -	struct drm_i915_private *i915 =
> -		container_of(work, typeof(*i915), gt.idle_work.work);
> -	bool rearm_hangcheck;
> -
> -	if (!READ_ONCE(i915->gt.awake))
> -		return;
> -
> -	if (READ_ONCE(i915->gt.active_requests))
> -		return;
> -
> -	rearm_hangcheck =
> -		cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
> -
> -	if (!mutex_trylock(&i915->drm.struct_mutex)) {
> -		/* Currently busy, come back later */
> -		mod_delayed_work(i915->wq,
> -				 &i915->gt.idle_work,
> -				 msecs_to_jiffies(50));
> -		goto out_rearm;
> -	}
> -
> -	/*
> -	 * Flush out the last user context, leaving only the pinned
> -	 * kernel context resident. Should anything unfortunate happen
> -	 * while we are idle (such as the GPU being power cycled), no users
> -	 * will be harmed.
> -	 */
> -	if (!work_pending(&i915->gt.idle_work.work) &&
> -	    !i915->gt.active_requests) {
> -		++i915->gt.active_requests; /* don't requeue idle */
> -
> -		switch_to_kernel_context_sync(i915, i915->gt.active_engines);
> -
> -		if (!--i915->gt.active_requests) {
> -			__i915_gem_park(i915);
> -			rearm_hangcheck = false;
> -		}
> -	}
> -
> -	mutex_unlock(&i915->drm.struct_mutex);
> -
> -out_rearm:
> -	if (rearm_hangcheck) {
> -		GEM_BUG_ON(!i915->gt.awake);
> -		i915_queue_hangcheck(i915);
> -	}
> -}
> -
>   void i915_gem_close_object(struct drm_gem_object *gem, struct drm_file *file)
>   {
>   	struct drm_i915_private *i915 = to_i915(gem->dev);
> @@ -4412,133 +4187,6 @@ void i915_gem_sanitize(struct drm_i915_private *i915)
>   	mutex_unlock(&i915->drm.struct_mutex);
>   }
>   
> -void i915_gem_suspend(struct drm_i915_private *i915)
> -{
> -	intel_wakeref_t wakeref;
> -
> -	GEM_TRACE("\n");
> -
> -	wakeref = intel_runtime_pm_get(i915);
> -
> -	flush_workqueue(i915->wq);
> -
> -	mutex_lock(&i915->drm.struct_mutex);
> -
> -	/*
> -	 * We have to flush all the executing contexts to main memory so
> -	 * that they can saved in the hibernation image. To ensure the last
> -	 * context image is coherent, we have to switch away from it. That
> -	 * leaves the i915->kernel_context still active when
> -	 * we actually suspend, and its image in memory may not match the GPU
> -	 * state. Fortunately, the kernel_context is disposable and we do
> -	 * not rely on its state.
> -	 */
> -	switch_to_kernel_context_sync(i915, i915->gt.active_engines);
> -
> -	mutex_unlock(&i915->drm.struct_mutex);
> -	i915_reset_flush(i915);
> -
> -	drain_delayed_work(&i915->gt.retire_work);
> -
> -	/*
> -	 * As the idle_work is rearming if it detects a race, play safe and
> -	 * repeat the flush until it is definitely idle.
> -	 */
> -	drain_delayed_work(&i915->gt.idle_work);
> -
> -	/*
> -	 * Assert that we successfully flushed all the work and
> -	 * reset the GPU back to its idle, low power state.
> -	 */
> -	GEM_BUG_ON(i915->gt.awake);
> -
> -	intel_uc_suspend(i915);
> -
> -	intel_runtime_pm_put(i915, wakeref);
> -}
> -
> -void i915_gem_suspend_late(struct drm_i915_private *i915)
> -{
> -	struct drm_i915_gem_object *obj;
> -	struct list_head *phases[] = {
> -		&i915->mm.unbound_list,
> -		&i915->mm.bound_list,
> -		NULL
> -	}, **phase;
> -
> -	/*
> -	 * Neither the BIOS, ourselves or any other kernel
> -	 * expects the system to be in execlists mode on startup,
> -	 * so we need to reset the GPU back to legacy mode. And the only
> -	 * known way to disable logical contexts is through a GPU reset.
> -	 *
> -	 * So in order to leave the system in a known default configuration,
> -	 * always reset the GPU upon unload and suspend. Afterwards we then
> -	 * clean up the GEM state tracking, flushing off the requests and
> -	 * leaving the system in a known idle state.
> -	 *
> -	 * Note that is of the upmost importance that the GPU is idle and
> -	 * all stray writes are flushed *before* we dismantle the backing
> -	 * storage for the pinned objects.
> -	 *
> -	 * However, since we are uncertain that resetting the GPU on older
> -	 * machines is a good idea, we don't - just in case it leaves the
> -	 * machine in an unusable condition.
> -	 */
> -
> -	mutex_lock(&i915->drm.struct_mutex);
> -	for (phase = phases; *phase; phase++) {
> -		list_for_each_entry(obj, *phase, mm.link)
> -			WARN_ON(i915_gem_object_set_to_gtt_domain(obj, false));
> -	}
> -	mutex_unlock(&i915->drm.struct_mutex);
> -
> -	intel_uc_sanitize(i915);
> -	i915_gem_sanitize(i915);
> -}
> -
> -void i915_gem_resume(struct drm_i915_private *i915)
> -{
> -	GEM_TRACE("\n");
> -
> -	WARN_ON(i915->gt.awake);
> -
> -	mutex_lock(&i915->drm.struct_mutex);
> -	intel_uncore_forcewake_get(&i915->uncore, FORCEWAKE_ALL);
> -
> -	i915_gem_restore_gtt_mappings(i915);
> -	i915_gem_restore_fences(i915);
> -
> -	/*
> -	 * As we didn't flush the kernel context before suspend, we cannot
> -	 * guarantee that the context image is complete. So let's just reset
> -	 * it and start again.
> -	 */
> -	intel_gt_resume(i915);
> -
> -	if (i915_gem_init_hw(i915))
> -		goto err_wedged;
> -
> -	intel_uc_resume(i915);
> -
> -	/* Always reload a context for powersaving. */
> -	if (!load_power_context(i915))
> -		goto err_wedged;
> -
> -out_unlock:
> -	intel_uncore_forcewake_put(&i915->uncore, FORCEWAKE_ALL);
> -	mutex_unlock(&i915->drm.struct_mutex);
> -	return;
> -
> -err_wedged:
> -	if (!i915_reset_failed(i915)) {
> -		dev_err(i915->drm.dev,
> -			"Failed to re-initialize GPU, declaring it wedged!\n");
> -		i915_gem_set_wedged(i915);
> -	}
> -	goto out_unlock;
> -}
> -
>   void i915_gem_init_swizzling(struct drm_i915_private *dev_priv)
>   {
>   	if (INTEL_GEN(dev_priv) < 5 ||
> @@ -4721,7 +4369,7 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
>   	}
>   
>   	/* Flush the default context image to memory, and enable powersaving. */
> -	if (!load_power_context(i915)) {
> +	if (!i915_gem_load_power_context(i915)) {
>   		err = -EIO;
>   		goto err_active;
>   	}
> @@ -5115,11 +4763,8 @@ int i915_gem_init_early(struct drm_i915_private *dev_priv)
>   	INIT_LIST_HEAD(&dev_priv->gt.closed_vma);
>   
>   	i915_gem_init__mm(dev_priv);
> +	i915_gem_init__pm(dev_priv);
>   
> -	INIT_DELAYED_WORK(&dev_priv->gt.retire_work,
> -			  i915_gem_retire_work_handler);
> -	INIT_DELAYED_WORK(&dev_priv->gt.idle_work,
> -			  i915_gem_idle_work_handler);
>   	init_waitqueue_head(&dev_priv->gpu_error.wait_queue);
>   	init_waitqueue_head(&dev_priv->gpu_error.reset_queue);
>   	mutex_init(&dev_priv->gpu_error.wedge_mutex);
> diff --git a/drivers/gpu/drm/i915/i915_gem_pm.c b/drivers/gpu/drm/i915/i915_gem_pm.c
> new file mode 100644
> index 000000000000..bc0663cac7b6
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_gem_pm.c
> @@ -0,0 +1,365 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#include "i915_drv.h"
> +#include "i915_gem_pm.h"
> +#include "i915_globals.h"
> +#include "intel_pm.h"
> +
> +static void __i915_gem_park(struct drm_i915_private *i915)
> +{
> +	intel_wakeref_t wakeref;
> +
> +	GEM_TRACE("\n");
> +
> +	lockdep_assert_held(&i915->drm.struct_mutex);
> +	GEM_BUG_ON(i915->gt.active_requests);
> +	GEM_BUG_ON(!list_empty(&i915->gt.active_rings));
> +
> +	if (!i915->gt.awake)
> +		return;
> +
> +	/*
> +	 * Be paranoid and flush a concurrent interrupt to make sure
> +	 * we don't reactivate any irq tasklets after parking.
> +	 *
> +	 * FIXME: Note that even though we have waited for execlists to be idle,
> +	 * there may still be an in-flight interrupt even though the CSB
> +	 * is now empty. synchronize_irq() makes sure that a residual interrupt
> +	 * is completed before we continue, but it doesn't prevent the HW from
> +	 * raising a spurious interrupt later. To complete the shield we should
> +	 * coordinate disabling the CS irq with flushing the interrupts.
> +	 */
> +	synchronize_irq(i915->drm.irq);
> +
> +	intel_engines_park(i915);
> +	i915_timelines_park(i915);
> +
> +	i915_pmu_gt_parked(i915);
> +	i915_vma_parked(i915);
> +
> +	wakeref = fetch_and_zero(&i915->gt.awake);
> +	GEM_BUG_ON(!wakeref);
> +
> +	if (INTEL_GEN(i915) >= 6)
> +		gen6_rps_idle(i915);
> +
> +	intel_display_power_put(i915, POWER_DOMAIN_GT_IRQ, wakeref);
> +
> +	i915_globals_park();
> +}
> +
> +static bool switch_to_kernel_context_sync(struct drm_i915_private *i915,
> +					  unsigned long mask)
> +{
> +	bool result = true;
> +
> +	/*
> +	 * Even if we fail to switch, give whatever is running a small chance
> +	 * to save itself before we report the failure. Yes, this may be a
> +	 * false positive due to e.g. ENOMEM, caveat emptor!
> +	 */
> +	if (i915_gem_switch_to_kernel_context(i915, mask))
> +		result = false;
> +
> +	if (i915_gem_wait_for_idle(i915,
> +				   I915_WAIT_LOCKED |
> +				   I915_WAIT_FOR_IDLE_BOOST,
> +				   I915_GEM_IDLE_TIMEOUT))
> +		result = false;
> +
> +	if (!result) {
> +		if (i915_modparams.reset) { /* XXX hide warning from gem_eio */
> +			dev_err(i915->drm.dev,
> +				"Failed to idle engines, declaring wedged!\n");
> +			GEM_TRACE_DUMP();
> +		}
> +
> +		/* Forcibly cancel outstanding work and leave the gpu quiet. */
> +		i915_gem_set_wedged(i915);
> +	}
> +
> +	i915_retire_requests(i915); /* ensure we flush after wedging */
> +	return result;
> +}
> +
> +static void idle_work_handler(struct work_struct *work)
> +{
> +	struct drm_i915_private *i915 =
> +		container_of(work, typeof(*i915), gem.idle_work.work);
> +	bool rearm_hangcheck;
> +
> +	if (!READ_ONCE(i915->gt.awake))
> +		return;
> +
> +	if (READ_ONCE(i915->gt.active_requests))
> +		return;
> +
> +	rearm_hangcheck =
> +		cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
> +
> +	if (!mutex_trylock(&i915->drm.struct_mutex)) {
> +		/* Currently busy, come back later */
> +		mod_delayed_work(i915->wq,
> +				 &i915->gem.idle_work,
> +				 msecs_to_jiffies(50));
> +		goto out_rearm;
> +	}
> +
> +	/*
> +	 * Flush out the last user context, leaving only the pinned
> +	 * kernel context resident. Should anything unfortunate happen
> +	 * while we are idle (such as the GPU being power cycled), no users
> +	 * will be harmed.
> +	 */
> +	if (!work_pending(&i915->gem.idle_work.work) &&
> +	    !i915->gt.active_requests) {
> +		++i915->gt.active_requests; /* don't requeue idle */
> +
> +		switch_to_kernel_context_sync(i915, i915->gt.active_engines);
> +
> +		if (!--i915->gt.active_requests) {
> +			__i915_gem_park(i915);
> +			rearm_hangcheck = false;
> +		}
> +	}
> +
> +	mutex_unlock(&i915->drm.struct_mutex);
> +
> +out_rearm:
> +	if (rearm_hangcheck) {
> +		GEM_BUG_ON(!i915->gt.awake);
> +		i915_queue_hangcheck(i915);
> +	}
> +}
> +
> +static void retire_work_handler(struct work_struct *work)
> +{
> +	struct drm_i915_private *i915 =
> +		container_of(work, typeof(*i915), gem.retire_work.work);
> +
> +	/* Come back later if the device is busy... */
> +	if (mutex_trylock(&i915->drm.struct_mutex)) {
> +		i915_retire_requests(i915);
> +		mutex_unlock(&i915->drm.struct_mutex);
> +	}
> +
> +	/*
> +	 * Keep the retire handler running until we are finally idle.
> +	 * We do not need to do this test under locking as in the worst-case
> +	 * we queue the retire worker once too often.
> +	 */
> +	if (READ_ONCE(i915->gt.awake))
> +		queue_delayed_work(i915->wq,
> +				   &i915->gem.retire_work,
> +				   round_jiffies_up_relative(HZ));
> +}
> +
> +void i915_gem_park(struct drm_i915_private *i915)
> +{
> +	GEM_TRACE("\n");
> +
> +	lockdep_assert_held(&i915->drm.struct_mutex);
> +	GEM_BUG_ON(i915->gt.active_requests);
> +
> +	if (!i915->gt.awake)
> +		return;
> +
> +	/* Defer the actual call to __i915_gem_park() to prevent ping-pongs */
> +	mod_delayed_work(i915->wq, &i915->gem.idle_work, msecs_to_jiffies(100));
> +}
> +
> +void i915_gem_unpark(struct drm_i915_private *i915)
> +{
> +	GEM_TRACE("\n");
> +
> +	lockdep_assert_held(&i915->drm.struct_mutex);
> +	GEM_BUG_ON(!i915->gt.active_requests);
> +	assert_rpm_wakelock_held(i915);
> +
> +	if (i915->gt.awake)
> +		return;
> +
> +	/*
> +	 * It seems that the DMC likes to transition between the DC states a lot
> +	 * when there are no connected displays (no active power domains) during
> +	 * command submission.
> +	 *
> +	 * This activity has negative impact on the performance of the chip with
> +	 * huge latencies observed in the interrupt handler and elsewhere.
> +	 *
> +	 * Work around it by grabbing a GT IRQ power domain whilst there is any
> +	 * GT activity, preventing any DC state transitions.
> +	 */
> +	i915->gt.awake = intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
> +	GEM_BUG_ON(!i915->gt.awake);
> +
> +	i915_globals_unpark();
> +
> +	intel_enable_gt_powersave(i915);
> +	i915_update_gfx_val(i915);
> +	if (INTEL_GEN(i915) >= 6)
> +		gen6_rps_busy(i915);
> +	i915_pmu_gt_unparked(i915);
> +
> +	intel_engines_unpark(i915);
> +
> +	i915_queue_hangcheck(i915);
> +
> +	queue_delayed_work(i915->wq,
> +			   &i915->gem.retire_work,
> +			   round_jiffies_up_relative(HZ));
> +}
> +
> +bool i915_gem_load_power_context(struct drm_i915_private *i915)
> +{
> +	/* Force loading the kernel context on all engines */
> +	if (!switch_to_kernel_context_sync(i915, ALL_ENGINES))
> +		return false;
> +
> +	/*
> +	 * Immediately park the GPU so that we enable powersaving and
> +	 * treat it as idle. The next time we issue a request, we will
> +	 * unpark and start using the engine->pinned_default_state, otherwise
> +	 * it is in limbo and an early reset may fail.
> +	 */
> +	__i915_gem_park(i915);
> +
> +	return true;
> +}
> +
> +void i915_gem_suspend(struct drm_i915_private *i915)
> +{
> +	intel_wakeref_t wakeref;
> +
> +	GEM_TRACE("\n");
> +
> +	wakeref = intel_runtime_pm_get(i915);
> +
> +	flush_workqueue(i915->wq);
> +
> +	mutex_lock(&i915->drm.struct_mutex);
> +
> +	/*
> +	 * We have to flush all the executing contexts to main memory so
> +	 * that they can saved in the hibernation image. To ensure the last
> +	 * context image is coherent, we have to switch away from it. That
> +	 * leaves the i915->kernel_context still active when
> +	 * we actually suspend, and its image in memory may not match the GPU
> +	 * state. Fortunately, the kernel_context is disposable and we do
> +	 * not rely on its state.
> +	 */
> +	switch_to_kernel_context_sync(i915, i915->gt.active_engines);
> +
> +	mutex_unlock(&i915->drm.struct_mutex);
> +	i915_reset_flush(i915);
> +
> +	drain_delayed_work(&i915->gem.retire_work);
> +
> +	/*
> +	 * As the idle_work is rearming if it detects a race, play safe and
> +	 * repeat the flush until it is definitely idle.
> +	 */
> +	drain_delayed_work(&i915->gem.idle_work);
> +
> +	/*
> +	 * Assert that we successfully flushed all the work and
> +	 * reset the GPU back to its idle, low power state.
> +	 */
> +	GEM_BUG_ON(i915->gt.awake);
> +
> +	intel_uc_suspend(i915);
> +
> +	intel_runtime_pm_put(i915, wakeref);
> +}
> +
> +void i915_gem_suspend_late(struct drm_i915_private *i915)
> +{
> +	struct drm_i915_gem_object *obj;
> +	struct list_head *phases[] = {
> +		&i915->mm.unbound_list,
> +		&i915->mm.bound_list,
> +		NULL
> +	}, **phase;
> +
> +	/*
> +	 * Neither the BIOS, ourselves or any other kernel
> +	 * expects the system to be in execlists mode on startup,
> +	 * so we need to reset the GPU back to legacy mode. And the only
> +	 * known way to disable logical contexts is through a GPU reset.
> +	 *
> +	 * So in order to leave the system in a known default configuration,
> +	 * always reset the GPU upon unload and suspend. Afterwards we then
> +	 * clean up the GEM state tracking, flushing off the requests and
> +	 * leaving the system in a known idle state.
> +	 *
> +	 * Note that is of the upmost importance that the GPU is idle and
> +	 * all stray writes are flushed *before* we dismantle the backing
> +	 * storage for the pinned objects.
> +	 *
> +	 * However, since we are uncertain that resetting the GPU on older
> +	 * machines is a good idea, we don't - just in case it leaves the
> +	 * machine in an unusable condition.
> +	 */
> +
> +	mutex_lock(&i915->drm.struct_mutex);
> +	for (phase = phases; *phase; phase++) {
> +		list_for_each_entry(obj, *phase, mm.link)
> +			WARN_ON(i915_gem_object_set_to_gtt_domain(obj, false));
> +	}
> +	mutex_unlock(&i915->drm.struct_mutex);
> +
> +	intel_uc_sanitize(i915);
> +	i915_gem_sanitize(i915);
> +}
> +
> +void i915_gem_resume(struct drm_i915_private *i915)
> +{
> +	GEM_TRACE("\n");
> +
> +	WARN_ON(i915->gt.awake);
> +
> +	mutex_lock(&i915->drm.struct_mutex);
> +	intel_uncore_forcewake_get(&i915->uncore, FORCEWAKE_ALL);
> +
> +	i915_gem_restore_gtt_mappings(i915);
> +	i915_gem_restore_fences(i915);
> +
> +	/*
> +	 * As we didn't flush the kernel context before suspend, we cannot
> +	 * guarantee that the context image is complete. So let's just reset
> +	 * it and start again.
> +	 */
> +	intel_gt_resume(i915);
> +
> +	if (i915_gem_init_hw(i915))
> +		goto err_wedged;
> +
> +	intel_uc_resume(i915);
> +
> +	/* Always reload a context for powersaving. */
> +	if (!i915_gem_load_power_context(i915))
> +		goto err_wedged;
> +
> +out_unlock:
> +	intel_uncore_forcewake_put(&i915->uncore, FORCEWAKE_ALL);
> +	mutex_unlock(&i915->drm.struct_mutex);
> +	return;
> +
> +err_wedged:
> +	if (!i915_reset_failed(i915)) {
> +		dev_err(i915->drm.dev,
> +			"Failed to re-initialize GPU, declaring it wedged!\n");
> +		i915_gem_set_wedged(i915);
> +	}
> +	goto out_unlock;
> +}
> +
> +void i915_gem_init__pm(struct drm_i915_private *i915)
> +{
> +	INIT_DELAYED_WORK(&i915->gem.idle_work, idle_work_handler);
> +	INIT_DELAYED_WORK(&i915->gem.retire_work, retire_work_handler);
> +}
> diff --git a/drivers/gpu/drm/i915/i915_gem_pm.h b/drivers/gpu/drm/i915/i915_gem_pm.h
> new file mode 100644
> index 000000000000..52f65e3f06b5
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_gem_pm.h
> @@ -0,0 +1,28 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#ifndef __I915_GEM_PM_H__
> +#define __I915_GEM_PM_H__
> +
> +#include <linux/types.h>
> +
> +struct drm_i915_private;
> +struct work_struct;
> +
> +void i915_gem_init__pm(struct drm_i915_private *i915);
> +
> +bool i915_gem_load_power_context(struct drm_i915_private *i915);
> +void i915_gem_resume(struct drm_i915_private *i915);
> +
> +void i915_gem_unpark(struct drm_i915_private *i915);
> +void i915_gem_park(struct drm_i915_private *i915);
> +
> +void i915_gem_idle_work_handler(struct work_struct *work);
> +
> +void i915_gem_suspend(struct drm_i915_private *i915);
> +void i915_gem_suspend_late(struct drm_i915_private *i915);
> +
> +#endif /* __I915_GEM_PM_H__ */
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> index 6f52ca881173..9d646fa1b74e 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> @@ -1658,7 +1658,7 @@ static int __igt_switch_to_kernel_context(struct drm_i915_private *i915,
>   		/* XXX Bonus points for proving we are the kernel context! */
>   
>   		mutex_unlock(&i915->drm.struct_mutex);
> -		drain_delayed_work(&i915->gt.idle_work);
> +		drain_delayed_work(&i915->gem.idle_work);
>   		mutex_lock(&i915->drm.struct_mutex);
>   	}
>   
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
> index 971148fbe6f5..12fc53c694a6 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_object.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
> @@ -514,8 +514,8 @@ static void disable_retire_worker(struct drm_i915_private *i915)
>   	}
>   	mutex_unlock(&i915->drm.struct_mutex);
>   
> -	cancel_delayed_work_sync(&i915->gt.retire_work);
> -	cancel_delayed_work_sync(&i915->gt.idle_work);
> +	cancel_delayed_work_sync(&i915->gem.retire_work);
> +	cancel_delayed_work_sync(&i915->gem.idle_work);
>   }
>   
>   static int igt_mmap_offset_exhaustion(void *arg)
> @@ -617,9 +617,9 @@ static int igt_mmap_offset_exhaustion(void *arg)
>   out_park:
>   	mutex_lock(&i915->drm.struct_mutex);
>   	if (--i915->gt.active_requests)
> -		queue_delayed_work(i915->wq, &i915->gt.retire_work, 0);
> +		queue_delayed_work(i915->wq, &i915->gem.retire_work, 0);
>   	else
> -		queue_delayed_work(i915->wq, &i915->gt.idle_work, 0);
> +		queue_delayed_work(i915->wq, &i915->gem.idle_work, 0);
>   	mutex_unlock(&i915->drm.struct_mutex);
>   	i915_gem_shrinker_register(i915);
>   	return err;
> diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> index f444ee5add27..fb677b4019a0 100644
> --- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> +++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> @@ -59,8 +59,8 @@ static void mock_device_release(struct drm_device *dev)
>   	i915_gem_contexts_lost(i915);
>   	mutex_unlock(&i915->drm.struct_mutex);
>   
> -	drain_delayed_work(&i915->gt.retire_work);
> -	drain_delayed_work(&i915->gt.idle_work);
> +	drain_delayed_work(&i915->gem.retire_work);
> +	drain_delayed_work(&i915->gem.idle_work);
>   	i915_gem_drain_workqueue(i915);
>   
>   	mutex_lock(&i915->drm.struct_mutex);
> @@ -111,7 +111,7 @@ static void mock_retire_work_handler(struct work_struct *work)
>   static void mock_idle_work_handler(struct work_struct *work)
>   {
>   	struct drm_i915_private *i915 =
> -		container_of(work, typeof(*i915), gt.idle_work.work);
> +		container_of(work, typeof(*i915), gem.idle_work.work);
>   
>   	i915->gt.active_engines = 0;
>   }
> @@ -197,8 +197,8 @@ struct drm_i915_private *mock_gem_device(void)
>   
>   	mock_init_contexts(i915);
>   
> -	INIT_DELAYED_WORK(&i915->gt.retire_work, mock_retire_work_handler);
> -	INIT_DELAYED_WORK(&i915->gt.idle_work, mock_idle_work_handler);
> +	INIT_DELAYED_WORK(&i915->gem.retire_work, mock_retire_work_handler);
> +	INIT_DELAYED_WORK(&i915->gem.idle_work, mock_idle_work_handler);
>   
>   	i915->gt.awake = true;
>   
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 08/29] drm/i915: Introduce struct intel_wakeref
  2019-04-10  9:49   ` Tvrtko Ursulin
@ 2019-04-10 10:01     ` Chris Wilson
  2019-04-10 10:07       ` Tvrtko Ursulin
  0 siblings, 1 reply; 71+ messages in thread
From: Chris Wilson @ 2019-04-10 10:01 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-04-10 10:49:35)
> 
> On 08/04/2019 10:17, Chris Wilson wrote:
> > For controlling runtime pm of the GT and engines, we would like to have
> > a callback to do extra work the first time we wake up and the last time
> > we drop the wakeref. This first/last access needs serialisation and so
> > we encompass a mutex with the regular intel_wakeref_t tracker.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   drivers/gpu/drm/i915/Makefile             |  1 +
> >   drivers/gpu/drm/i915/Makefile.header-test |  3 +-
> >   drivers/gpu/drm/i915/i915_drv.h           |  3 +-
> >   drivers/gpu/drm/i915/intel_wakeref.c      | 51 +++++++++++++++
> >   drivers/gpu/drm/i915/intel_wakeref.h      | 77 +++++++++++++++++++++++
> >   5 files changed, 132 insertions(+), 3 deletions(-)
> >   create mode 100644 drivers/gpu/drm/i915/intel_wakeref.c
> >   create mode 100644 drivers/gpu/drm/i915/intel_wakeref.h
> > 
> > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> > index 40130cf5c003..233bad5e361f 100644
> > --- a/drivers/gpu/drm/i915/Makefile
> > +++ b/drivers/gpu/drm/i915/Makefile
> > @@ -50,6 +50,7 @@ i915-y += i915_drv.o \
> >         intel_device_info.o \
> >         intel_pm.o \
> >         intel_runtime_pm.o \
> > +       intel_wakeref.o \
> >         intel_uncore.o
> >   
> >   # core library code
> > diff --git a/drivers/gpu/drm/i915/Makefile.header-test b/drivers/gpu/drm/i915/Makefile.header-test
> > index 96a5d90629ec..e6b3e7588860 100644
> > --- a/drivers/gpu/drm/i915/Makefile.header-test
> > +++ b/drivers/gpu/drm/i915/Makefile.header-test
> > @@ -31,7 +31,8 @@ header_test := \
> >       intel_psr.h \
> >       intel_sdvo.h \
> >       intel_sprite.h \
> > -     intel_tv.h
> > +     intel_tv.h \
> > +     intel_wakeref.h
> >   
> >   quiet_cmd_header_test = HDRTEST $@
> >         cmd_header_test = echo "\#include \"$(<F)\"" > $@
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index fad5306f07da..62a7e91acd7f 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -74,6 +74,7 @@
> >   #include "intel_opregion.h"
> >   #include "intel_uc.h"
> >   #include "intel_uncore.h"
> > +#include "intel_wakeref.h"
> >   #include "intel_wopcm.h"
> >   
> >   #include "i915_gem.h"
> > @@ -134,8 +135,6 @@ bool i915_error_injected(void);
> >       __i915_printk(i915, i915_error_injected() ? KERN_DEBUG : KERN_ERR, \
> >                     fmt, ##__VA_ARGS__)
> >   
> > -typedef depot_stack_handle_t intel_wakeref_t;
> > -
> >   enum hpd_pin {
> >       HPD_NONE = 0,
> >       HPD_TV = HPD_NONE,     /* TV is known to be unreliable */
> > diff --git a/drivers/gpu/drm/i915/intel_wakeref.c b/drivers/gpu/drm/i915/intel_wakeref.c
> > new file mode 100644
> > index 000000000000..c133c91cf277
> > --- /dev/null
> > +++ b/drivers/gpu/drm/i915/intel_wakeref.c
> > @@ -0,0 +1,51 @@
> > +/*
> > + * SPDX-License-Identifier: MIT
> > + *
> > + * Copyright © 2019 Intel Corporation
> > + */
> > +
> > +#include "intel_drv.h"
> > +#include "intel_wakeref.h"
> > +
> > +void __intel_wakeref_get_once(struct drm_i915_private *i915,
> > +                           struct intel_wakeref_count *wc,
> > +                           bool (*fn)(struct intel_wakeref_count *wc))
> > +{
> > +     /*
> > +      * Treat get/put as different subclasses, as we may need to run
> > +      * the put callback from under the shrinker and do not want to
> > +      * cross-contanimate that callback with any extra work performed
> > +      * upon acquiring the wakeref.
> > +      */
> > +     mutex_lock_nested(&wc->mutex, SINGLE_DEPTH_NESTING);
> > +     if (!atomic_read(&wc->count)) {
> > +             wc->wakeref = intel_runtime_pm_get(i915);
> > +             if (unlikely(!fn(wc))) {
> > +                     intel_runtime_pm_put(i915, wc->wakeref);
> > +                     mutex_unlock(&wc->mutex);
> > +                     return;
> 
> Feels like a bad API that callback return error gets ignored. I mean 
> from the caller's perspective - how do they know if the wakeref get 
> succeeded or not?

Didn't think get failing through, as I just added it to try and make it
look similar to the put case (which is one being used to postpone the
wakeref release).

Propagate seems reasonable. For put, we'll just ignore as we try again
later, eventually wedging the device as a get-out-of-jail. For get
failing, we should abort the operation in the caller, and return an error
to the user.

> > +static inline void
> > +intel_wakeref_get_once(struct drm_i915_private *i915,
> > +                    struct intel_wakeref_count *wc,
> > +                    bool (*fn)(struct intel_wakeref_count *wc))
> 
> What does the "once" suffix refer to here? That it will only call the 
> callback once? Is that needed in the function name and couldn't be 
> documented in kerneldoc, given how there is no non-suffixed version?

I was thinking of 'works like pthread_once()' in that we only
acquire/release the wakeref once. Yeah, that may be overthinking it and
makes it look clumsy. We certainly don't do 'pin_once', and just accept
that the first/last pinning are the heavy ones. The only difference here
is that we are wrapping a callback.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 10/29] drm/i915: Introduce context->enter() and context->exit()
  2019-04-08  9:17 ` [PATCH 10/29] drm/i915: Introduce context->enter() and context->exit() Chris Wilson
@ 2019-04-10 10:05   ` Tvrtko Ursulin
  2019-04-10 10:13     ` Chris Wilson
  0 siblings, 1 reply; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-10 10:05 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 08/04/2019 10:17, Chris Wilson wrote:
> We wish to start segregating the power management into different control
> domains, both with respect to the hardware and the user interface. The
> first step is that at the lowest level flow of requests, we want to
> process a context event (and not a global GEM operation). In this patch,
> we introduce the context callbacks that in future patches will be
> redirected to per-engine interfaces leading to global operations as
> required.
> 
> The intent is that this will be guarded by the timeline->mutex, except
> that retiring has not quite finished transitioning over from being
> guarded by struct_mutex. So at the moment it is protected by
> struct_mutex with a reminded to switch.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/intel_context.c       | 17 ++++++++++++++
>   drivers/gpu/drm/i915/gt/intel_context.h       | 20 +++++++++++++++++
>   drivers/gpu/drm/i915/gt/intel_context_types.h |  5 +++++
>   drivers/gpu/drm/i915/gt/intel_lrc.c           |  3 +++
>   drivers/gpu/drm/i915/gt/intel_ringbuffer.c    |  3 +++
>   drivers/gpu/drm/i915/gt/mock_engine.c         |  3 +++
>   drivers/gpu/drm/i915/i915_request.c           | 22 ++++---------------
>   7 files changed, 55 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
> index ebd1e5919a4a..bf16f00a10a4 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context.c
> +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> @@ -266,3 +266,20 @@ int __init i915_global_context_init(void)
>   	i915_global_register(&global.base);
>   	return 0;
>   }
> +
> +void __intel_context_enter(struct intel_context *ce)
> +{
> +	struct drm_i915_private *i915 = ce->gem_context->i915;
> +
> +	if (!i915->gt.active_requests++)
> +		i915_gem_unpark(i915);
> +}
> +
> +void __intel_context_exit(struct intel_context *ce)
> +{
> +	struct drm_i915_private *i915 = ce->gem_context->i915;
> +
> +	GEM_BUG_ON(!i915->gt.active_requests);
> +	if (!--i915->gt.active_requests)
> +		i915_gem_park(i915);
> +}

In our normal nomenclature __intel_context_enter would be expected to be 
directly called by intel_context_enter on the 0 -> 1 refcnt transition. 
Here they are in fact common helpers, so one level of indirection, via a 
different layer. I found this a bit confusing at first. My usual remedy 
here is to prescribe a better name. __intel_gt_context_enter/exit? To 
designate the glue between backend and GT.

> diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
> index ebc861b1a49e..95b1fdc5826a 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context.h
> @@ -73,6 +73,26 @@ static inline void __intel_context_pin(struct intel_context *ce)
>   
>   void intel_context_unpin(struct intel_context *ce);
>   
> +void __intel_context_enter(struct intel_context *ce);
> +void __intel_context_exit(struct intel_context *ce);
> +
> +static inline void intel_context_enter(struct intel_context *ce)
> +{
> +	if (!ce->active_count++)
> +		ce->ops->enter(ce);
> +}
> +
> +static inline void intel_context_mark_active(struct intel_context *ce)
> +{

GEM_BUG_ON(ce->active_count == 0) ?

And some lockdep_assert_held in all three?

> +	++ce->active_count;
> +}
> +
> +static inline void intel_context_exit(struct intel_context *ce)
> +{
> +	if (!--ce->active_count)
> +		ce->ops->exit(ce);
> +}
> +
>   static inline struct intel_context *intel_context_get(struct intel_context *ce)
>   {
>   	kref_get(&ce->ref);
> diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h
> index 9ec4f787c908..f02d27734e3b 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
> @@ -25,6 +25,9 @@ struct intel_context_ops {
>   	int (*pin)(struct intel_context *ce);
>   	void (*unpin)(struct intel_context *ce);
>   
> +	void (*enter)(struct intel_context *ce);
> +	void (*exit)(struct intel_context *ce);
> +
>   	void (*reset)(struct intel_context *ce);
>   	void (*destroy)(struct kref *kref);
>   };
> @@ -46,6 +49,8 @@ struct intel_context {
>   	u32 *lrc_reg_state;
>   	u64 lrc_desc;
>   
> +	unsigned int active_count; /* notionally protected by timeline->mutex */
> +
>   	atomic_t pin_count;
>   	struct mutex pin_mutex; /* guards pinning and associated on-gpuing */
>   
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 1565b11b15f6..7da5e49dd385 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -1404,6 +1404,9 @@ static const struct intel_context_ops execlists_context_ops = {
>   	.pin = execlists_context_pin,
>   	.unpin = execlists_context_unpin,
>   
> +	.enter = __intel_context_enter,
> +	.exit = __intel_context_exit,
> +
>   	.reset = execlists_context_reset,
>   	.destroy = execlists_context_destroy,
>   };
> diff --git a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
> index bcc842b8bbe7..5788e5ab818f 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
> @@ -1516,6 +1516,9 @@ static const struct intel_context_ops ring_context_ops = {
>   	.pin = ring_context_pin,
>   	.unpin = ring_context_unpin,
>   
> +	.enter = __intel_context_enter,
> +	.exit = __intel_context_exit,
> +
>   	.reset = ring_context_reset,
>   	.destroy = ring_context_destroy,
>   };
> diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c
> index 414afd2f27fe..2e84ffb46489 100644
> --- a/drivers/gpu/drm/i915/gt/mock_engine.c
> +++ b/drivers/gpu/drm/i915/gt/mock_engine.c
> @@ -157,6 +157,9 @@ static const struct intel_context_ops mock_context_ops = {
>   	.pin = mock_context_pin,
>   	.unpin = mock_context_unpin,
>   
> +	.enter = __intel_context_enter,
> +	.exit = __intel_context_exit,
> +
>   	.destroy = mock_context_destroy,
>   };
>   
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index f2b9e0e1910b..e72e4087bbfe 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -130,19 +130,6 @@ i915_request_remove_from_client(struct i915_request *request)
>   	spin_unlock(&file_priv->mm.lock);
>   }
>   
> -static void reserve_gt(struct drm_i915_private *i915)
> -{
> -	if (!i915->gt.active_requests++)
> -		i915_gem_unpark(i915);
> -}
> -
> -static void unreserve_gt(struct drm_i915_private *i915)
> -{
> -	GEM_BUG_ON(!i915->gt.active_requests);
> -	if (!--i915->gt.active_requests)
> -		i915_gem_park(i915);
> -}
> -
>   static void advance_ring(struct i915_request *request)
>   {
>   	struct intel_ring *ring = request->ring;
> @@ -300,11 +287,10 @@ static void i915_request_retire(struct i915_request *request)
>   
>   	i915_request_remove_from_client(request);
>   
> -	intel_context_unpin(request->hw_context);
> -
>   	__retire_engine_upto(request->engine, request);
>   
> -	unreserve_gt(request->i915);
> +	intel_context_unpin(request->hw_context);
> +	intel_context_exit(request->hw_context);
>   
>   	i915_sched_node_fini(&request->sched);
>   	i915_request_put(request);
> @@ -628,8 +614,8 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   	if (IS_ERR(ce))
>   		return ERR_CAST(ce);
>   
> -	reserve_gt(i915);
>   	mutex_lock(&ce->ring->timeline->mutex);
> +	intel_context_enter(ce);
>   
>   	/* Move our oldest request to the slab-cache (if not in use!) */
>   	rq = list_first_entry(&ce->ring->request_list, typeof(*rq), ring_link);
> @@ -759,8 +745,8 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   err_free:
>   	kmem_cache_free(global.slab_requests, rq);
>   err_unreserve:
> +	intel_context_exit(ce);
>   	mutex_unlock(&ce->ring->timeline->mutex);
> -	unreserve_gt(i915);
>   	intel_context_unpin(ce);
>   	return ERR_PTR(ret);
>   }
> 

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 08/29] drm/i915: Introduce struct intel_wakeref
  2019-04-10 10:01     ` Chris Wilson
@ 2019-04-10 10:07       ` Tvrtko Ursulin
  0 siblings, 0 replies; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-10 10:07 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 10/04/2019 11:01, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-04-10 10:49:35)
>>
>> On 08/04/2019 10:17, Chris Wilson wrote:
>>> For controlling runtime pm of the GT and engines, we would like to have
>>> a callback to do extra work the first time we wake up and the last time
>>> we drop the wakeref. This first/last access needs serialisation and so
>>> we encompass a mutex with the regular intel_wakeref_t tracker.
>>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> ---
>>>    drivers/gpu/drm/i915/Makefile             |  1 +
>>>    drivers/gpu/drm/i915/Makefile.header-test |  3 +-
>>>    drivers/gpu/drm/i915/i915_drv.h           |  3 +-
>>>    drivers/gpu/drm/i915/intel_wakeref.c      | 51 +++++++++++++++
>>>    drivers/gpu/drm/i915/intel_wakeref.h      | 77 +++++++++++++++++++++++
>>>    5 files changed, 132 insertions(+), 3 deletions(-)
>>>    create mode 100644 drivers/gpu/drm/i915/intel_wakeref.c
>>>    create mode 100644 drivers/gpu/drm/i915/intel_wakeref.h
>>>
>>> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
>>> index 40130cf5c003..233bad5e361f 100644
>>> --- a/drivers/gpu/drm/i915/Makefile
>>> +++ b/drivers/gpu/drm/i915/Makefile
>>> @@ -50,6 +50,7 @@ i915-y += i915_drv.o \
>>>          intel_device_info.o \
>>>          intel_pm.o \
>>>          intel_runtime_pm.o \
>>> +       intel_wakeref.o \
>>>          intel_uncore.o
>>>    
>>>    # core library code
>>> diff --git a/drivers/gpu/drm/i915/Makefile.header-test b/drivers/gpu/drm/i915/Makefile.header-test
>>> index 96a5d90629ec..e6b3e7588860 100644
>>> --- a/drivers/gpu/drm/i915/Makefile.header-test
>>> +++ b/drivers/gpu/drm/i915/Makefile.header-test
>>> @@ -31,7 +31,8 @@ header_test := \
>>>        intel_psr.h \
>>>        intel_sdvo.h \
>>>        intel_sprite.h \
>>> -     intel_tv.h
>>> +     intel_tv.h \
>>> +     intel_wakeref.h
>>>    
>>>    quiet_cmd_header_test = HDRTEST $@
>>>          cmd_header_test = echo "\#include \"$(<F)\"" > $@
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>> index fad5306f07da..62a7e91acd7f 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -74,6 +74,7 @@
>>>    #include "intel_opregion.h"
>>>    #include "intel_uc.h"
>>>    #include "intel_uncore.h"
>>> +#include "intel_wakeref.h"
>>>    #include "intel_wopcm.h"
>>>    
>>>    #include "i915_gem.h"
>>> @@ -134,8 +135,6 @@ bool i915_error_injected(void);
>>>        __i915_printk(i915, i915_error_injected() ? KERN_DEBUG : KERN_ERR, \
>>>                      fmt, ##__VA_ARGS__)
>>>    
>>> -typedef depot_stack_handle_t intel_wakeref_t;
>>> -
>>>    enum hpd_pin {
>>>        HPD_NONE = 0,
>>>        HPD_TV = HPD_NONE,     /* TV is known to be unreliable */
>>> diff --git a/drivers/gpu/drm/i915/intel_wakeref.c b/drivers/gpu/drm/i915/intel_wakeref.c
>>> new file mode 100644
>>> index 000000000000..c133c91cf277
>>> --- /dev/null
>>> +++ b/drivers/gpu/drm/i915/intel_wakeref.c
>>> @@ -0,0 +1,51 @@
>>> +/*
>>> + * SPDX-License-Identifier: MIT
>>> + *
>>> + * Copyright © 2019 Intel Corporation
>>> + */
>>> +
>>> +#include "intel_drv.h"
>>> +#include "intel_wakeref.h"
>>> +
>>> +void __intel_wakeref_get_once(struct drm_i915_private *i915,
>>> +                           struct intel_wakeref_count *wc,
>>> +                           bool (*fn)(struct intel_wakeref_count *wc))
>>> +{
>>> +     /*
>>> +      * Treat get/put as different subclasses, as we may need to run
>>> +      * the put callback from under the shrinker and do not want to
>>> +      * cross-contanimate that callback with any extra work performed
>>> +      * upon acquiring the wakeref.
>>> +      */
>>> +     mutex_lock_nested(&wc->mutex, SINGLE_DEPTH_NESTING);
>>> +     if (!atomic_read(&wc->count)) {
>>> +             wc->wakeref = intel_runtime_pm_get(i915);
>>> +             if (unlikely(!fn(wc))) {
>>> +                     intel_runtime_pm_put(i915, wc->wakeref);
>>> +                     mutex_unlock(&wc->mutex);
>>> +                     return;
>>
>> Feels like a bad API that callback return error gets ignored. I mean
>> from the caller's perspective - how do they know if the wakeref get
>> succeeded or not?
> 
> Didn't think get failing through, as I just added it to try and make it
> look similar to the put case (which is one being used to postpone the
> wakeref release).
> 
> Propagate seems reasonable. For put, we'll just ignore as we try again
> later, eventually wedging the device as a get-out-of-jail. For get
> failing, we should abort the operation in the caller, and return an error
> to the user.

If you need the ability to fail sure. Just I did not see it in one later 
patch which uses it so if there aren't any other then maybe even make 
the callback have no return value and simplify the callers. Any of the 
two works for me.

Regards,

Tvrtko

>>> +static inline void
>>> +intel_wakeref_get_once(struct drm_i915_private *i915,
>>> +                    struct intel_wakeref_count *wc,
>>> +                    bool (*fn)(struct intel_wakeref_count *wc))
>>
>> What does the "once" suffix refer to here? That it will only call the
>> callback once? Is that needed in the function name and couldn't be
>> documented in kerneldoc, given how there is no non-suffixed version?
> 
> I was thinking of 'works like pthread_once()' in that we only
> acquire/release the wakeref once. Yeah, that may be overthinking it and
> makes it look clumsy. We certainly don't do 'pin_once', and just accept
> that the first/last pinning are the heavy ones. The only difference here
> is that we are wrapping a callback.
> -Chris
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 10/29] drm/i915: Introduce context->enter() and context->exit()
  2019-04-10 10:05   ` Tvrtko Ursulin
@ 2019-04-10 10:13     ` Chris Wilson
  2019-04-10 11:06       ` Tvrtko Ursulin
  0 siblings, 1 reply; 71+ messages in thread
From: Chris Wilson @ 2019-04-10 10:13 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-04-10 11:05:13)
> 
> On 08/04/2019 10:17, Chris Wilson wrote:
> > +void __intel_context_enter(struct intel_context *ce)
> > +{
> > +     struct drm_i915_private *i915 = ce->gem_context->i915;
> > +
> > +     if (!i915->gt.active_requests++)
> > +             i915_gem_unpark(i915);
> > +}
> > +
> > +void __intel_context_exit(struct intel_context *ce)
> > +{
> > +     struct drm_i915_private *i915 = ce->gem_context->i915;
> > +
> > +     GEM_BUG_ON(!i915->gt.active_requests);
> > +     if (!--i915->gt.active_requests)
> > +             i915_gem_park(i915);
> > +}
> 
> In our normal nomenclature __intel_context_enter would be expected to be 
> directly called by intel_context_enter on the 0 -> 1 refcnt transition. 
> Here they are in fact common helpers, so one level of indirection, via a 
> different layer. I found this a bit confusing at first. My usual remedy 
> here is to prescribe a better name. __intel_gt_context_enter/exit? To 
> designate the glue between backend and GT.

The use of gt/gem here is a placeholder. Here we will be iterating over
the engines (so just the one for the usual contexts, and num_siblings for
virtual).

intel_context_enter_engine() ?

And virtual_context_enter remains distinct.

> > diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
> > index ebc861b1a49e..95b1fdc5826a 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_context.h
> > @@ -73,6 +73,26 @@ static inline void __intel_context_pin(struct intel_context *ce)
> >   
> >   void intel_context_unpin(struct intel_context *ce);
> >   
> > +void __intel_context_enter(struct intel_context *ce);
> > +void __intel_context_exit(struct intel_context *ce);
> > +
> > +static inline void intel_context_enter(struct intel_context *ce)
> > +{
> > +     if (!ce->active_count++)
> > +             ce->ops->enter(ce);
> > +}
> > +
> > +static inline void intel_context_mark_active(struct intel_context *ce)
> > +{
> 
> GEM_BUG_ON(ce->active_count == 0) ?

We're no longer in Kansas. I was wondering if I could start GT_BUG_ON
and include "intel_gt.h" with a goal of doing a search and replace within
gt/. That gives us some separation from GEM, and maybe we could even
start filtering based on need.
 
> And some lockdep_assert_held in all three?

Read on :(

The plan is for intel_context_enter/_exit to be under the
timeline->mutex, but that isn't realised for about another 30 patches.

mark_active has special protection because it gets used from the
serialised portion of intel_engine_park.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 11/29] drm/i915: Pass intel_context to i915_request_create()
  2019-04-08  9:17 ` [PATCH 11/29] drm/i915: Pass intel_context to i915_request_create() Chris Wilson
@ 2019-04-10 10:38   ` Tvrtko Ursulin
  0 siblings, 0 replies; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-10 10:38 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 08/04/2019 10:17, Chris Wilson wrote:
> Start acquiring the logical intel_context and using that as our primary
> means for request allocation. This is the initial step to allow us to
> avoid requiring struct_mutex for request allocation along the
> perma-pinned kernel context, but it also provides a foundation for
> breaking up the complex request allocation to handle different scenarios
> inside execbuf.
> 
> For the purpose of emitting a request from inside retirement (see the
> next patch for engine power management), we also need to lift control
> over the timeline mutex to the caller.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_context.h       |  12 +
>   drivers/gpu/drm/i915/gt/intel_reset.c         |   2 +-
>   drivers/gpu/drm/i915/gt/intel_ringbuffer.c    |   3 -
>   drivers/gpu/drm/i915/i915_gem_context.c       |   4 +-
>   drivers/gpu/drm/i915/i915_perf.c              |   2 +-
>   drivers/gpu/drm/i915/i915_request.c           | 245 ++++++++++--------
>   drivers/gpu/drm/i915/i915_request.h           |   7 +
>   drivers/gpu/drm/i915/intel_overlay.c          |   5 +-
>   drivers/gpu/drm/i915/selftests/i915_active.c  |   2 +-
>   .../drm/i915/selftests/i915_gem_coherency.c   |   2 +-
>   .../gpu/drm/i915/selftests/i915_gem_object.c  |   2 +-
>   drivers/gpu/drm/i915/selftests/i915_request.c |   9 +-
>   .../gpu/drm/i915/selftests/i915_timeline.c    |   4 +-
>   13 files changed, 175 insertions(+), 124 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
> index 95b1fdc5826a..9aeef88176b9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context.h
> @@ -104,4 +104,16 @@ static inline void intel_context_put(struct intel_context *ce)
>   	kref_put(&ce->ref, ce->ops->destroy);
>   }
>   
> +static inline void intel_context_timeline_lock(struct intel_context *ce)
> +	__acquires(&ce->ring->timeline->mutex)
> +{
> +	mutex_lock(&ce->ring->timeline->mutex);
> +}
> +
> +static inline void intel_context_timeline_unlock(struct intel_context *ce)
> +	__releases(&ce->ring->timeline->mutex)
> +{
> +	mutex_unlock(&ce->ring->timeline->mutex);
> +}
> +
>   #endif /* __INTEL_CONTEXT_H__ */
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> index dce49b8b1118..a5501b136f0c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -770,7 +770,7 @@ static void restart_work(struct work_struct *work)
>   		if (!intel_engine_is_idle(engine))
>   			continue;
>   
> -		rq = i915_request_alloc(engine, i915->kernel_context);
> +		rq = i915_request_create(engine->kernel_context);
>   		if (!IS_ERR(rq))
>   			i915_request_add(rq);
>   	}
> diff --git a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
> index 5788e5ab818f..63bf092d737b 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
> @@ -1778,7 +1778,6 @@ static int switch_context(struct i915_request *rq)
>   	u32 hw_flags = 0;
>   	int ret, i;
>   
> -	lockdep_assert_held(&rq->i915->drm.struct_mutex);
>   	GEM_BUG_ON(HAS_EXECLISTS(rq->i915));
>   
>   	if (ppgtt) {
> @@ -1908,8 +1907,6 @@ static noinline int wait_for_space(struct intel_ring *ring, unsigned int bytes)
>   	struct i915_request *target;
>   	long timeout;
>   
> -	lockdep_assert_held(&ring->vma->vm->i915->drm.struct_mutex);
> -
>   	if (intel_ring_update_space(ring) >= bytes)
>   		return 0;
>   
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 170500d49049..de3404479bc3 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -942,7 +942,7 @@ int i915_gem_switch_to_kernel_context(struct drm_i915_private *i915,
>   		struct intel_ring *ring;
>   		struct i915_request *rq;
>   
> -		rq = i915_request_alloc(engine, i915->kernel_context);
> +		rq = i915_request_create(engine->kernel_context);
>   		if (IS_ERR(rq))
>   			return PTR_ERR(rq);
>   
> @@ -1188,7 +1188,7 @@ gen8_modify_rpcs(struct intel_context *ce, struct intel_sseu sseu)
>   	/* Submitting requests etc needs the hw awake. */
>   	wakeref = intel_runtime_pm_get(i915);
>   
> -	rq = i915_request_alloc(ce->engine, i915->kernel_context);
> +	rq = i915_request_create(ce->engine->kernel_context);
>   	if (IS_ERR(rq)) {
>   		ret = PTR_ERR(rq);
>   		goto out_put;
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index a87f790335c1..328a740e72cb 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -1762,7 +1762,7 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
>   	 * Apply the configuration by doing one context restore of the edited
>   	 * context image.
>   	 */
> -	rq = i915_request_alloc(engine, dev_priv->kernel_context);
> +	rq = i915_request_create(engine->kernel_context);
>   	if (IS_ERR(rq))
>   		return PTR_ERR(rq);
>   
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index e72e4087bbfe..ca4c200ee7ec 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -550,7 +550,7 @@ static void ring_retire_requests(struct intel_ring *ring)
>   }
>   
>   static noinline struct i915_request *
> -i915_request_alloc_slow(struct intel_context *ce)
> +request_alloc_slow(struct intel_context *ce, gfp_t gfp)
>   {
>   	struct intel_ring *ring = ce->ring;
>   	struct i915_request *rq;
> @@ -558,6 +558,9 @@ i915_request_alloc_slow(struct intel_context *ce)
>   	if (list_empty(&ring->request_list))
>   		goto out;
>   
> +	if (!gfpflags_allow_blocking(gfp))
> +		goto out;
> +
>   	/* Ratelimit ourselves to prevent oom from malicious clients */
>   	rq = list_last_entry(&ring->request_list, typeof(*rq), ring_link);
>   	cond_synchronize_rcu(rq->rcustate);
> @@ -566,62 +569,21 @@ i915_request_alloc_slow(struct intel_context *ce)
>   	ring_retire_requests(ring);
>   
>   out:
> -	return kmem_cache_alloc(global.slab_requests, GFP_KERNEL);
> +	return kmem_cache_alloc(global.slab_requests, gfp);
>   }
>   
> -/**
> - * i915_request_alloc - allocate a request structure
> - *
> - * @engine: engine that we wish to issue the request on.
> - * @ctx: context that the request will be associated with.
> - *
> - * Returns a pointer to the allocated request if successful,
> - * or an error code if not.
> - */
>   struct i915_request *
> -i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
> +__i915_request_create(struct intel_context *ce, gfp_t gfp)
>   {
> -	struct drm_i915_private *i915 = engine->i915;
> -	struct intel_context *ce;
> -	struct i915_timeline *tl;
> +	struct i915_timeline *tl = ce->ring->timeline;
>   	struct i915_request *rq;
>   	u32 seqno;
>   	int ret;
>   
> -	lockdep_assert_held(&i915->drm.struct_mutex);
> -
> -	/*
> -	 * Preempt contexts are reserved for exclusive use to inject a
> -	 * preemption context switch. They are never to be used for any trivial
> -	 * request!
> -	 */
> -	GEM_BUG_ON(ctx == i915->preempt_context);
> -
> -	/*
> -	 * ABI: Before userspace accesses the GPU (e.g. execbuffer), report
> -	 * EIO if the GPU is already wedged.
> -	 */
> -	ret = i915_terminally_wedged(i915);
> -	if (ret)
> -		return ERR_PTR(ret);
> -
> -	/*
> -	 * Pinning the contexts may generate requests in order to acquire
> -	 * GGTT space, so do this first before we reserve a seqno for
> -	 * ourselves.
> -	 */
> -	ce = intel_context_pin(ctx, engine);
> -	if (IS_ERR(ce))
> -		return ERR_CAST(ce);
> -
> -	mutex_lock(&ce->ring->timeline->mutex);
> -	intel_context_enter(ce);
> +	might_sleep_if(gfpflags_allow_blocking(gfp));
>   
> -	/* Move our oldest request to the slab-cache (if not in use!) */
> -	rq = list_first_entry(&ce->ring->request_list, typeof(*rq), ring_link);
> -	if (!list_is_last(&rq->ring_link, &ce->ring->request_list) &&
> -	    i915_request_completed(rq))
> -		i915_request_retire(rq);
> +	/* Check that the caller provided an already pinned context */
> +	__intel_context_pin(ce);
>   
>   	/*
>   	 * Beware: Dragons be flying overhead.
> @@ -653,30 +615,26 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   	 * Do not use kmem_cache_zalloc() here!
>   	 */
>   	rq = kmem_cache_alloc(global.slab_requests,
> -			      GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN);
> +			      gfp | __GFP_RETRY_MAYFAIL | __GFP_NOWARN);
>   	if (unlikely(!rq)) {
> -		rq = i915_request_alloc_slow(ce);
> +		rq = request_alloc_slow(ce, gfp);
>   		if (!rq) {
>   			ret = -ENOMEM;
>   			goto err_unreserve;
>   		}
>   	}
>   
> -	INIT_LIST_HEAD(&rq->active_list);
> -	INIT_LIST_HEAD(&rq->execute_cb);
> -
> -	tl = ce->ring->timeline;
>   	ret = i915_timeline_get_seqno(tl, rq, &seqno);
>   	if (ret)
>   		goto err_free;
>   
> -	rq->i915 = i915;
> -	rq->engine = engine;
> -	rq->gem_context = ctx;
> +	rq->i915 = ce->engine->i915;
>   	rq->hw_context = ce;
> +	rq->gem_context = ce->gem_context;
> +	rq->engine = ce->engine;
>   	rq->ring = ce->ring;
>   	rq->timeline = tl;
> -	GEM_BUG_ON(rq->timeline == &engine->timeline);
> +	GEM_BUG_ON(rq->timeline == &ce->engine->timeline);
>   	rq->hwsp_seqno = tl->hwsp_seqno;
>   	rq->hwsp_cacheline = tl->hwsp_cacheline;
>   	rq->rcustate = get_state_synchronize_rcu(); /* acts as smp_mb() */
> @@ -696,6 +654,9 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   	rq->capture_list = NULL;
>   	rq->waitboost = false;
>   
> +	INIT_LIST_HEAD(&rq->active_list);
> +	INIT_LIST_HEAD(&rq->execute_cb);
> +
>   	/*
>   	 * Reserve space in the ring buffer for all the commands required to
>   	 * eventually emit this request. This is to guarantee that the
> @@ -708,7 +669,8 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   	 * around inside i915_request_add() there is sufficient space at
>   	 * the beginning of the ring as well.
>   	 */
> -	rq->reserved_space = 2 * engine->emit_fini_breadcrumb_dw * sizeof(u32);
> +	rq->reserved_space =
> +		2 * rq->engine->emit_fini_breadcrumb_dw * sizeof(u32);
>   
>   	/*
>   	 * Record the position of the start of the request so that
> @@ -718,20 +680,16 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   	 */
>   	rq->head = rq->ring->emit;
>   
> -	ret = engine->request_alloc(rq);
> +	ret = rq->engine->request_alloc(rq);
>   	if (ret)
>   		goto err_unwind;
>   
> -	/* Keep a second pin for the dual retirement along engine and ring */
> -	__intel_context_pin(ce);
> -
>   	rq->infix = rq->ring->emit; /* end of header; start of user payload */
>   
> -	/* Check that we didn't interrupt ourselves with a new request */
> -	lockdep_assert_held(&rq->timeline->mutex);
> -	GEM_BUG_ON(rq->timeline->seqno != rq->fence.seqno);
> -	rq->cookie = lockdep_pin_lock(&rq->timeline->mutex);
> +	/* Keep a second pin for the dual retirement along engine and ring */
> +	__intel_context_pin(ce);
>   
> +	intel_context_mark_active(ce);
>   	return rq;
>   
>   err_unwind:
> @@ -745,12 +703,86 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   err_free:
>   	kmem_cache_free(global.slab_requests, rq);
>   err_unreserve:
> -	intel_context_exit(ce);
> -	mutex_unlock(&ce->ring->timeline->mutex);
>   	intel_context_unpin(ce);
>   	return ERR_PTR(ret);
>   }
>   
> +struct i915_request *
> +i915_request_create(struct intel_context *ce)
> +{
> +	struct i915_request *rq;
> +
> +	intel_context_timeline_lock(ce);
> +
> +	/* Move our oldest request to the slab-cache (if not in use!) */
> +	rq = list_first_entry(&ce->ring->request_list, typeof(*rq), ring_link);
> +	if (!list_is_last(&rq->ring_link, &ce->ring->request_list) &&
> +	    i915_request_completed(rq))
> +		i915_request_retire(rq);
> +
> +	intel_context_enter(ce);
> +	rq = __i915_request_create(ce, GFP_KERNEL);
> +	intel_context_exit(ce);

I think a comment here saying __i915_request_create has a reference to 
the context now so we can drop the local one would be helpful.

> +	if (IS_ERR(rq))
> +		goto err_unlock;
> +
> +	/* Check that we do not interrupt ourselves with a new request */
> +	rq->cookie = lockdep_pin_lock(&ce->ring->timeline->mutex);
> +
> +	return rq;
> +
> +err_unlock:
> +	intel_context_timeline_unlock(ce);
> +	return rq;
> +}
> +
> +/**
> + * i915_request_alloc - allocate a request structure
> + *
> + * @engine: engine that we wish to issue the request on.
> + * @ctx: context that the request will be associated with.
> + *
> + * Returns a pointer to the allocated request if successful,
> + * or an error code if not.
> + */
> +struct i915_request *
> +i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
> +{
> +	struct drm_i915_private *i915 = engine->i915;
> +	struct intel_context *ce;
> +	struct i915_request *rq;
> +	int ret;
> +
> +	/*
> +	 * Preempt contexts are reserved for exclusive use to inject a
> +	 * preemption context switch. They are never to be used for any trivial
> +	 * request!
> +	 */
> +	GEM_BUG_ON(ctx == i915->preempt_context);
> +
> +	/*
> +	 * ABI: Before userspace accesses the GPU (e.g. execbuffer), report
> +	 * EIO if the GPU is already wedged.
> +	 */
> +	ret = i915_terminally_wedged(i915);
> +	if (ret)
> +		return ERR_PTR(ret);
> +
> +	/*
> +	 * Pinning the contexts may generate requests in order to acquire
> +	 * GGTT space, so do this first before we reserve a seqno for
> +	 * ourselves.
> +	 */
> +	ce = intel_context_pin(ctx, engine);
> +	if (IS_ERR(ce))
> +		return ERR_CAST(ce);
> +
> +	rq = i915_request_create(ce);
> +	intel_context_unpin(ce);
> +
> +	return rq;
> +}
> +
>   static int
>   emit_semaphore_wait(struct i915_request *to,
>   		    struct i915_request *from,
> @@ -1005,8 +1037,7 @@ __i915_request_add_to_timeline(struct i915_request *rq)
>   	 * precludes optimising to use semaphores serialisation of a single
>   	 * timeline across engines.
>   	 */
> -	prev = i915_active_request_raw(&timeline->last_request,
> -				       &rq->i915->drm.struct_mutex);
> +	prev = rcu_dereference_protected(timeline->last_request.request, 1);
>   	if (prev && !i915_request_completed(prev)) {
>   		if (is_power_of_2(prev->engine->mask | rq->engine->mask))
>   			i915_sw_fence_await_sw_fence(&rq->submit,
> @@ -1027,6 +1058,11 @@ __i915_request_add_to_timeline(struct i915_request *rq)
>   	list_add_tail(&rq->link, &timeline->requests);
>   	spin_unlock_irq(&timeline->lock);
>   
> +	/*
> +	 * Make sure that no request gazumped us - if it was allocated after
> +	 * our i915_request_alloc() and called __i915_request_add() before
> +	 * us, the timeline will hold its seqno which is later than ours.
> +	 */
>   	GEM_BUG_ON(timeline->seqno != rq->fence.seqno);
>   	__i915_active_request_set(&timeline->last_request, rq);
>   
> @@ -1038,36 +1074,23 @@ __i915_request_add_to_timeline(struct i915_request *rq)
>    * request is not being tracked for completion but the work itself is
>    * going to happen on the hardware. This would be a Bad Thing(tm).
>    */
> -void i915_request_add(struct i915_request *request)
> +struct i915_request *__i915_request_commit(struct i915_request *rq)
>   {
> -	struct intel_engine_cs *engine = request->engine;
> -	struct i915_timeline *timeline = request->timeline;
> -	struct intel_ring *ring = request->ring;
> +	struct intel_engine_cs *engine = rq->engine;
> +	struct intel_ring *ring = rq->ring;
>   	struct i915_request *prev;
>   	u32 *cs;
>   
>   	GEM_TRACE("%s fence %llx:%lld\n",
> -		  engine->name, request->fence.context, request->fence.seqno);
> -
> -	lockdep_assert_held(&request->timeline->mutex);
> -	lockdep_unpin_lock(&request->timeline->mutex, request->cookie);
> -
> -	trace_i915_request_add(request);
> -
> -	/*
> -	 * Make sure that no request gazumped us - if it was allocated after
> -	 * our i915_request_alloc() and called __i915_request_add() before
> -	 * us, the timeline will hold its seqno which is later than ours.
> -	 */
> -	GEM_BUG_ON(timeline->seqno != request->fence.seqno);
> +		  engine->name, rq->fence.context, rq->fence.seqno);
>   
>   	/*
>   	 * To ensure that this call will not fail, space for its emissions
>   	 * should already have been reserved in the ring buffer. Let the ring
>   	 * know that it is time to use that space up.
>   	 */
> -	GEM_BUG_ON(request->reserved_space > request->ring->space);
> -	request->reserved_space = 0;
> +	GEM_BUG_ON(rq->reserved_space > ring->space);
> +	rq->reserved_space = 0;
>   
>   	/*
>   	 * Record the position of the start of the breadcrumb so that
> @@ -1075,17 +1098,17 @@ void i915_request_add(struct i915_request *request)
>   	 * GPU processing the request, we never over-estimate the
>   	 * position of the ring's HEAD.
>   	 */
> -	cs = intel_ring_begin(request, engine->emit_fini_breadcrumb_dw);
> +	cs = intel_ring_begin(rq, engine->emit_fini_breadcrumb_dw);
>   	GEM_BUG_ON(IS_ERR(cs));
> -	request->postfix = intel_ring_offset(request, cs);
> +	rq->postfix = intel_ring_offset(rq, cs);
>   
> -	prev = __i915_request_add_to_timeline(request);
> +	prev = __i915_request_add_to_timeline(rq);
>   
> -	list_add_tail(&request->ring_link, &ring->request_list);
> -	if (list_is_first(&request->ring_link, &ring->request_list))
> -		list_add(&ring->active_link, &request->i915->gt.active_rings);
> -	request->i915->gt.active_engines |= request->engine->mask;
> -	request->emitted_jiffies = jiffies;
> +	list_add_tail(&rq->ring_link, &ring->request_list);
> +	if (list_is_first(&rq->ring_link, &ring->request_list))
> +		list_add(&ring->active_link, &rq->i915->gt.active_rings);
> +	rq->i915->gt.active_engines |= rq->engine->mask;
> +	rq->emitted_jiffies = jiffies;
>   
>   	/*
>   	 * Let the backend know a new request has arrived that may need
> @@ -1101,7 +1124,7 @@ void i915_request_add(struct i915_request *request)
>   	local_bh_disable();
>   	rcu_read_lock(); /* RCU serialisation for set-wedged protection */
>   	if (engine->schedule) {
> -		struct i915_sched_attr attr = request->gem_context->sched;
> +		struct i915_sched_attr attr = rq->gem_context->sched;
>   
>   		/*
>   		 * Boost actual workloads past semaphores!
> @@ -1115,7 +1138,7 @@ void i915_request_add(struct i915_request *request)
>   		 * far in the distance past over useful work, we keep a history
>   		 * of any semaphore use along our dependency chain.
>   		 */
> -		if (!(request->sched.flags & I915_SCHED_HAS_SEMAPHORE_CHAIN))
> +		if (!(rq->sched.flags & I915_SCHED_HAS_SEMAPHORE_CHAIN))
>   			attr.priority |= I915_PRIORITY_NOSEMAPHORE;
>   
>   		/*
> @@ -1124,15 +1147,29 @@ void i915_request_add(struct i915_request *request)
>   		 * Allow interactive/synchronous clients to jump ahead of
>   		 * the bulk clients. (FQ_CODEL)
>   		 */
> -		if (list_empty(&request->sched.signalers_list))
> +		if (list_empty(&rq->sched.signalers_list))
>   			attr.priority |= I915_PRIORITY_NEWCLIENT;
>   
> -		engine->schedule(request, &attr);
> +		engine->schedule(rq, &attr);
>   	}
>   	rcu_read_unlock();
> -	i915_sw_fence_commit(&request->submit);
> +	i915_sw_fence_commit(&rq->submit);
>   	local_bh_enable(); /* Kick the execlists tasklet if just scheduled */
>   
> +	return prev;
> +}
> +
> +void i915_request_add(struct i915_request *rq)
> +{
> +	struct i915_request *prev;
> +
> +	lockdep_assert_held(&rq->timeline->mutex);
> +	lockdep_unpin_lock(&rq->timeline->mutex, rq->cookie);
> +
> +	trace_i915_request_add(rq);
> +
> +	prev = __i915_request_commit(rq);
> +
>   	/*
>   	 * In typical scenarios, we do not expect the previous request on
>   	 * the timeline to be still tracked by timeline->last_request if it
> @@ -1153,7 +1190,7 @@ void i915_request_add(struct i915_request *request)
>   	if (prev && i915_request_completed(prev))
>   		i915_request_retire_upto(prev);
>   
> -	mutex_unlock(&request->timeline->mutex);
> +	mutex_unlock(&rq->timeline->mutex);
>   }
>   
>   static unsigned long local_clock_us(unsigned int *cpu)
> diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> index 875be6f71412..9a50072171f8 100644
> --- a/drivers/gpu/drm/i915/i915_request.h
> +++ b/drivers/gpu/drm/i915/i915_request.h
> @@ -238,6 +238,13 @@ static inline bool dma_fence_is_i915(const struct dma_fence *fence)
>   	return fence->ops == &i915_fence_ops;
>   }
>   
> +struct i915_request * __must_check
> +__i915_request_create(struct intel_context *ce, gfp_t gfp);
> +struct i915_request * __must_check
> +i915_request_create(struct intel_context *ce);
> +
> +struct i915_request *__i915_request_commit(struct i915_request *request);
> +
>   struct i915_request * __must_check
>   i915_request_alloc(struct intel_engine_cs *engine,
>   		   struct i915_gem_context *ctx);
> diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> index a882b8d42bd9..fea8c5869da9 100644
> --- a/drivers/gpu/drm/i915/intel_overlay.c
> +++ b/drivers/gpu/drm/i915/intel_overlay.c
> @@ -235,10 +235,9 @@ static int intel_overlay_do_wait_request(struct intel_overlay *overlay,
>   
>   static struct i915_request *alloc_request(struct intel_overlay *overlay)
>   {
> -	struct drm_i915_private *dev_priv = overlay->i915;
> -	struct intel_engine_cs *engine = dev_priv->engine[RCS0];
> +	struct intel_engine_cs *engine = overlay->i915->engine[RCS0];
>   
> -	return i915_request_alloc(engine, dev_priv->kernel_context);
> +	return i915_request_create(engine->kernel_context);
>   }
>   
>   /* overlay needs to be disable in OCMD reg */
> diff --git a/drivers/gpu/drm/i915/selftests/i915_active.c b/drivers/gpu/drm/i915/selftests/i915_active.c
> index 27d8f853111b..eee838dc0634 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_active.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_active.c
> @@ -46,7 +46,7 @@ static int __live_active_setup(struct drm_i915_private *i915,
>   	for_each_engine(engine, i915, id) {
>   		struct i915_request *rq;
>   
> -		rq = i915_request_alloc(engine, i915->kernel_context);
> +		rq = i915_request_create(engine->kernel_context);
>   		if (IS_ERR(rq)) {
>   			err = PTR_ERR(rq);
>   			break;
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
> index e43630b40fce..046a38743152 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
> @@ -202,7 +202,7 @@ static int gpu_set(struct drm_i915_gem_object *obj,
>   	if (IS_ERR(vma))
>   		return PTR_ERR(vma);
>   
> -	rq = i915_request_alloc(i915->engine[RCS0], i915->kernel_context);
> +	rq = i915_request_create(i915->engine[RCS0]->kernel_context);
>   	if (IS_ERR(rq)) {
>   		i915_vma_unpin(vma);
>   		return PTR_ERR(rq);
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
> index 12fc53c694a6..12203d665a4e 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_object.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
> @@ -468,7 +468,7 @@ static int make_obj_busy(struct drm_i915_gem_object *obj)
>   	if (err)
>   		return err;
>   
> -	rq = i915_request_alloc(i915->engine[RCS0], i915->kernel_context);
> +	rq = i915_request_create(i915->engine[RCS0]->kernel_context);
>   	if (IS_ERR(rq)) {
>   		i915_vma_unpin(vma);
>   		return PTR_ERR(rq);
> diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c
> index e6ffe2240126..098d7b3aa131 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_request.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_request.c
> @@ -551,8 +551,7 @@ static int live_nop_request(void *arg)
>   			times[1] = ktime_get_raw();
>   
>   			for (n = 0; n < prime; n++) {
> -				request = i915_request_alloc(engine,
> -							     i915->kernel_context);
> +				request = i915_request_create(engine->kernel_context);
>   				if (IS_ERR(request)) {
>   					err = PTR_ERR(request);
>   					goto out_unlock;
> @@ -649,7 +648,7 @@ empty_request(struct intel_engine_cs *engine,
>   	struct i915_request *request;
>   	int err;
>   
> -	request = i915_request_alloc(engine, engine->i915->kernel_context);
> +	request = i915_request_create(engine->kernel_context);
>   	if (IS_ERR(request))
>   		return request;
>   
> @@ -853,7 +852,7 @@ static int live_all_engines(void *arg)
>   	}
>   
>   	for_each_engine(engine, i915, id) {
> -		request[id] = i915_request_alloc(engine, i915->kernel_context);
> +		request[id] = i915_request_create(engine->kernel_context);
>   		if (IS_ERR(request[id])) {
>   			err = PTR_ERR(request[id]);
>   			pr_err("%s: Request allocation failed with err=%d\n",
> @@ -962,7 +961,7 @@ static int live_sequential_engines(void *arg)
>   			goto out_unlock;
>   		}
>   
> -		request[id] = i915_request_alloc(engine, i915->kernel_context);
> +		request[id] = i915_request_create(engine->kernel_context);
>   		if (IS_ERR(request[id])) {
>   			err = PTR_ERR(request[id]);
>   			pr_err("%s: Request allocation failed for %s with err=%d\n",
> diff --git a/drivers/gpu/drm/i915/selftests/i915_timeline.c b/drivers/gpu/drm/i915/selftests/i915_timeline.c
> index 8e7bcaa1eb66..227479349c61 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_timeline.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_timeline.c
> @@ -454,7 +454,7 @@ tl_write(struct i915_timeline *tl, struct intel_engine_cs *engine, u32 value)
>   		goto out;
>   	}
>   
> -	rq = i915_request_alloc(engine, engine->i915->kernel_context);
> +	rq = i915_request_create(engine->kernel_context);
>   	if (IS_ERR(rq))
>   		goto out_unpin;
>   
> @@ -678,7 +678,7 @@ static int live_hwsp_wrap(void *arg)
>   		if (!intel_engine_can_store_dword(engine))
>   			continue;
>   
> -		rq = i915_request_alloc(engine, i915->kernel_context);
> +		rq = i915_request_create(engine->kernel_context);
>   		if (IS_ERR(rq)) {
>   			err = PTR_ERR(rq);
>   			goto out;
> 

With the comment added:

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 10/29] drm/i915: Introduce context->enter() and context->exit()
  2019-04-10 10:13     ` Chris Wilson
@ 2019-04-10 11:06       ` Tvrtko Ursulin
  2019-04-10 19:19         ` Chris Wilson
  0 siblings, 1 reply; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-10 11:06 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 10/04/2019 11:13, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-04-10 11:05:13)
>>
>> On 08/04/2019 10:17, Chris Wilson wrote:
>>> +void __intel_context_enter(struct intel_context *ce)
>>> +{
>>> +     struct drm_i915_private *i915 = ce->gem_context->i915;
>>> +
>>> +     if (!i915->gt.active_requests++)
>>> +             i915_gem_unpark(i915);
>>> +}
>>> +
>>> +void __intel_context_exit(struct intel_context *ce)
>>> +{
>>> +     struct drm_i915_private *i915 = ce->gem_context->i915;
>>> +
>>> +     GEM_BUG_ON(!i915->gt.active_requests);
>>> +     if (!--i915->gt.active_requests)
>>> +             i915_gem_park(i915);
>>> +}
>>
>> In our normal nomenclature __intel_context_enter would be expected to be
>> directly called by intel_context_enter on the 0 -> 1 refcnt transition.
>> Here they are in fact common helpers, so one level of indirection, via a
>> different layer. I found this a bit confusing at first. My usual remedy
>> here is to prescribe a better name. __intel_gt_context_enter/exit? To
>> designate the glue between backend and GT.
> 
> The use of gt/gem here is a placeholder. Here we will be iterating over
> the engines (so just the one for the usual contexts, and num_siblings for
> virtual).
> 
> intel_context_enter_engine() ?
> 
> And virtual_context_enter remains distinct.

I did not find it at first since from the commit message I expected you 
would be assigning backend specific enter/exit callbacks to ce->ops. I 
see now that in "drm/i915: Invert the GEM wakeref hierarchy" you in fact 
just change the bodies of __intel_context_enter/exit. The only vfunc 
override comes with the virtual engine as you wrote.

intel_context_enter_engine sounds okay consider all this.

>>> diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
>>> index ebc861b1a49e..95b1fdc5826a 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_context.h
>>> +++ b/drivers/gpu/drm/i915/gt/intel_context.h
>>> @@ -73,6 +73,26 @@ static inline void __intel_context_pin(struct intel_context *ce)
>>>    
>>>    void intel_context_unpin(struct intel_context *ce);
>>>    
>>> +void __intel_context_enter(struct intel_context *ce);
>>> +void __intel_context_exit(struct intel_context *ce);
>>> +
>>> +static inline void intel_context_enter(struct intel_context *ce)
>>> +{
>>> +     if (!ce->active_count++)
>>> +             ce->ops->enter(ce);
>>> +}
>>> +
>>> +static inline void intel_context_mark_active(struct intel_context *ce)
>>> +{
>>
>> GEM_BUG_ON(ce->active_count == 0) ?
> 
> We're no longer in Kansas. I was wondering if I could start GT_BUG_ON
> and include "intel_gt.h" with a goal of doing a search and replace within
> gt/. That gives us some separation from GEM, and maybe we could even
> start filtering based on need.

I fear it would be too much for time being.

>   
>> And some lockdep_assert_held in all three?
> 
> Read on :(
> 
> The plan is for intel_context_enter/_exit to be under the
> timeline->mutex, but that isn't realised for about another 30 patches.
> 
> mark_active has special protection because it gets used from the
> serialised portion of intel_engine_park.

Ok, but bug on for zero active_count makes sense right? Maybe even 
intel_context_pin_active to signify that it can only act on an already 
active context?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 16/29] drm/i915: Export intel_context_instance()
  2019-04-08  9:17 ` [PATCH 16/29] drm/i915: Export intel_context_instance() Chris Wilson
@ 2019-04-10 12:06   ` Tvrtko Ursulin
  2019-04-10 19:32     ` Chris Wilson
  0 siblings, 1 reply; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-10 12:06 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 08/04/2019 10:17, Chris Wilson wrote:
> We want to pass in a intel_context into intel_context_pin() and that
> requires us to first be able to lookup the intel_context!
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/intel_context.c    | 37 +++++++++++-----------
>   drivers/gpu/drm/i915/gt/intel_context.h    | 19 +++++++----
>   drivers/gpu/drm/i915/gt/intel_engine_cs.c  |  8 ++++-
>   drivers/gpu/drm/i915/gt/mock_engine.c      |  8 ++++-
>   drivers/gpu/drm/i915/gvt/scheduler.c       |  7 +++-
>   drivers/gpu/drm/i915/i915_gem_execbuffer.c | 11 +++++--
>   drivers/gpu/drm/i915/i915_perf.c           | 21 ++++++++----
>   drivers/gpu/drm/i915/i915_request.c        | 11 ++++++-
>   8 files changed, 83 insertions(+), 39 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
> index 6ae6a3f58364..a1267739e369 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context.c
> +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> @@ -104,7 +104,7 @@ void __intel_context_remove(struct intel_context *ce)
>   	spin_unlock(&ctx->hw_contexts_lock);
>   }
>   
> -static struct intel_context *
> +struct intel_context *
>   intel_context_instance(struct i915_gem_context *ctx,
>   		       struct intel_engine_cs *engine)
>   {
> @@ -112,7 +112,7 @@ intel_context_instance(struct i915_gem_context *ctx,
>   
>   	ce = intel_context_lookup(ctx, engine);
>   	if (likely(ce))
> -		return ce;
> +		return intel_context_get(ce);
>   
>   	ce = intel_context_alloc();
>   	if (!ce)
> @@ -125,7 +125,7 @@ intel_context_instance(struct i915_gem_context *ctx,
>   		intel_context_free(ce);
>   
>   	GEM_BUG_ON(intel_context_lookup(ctx, engine) != pos);
> -	return pos;
> +	return intel_context_get(pos);
>   }
>   
>   struct intel_context *
> @@ -139,30 +139,30 @@ intel_context_pin_lock(struct i915_gem_context *ctx,
>   	if (IS_ERR(ce))
>   		return ce;
>   
> -	if (mutex_lock_interruptible(&ce->pin_mutex))
> +	if (mutex_lock_interruptible(&ce->pin_mutex)) {
> +		intel_context_put(ce);
>   		return ERR_PTR(-EINTR);
> +	}
>   
>   	return ce;
>   }
>   
> -struct intel_context *
> -intel_context_pin(struct i915_gem_context *ctx,
> -		  struct intel_engine_cs *engine)
> +void intel_context_pin_unlock(struct intel_context *ce)
> +	__releases(ce->pin_mutex)
>   {
> -	struct intel_context *ce;
> -	int err;
> -
> -	ce = intel_context_instance(ctx, engine);
> -	if (IS_ERR(ce))
> -		return ce;
> +	mutex_unlock(&ce->pin_mutex);
> +	intel_context_put(ce);
> +}
>   
> -	if (likely(atomic_inc_not_zero(&ce->pin_count)))
> -		return ce;
> +int __intel_context_do_pin(struct intel_context *ce)
> +{
> +	int err;
>   
>   	if (mutex_lock_interruptible(&ce->pin_mutex))
> -		return ERR_PTR(-EINTR);
> +		return -EINTR;
>   
>   	if (likely(!atomic_read(&ce->pin_count))) {
> +		struct i915_gem_context *ctx = ce->gem_context;
>   		intel_wakeref_t wakeref;
>   
>   		err = 0;
> @@ -172,7 +172,6 @@ intel_context_pin(struct i915_gem_context *ctx,
>   			goto err;
>   
>   		i915_gem_context_get(ctx);
> -		GEM_BUG_ON(ce->gem_context != ctx);
>   
>   		mutex_lock(&ctx->mutex);
>   		list_add(&ce->active_link, &ctx->active_engines);
> @@ -186,11 +185,11 @@ intel_context_pin(struct i915_gem_context *ctx,
>   	GEM_BUG_ON(!intel_context_is_pinned(ce)); /* no overflow! */
>   
>   	mutex_unlock(&ce->pin_mutex);
> -	return ce;
> +	return 0;
>   
>   err:
>   	mutex_unlock(&ce->pin_mutex);
> -	return ERR_PTR(err);
> +	return err;
>   }
>   
>   void intel_context_unpin(struct intel_context *ce)
> diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
> index 9aeef88176b9..da342e9a8c2e 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context.h
> @@ -49,11 +49,7 @@ intel_context_is_pinned(struct intel_context *ce)
>   	return atomic_read(&ce->pin_count);
>   }
>   
> -static inline void intel_context_pin_unlock(struct intel_context *ce)
> -__releases(ce->pin_mutex)
> -{
> -	mutex_unlock(&ce->pin_mutex);
> -}

Could leave this as static inline since the only addition is kref_put so 
compiler could decide what to do? Don't mind either way.

> +void intel_context_pin_unlock(struct intel_context *ce);
>   
>   struct intel_context *
>   __intel_context_insert(struct i915_gem_context *ctx,
> @@ -63,7 +59,18 @@ void
>   __intel_context_remove(struct intel_context *ce);
>   
>   struct intel_context *
> -intel_context_pin(struct i915_gem_context *ctx, struct intel_engine_cs *engine);
> +intel_context_instance(struct i915_gem_context *ctx,
> +		       struct intel_engine_cs *engine);
> +
> +int __intel_context_do_pin(struct intel_context *ce);
> +
> +static inline int intel_context_pin(struct intel_context *ce)
> +{
> +	if (likely(atomic_inc_not_zero(&ce->pin_count)))
> +		return 0;
> +
> +	return __intel_context_do_pin(ce);
> +}
>   
>   static inline void __intel_context_pin(struct intel_context *ce)
>   {
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index d378b276e476..f6828c0276eb 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -697,11 +697,17 @@ static int pin_context(struct i915_gem_context *ctx,
>   		       struct intel_context **out)
>   {
>   	struct intel_context *ce;
> +	int err;
>   
> -	ce = intel_context_pin(ctx, engine);
> +	ce = intel_context_instance(ctx, engine);
>   	if (IS_ERR(ce))
>   		return PTR_ERR(ce);
>   
> +	err = intel_context_pin(ce);
> +	intel_context_put(ce);
> +	if (err)
> +		return err;
> +
>   	*out = ce;
>   	return 0;
>   }
> diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c
> index d9f68c89dff4..a79d9909d171 100644
> --- a/drivers/gpu/drm/i915/gt/mock_engine.c
> +++ b/drivers/gpu/drm/i915/gt/mock_engine.c
> @@ -239,6 +239,7 @@ struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
>   				    int id)
>   {
>   	struct mock_engine *engine;
> +	int err;
>   
>   	GEM_BUG_ON(id >= I915_NUM_ENGINES);
>   
> @@ -278,10 +279,15 @@ struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
>   	INIT_LIST_HEAD(&engine->hw_queue);
>   
>   	engine->base.kernel_context =
> -		intel_context_pin(i915->kernel_context, &engine->base);
> +		intel_context_instance(i915->kernel_context, &engine->base);
>   	if (IS_ERR(engine->base.kernel_context))
>   		goto err_breadcrumbs;
>   
> +	err = intel_context_pin(engine->base.kernel_context);
> +	intel_context_put(engine->base.kernel_context);
> +	if (err)
> +		goto err_breadcrumbs;
> +
>   	return &engine->base;
>   
>   err_breadcrumbs:
> diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c
> index 40d9f549a0cd..606fc2713240 100644
> --- a/drivers/gpu/drm/i915/gvt/scheduler.c
> +++ b/drivers/gpu/drm/i915/gvt/scheduler.c
> @@ -1188,12 +1188,17 @@ int intel_vgpu_setup_submission(struct intel_vgpu *vgpu)
>   		INIT_LIST_HEAD(&s->workload_q_head[i]);
>   		s->shadow[i] = ERR_PTR(-EINVAL);
>   
> -		ce = intel_context_pin(ctx, engine);
> +		ce = intel_context_instance(ctx, engine);
>   		if (IS_ERR(ce)) {
>   			ret = PTR_ERR(ce);
>   			goto out_shadow_ctx;
>   		}
>   
> +		ret = intel_context_pin(ce);
> +		intel_context_put(ce);
> +		if (ret)
> +			goto out_shadow_ctx;
> +
>   		s->shadow[i] = ce;
>   	}
>   
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 35732c2ae17f..c700cbc2f594 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -2101,14 +2101,19 @@ static int eb_pin_context(struct i915_execbuffer *eb,
>   	if (err)
>   		return err;
>   
> +	ce = intel_context_instance(eb->gem_context, engine);
> +	if (IS_ERR(ce))
> +		return PTR_ERR(ce);
> +
>   	/*
>   	 * Pinning the contexts may generate requests in order to acquire
>   	 * GGTT space, so do this first before we reserve a seqno for
>   	 * ourselves.
>   	 */
> -	ce = intel_context_pin(eb->gem_context, engine);
> -	if (IS_ERR(ce))
> -		return PTR_ERR(ce);
> +	err = intel_context_pin(ce);
> +	intel_context_put(ce);
> +	if (err)
> +		return err;
>   
>   	eb->engine = engine;
>   	eb->context = ce;
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index 328a740e72cb..afaeabe5e531 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -1205,11 +1205,17 @@ static struct intel_context *oa_pin_context(struct drm_i915_private *i915,
>   {
>   	struct intel_engine_cs *engine = i915->engine[RCS0];
>   	struct intel_context *ce;
> -	int ret;
> +	int err;
>   
> -	ret = i915_mutex_lock_interruptible(&i915->drm);
> -	if (ret)
> -		return ERR_PTR(ret);
> +	ce = intel_context_instance(ctx, engine);
> +	if (IS_ERR(ce))
> +		return ce;
> +
> +	err = i915_mutex_lock_interruptible(&i915->drm);
> +	if (err) {
> +		intel_context_put(ce);
> +		return ERR_PTR(err);
> +	}
>   
>   	/*
>   	 * As the ID is the gtt offset of the context's vma we
> @@ -1217,10 +1223,11 @@ static struct intel_context *oa_pin_context(struct drm_i915_private *i915,
>   	 *
>   	 * NB: implied RCS engine...
>   	 */
> -	ce = intel_context_pin(ctx, engine);
> +	err = intel_context_pin(ce);
>   	mutex_unlock(&i915->drm.struct_mutex);
> -	if (IS_ERR(ce))
> -		return ce;
> +	intel_context_put(ce);
> +	if (err)
> +		return ERR_PTR(err);
>   
>   	i915->perf.oa.pinned_ctx = ce;
>   
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 7932d1209247..8abd891d9287 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -753,6 +753,7 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   	struct drm_i915_private *i915 = engine->i915;
>   	struct intel_context *ce;
>   	struct i915_request *rq;
> +	int err;
>   
>   	/*
>   	 * Preempt contexts are reserved for exclusive use to inject a
> @@ -766,13 +767,21 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   	 * GGTT space, so do this first before we reserve a seqno for
>   	 * ourselves.
>   	 */
> -	ce = intel_context_pin(ctx, engine);
> +	ce = intel_context_instance(ctx, engine);
>   	if (IS_ERR(ce))
>   		return ERR_CAST(ce);
>   
> +	err = intel_context_pin(ce);
> +	if (err) {
> +		rq = ERR_PTR(err);
> +		goto err_put;
> +	}
> +
>   	rq = i915_request_create(ce);
>   	intel_context_unpin(ce);
>   
> +err_put:
> +	intel_context_put(ce);
>   	return rq;
>   }
>   
> 

The pattern of instance-pin-put is repeated so many times it begs to be 
promoted to something like intel_context_pin_instance.

Side note, it is a bit confusing between reference count and pin count. 
I won't try to make sense of it all now.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 18/29] drm/i915/selftests: Pass around intel_context for sseu
  2019-04-08  9:17 ` [PATCH 18/29] drm/i915/selftests: Pass around intel_context for sseu Chris Wilson
@ 2019-04-10 12:25   ` Tvrtko Ursulin
  0 siblings, 0 replies; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-10 12:25 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 08/04/2019 10:17, Chris Wilson wrote:
> Combine the (i915_gem_context, intel_engine) into a single parameter,
> the intel_context for convenience and later simplification.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   .../gpu/drm/i915/selftests/i915_gem_context.c | 74 +++++++++++--------
>   1 file changed, 44 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> index 807644ae6877..8e2a94333559 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> @@ -755,8 +755,7 @@ static struct i915_vma *rpcs_query_batch(struct i915_vma *vma)
>   
>   static int
>   emit_rpcs_query(struct drm_i915_gem_object *obj,
> -		struct i915_gem_context *ctx,
> -		struct intel_engine_cs *engine,
> +		struct intel_context *ce,
>   		struct i915_request **rq_out)
>   {
>   	struct i915_request *rq;
> @@ -764,9 +763,9 @@ emit_rpcs_query(struct drm_i915_gem_object *obj,
>   	struct i915_vma *vma;
>   	int err;
>   
> -	GEM_BUG_ON(!intel_engine_can_store_dword(engine));
> +	GEM_BUG_ON(!intel_engine_can_store_dword(ce->engine));
>   
> -	vma = i915_vma_instance(obj, &ctx->ppgtt->vm, NULL);
> +	vma = i915_vma_instance(obj, &ce->gem_context->ppgtt->vm, NULL);
>   	if (IS_ERR(vma))
>   		return PTR_ERR(vma);
>   
> @@ -784,13 +783,15 @@ emit_rpcs_query(struct drm_i915_gem_object *obj,
>   		goto err_vma;
>   	}
>   
> -	rq = i915_request_alloc(engine, ctx);
> +	rq = i915_request_create(ce);
>   	if (IS_ERR(rq)) {
>   		err = PTR_ERR(rq);
>   		goto err_batch;
>   	}
>   
> -	err = engine->emit_bb_start(rq, batch->node.start, batch->node.size, 0);
> +	err = rq->engine->emit_bb_start(rq,
> +					batch->node.start, batch->node.size,
> +					0);
>   	if (err)
>   		goto err_request;
>   
> @@ -834,8 +835,7 @@ static int
>   __sseu_prepare(struct drm_i915_private *i915,
>   	       const char *name,
>   	       unsigned int flags,
> -	       struct i915_gem_context *ctx,
> -	       struct intel_engine_cs *engine,
> +	       struct intel_context *ce,
>   	       struct igt_spinner **spin)
>   {
>   	struct i915_request *rq;
> @@ -853,7 +853,10 @@ __sseu_prepare(struct drm_i915_private *i915,
>   	if (ret)
>   		goto err_free;
>   
> -	rq = igt_spinner_create_request(*spin, ctx, engine, MI_NOOP);
> +	rq = igt_spinner_create_request(*spin,
> +					ce->gem_context,
> +					ce->engine,
> +					MI_NOOP);
>   	if (IS_ERR(rq)) {
>   		ret = PTR_ERR(rq);
>   		goto err_fini;
> @@ -880,8 +883,7 @@ __sseu_prepare(struct drm_i915_private *i915,
>   
>   static int
>   __read_slice_count(struct drm_i915_private *i915,
> -		   struct i915_gem_context *ctx,
> -		   struct intel_engine_cs *engine,
> +		   struct intel_context *ce,
>   		   struct drm_i915_gem_object *obj,
>   		   struct igt_spinner *spin,
>   		   u32 *rpcs)
> @@ -892,7 +894,7 @@ __read_slice_count(struct drm_i915_private *i915,
>   	u32 *buf, val;
>   	long ret;
>   
> -	ret = emit_rpcs_query(obj, ctx, engine, &rq);
> +	ret = emit_rpcs_query(obj, ce, &rq);
>   	if (ret)
>   		return ret;
>   
> @@ -956,29 +958,28 @@ static int
>   __sseu_finish(struct drm_i915_private *i915,
>   	      const char *name,
>   	      unsigned int flags,
> -	      struct i915_gem_context *ctx,
> -	      struct intel_engine_cs *engine,
> +	      struct intel_context *ce,
>   	      struct drm_i915_gem_object *obj,
>   	      unsigned int expected,
>   	      struct igt_spinner *spin)
>   {
> -	unsigned int slices = hweight32(engine->sseu.slice_mask);
> +	unsigned int slices = hweight32(ce->engine->sseu.slice_mask);
>   	u32 rpcs = 0;
>   	int ret = 0;
>   
>   	if (flags & TEST_RESET) {
> -		ret = i915_reset_engine(engine, "sseu");
> +		ret = i915_reset_engine(ce->engine, "sseu");
>   		if (ret)
>   			goto out;
>   	}
>   
> -	ret = __read_slice_count(i915, ctx, engine, obj,
> +	ret = __read_slice_count(i915, ce, obj,
>   				 flags & TEST_RESET ? NULL : spin, &rpcs);
>   	ret = __check_rpcs(name, rpcs, ret, expected, "Context", "!");
>   	if (ret)
>   		goto out;
>   
> -	ret = __read_slice_count(i915, i915->kernel_context, engine, obj,
> +	ret = __read_slice_count(i915, ce->engine->kernel_context, obj,
>   				 NULL, &rpcs);
>   	ret = __check_rpcs(name, rpcs, ret, slices, "Kernel context", "!");
>   
> @@ -993,7 +994,7 @@ __sseu_finish(struct drm_i915_private *i915,
>   		if (ret)
>   			return ret;
>   
> -		ret = __read_slice_count(i915, ctx, engine, obj, NULL, &rpcs);
> +		ret = __read_slice_count(i915, ce, obj, NULL, &rpcs);
>   		ret = __check_rpcs(name, rpcs, ret, expected,
>   				   "Context", " after idle!");
>   	}
> @@ -1005,23 +1006,22 @@ static int
>   __sseu_test(struct drm_i915_private *i915,
>   	    const char *name,
>   	    unsigned int flags,
> -	    struct i915_gem_context *ctx,
> -	    struct intel_engine_cs *engine,
> +	    struct intel_context *ce,
>   	    struct drm_i915_gem_object *obj,
>   	    struct intel_sseu sseu)
>   {
>   	struct igt_spinner *spin = NULL;
>   	int ret;
>   
> -	ret = __sseu_prepare(i915, name, flags, ctx, engine, &spin);
> +	ret = __sseu_prepare(i915, name, flags, ce, &spin);
>   	if (ret)
>   		return ret;
>   
> -	ret = __i915_gem_context_reconfigure_sseu(ctx, engine, sseu);
> +	ret = __i915_gem_context_reconfigure_sseu(ce->gem_context, ce->engine, sseu);
>   	if (ret)
>   		goto out_spin;
>   
> -	ret = __sseu_finish(i915, name, flags, ctx, engine, obj,
> +	ret = __sseu_finish(i915, name, flags, ce, obj,
>   			    hweight32(sseu.slice_mask), spin);
>   
>   out_spin:
> @@ -1042,6 +1042,7 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
>   	struct intel_sseu default_sseu = engine->sseu;
>   	struct drm_i915_gem_object *obj;
>   	struct i915_gem_context *ctx;
> +	struct intel_context *ce;
>   	struct intel_sseu pg_sseu;
>   	intel_wakeref_t wakeref;
>   	struct drm_file *file;
> @@ -1093,23 +1094,33 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
>   
>   	wakeref = intel_runtime_pm_get(i915);
>   
> +	ce = intel_context_instance(ctx, i915->engine[RCS0]);
> +	if (IS_ERR(ce)) {
> +		ret = PTR_ERR(ce);
> +		goto out_rpm;
> +	}
> +
> +	ret = intel_context_pin(ce);
> +	if (ret)
> +		goto out_context;
> +
>   	/* First set the default mask. */
> -	ret = __sseu_test(i915, name, flags, ctx, engine, obj, default_sseu);
> +	ret = __sseu_test(i915, name, flags, ce, obj, default_sseu);
>   	if (ret)
>   		goto out_fail;
>   
>   	/* Then set a power-gated configuration. */
> -	ret = __sseu_test(i915, name, flags, ctx, engine, obj, pg_sseu);
> +	ret = __sseu_test(i915, name, flags, ce, obj, pg_sseu);
>   	if (ret)
>   		goto out_fail;
>   
>   	/* Back to defaults. */
> -	ret = __sseu_test(i915, name, flags, ctx, engine, obj, default_sseu);
> +	ret = __sseu_test(i915, name, flags, ce, obj, default_sseu);
>   	if (ret)
>   		goto out_fail;
>   
>   	/* One last power-gated configuration for the road. */
> -	ret = __sseu_test(i915, name, flags, ctx, engine, obj, pg_sseu);
> +	ret = __sseu_test(i915, name, flags, ce, obj, pg_sseu);
>   	if (ret)
>   		goto out_fail;
>   
> @@ -1117,9 +1128,12 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
>   	if (igt_flush_test(i915, I915_WAIT_LOCKED))
>   		ret = -EIO;
>   
> -	i915_gem_object_put(obj);
> -
> +	intel_context_unpin(ce);
> +out_context:
> +	intel_context_put(ce);
> +out_rpm:
>   	intel_runtime_pm_put(i915, wakeref);
> +	i915_gem_object_put(obj);
>   
>   out_unlock:
>   	mutex_unlock(&i915->drm.struct_mutex);
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 19/29] drm/i915: Pass intel_context to intel_context_pin_lock()
  2019-04-08  9:17 ` [PATCH 19/29] drm/i915: Pass intel_context to intel_context_pin_lock() Chris Wilson
@ 2019-04-10 12:45   ` Tvrtko Ursulin
  2019-04-10 12:49     ` Chris Wilson
  0 siblings, 1 reply; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-10 12:45 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 08/04/2019 10:17, Chris Wilson wrote:
> Move the intel_context_instance() to the caller so that we can decouple
> ourselves from one context instance per engine.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/intel_context.c       | 26 ------
>   drivers/gpu/drm/i915/gt/intel_context.h       | 14 ++-
>   drivers/gpu/drm/i915/i915_gem_context.c       | 88 +++++++++++--------
>   .../gpu/drm/i915/selftests/i915_gem_context.c |  2 +-
>   4 files changed, 64 insertions(+), 66 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
> index a1267739e369..46bf8d8fb7e4 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context.c
> +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> @@ -128,32 +128,6 @@ intel_context_instance(struct i915_gem_context *ctx,
>   	return intel_context_get(pos);
>   }
>   
> -struct intel_context *
> -intel_context_pin_lock(struct i915_gem_context *ctx,
> -		       struct intel_engine_cs *engine)
> -	__acquires(ce->pin_mutex)
> -{
> -	struct intel_context *ce;
> -
> -	ce = intel_context_instance(ctx, engine);
> -	if (IS_ERR(ce))
> -		return ce;
> -
> -	if (mutex_lock_interruptible(&ce->pin_mutex)) {
> -		intel_context_put(ce);
> -		return ERR_PTR(-EINTR);
> -	}
> -
> -	return ce;
> -}
> -
> -void intel_context_pin_unlock(struct intel_context *ce)
> -	__releases(ce->pin_mutex)
> -{
> -	mutex_unlock(&ce->pin_mutex);
> -	intel_context_put(ce);
> -}
> -
>   int __intel_context_do_pin(struct intel_context *ce)
>   {
>   	int err;
> diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
> index da342e9a8c2e..4f8ef45e6344 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context.h
> @@ -39,9 +39,17 @@ intel_context_lookup(struct i915_gem_context *ctx,
>    * can neither be bound to the GPU or unbound whilst the lock is held, i.e.
>    * intel_context_is_pinned() remains stable.
>    */
> -struct intel_context *
> -intel_context_pin_lock(struct i915_gem_context *ctx,
> -		       struct intel_engine_cs *engine);
> +static inline int intel_context_pin_lock(struct intel_context *ce)
> +	__acquires(ce->pin_mutex)
> +{
> +	return mutex_lock_interruptible(&ce->pin_mutex);
> +}

So if this is not getting the kref any more why are the other callers 
okay? In previous patch(es) some certainly looked like they assume pin 
implies a reference.

> +
> +static inline void intel_context_pin_unlock(struct intel_context *ce)
> +	__releases(ce->pin_mutex)
> +{
> +	mutex_unlock(&ce->pin_mutex);
> +}
>   
>   static inline bool
>   intel_context_is_pinned(struct intel_context *ce)
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 51c427ac0c21..1e84fe0a4c55 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -141,6 +141,18 @@ static void lut_close(struct i915_gem_context *ctx)
>   	rcu_read_unlock();
>   }
>   
> +static struct intel_context *
> +lookup_user_engine(struct i915_gem_context *ctx, u16 class, u16 instance)
> +{
> +	struct intel_engine_cs *engine;
> +
> +	engine = intel_engine_lookup_user(ctx->i915, class, instance);
> +	if (!engine)
> +		return ERR_PTR(-EINVAL);
> +
> +	return intel_context_instance(ctx, engine);
> +}
> +
>   static inline int new_hw_id(struct drm_i915_private *i915, gfp_t gfp)
>   {
>   	unsigned int max;
> @@ -1142,19 +1154,17 @@ gen8_modify_rpcs(struct intel_context *ce, struct intel_sseu sseu)
>   }
>   
>   static int
> -__i915_gem_context_reconfigure_sseu(struct i915_gem_context *ctx,
> -				    struct intel_engine_cs *engine,
> -				    struct intel_sseu sseu)
> +__intel_context_reconfigure_sseu(struct intel_context *ce,
> +				 struct intel_sseu sseu)
>   {
> -	struct intel_context *ce;
> -	int ret = 0;
> +	int ret;
>   
> -	GEM_BUG_ON(INTEL_GEN(ctx->i915) < 8);
> -	GEM_BUG_ON(engine->id != RCS0);
> +	GEM_BUG_ON(INTEL_GEN(ce->gem_context->i915) < 8);
> +	GEM_BUG_ON(ce->engine->id != RCS0);
>   
> -	ce = intel_context_pin_lock(ctx, engine);
> -	if (IS_ERR(ce))
> -		return PTR_ERR(ce);
> +	ret = intel_context_pin_lock(ce);
> +	if (ret)
> +		return ret;
>   
>   	/* Nothing to do if unmodified. */
>   	if (!memcmp(&ce->sseu, &sseu, sizeof(sseu)))
> @@ -1170,19 +1180,18 @@ __i915_gem_context_reconfigure_sseu(struct i915_gem_context *ctx,
>   }
>   
>   static int
> -i915_gem_context_reconfigure_sseu(struct i915_gem_context *ctx,
> -				  struct intel_engine_cs *engine,
> -				  struct intel_sseu sseu)
> +intel_context_reconfigure_sseu(struct intel_context *ce, struct intel_sseu sseu)
>   {
> +	struct drm_i915_private *i915 = ce->gem_context->i915;
>   	int ret;
>   
> -	ret = mutex_lock_interruptible(&ctx->i915->drm.struct_mutex);
> +	ret = mutex_lock_interruptible(&i915->drm.struct_mutex);
>   	if (ret)
>   		return ret;
>   
> -	ret = __i915_gem_context_reconfigure_sseu(ctx, engine, sseu);
> +	ret = __intel_context_reconfigure_sseu(ce, sseu);
>   
> -	mutex_unlock(&ctx->i915->drm.struct_mutex);
> +	mutex_unlock(&i915->drm.struct_mutex);
>   
>   	return ret;
>   }
> @@ -1290,7 +1299,7 @@ static int set_sseu(struct i915_gem_context *ctx,
>   {
>   	struct drm_i915_private *i915 = ctx->i915;
>   	struct drm_i915_gem_context_param_sseu user_sseu;
> -	struct intel_engine_cs *engine;
> +	struct intel_context *ce;
>   	struct intel_sseu sseu;
>   	int ret;
>   
> @@ -1307,27 +1316,31 @@ static int set_sseu(struct i915_gem_context *ctx,
>   	if (user_sseu.flags || user_sseu.rsvd)
>   		return -EINVAL;
>   
> -	engine = intel_engine_lookup_user(i915,
> -					  user_sseu.engine_class,
> -					  user_sseu.engine_instance);
> -	if (!engine)
> -		return -EINVAL;
> +	ce = lookup_user_engine(ctx,
> +				user_sseu.engine_class,
> +				user_sseu.engine_instance);
> +	if (IS_ERR(ce))
> +		return PTR_ERR(ce);
>   
>   	/* Only render engine supports RPCS configuration. */
> -	if (engine->class != RENDER_CLASS)
> -		return -ENODEV;
> +	if (ce->engine->class != RENDER_CLASS) {
> +		ret = -ENODEV;
> +		goto out_ce;
> +	}
>   
>   	ret = user_to_context_sseu(i915, &user_sseu, &sseu);
>   	if (ret)
> -		return ret;
> +		goto out_ce;
>   
> -	ret = i915_gem_context_reconfigure_sseu(ctx, engine, sseu);
> +	ret = intel_context_reconfigure_sseu(ce, sseu);
>   	if (ret)
> -		return ret;
> +		goto out_ce;
>   
>   	args->size = sizeof(user_sseu);
>   
> -	return 0;
> +out_ce:
> +	intel_context_put(ce);
> +	return ret;
>   }
>   
>   static int ctx_setparam(struct drm_i915_file_private *fpriv,
> @@ -1532,8 +1545,8 @@ static int get_sseu(struct i915_gem_context *ctx,
>   		    struct drm_i915_gem_context_param *args)
>   {
>   	struct drm_i915_gem_context_param_sseu user_sseu;
> -	struct intel_engine_cs *engine;
>   	struct intel_context *ce;
> +	int err;
>   
>   	if (args->size == 0)
>   		goto out;
> @@ -1547,22 +1560,25 @@ static int get_sseu(struct i915_gem_context *ctx,
>   	if (user_sseu.flags || user_sseu.rsvd)
>   		return -EINVAL;
>   
> -	engine = intel_engine_lookup_user(ctx->i915,
> -					  user_sseu.engine_class,
> -					  user_sseu.engine_instance);
> -	if (!engine)
> -		return -EINVAL;
> -
> -	ce = intel_context_pin_lock(ctx, engine); /* serialises with set_sseu */
> +	ce = lookup_user_engine(ctx,
> +				user_sseu.engine_class,
> +				user_sseu.engine_instance);
>   	if (IS_ERR(ce))
>   		return PTR_ERR(ce);
>   
> +	err = intel_context_pin_lock(ce); /* serialises with set_sseu */
> +	if (err) {
> +		intel_context_put(ce);
> +		return err;
> +	}
> +
>   	user_sseu.slice_mask = ce->sseu.slice_mask;
>   	user_sseu.subslice_mask = ce->sseu.subslice_mask;
>   	user_sseu.min_eus_per_subslice = ce->sseu.min_eus_per_subslice;
>   	user_sseu.max_eus_per_subslice = ce->sseu.max_eus_per_subslice;
>   
>   	intel_context_pin_unlock(ce);
> +	intel_context_put(ce);
>   
>   	if (copy_to_user(u64_to_user_ptr(args->value), &user_sseu,
>   			 sizeof(user_sseu)))
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> index 8e2a94333559..214d1fd2f4dc 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> @@ -1017,7 +1017,7 @@ __sseu_test(struct drm_i915_private *i915,
>   	if (ret)
>   		return ret;
>   
> -	ret = __i915_gem_context_reconfigure_sseu(ce->gem_context, ce->engine, sseu);
> +	ret = __intel_context_reconfigure_sseu(ce, sseu);
>   	if (ret)
>   		goto out_spin;
>   
> 

Rest looks good.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 19/29] drm/i915: Pass intel_context to intel_context_pin_lock()
  2019-04-10 12:45   ` Tvrtko Ursulin
@ 2019-04-10 12:49     ` Chris Wilson
  2019-04-10 13:04       ` Chris Wilson
  0 siblings, 1 reply; 71+ messages in thread
From: Chris Wilson @ 2019-04-10 12:49 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-04-10 13:45:05)
> 
> On 08/04/2019 10:17, Chris Wilson wrote:
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
> > index da342e9a8c2e..4f8ef45e6344 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_context.h
> > @@ -39,9 +39,17 @@ intel_context_lookup(struct i915_gem_context *ctx,
> >    * can neither be bound to the GPU or unbound whilst the lock is held, i.e.
> >    * intel_context_is_pinned() remains stable.
> >    */
> > -struct intel_context *
> > -intel_context_pin_lock(struct i915_gem_context *ctx,
> > -                    struct intel_engine_cs *engine);
> > +static inline int intel_context_pin_lock(struct intel_context *ce)
> > +     __acquires(ce->pin_mutex)
> > +{
> > +     return mutex_lock_interruptible(&ce->pin_mutex);
> > +}
> 
> So if this is not getting the kref any more why are the other callers 
> okay? In previous patch(es) some certainly looked like they assume pin 
> implies a reference.

This is not 'pin' but locking the pinned state.
Maybe lock_pin and unlock_pin? (I was going to say lock_pin_count, but
it strictly does not prevent updates to pin_count, merely the transition
to/from .pin_count == 0.)

The usecase example is ctx_sseu that wants to inspect the current status
of the context and operate on pinned/not-pinned safe in the knowledge
that can't change as it does so.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 19/29] drm/i915: Pass intel_context to intel_context_pin_lock()
  2019-04-10 12:49     ` Chris Wilson
@ 2019-04-10 13:04       ` Chris Wilson
  2019-04-10 14:53         ` Tvrtko Ursulin
  0 siblings, 1 reply; 71+ messages in thread
From: Chris Wilson @ 2019-04-10 13:04 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Chris Wilson (2019-04-10 13:49:32)
> Quoting Tvrtko Ursulin (2019-04-10 13:45:05)
> > 
> > On 08/04/2019 10:17, Chris Wilson wrote:
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
> > > index da342e9a8c2e..4f8ef45e6344 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_context.h
> > > +++ b/drivers/gpu/drm/i915/gt/intel_context.h
> > > @@ -39,9 +39,17 @@ intel_context_lookup(struct i915_gem_context *ctx,
> > >    * can neither be bound to the GPU or unbound whilst the lock is held, i.e.
> > >    * intel_context_is_pinned() remains stable.
> > >    */
> > > -struct intel_context *
> > > -intel_context_pin_lock(struct i915_gem_context *ctx,
> > > -                    struct intel_engine_cs *engine);
> > > +static inline int intel_context_pin_lock(struct intel_context *ce)
> > > +     __acquires(ce->pin_mutex)
> > > +{
> > > +     return mutex_lock_interruptible(&ce->pin_mutex);
> > > +}
> > 
> > So if this is not getting the kref any more why are the other callers 
> > okay? In previous patch(es) some certainly looked like they assume pin 
> > implies a reference.
> 
> This is not 'pin' but locking the pinned state.
> Maybe lock_pin and unlock_pin? (I was going to say lock_pin_count, but
> it strictly does not prevent updates to pin_count, merely the transition
> to/from .pin_count == 0.)

Went with intel_context_lock_pinned() and intel_context_unlock_pinned().
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 20/29] drm/i915: Split engine setup/init into two phases
  2019-04-08  9:17 ` [PATCH 20/29] drm/i915: Split engine setup/init into two phases Chris Wilson
@ 2019-04-10 13:30   ` Tvrtko Ursulin
  0 siblings, 0 replies; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-10 13:30 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 08/04/2019 10:17, Chris Wilson wrote:
> In the next patch, we require the engine vfuncs setup prior to
> initialising the pinned kernel contexts, so split the vfunc setup from
> the engine initialisation and call it earlier.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/intel_engine.h        |   8 +-
>   drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  99 ++++----
>   drivers/gpu/drm/i915/gt/intel_lrc.c           |  74 ++----
>   drivers/gpu/drm/i915/gt/intel_lrc.h           |   5 +-
>   drivers/gpu/drm/i915/gt/intel_ringbuffer.c    | 232 +++++++++---------
>   drivers/gpu/drm/i915/gt/intel_workarounds.c   |   3 +-
>   drivers/gpu/drm/i915/gt/mock_engine.c         |  48 ++--
>   drivers/gpu/drm/i915/gt/mock_engine.h         |   2 +
>   drivers/gpu/drm/i915/i915_gem.c               |   6 +
>   .../gpu/drm/i915/selftests/mock_gem_device.c  |  12 +-
>   10 files changed, 245 insertions(+), 244 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
> index a17152e96bf8..a8dc2740ba2f 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine.h
> @@ -362,14 +362,12 @@ __intel_ring_space(unsigned int head, unsigned int tail, unsigned int size)
>   	return (head - tail - CACHELINE_BYTES) & (size - 1);
>   }
>   
> -int intel_engine_setup_common(struct intel_engine_cs *engine);
> +int intel_engines_setup(struct drm_i915_private *i915);
>   int intel_engine_init_common(struct intel_engine_cs *engine);
>   void intel_engine_cleanup_common(struct intel_engine_cs *engine);
>   
> -int intel_init_render_ring_buffer(struct intel_engine_cs *engine);
> -int intel_init_bsd_ring_buffer(struct intel_engine_cs *engine);
> -int intel_init_blt_ring_buffer(struct intel_engine_cs *engine);
> -int intel_init_vebox_ring_buffer(struct intel_engine_cs *engine);
> +int intel_ring_submission_setup(struct intel_engine_cs *engine);
> +int intel_ring_submission_init(struct intel_engine_cs *engine);
>   
>   int intel_engine_stop_cs(struct intel_engine_cs *engine);
>   void intel_engine_cancel_stop_cs(struct intel_engine_cs *engine);
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index f6828c0276eb..3f794bc71958 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -50,35 +50,24 @@
>   
>   struct engine_class_info {
>   	const char *name;
> -	int (*init_legacy)(struct intel_engine_cs *engine);
> -	int (*init_execlists)(struct intel_engine_cs *engine);
> -
>   	u8 uabi_class;
>   };
>   
>   static const struct engine_class_info intel_engine_classes[] = {
>   	[RENDER_CLASS] = {
>   		.name = "rcs",
> -		.init_execlists = logical_render_ring_init,
> -		.init_legacy = intel_init_render_ring_buffer,
>   		.uabi_class = I915_ENGINE_CLASS_RENDER,
>   	},
>   	[COPY_ENGINE_CLASS] = {
>   		.name = "bcs",
> -		.init_execlists = logical_xcs_ring_init,
> -		.init_legacy = intel_init_blt_ring_buffer,
>   		.uabi_class = I915_ENGINE_CLASS_COPY,
>   	},
>   	[VIDEO_DECODE_CLASS] = {
>   		.name = "vcs",
> -		.init_execlists = logical_xcs_ring_init,
> -		.init_legacy = intel_init_bsd_ring_buffer,
>   		.uabi_class = I915_ENGINE_CLASS_VIDEO,
>   	},
>   	[VIDEO_ENHANCEMENT_CLASS] = {
>   		.name = "vecs",
> -		.init_execlists = logical_xcs_ring_init,
> -		.init_legacy = intel_init_vebox_ring_buffer,
>   		.uabi_class = I915_ENGINE_CLASS_VIDEO_ENHANCE,
>   	},
>   };
> @@ -400,48 +389,39 @@ int intel_engines_init_mmio(struct drm_i915_private *dev_priv)
>   
>   /**
>    * intel_engines_init() - init the Engine Command Streamers
> - * @dev_priv: i915 device private
> + * @i915: i915 device private
>    *
>    * Return: non-zero if the initialization failed.
>    */
> -int intel_engines_init(struct drm_i915_private *dev_priv)
> +int intel_engines_init(struct drm_i915_private *i915)
>   {
> +	int (*init)(struct intel_engine_cs *engine);
>   	struct intel_engine_cs *engine;
>   	enum intel_engine_id id, err_id;
>   	int err;
>   
> -	for_each_engine(engine, dev_priv, id) {
> -		const struct engine_class_info *class_info =
> -			&intel_engine_classes[engine->class];
> -		int (*init)(struct intel_engine_cs *engine);
> -
> -		if (HAS_EXECLISTS(dev_priv))
> -			init = class_info->init_execlists;
> -		else
> -			init = class_info->init_legacy;
> +	if (HAS_EXECLISTS(i915))
> +		init = intel_execlists_submission_init;
> +	else
> +		init = intel_ring_submission_init;
>   
> -		err = -EINVAL;
> +	for_each_engine(engine, i915, id) {
>   		err_id = id;
>   
> -		if (GEM_DEBUG_WARN_ON(!init))
> -			goto cleanup;
> -
>   		err = init(engine);
>   		if (err)
>   			goto cleanup;
> -
> -		GEM_BUG_ON(!engine->submit_request);
>   	}
>   
>   	return 0;
>   
>   cleanup:
> -	for_each_engine(engine, dev_priv, id) {
> +	for_each_engine(engine, i915, id) {
>   		if (id >= err_id) {
>   			kfree(engine);
> -			dev_priv->engine[id] = NULL;
> +			i915->engine[id] = NULL;
>   		} else {
> -			dev_priv->gt.cleanup_engine(engine);
> +			i915->gt.cleanup_engine(engine);
>   		}
>   	}
>   	return err;
> @@ -559,16 +539,7 @@ static int init_status_page(struct intel_engine_cs *engine)
>   	return ret;
>   }
>   
> -/**
> - * intel_engines_setup_common - setup engine state not requiring hw access
> - * @engine: Engine to setup.
> - *
> - * Initializes @engine@ structure members shared between legacy and execlists
> - * submission modes which do not require hardware access.
> - *
> - * Typically done early in the submission mode specific engine setup stage.
> - */
> -int intel_engine_setup_common(struct intel_engine_cs *engine)
> +static int intel_engine_setup_common(struct intel_engine_cs *engine)
>   {
>   	int err;
>   
> @@ -602,6 +573,52 @@ int intel_engine_setup_common(struct intel_engine_cs *engine)
>   	return err;
>   }
>   
> +/**
> + * intel_engines_setup- setup engine state not requiring hw access
> + * @i915: Device to setup.
> + *
> + * Initializes engine structure members shared between legacy and execlists
> + * submission modes which do not require hardware access.
> + *
> + * Typically done early in the submission mode specific engine setup stage.
> + */
> +int intel_engines_setup(struct drm_i915_private *i915)
> +{
> +	int (*setup)(struct intel_engine_cs *engine);
> +	struct intel_engine_cs *engine;
> +	enum intel_engine_id id;
> +	int err;
> +
> +	if (HAS_EXECLISTS(i915))
> +		setup = intel_execlists_submission_setup;
> +	else
> +		setup = intel_ring_submission_setup;
> +
> +	for_each_engine(engine, i915, id) {
> +		err = intel_engine_setup_common(engine);
> +		if (err)
> +			goto cleanup;
> +
> +		err = setup(engine);
> +		if (err)
> +			goto cleanup;
> +
> +		GEM_BUG_ON(!engine->cops);
> +	}
> +
> +	return 0;
> +
> +cleanup:
> +	for_each_engine(engine, i915, id) {
> +		if (engine->cops)
> +			i915->gt.cleanup_engine(engine);
> +		else
> +			kfree(engine);
> +		i915->engine[id] = NULL;
> +	}
> +	return err;
> +}
> +
>   void intel_engines_set_scheduler_caps(struct drm_i915_private *i915)
>   {
>   	static const struct {
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 7a5e6e962e61..d4e28fbb5dcd 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -1786,8 +1786,8 @@ static int intel_init_workaround_bb(struct intel_engine_cs *engine)
>   	unsigned int i;
>   	int ret;
>   
> -	if (GEM_DEBUG_WARN_ON(engine->id != RCS0))
> -		return -EINVAL;
> +	if (engine->class != RENDER_CLASS)
> +		return 0;
>   
>   	switch (INTEL_GEN(engine->i915)) {
>   	case 11:
> @@ -2423,15 +2423,8 @@ logical_ring_default_irqs(struct intel_engine_cs *engine)
>   	engine->irq_keep_mask = GT_CONTEXT_SWITCH_INTERRUPT << shift;
>   }
>   
> -static int
> -logical_ring_setup(struct intel_engine_cs *engine)
> +int intel_execlists_submission_setup(struct intel_engine_cs *engine)
>   {
> -	int err;
> -
> -	err = intel_engine_setup_common(engine);
> -	if (err)
> -		return err;
> -
>   	/* Intentionally left blank. */
>   	engine->buffer = NULL;
>   
> @@ -2441,10 +2434,16 @@ logical_ring_setup(struct intel_engine_cs *engine)
>   	logical_ring_default_vfuncs(engine);
>   	logical_ring_default_irqs(engine);
>   
> +	if (engine->class == RENDER_CLASS) {
> +		engine->init_context = gen8_init_rcs_context;
> +		engine->emit_flush = gen8_emit_flush_render;
> +		engine->emit_fini_breadcrumb = gen8_emit_fini_breadcrumb_rcs;
> +	}
> +
>   	return 0;
>   }
>   
> -static int logical_ring_init(struct intel_engine_cs *engine)
> +int intel_execlists_submission_init(struct intel_engine_cs *engine)
>   {
>   	struct drm_i915_private *i915 = engine->i915;
>   	struct intel_engine_execlists * const execlists = &engine->execlists;
> @@ -2456,6 +2455,15 @@ static int logical_ring_init(struct intel_engine_cs *engine)
>   		return ret;
>   
>   	intel_engine_init_workarounds(engine);
> +	intel_engine_init_whitelist(engine);
> +
> +	if (intel_init_workaround_bb(engine))
> +		/*
> +		 * We continue even if we fail to initialize WA batch
> +		 * because we only expect rare glitches but nothing
> +		 * critical to prevent us from using GPU
> +		 */
> +		DRM_ERROR("WA batch buffer initialization failed\n");
>   
>   	if (HAS_LOGICAL_RING_ELSQ(i915)) {
>   		execlists->submit_reg = i915->uncore.regs +
> @@ -2483,50 +2491,6 @@ static int logical_ring_init(struct intel_engine_cs *engine)
>   	return 0;
>   }
>   
> -int logical_render_ring_init(struct intel_engine_cs *engine)
> -{
> -	int ret;
> -
> -	ret = logical_ring_setup(engine);
> -	if (ret)
> -		return ret;
> -
> -	/* Override some for render ring. */
> -	engine->init_context = gen8_init_rcs_context;
> -	engine->emit_flush = gen8_emit_flush_render;
> -	engine->emit_fini_breadcrumb = gen8_emit_fini_breadcrumb_rcs;
> -
> -	ret = logical_ring_init(engine);
> -	if (ret)
> -		return ret;
> -
> -	ret = intel_init_workaround_bb(engine);
> -	if (ret) {
> -		/*
> -		 * We continue even if we fail to initialize WA batch
> -		 * because we only expect rare glitches but nothing
> -		 * critical to prevent us from using GPU
> -		 */
> -		DRM_ERROR("WA batch buffer initialization failed: %d\n",
> -			  ret);
> -	}
> -
> -	intel_engine_init_whitelist(engine);
> -
> -	return 0;
> -}
> -
> -int logical_xcs_ring_init(struct intel_engine_cs *engine)
> -{
> -	int err;
> -
> -	err = logical_ring_setup(engine);
> -	if (err)
> -		return err;
> -
> -	return logical_ring_init(engine);
> -}
> -
>   static u32 intel_lr_indirect_ctx_offset(struct intel_engine_cs *engine)
>   {
>   	u32 indirect_ctx_offset;
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.h b/drivers/gpu/drm/i915/gt/intel_lrc.h
> index a598f2d56de3..a56ee45b9e3c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.h
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.h
> @@ -67,8 +67,9 @@ enum {
>   
>   /* Logical Rings */
>   void intel_logical_ring_cleanup(struct intel_engine_cs *engine);
> -int logical_render_ring_init(struct intel_engine_cs *engine);
> -int logical_xcs_ring_init(struct intel_engine_cs *engine);
> +
> +int intel_execlists_submission_setup(struct intel_engine_cs *engine);
> +int intel_execlists_submission_init(struct intel_engine_cs *engine);
>   
>   /* Logical Ring Contexts */
>   
> diff --git a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
> index 748c133b83b4..3204dbb541f6 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
> @@ -1523,54 +1523,6 @@ static const struct intel_context_ops ring_context_ops = {
>   	.destroy = ring_context_destroy,
>   };
>   
> -static int intel_init_ring_buffer(struct intel_engine_cs *engine)
> -{
> -	struct i915_timeline *timeline;
> -	struct intel_ring *ring;
> -	int err;
> -
> -	err = intel_engine_setup_common(engine);
> -	if (err)
> -		return err;
> -
> -	timeline = i915_timeline_create(engine->i915, engine->status_page.vma);
> -	if (IS_ERR(timeline)) {
> -		err = PTR_ERR(timeline);
> -		goto err;
> -	}
> -	GEM_BUG_ON(timeline->has_initial_breadcrumb);
> -
> -	ring = intel_engine_create_ring(engine, timeline, 32 * PAGE_SIZE);
> -	i915_timeline_put(timeline);
> -	if (IS_ERR(ring)) {
> -		err = PTR_ERR(ring);
> -		goto err;
> -	}
> -
> -	err = intel_ring_pin(ring);
> -	if (err)
> -		goto err_ring;
> -
> -	GEM_BUG_ON(engine->buffer);
> -	engine->buffer = ring;
> -
> -	err = intel_engine_init_common(engine);
> -	if (err)
> -		goto err_unpin;
> -
> -	GEM_BUG_ON(ring->timeline->hwsp_ggtt != engine->status_page.vma);
> -
> -	return 0;
> -
> -err_unpin:
> -	intel_ring_unpin(ring);
> -err_ring:
> -	intel_ring_put(ring);
> -err:
> -	intel_engine_cleanup_common(engine);
> -	return err;
> -}
> -
>   void intel_engine_cleanup(struct intel_engine_cs *engine)
>   {
>   	struct drm_i915_private *dev_priv = engine->i915;
> @@ -2166,24 +2118,6 @@ static int gen6_ring_flush(struct i915_request *rq, u32 mode)
>   	return gen6_flush_dw(rq, mode, MI_INVALIDATE_TLB);
>   }
>   
> -static void intel_ring_init_irq(struct drm_i915_private *dev_priv,
> -				struct intel_engine_cs *engine)
> -{
> -	if (INTEL_GEN(dev_priv) >= 6) {
> -		engine->irq_enable = gen6_irq_enable;
> -		engine->irq_disable = gen6_irq_disable;
> -	} else if (INTEL_GEN(dev_priv) >= 5) {
> -		engine->irq_enable = gen5_irq_enable;
> -		engine->irq_disable = gen5_irq_disable;
> -	} else if (INTEL_GEN(dev_priv) >= 3) {
> -		engine->irq_enable = i9xx_irq_enable;
> -		engine->irq_disable = i9xx_irq_disable;
> -	} else {
> -		engine->irq_enable = i8xx_irq_enable;
> -		engine->irq_disable = i8xx_irq_disable;
> -	}
> -}
> -
>   static void i9xx_set_default_submission(struct intel_engine_cs *engine)
>   {
>   	engine->submit_request = i9xx_submit_request;
> @@ -2199,13 +2133,33 @@ static void gen6_bsd_set_default_submission(struct intel_engine_cs *engine)
>   	engine->submit_request = gen6_bsd_submit_request;
>   }
>   
> -static void intel_ring_default_vfuncs(struct drm_i915_private *dev_priv,
> -				      struct intel_engine_cs *engine)
> +static void setup_irq(struct intel_engine_cs *engine)
> +{
> +	struct drm_i915_private *i915 = engine->i915;
> +
> +	if (INTEL_GEN(i915) >= 6) {
> +		engine->irq_enable = gen6_irq_enable;
> +		engine->irq_disable = gen6_irq_disable;
> +	} else if (INTEL_GEN(i915) >= 5) {
> +		engine->irq_enable = gen5_irq_enable;
> +		engine->irq_disable = gen5_irq_disable;
> +	} else if (INTEL_GEN(i915) >= 3) {
> +		engine->irq_enable = i9xx_irq_enable;
> +		engine->irq_disable = i9xx_irq_disable;
> +	} else {
> +		engine->irq_enable = i8xx_irq_enable;
> +		engine->irq_disable = i8xx_irq_disable;
> +	}
> +}
> +
> +static void setup_xcs(struct intel_engine_cs *engine)
>   {
> +	struct drm_i915_private *i915 = engine->i915;
> +
>   	/* gen8+ are only supported with execlists */
> -	GEM_BUG_ON(INTEL_GEN(dev_priv) >= 8);
> +	GEM_BUG_ON(INTEL_GEN(i915) >= 8);
>   
> -	intel_ring_init_irq(dev_priv, engine);
> +	setup_irq(engine);
>   
>   	engine->resume = xcs_resume;
>   	engine->reset.prepare = reset_prepare;
> @@ -2221,117 +2175,96 @@ static void intel_ring_default_vfuncs(struct drm_i915_private *dev_priv,
>   	 * engine->emit_init_breadcrumb().
>   	 */
>   	engine->emit_fini_breadcrumb = i9xx_emit_breadcrumb;
> -	if (IS_GEN(dev_priv, 5))
> +	if (IS_GEN(i915, 5))
>   		engine->emit_fini_breadcrumb = gen5_emit_breadcrumb;
>   
>   	engine->set_default_submission = i9xx_set_default_submission;
>   
> -	if (INTEL_GEN(dev_priv) >= 6)
> +	if (INTEL_GEN(i915) >= 6)
>   		engine->emit_bb_start = gen6_emit_bb_start;
> -	else if (INTEL_GEN(dev_priv) >= 4)
> +	else if (INTEL_GEN(i915) >= 4)
>   		engine->emit_bb_start = i965_emit_bb_start;
> -	else if (IS_I830(dev_priv) || IS_I845G(dev_priv))
> +	else if (IS_I830(i915) || IS_I845G(i915))
>   		engine->emit_bb_start = i830_emit_bb_start;
>   	else
>   		engine->emit_bb_start = i915_emit_bb_start;
>   }
>   
> -int intel_init_render_ring_buffer(struct intel_engine_cs *engine)
> +static void setup_rcs(struct intel_engine_cs *engine)
>   {
> -	struct drm_i915_private *dev_priv = engine->i915;
> -	int ret;
> -
> -	intel_ring_default_vfuncs(dev_priv, engine);
> +	struct drm_i915_private *i915 = engine->i915;
>   
> -	if (HAS_L3_DPF(dev_priv))
> +	if (HAS_L3_DPF(i915))
>   		engine->irq_keep_mask = GT_RENDER_L3_PARITY_ERROR_INTERRUPT;
>   
>   	engine->irq_enable_mask = GT_RENDER_USER_INTERRUPT;
>   
> -	if (INTEL_GEN(dev_priv) >= 7) {
> +	if (INTEL_GEN(i915) >= 7) {
>   		engine->init_context = intel_rcs_ctx_init;
>   		engine->emit_flush = gen7_render_ring_flush;
>   		engine->emit_fini_breadcrumb = gen7_rcs_emit_breadcrumb;
> -	} else if (IS_GEN(dev_priv, 6)) {
> +	} else if (IS_GEN(i915, 6)) {
>   		engine->init_context = intel_rcs_ctx_init;
>   		engine->emit_flush = gen6_render_ring_flush;
>   		engine->emit_fini_breadcrumb = gen6_rcs_emit_breadcrumb;
> -	} else if (IS_GEN(dev_priv, 5)) {
> +	} else if (IS_GEN(i915, 5)) {
>   		engine->emit_flush = gen4_render_ring_flush;
>   	} else {
> -		if (INTEL_GEN(dev_priv) < 4)
> +		if (INTEL_GEN(i915) < 4)
>   			engine->emit_flush = gen2_render_ring_flush;
>   		else
>   			engine->emit_flush = gen4_render_ring_flush;
>   		engine->irq_enable_mask = I915_USER_INTERRUPT;
>   	}
>   
> -	if (IS_HASWELL(dev_priv))
> +	if (IS_HASWELL(i915))
>   		engine->emit_bb_start = hsw_emit_bb_start;
>   
>   	engine->resume = rcs_resume;
> -
> -	ret = intel_init_ring_buffer(engine);
> -	if (ret)
> -		return ret;
> -
> -	return 0;
>   }
>   
> -int intel_init_bsd_ring_buffer(struct intel_engine_cs *engine)
> +static void setup_vcs(struct intel_engine_cs *engine)
>   {
> -	struct drm_i915_private *dev_priv = engine->i915;
> -
> -	intel_ring_default_vfuncs(dev_priv, engine);
> +	struct drm_i915_private *i915 = engine->i915;
>   
> -	if (INTEL_GEN(dev_priv) >= 6) {
> +	if (INTEL_GEN(i915) >= 6) {
>   		/* gen6 bsd needs a special wa for tail updates */
> -		if (IS_GEN(dev_priv, 6))
> +		if (IS_GEN(i915, 6))
>   			engine->set_default_submission = gen6_bsd_set_default_submission;
>   		engine->emit_flush = gen6_bsd_ring_flush;
>   		engine->irq_enable_mask = GT_BSD_USER_INTERRUPT;
>   
> -		if (IS_GEN(dev_priv, 6))
> +		if (IS_GEN(i915, 6))
>   			engine->emit_fini_breadcrumb = gen6_xcs_emit_breadcrumb;
>   		else
>   			engine->emit_fini_breadcrumb = gen7_xcs_emit_breadcrumb;
>   	} else {
>   		engine->emit_flush = bsd_ring_flush;
> -		if (IS_GEN(dev_priv, 5))
> +		if (IS_GEN(i915, 5))
>   			engine->irq_enable_mask = ILK_BSD_USER_INTERRUPT;
>   		else
>   			engine->irq_enable_mask = I915_BSD_USER_INTERRUPT;
>   	}
> -
> -	return intel_init_ring_buffer(engine);
>   }
>   
> -int intel_init_blt_ring_buffer(struct intel_engine_cs *engine)
> +static void setup_bcs(struct intel_engine_cs *engine)
>   {
> -	struct drm_i915_private *dev_priv = engine->i915;
> -
> -	GEM_BUG_ON(INTEL_GEN(dev_priv) < 6);
> -
> -	intel_ring_default_vfuncs(dev_priv, engine);
> +	struct drm_i915_private *i915 = engine->i915;
>   
>   	engine->emit_flush = gen6_ring_flush;
>   	engine->irq_enable_mask = GT_BLT_USER_INTERRUPT;
>   
> -	if (IS_GEN(dev_priv, 6))
> +	if (IS_GEN(i915, 6))
>   		engine->emit_fini_breadcrumb = gen6_xcs_emit_breadcrumb;
>   	else
>   		engine->emit_fini_breadcrumb = gen7_xcs_emit_breadcrumb;
> -
> -	return intel_init_ring_buffer(engine);
>   }
>   
> -int intel_init_vebox_ring_buffer(struct intel_engine_cs *engine)
> +static void setup_vecs(struct intel_engine_cs *engine)
>   {
> -	struct drm_i915_private *dev_priv = engine->i915;
> -
> -	GEM_BUG_ON(INTEL_GEN(dev_priv) < 7);
> +	struct drm_i915_private *i915 = engine->i915;
>   
> -	intel_ring_default_vfuncs(dev_priv, engine);
> +	GEM_BUG_ON(INTEL_GEN(i915) < 7);
>   
>   	engine->emit_flush = gen6_ring_flush;
>   	engine->irq_enable_mask = PM_VEBOX_USER_INTERRUPT;
> @@ -2339,6 +2272,73 @@ int intel_init_vebox_ring_buffer(struct intel_engine_cs *engine)
>   	engine->irq_disable = hsw_vebox_irq_disable;
>   
>   	engine->emit_fini_breadcrumb = gen7_xcs_emit_breadcrumb;
> +}
> +
> +int intel_ring_submission_setup(struct intel_engine_cs *engine)
> +{
> +	setup_xcs(engine);

It is actually setup_common, no? I think it would be clearer since we 
use to have xcs mean !rcs.

> +
> +	switch (engine->class) {
> +	case RENDER_CLASS:
> +		setup_rcs(engine);
> +		break;
> +	case VIDEO_DECODE_CLASS:
> +		setup_vcs(engine);
> +		break;
> +	case COPY_ENGINE_CLASS:
> +		setup_bcs(engine);
> +		break;
> +	case VIDEO_ENHANCEMENT_CLASS:
> +		setup_vecs(engine);
> +		break;
> +	default:
> +		MISSING_CASE(engine->class);
> +		return -ENODEV;
> +	}
> +
> +	return 0;
> +}
> +
> +int intel_ring_submission_init(struct intel_engine_cs *engine)
> +{
> +	struct i915_timeline *timeline;
> +	struct intel_ring *ring;
> +	int err;
> +
> +	timeline = i915_timeline_create(engine->i915, engine->status_page.vma);
> +	if (IS_ERR(timeline)) {
> +		err = PTR_ERR(timeline);
> +		goto err;
> +	}
> +	GEM_BUG_ON(timeline->has_initial_breadcrumb);
> +
> +	ring = intel_engine_create_ring(engine, timeline, 32 * PAGE_SIZE);
> +	i915_timeline_put(timeline);
> +	if (IS_ERR(ring)) {
> +		err = PTR_ERR(ring);
> +		goto err;
> +	}
> +
> +	err = intel_ring_pin(ring);
> +	if (err)
> +		goto err_ring;
>   
> -	return intel_init_ring_buffer(engine);
> +	GEM_BUG_ON(engine->buffer);
> +	engine->buffer = ring;
> +
> +	err = intel_engine_init_common(engine);
> +	if (err)
> +		goto err_unpin;
> +
> +	GEM_BUG_ON(ring->timeline->hwsp_ggtt != engine->status_page.vma);
> +
> +	return 0;
> +
> +err_unpin:
> +	intel_ring_unpin(ring);
> +err_ring:
> +	intel_ring_put(ring);
> +err:
> +	intel_engine_cleanup_common(engine);
> +	return err;
>   }
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> index 2d6d17ee3601..e9d8174b24e8 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> @@ -1060,7 +1060,8 @@ void intel_engine_init_whitelist(struct intel_engine_cs *engine)
>   	struct drm_i915_private *i915 = engine->i915;
>   	struct i915_wa_list *w = &engine->whitelist;
>   
> -	GEM_BUG_ON(engine->id != RCS0);
> +	if (engine->class != RENDER_CLASS)
> +		return;
>   
>   	wa_init_start(w, "whitelist");
>   
> diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c
> index a79d9909d171..3b672e011cf0 100644
> --- a/drivers/gpu/drm/i915/gt/mock_engine.c
> +++ b/drivers/gpu/drm/i915/gt/mock_engine.c
> @@ -239,7 +239,6 @@ struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
>   				    int id)
>   {
>   	struct mock_engine *engine;
> -	int err;
>   
>   	GEM_BUG_ON(id >= I915_NUM_ENGINES);
>   
> @@ -265,37 +264,44 @@ struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
>   	engine->base.reset.finish = mock_reset_finish;
>   	engine->base.cancel_requests = mock_cancel_requests;
>   
> -	if (i915_timeline_init(i915, &engine->base.timeline, NULL))
> -		goto err_free;
> -	i915_timeline_set_subclass(&engine->base.timeline, TIMELINE_ENGINE);
> -
> -	intel_engine_init_breadcrumbs(&engine->base);
> -	intel_engine_init_execlists(&engine->base);
> -	intel_engine_init__pm(&engine->base);
> -
>   	/* fake hw queue */
>   	spin_lock_init(&engine->hw_lock);
>   	timer_setup(&engine->hw_delay, hw_delay_complete, 0);
>   	INIT_LIST_HEAD(&engine->hw_queue);
>   
> -	engine->base.kernel_context =
> -		intel_context_instance(i915->kernel_context, &engine->base);
> -	if (IS_ERR(engine->base.kernel_context))
> +	return &engine->base;
> +}
> +
> +int mock_engine_init(struct intel_engine_cs *engine)
> +{
> +	struct drm_i915_private *i915 = engine->i915;
> +	int err;
> +
> +	intel_engine_init_breadcrumbs(engine);
> +	intel_engine_init_execlists(engine);
> +	intel_engine_init__pm(engine);
> +
> +	if (i915_timeline_init(i915, &engine->timeline, NULL))
>   		goto err_breadcrumbs;
> +	i915_timeline_set_subclass(&engine->timeline, TIMELINE_ENGINE);
> +
> +	engine->kernel_context =
> +		intel_context_instance(i915->kernel_context, engine);
> +	if (IS_ERR(engine->kernel_context))
> +		goto err_timeline;
>   
> -	err = intel_context_pin(engine->base.kernel_context);
> -	intel_context_put(engine->base.kernel_context);
> +	err = intel_context_pin(engine->kernel_context);
> +	intel_context_put(engine->kernel_context);
>   	if (err)
> -		goto err_breadcrumbs;
> +		goto err_timeline;
>   
> -	return &engine->base;
> +	return 0;
>   
> +err_timeline:
> +	i915_timeline_fini(&engine->timeline);
>   err_breadcrumbs:
> -	intel_engine_fini_breadcrumbs(&engine->base);
> -	i915_timeline_fini(&engine->base.timeline);
> -err_free:
> -	kfree(engine);
> -	return NULL;
> +	intel_engine_fini_breadcrumbs(engine);
> +	return -ENOMEM;
>   }
>   
>   void mock_engine_flush(struct intel_engine_cs *engine)
> diff --git a/drivers/gpu/drm/i915/gt/mock_engine.h b/drivers/gpu/drm/i915/gt/mock_engine.h
> index 44b35a85e9d1..3f9b698c49d2 100644
> --- a/drivers/gpu/drm/i915/gt/mock_engine.h
> +++ b/drivers/gpu/drm/i915/gt/mock_engine.h
> @@ -42,6 +42,8 @@ struct mock_engine {
>   struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
>   				    const char *name,
>   				    int id);
> +int mock_engine_init(struct intel_engine_cs *engine);
> +
>   void mock_engine_flush(struct intel_engine_cs *engine);
>   void mock_engine_reset(struct intel_engine_cs *engine);
>   void mock_engine_free(struct intel_engine_cs *engine);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index a6436c77109a..50266e87c225 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4523,6 +4523,12 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
>   		goto err_ggtt;
>   	}
>   
> +	ret = intel_engines_setup(dev_priv);
> +	if (ret) {
> +		GEM_BUG_ON(ret == -EIO);
> +		goto err_unlock;
> +	}
> +
>   	ret = i915_gem_contexts_init(dev_priv);
>   	if (ret) {
>   		GEM_BUG_ON(ret == -EIO);
> diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> index c072424c6b7c..e4033d0576c4 100644
> --- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> +++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> @@ -209,12 +209,16 @@ struct drm_i915_private *mock_gem_device(void)
>   	mock_init_ggtt(i915, &i915->ggtt);
>   
>   	mkwrite_device_info(i915)->engine_mask = BIT(0);
> -	i915->kernel_context = mock_context(i915, NULL);
> -	if (!i915->kernel_context)
> -		goto err_unlock;
>   
>   	i915->engine[RCS0] = mock_engine(i915, "mock", RCS0);
>   	if (!i915->engine[RCS0])
> +		goto err_unlock;
> +
> +	i915->kernel_context = mock_context(i915, NULL);
> +	if (!i915->kernel_context)
> +		goto err_engine;
> +
> +	if (mock_engine_init(i915->engine[RCS0]))
>   		goto err_context;
>   
>   	mutex_unlock(&i915->drm.struct_mutex);
> @@ -225,6 +229,8 @@ struct drm_i915_private *mock_gem_device(void)
>   
>   err_context:
>   	i915_gem_contexts_fini(i915);
> +err_engine:
> +	mock_engine_free(i915->engine[RCS0]);
>   err_unlock:
>   	mutex_unlock(&i915->drm.struct_mutex);
>   	i915_timelines_fini(i915);
> 

After a couple backs and forth to triple check things I did not spot any 
mistakes in code split and movement so:

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 19/29] drm/i915: Pass intel_context to intel_context_pin_lock()
  2019-04-10 13:04       ` Chris Wilson
@ 2019-04-10 14:53         ` Tvrtko Ursulin
  0 siblings, 0 replies; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-10 14:53 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 10/04/2019 14:04, Chris Wilson wrote:
> Quoting Chris Wilson (2019-04-10 13:49:32)
>> Quoting Tvrtko Ursulin (2019-04-10 13:45:05)
>>>
>>> On 08/04/2019 10:17, Chris Wilson wrote:
>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
>>>> index da342e9a8c2e..4f8ef45e6344 100644
>>>> --- a/drivers/gpu/drm/i915/gt/intel_context.h
>>>> +++ b/drivers/gpu/drm/i915/gt/intel_context.h
>>>> @@ -39,9 +39,17 @@ intel_context_lookup(struct i915_gem_context *ctx,
>>>>     * can neither be bound to the GPU or unbound whilst the lock is held, i.e.
>>>>     * intel_context_is_pinned() remains stable.
>>>>     */
>>>> -struct intel_context *
>>>> -intel_context_pin_lock(struct i915_gem_context *ctx,
>>>> -                    struct intel_engine_cs *engine);
>>>> +static inline int intel_context_pin_lock(struct intel_context *ce)
>>>> +     __acquires(ce->pin_mutex)
>>>> +{
>>>> +     return mutex_lock_interruptible(&ce->pin_mutex);
>>>> +}
>>>
>>> So if this is not getting the kref any more why are the other callers
>>> okay? In previous patch(es) some certainly looked like they assume pin
>>> implies a reference.
>>
>> This is not 'pin' but locking the pinned state.
>> Maybe lock_pin and unlock_pin? (I was going to say lock_pin_count, but
>> it strictly does not prevent updates to pin_count, merely the transition
>> to/from .pin_count == 0.)

I honestly cannot figure out now what I thought the problem is. Perhaps 
I confused it with intel_context_pin.

There is so much churn in intel_context.c, both recent and in this 
series, that I struggle to keep with it...

> Went with intel_context_lock_pinned() and intel_context_unlock_pinned().

...so this is also okay. Perhaps add an assert that pin count is above 
zero as well.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 21/29] drm/i915: Switch back to an array of logical per-engine HW contexts
  2019-04-08  9:17 ` [PATCH 21/29] drm/i915: Switch back to an array of logical per-engine HW contexts Chris Wilson
@ 2019-04-10 15:32   ` Tvrtko Ursulin
  2019-04-10 16:18     ` Chris Wilson
  2019-04-11 13:33   ` [PATCH] " Chris Wilson
  1 sibling, 1 reply; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-10 15:32 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 08/04/2019 10:17, Chris Wilson wrote:
> We switched to a tree of per-engine HW context to accommodate the
> introduction of virtual engines. However, we plan to also support
> multiple instances of the same engine within the GEM context, defeating
> our use of the engine as a key to looking up the HW context. Just
> allocate a logical per-engine instance and always use an index into the
> ctx->engines[]. Later on, this ctx->engines[] may be replaced by a user
> specified map.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/intel_context.c       | 97 ++-----------------
>   drivers/gpu/drm/i915/gt/intel_context.h       | 25 +----
>   drivers/gpu/drm/i915/gt/intel_context_types.h |  2 -
>   drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  2 +-
>   drivers/gpu/drm/i915/gt/mock_engine.c         |  3 +-
>   drivers/gpu/drm/i915/gvt/scheduler.c          |  2 +-
>   drivers/gpu/drm/i915/i915_gem.c               | 29 +++---
>   drivers/gpu/drm/i915/i915_gem_context.c       | 92 +++++++++++++++---
>   drivers/gpu/drm/i915/i915_gem_context.h       | 42 ++++++++
>   drivers/gpu/drm/i915/i915_gem_context_types.h | 26 ++++-
>   drivers/gpu/drm/i915/i915_gem_execbuffer.c    | 70 ++++++-------
>   drivers/gpu/drm/i915/i915_perf.c              | 81 +++++++++-------
>   drivers/gpu/drm/i915/i915_request.c           |  2 +-
>   drivers/gpu/drm/i915/intel_guc_submission.c   | 24 +++--
>   .../gpu/drm/i915/selftests/i915_gem_context.c |  2 +-
>   drivers/gpu/drm/i915/selftests/mock_context.c | 14 ++-
>   16 files changed, 283 insertions(+), 230 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
> index 46bf8d8fb7e4..803ac9a1dd15 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context.c
> +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> @@ -17,7 +17,7 @@ static struct i915_global_context {
>   	struct kmem_cache *slab_ce;
>   } global;
>   
> -struct intel_context *intel_context_alloc(void)
> +static struct intel_context *intel_context_alloc(void)
>   {
>   	return kmem_cache_zalloc(global.slab_ce, GFP_KERNEL);
>   }
> @@ -28,104 +28,17 @@ void intel_context_free(struct intel_context *ce)
>   }
>   
>   struct intel_context *
> -intel_context_lookup(struct i915_gem_context *ctx,
> +intel_context_create(struct i915_gem_context *ctx,
>   		     struct intel_engine_cs *engine)
>   {
> -	struct intel_context *ce = NULL;
> -	struct rb_node *p;
> -
> -	spin_lock(&ctx->hw_contexts_lock);
> -	p = ctx->hw_contexts.rb_node;
> -	while (p) {
> -		struct intel_context *this =
> -			rb_entry(p, struct intel_context, node);
> -
> -		if (this->engine == engine) {
> -			GEM_BUG_ON(this->gem_context != ctx);
> -			ce = this;
> -			break;
> -		}
> -
> -		if (this->engine < engine)
> -			p = p->rb_right;
> -		else
> -			p = p->rb_left;
> -	}
> -	spin_unlock(&ctx->hw_contexts_lock);
> -
> -	return ce;
> -}
> -
> -struct intel_context *
> -__intel_context_insert(struct i915_gem_context *ctx,
> -		       struct intel_engine_cs *engine,
> -		       struct intel_context *ce)
> -{
> -	struct rb_node **p, *parent;
> -	int err = 0;
> -
> -	spin_lock(&ctx->hw_contexts_lock);
> -
> -	parent = NULL;
> -	p = &ctx->hw_contexts.rb_node;
> -	while (*p) {
> -		struct intel_context *this;
> -
> -		parent = *p;
> -		this = rb_entry(parent, struct intel_context, node);
> -
> -		if (this->engine == engine) {
> -			err = -EEXIST;
> -			ce = this;
> -			break;
> -		}
> -
> -		if (this->engine < engine)
> -			p = &parent->rb_right;
> -		else
> -			p = &parent->rb_left;
> -	}
> -	if (!err) {
> -		rb_link_node(&ce->node, parent, p);
> -		rb_insert_color(&ce->node, &ctx->hw_contexts);
> -	}
> -
> -	spin_unlock(&ctx->hw_contexts_lock);
> -
> -	return ce;
> -}
> -
> -void __intel_context_remove(struct intel_context *ce)
> -{
> -	struct i915_gem_context *ctx = ce->gem_context;
> -
> -	spin_lock(&ctx->hw_contexts_lock);
> -	rb_erase(&ce->node, &ctx->hw_contexts);
> -	spin_unlock(&ctx->hw_contexts_lock);
> -}
> -
> -struct intel_context *
> -intel_context_instance(struct i915_gem_context *ctx,
> -		       struct intel_engine_cs *engine)
> -{
> -	struct intel_context *ce, *pos;
> -
> -	ce = intel_context_lookup(ctx, engine);
> -	if (likely(ce))
> -		return intel_context_get(ce);
> +	struct intel_context *ce;
>   
>   	ce = intel_context_alloc();
>   	if (!ce)
>   		return ERR_PTR(-ENOMEM);
>   
>   	intel_context_init(ce, ctx, engine);
> -
> -	pos = __intel_context_insert(ctx, engine, ce);
> -	if (unlikely(pos != ce)) /* Beaten! Use their HW context instead */
> -		intel_context_free(ce);
> -
> -	GEM_BUG_ON(intel_context_lookup(ctx, engine) != pos);
> -	return intel_context_get(pos);
> +	return ce;
>   }
>   
>   int __intel_context_do_pin(struct intel_context *ce)
> @@ -204,6 +117,8 @@ intel_context_init(struct intel_context *ce,
>   		   struct i915_gem_context *ctx,
>   		   struct intel_engine_cs *engine)
>   {
> +	GEM_BUG_ON(!engine->cops);
> +
>   	kref_init(&ce->ref);
>   
>   	ce->gem_context = ctx;
> diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
> index 4f8ef45e6344..567a415be513 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context.h
> @@ -12,24 +12,16 @@
>   #include "intel_context_types.h"
>   #include "intel_engine_types.h"
>   
> -struct intel_context *intel_context_alloc(void);
> -void intel_context_free(struct intel_context *ce);
> -
>   void intel_context_init(struct intel_context *ce,
>   			struct i915_gem_context *ctx,
>   			struct intel_engine_cs *engine);
>   
> -/**
> - * intel_context_lookup - Find the matching HW context for this (ctx, engine)
> - * @ctx - the parent GEM context
> - * @engine - the target HW engine
> - *
> - * May return NULL if the HW context hasn't been instantiated (i.e. unused).
> - */
>   struct intel_context *
> -intel_context_lookup(struct i915_gem_context *ctx,
> +intel_context_create(struct i915_gem_context *ctx,
>   		     struct intel_engine_cs *engine);
>   
> +void intel_context_free(struct intel_context *ce);
> +
>   /**
>    * intel_context_pin_lock - Stablises the 'pinned' status of the HW context
>    * @ctx - the parent GEM context
> @@ -59,17 +51,6 @@ intel_context_is_pinned(struct intel_context *ce)
>   
>   void intel_context_pin_unlock(struct intel_context *ce);
>   
> -struct intel_context *
> -__intel_context_insert(struct i915_gem_context *ctx,
> -		       struct intel_engine_cs *engine,
> -		       struct intel_context *ce);
> -void
> -__intel_context_remove(struct intel_context *ce);
> -
> -struct intel_context *
> -intel_context_instance(struct i915_gem_context *ctx,
> -		       struct intel_engine_cs *engine);
> -
>   int __intel_context_do_pin(struct intel_context *ce);
>   
>   static inline int intel_context_pin(struct intel_context *ce)
> diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h
> index f02d27734e3b..3579c2708321 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
> @@ -10,7 +10,6 @@
>   #include <linux/kref.h>
>   #include <linux/list.h>
>   #include <linux/mutex.h>
> -#include <linux/rbtree.h>
>   #include <linux/types.h>
>   
>   #include "i915_active_types.h"
> @@ -61,7 +60,6 @@ struct intel_context {
>   	struct i915_active_request active_tracker;
>   
>   	const struct intel_context_ops *ops;
> -	struct rb_node node;
>   
>   	/** sseu: Control eu/slice partitioning */
>   	struct intel_sseu sseu;
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 3f794bc71958..0df3c5238c04 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -716,7 +716,7 @@ static int pin_context(struct i915_gem_context *ctx,
>   	struct intel_context *ce;
>   	int err;
>   
> -	ce = intel_context_instance(ctx, engine);
> +	ce = i915_gem_context_get_engine(ctx, engine->id);

Staying with intel_context_instance wouldn't help to reduce the churn?

>   	if (IS_ERR(ce))
>   		return PTR_ERR(ce);
>   
> diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c
> index 3b672e011cf0..87e538825fee 100644
> --- a/drivers/gpu/drm/i915/gt/mock_engine.c
> +++ b/drivers/gpu/drm/i915/gt/mock_engine.c
> @@ -23,6 +23,7 @@
>    */
>   
>   #include "i915_drv.h"
> +#include "i915_gem_context.h"
>   #include "intel_context.h"
>   #include "intel_engine_pm.h"
>   
> @@ -286,7 +287,7 @@ int mock_engine_init(struct intel_engine_cs *engine)
>   	i915_timeline_set_subclass(&engine->timeline, TIMELINE_ENGINE);
>   
>   	engine->kernel_context =
> -		intel_context_instance(i915->kernel_context, engine);
> +		i915_gem_context_get_engine(i915->kernel_context, engine->id);
>   	if (IS_ERR(engine->kernel_context))
>   		goto err_timeline;
>   
> diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c
> index 606fc2713240..8b6574e1b495 100644
> --- a/drivers/gpu/drm/i915/gvt/scheduler.c
> +++ b/drivers/gpu/drm/i915/gvt/scheduler.c
> @@ -1188,7 +1188,7 @@ int intel_vgpu_setup_submission(struct intel_vgpu *vgpu)
>   		INIT_LIST_HEAD(&s->workload_q_head[i]);
>   		s->shadow[i] = ERR_PTR(-EINVAL);
>   
> -		ce = intel_context_instance(ctx, engine);
> +		ce = i915_gem_context_get_engine(ctx, i);
>   		if (IS_ERR(ce)) {
>   			ret = PTR_ERR(ce);
>   			goto out_shadow_ctx;
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 50266e87c225..21b4a04c424b 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4312,8 +4312,9 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
>   
>   static int __intel_engines_record_defaults(struct drm_i915_private *i915)
>   {
> -	struct i915_gem_context *ctx;
>   	struct intel_engine_cs *engine;
> +	struct i915_gem_context *ctx;
> +	struct i915_gem_engines *e;
>   	enum intel_engine_id id;
>   	int err = 0;
>   
> @@ -4330,18 +4331,26 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
>   	if (IS_ERR(ctx))
>   		return PTR_ERR(ctx);
>   
> +	e = i915_gem_context_engine_list_lock(ctx);
> +
>   	for_each_engine(engine, i915, id) {

Do we need i915 version of for_each_context_engine? If all call sites 
will doing this "lock ctx" -> "walk physical engines" -> "lookup in 
context" then it seems a bit disconnected.

> +		struct intel_context *ce = e->engines[id];

How will index by engine->id work for engine map?

>   		struct i915_request *rq;
>   
> -		rq = i915_request_alloc(engine, ctx);
> +		err = intel_context_pin(ce);
> +		if (err)
> +			goto err_active;
> +
> +		rq = i915_request_create(ce);
> +		intel_context_unpin(ce);

Kind of verbose, no? Do you want to add some 
middle-ground-between-request-alloc-and-create helper?

>   		if (IS_ERR(rq)) {
>   			err = PTR_ERR(rq);
> -			goto out_ctx;
> +			goto err_active;
>   		}
>   
>   		err = 0;
> -		if (engine->init_context)
> -			err = engine->init_context(rq);
> +		if (rq->engine->init_context)
> +			err = rq->engine->init_context(rq);
>   
>   		i915_request_add(rq);
>   		if (err)
> @@ -4355,15 +4364,10 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
>   	}
>   
>   	for_each_engine(engine, i915, id) {
> -		struct intel_context *ce;
> -		struct i915_vma *state;
> +		struct intel_context *ce = e->engines[id];
> +		struct i915_vma *state = ce->state;
>   		void *vaddr;
>   
> -		ce = intel_context_lookup(ctx, engine);
> -		if (!ce)
> -			continue;
> -
> -		state = ce->state;
>   		if (!state)
>   			continue;
>   
> @@ -4419,6 +4423,7 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
>   	}
>   
>   out_ctx:
> +	i915_gem_context_engine_list_unlock(ctx);
>   	i915_gem_context_set_closed(ctx);
>   	i915_gem_context_put(ctx);
>   	return err;
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 1e84fe0a4c55..517816558c85 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -150,7 +150,7 @@ lookup_user_engine(struct i915_gem_context *ctx, u16 class, u16 instance)
>   	if (!engine)
>   		return ERR_PTR(-EINVAL);
>   
> -	return intel_context_instance(ctx, engine);
> +	return i915_gem_context_get_engine(ctx, engine->id);
>   }
>   
>   static inline int new_hw_id(struct drm_i915_private *i915, gfp_t gfp)
> @@ -242,10 +242,54 @@ static void release_hw_id(struct i915_gem_context *ctx)
>   	mutex_unlock(&i915->contexts.mutex);
>   }
>   
> -static void i915_gem_context_free(struct i915_gem_context *ctx)
> +static void free_engines(struct i915_gem_engines *e)
> +{
> +	unsigned int n;
> +
> +	for (n = 0; n < e->num_engines; n++) {
> +		if (!e->engines[n])
> +			continue;
> +
> +		intel_context_put(e->engines[n]);
> +	}
> +	kfree(e);
> +}
> +
> +static void free_engines_n(struct i915_gem_engines *e, unsigned long n)
>   {
> -	struct intel_context *it, *n;
> +	e->num_engines = n;
> +	free_engines(e);
> +}
> +
> +static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx)
> +{
> +	struct intel_engine_cs *engine;
> +	struct i915_gem_engines *e;
> +	enum intel_engine_id id;
> +
> +	e = kzalloc(struct_size(e, engines, I915_NUM_ENGINES), GFP_KERNEL);
> +	if (!e)
> +		return ERR_PTR(-ENOMEM);
> +
> +	e->i915 = ctx->i915;
> +	for_each_engine(engine, ctx->i915, id) {
> +		struct intel_context *ce;
>   
> +		ce = intel_context_create(ctx, engine);
> +		if (IS_ERR(ce)) {
> +			free_engines_n(e, id);

I dislike piggy-back of n into e. How about:

__free_engines(e, n)
{
	...
}

free_engines(e)
{
	__fre_engines(e, e->num_engines):
}

?

Or even you could e->num_engines++ in the above loop and just have one 
free_engines.

> +			return ERR_CAST(ce);
> +		}
> +
> +		e->engines[id] = ce;

Each context would have a sparse array of engines, on most platforms. 
Would it be workable to instead create a compact array per context, and 
just have a per device translation table of idx to engine->id? Or 
vice-versa, I can't figure out straight from the bat which one would you 
need.

I guess in this scheme you would need some translation as well to 
support customised engine maps.. will see if I will spot that later in 
the patch.

> +	}
> +	e->num_engines = id;
> +
> +	return e;
> +}
> +
> +static void i915_gem_context_free(struct i915_gem_context *ctx)
> +{
>   	lockdep_assert_held(&ctx->i915->drm.struct_mutex);
>   	GEM_BUG_ON(!i915_gem_context_is_closed(ctx));
>   	GEM_BUG_ON(!list_empty(&ctx->active_engines));
> @@ -253,8 +297,8 @@ static void i915_gem_context_free(struct i915_gem_context *ctx)
>   	release_hw_id(ctx);
>   	i915_ppgtt_put(ctx->ppgtt);
>   
> -	rbtree_postorder_for_each_entry_safe(it, n, &ctx->hw_contexts, node)
> -		intel_context_put(it);
> +	free_engines(rcu_access_pointer(ctx->engines));
> +	mutex_destroy(&ctx->engines_mutex);
>   
>   	if (ctx->timeline)
>   		i915_timeline_put(ctx->timeline);
> @@ -363,6 +407,8 @@ static struct i915_gem_context *
>   __create_context(struct drm_i915_private *dev_priv)
>   {
>   	struct i915_gem_context *ctx;
> +	struct i915_gem_engines *e;
> +	int err;
>   	int i;
>   
>   	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
> @@ -376,8 +422,13 @@ __create_context(struct drm_i915_private *dev_priv)
>   	INIT_LIST_HEAD(&ctx->active_engines);
>   	mutex_init(&ctx->mutex);
>   
> -	ctx->hw_contexts = RB_ROOT;
> -	spin_lock_init(&ctx->hw_contexts_lock);
> +	mutex_init(&ctx->engines_mutex);
> +	e = default_engines(ctx);
> +	if (IS_ERR(e)) {
> +		err = PTR_ERR(e);
> +		goto err_free;
> +	}
> +	RCU_INIT_POINTER(ctx->engines, e);
>   
>   	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
>   	INIT_LIST_HEAD(&ctx->handles_list);
> @@ -399,6 +450,10 @@ __create_context(struct drm_i915_private *dev_priv)
>   		ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
>   
>   	return ctx;
> +
> +err_free:
> +	kfree(ctx);
> +	return ERR_PTR(err);
>   }
>   
>   static struct i915_hw_ppgtt *
> @@ -862,7 +917,8 @@ static int context_barrier_task(struct i915_gem_context *ctx,
>   {
>   	struct drm_i915_private *i915 = ctx->i915;
>   	struct context_barrier_task *cb;
> -	struct intel_context *ce, *next;
> +	struct i915_gem_engines *e;
> +	unsigned int i;
>   	int err = 0;
>   
>   	lockdep_assert_held(&i915->drm.struct_mutex);
> @@ -875,20 +931,29 @@ static int context_barrier_task(struct i915_gem_context *ctx,
>   	i915_active_init(i915, &cb->base, cb_retire);
>   	i915_active_acquire(&cb->base);
>   
> -	rbtree_postorder_for_each_entry_safe(ce, next, &ctx->hw_contexts, node) {
> -		struct intel_engine_cs *engine = ce->engine;
> +	e = i915_gem_context_engine_list_lock(ctx);
> +	for (i = 0; i < e->num_engines; i++) {
> +		struct intel_context *ce = e->engines[i];
>   		struct i915_request *rq;
>   
> -		if (!(engine->mask & engines))
> +		if (!ce || !(ce->engine->mask & engines))

Definitely looks like a case for for_each_context_engine.

> +			continue;
> +
> +		if (!intel_context_is_pinned(ce))
>   			continue;
>   
>   		if (I915_SELFTEST_ONLY(context_barrier_inject_fault &
> -				       engine->mask)) {
> +				       ce->engine->mask)) {
>   			err = -ENXIO;
>   			break;
>   		}
>   
> -		rq = i915_request_alloc(engine, ctx);
> +		err = intel_context_pin(ce);
> +		if (err)
> +			break;
> +
> +		rq = i915_request_create(ce);
> +		intel_context_unpin(ce);

Yep as mentioned somewhere above. I definitely think another helper 
would help code base redability. Even if called unimaginatively as 
__i915_request_add.

>   		if (IS_ERR(rq)) {
>   			err = PTR_ERR(rq);
>   			break;
> @@ -904,6 +969,7 @@ static int context_barrier_task(struct i915_gem_context *ctx,
>   		if (err)
>   			break;
>   	}
> +	i915_gem_context_engine_list_unlock(ctx);
>   
>   	cb->task = err ? NULL : task; /* caller needs to unwind instead */
>   	cb->data = data;
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
> index 5a8e080499fb..d94fc4c1907c 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.h
> +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> @@ -176,6 +176,48 @@ static inline void i915_gem_context_put(struct i915_gem_context *ctx)
>   	kref_put(&ctx->ref, i915_gem_context_release);
>   }
>   
> +static inline struct i915_gem_engines *
> +i915_gem_context_engine_list(struct i915_gem_context *ctx)
> +{
> +	return rcu_dereference_protected(ctx->engines,
> +					 lockdep_is_held(&ctx->engines_mutex));
> +}
> +
> +static inline struct i915_gem_engines *
> +i915_gem_context_engine_list_lock(struct i915_gem_context *ctx)
> +	__acquires(&ctx->engines_mutex)
> +{
> +	mutex_lock(&ctx->engines_mutex);
> +	return i915_gem_context_engine_list(ctx);
> +}
> +
> +static inline void
> +i915_gem_context_engine_list_unlock(struct i915_gem_context *ctx)
> +	__releases(&ctx->engines_mutex)
> +{
> +	mutex_unlock(&ctx->engines_mutex);
> +}
> +
> +static inline struct intel_context *
> +i915_gem_context_lookup_engine(struct i915_gem_context *ctx, unsigned long idx)
> +{
> +	return i915_gem_context_engine_list(ctx)->engines[idx];
> +}
> +
> +static inline struct intel_context *
> +i915_gem_context_get_engine(struct i915_gem_context *ctx, unsigned long idx)
> +{
> +	struct intel_context *ce = ERR_PTR(-EINVAL);
> +
> +	rcu_read_lock(); {
> +		struct i915_gem_engines *e = rcu_dereference(ctx->engines);
> +		if (likely(idx < e->num_engines && e->engines[idx]))
> +			ce = intel_context_get(e->engines[idx]);
> +	} rcu_read_unlock();
> +
> +	return ce;
> +}
> +
>   struct i915_lut_handle *i915_lut_handle_alloc(void);
>   void i915_lut_handle_free(struct i915_lut_handle *lut);
>   
> diff --git a/drivers/gpu/drm/i915/i915_gem_context_types.h b/drivers/gpu/drm/i915/i915_gem_context_types.h
> index d282a6ab3b9f..1610a74b0283 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context_types.h
> +++ b/drivers/gpu/drm/i915/i915_gem_context_types.h
> @@ -29,6 +29,13 @@ struct i915_hw_ppgtt;
>   struct i915_timeline;
>   struct intel_ring;
>   
> +struct i915_gem_engines {
> +	struct rcu_work rcu;
> +	struct drm_i915_private *i915;
> +	unsigned long num_engines;

unsigned int?

> +	struct intel_context *engines[];
> +};
> +
>   /**
>    * struct i915_gem_context - client state
>    *
> @@ -42,6 +49,21 @@ struct i915_gem_context {
>   	/** file_priv: owning file descriptor */
>   	struct drm_i915_file_private *file_priv;
>   
> +	/**
> +	 * @engines: User defined engines for this context
> +	 *
> +	 * NULL means to use legacy definitions (including random meaning of
> +	 * I915_EXEC_BSD with I915_EXEC_BSD_SELECTOR overrides).
> +	 *
> +	 * If defined, execbuf uses the I915_EXEC_MASK as an index into
> +	 * array, and various uAPI other the ability to lookup up an
> +	 * index from this array to select an engine operate on.
> +	 *
> +	 * User defined by I915_CONTEXT_PARAM_ENGINE.
> +	 */
> +	struct i915_gem_engines __rcu *engines;
> +	struct mutex engines_mutex; /* guards writes to engines */
> +
>   	struct i915_timeline *timeline;
>   
>   	/**
> @@ -134,10 +156,6 @@ struct i915_gem_context {
>   
>   	struct i915_sched_attr sched;
>   
> -	/** hw_contexts: per-engine logical HW state */
> -	struct rb_root hw_contexts;
> -	spinlock_t hw_contexts_lock;
> -
>   	/** ring_size: size for allocating the per-engine ring buffer */
>   	u32 ring_size;
>   	/** desc_template: invariant fields for the HW context descriptor */
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index c700cbc2f594..941192c60832 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -2077,9 +2077,7 @@ gen8_dispatch_bsd_engine(struct drm_i915_private *dev_priv,
>   	return file_priv->bsd_engine;
>   }
>   
> -#define I915_USER_RINGS (4)
> -
> -static const enum intel_engine_id user_ring_map[I915_USER_RINGS + 1] = {
> +static const enum intel_engine_id user_ring_map[] = {
>   	[I915_EXEC_DEFAULT]	= RCS0,
>   	[I915_EXEC_RENDER]	= RCS0,
>   	[I915_EXEC_BLT]		= BCS0,
> @@ -2087,10 +2085,8 @@ static const enum intel_engine_id user_ring_map[I915_USER_RINGS + 1] = {
>   	[I915_EXEC_VEBOX]	= VECS0
>   };
>   
> -static int eb_pin_context(struct i915_execbuffer *eb,
> -			  struct intel_engine_cs *engine)
> +static int eb_pin_context(struct i915_execbuffer *eb, struct intel_context *ce)
>   {
> -	struct intel_context *ce;
>   	int err;
>   
>   	/*
> @@ -2101,21 +2097,16 @@ static int eb_pin_context(struct i915_execbuffer *eb,
>   	if (err)
>   		return err;
>   
> -	ce = intel_context_instance(eb->gem_context, engine);
> -	if (IS_ERR(ce))
> -		return PTR_ERR(ce);
> -
>   	/*
>   	 * Pinning the contexts may generate requests in order to acquire
>   	 * GGTT space, so do this first before we reserve a seqno for
>   	 * ourselves.
>   	 */
>   	err = intel_context_pin(ce);
> -	intel_context_put(ce);
>   	if (err)
>   		return err;
>   
> -	eb->engine = engine;
> +	eb->engine = ce->engine;
>   	eb->context = ce;
>   	return 0;
>   }
> @@ -2125,24 +2116,18 @@ static void eb_unpin_context(struct i915_execbuffer *eb)
>   	intel_context_unpin(eb->context);
>   }
>   
> -static int
> -eb_select_engine(struct i915_execbuffer *eb,
> -		 struct drm_file *file,
> -		 struct drm_i915_gem_execbuffer2 *args)
> +static unsigned int
> +eb_select_legacy_ring(struct i915_execbuffer *eb,
> +		      struct drm_file *file,
> +		      struct drm_i915_gem_execbuffer2 *args)
>   {
>   	unsigned int user_ring_id = args->flags & I915_EXEC_RING_MASK;
> -	struct intel_engine_cs *engine;
> -
> -	if (user_ring_id > I915_USER_RINGS) {
> -		DRM_DEBUG("execbuf with unknown ring: %u\n", user_ring_id);
> -		return -EINVAL;
> -	}
>   
> -	if ((user_ring_id != I915_EXEC_BSD) &&
> -	    ((args->flags & I915_EXEC_BSD_MASK) != 0)) {
> +	if (user_ring_id != I915_EXEC_BSD &&
> +	    (args->flags & I915_EXEC_BSD_MASK)) {
>   		DRM_DEBUG("execbuf with non bsd ring but with invalid "
>   			  "bsd dispatch flags: %d\n", (int)(args->flags));
> -		return -EINVAL;
> +		return -1;
>   	}
>   
>   	if (user_ring_id == I915_EXEC_BSD && HAS_ENGINE(eb->i915, VCS1)) {
> @@ -2157,20 +2142,39 @@ eb_select_engine(struct i915_execbuffer *eb,
>   		} else {
>   			DRM_DEBUG("execbuf with unknown bsd ring: %u\n",
>   				  bsd_idx);
> -			return -EINVAL;
> +			return -1;
>   		}
>   
> -		engine = eb->i915->engine[_VCS(bsd_idx)];
> -	} else {
> -		engine = eb->i915->engine[user_ring_map[user_ring_id]];
> +		return _VCS(bsd_idx);
>   	}
>   
> -	if (!engine) {
> -		DRM_DEBUG("execbuf with invalid ring: %u\n", user_ring_id);
> -		return -EINVAL;
> +	if (user_ring_id >= ARRAY_SIZE(user_ring_map)) {
> +		DRM_DEBUG("execbuf with unknown ring: %u\n", user_ring_id);
> +		return -1;
>   	}
>   
> -	return eb_pin_context(eb, engine);
> +	return user_ring_map[user_ring_id];
> +}
> +
> +static int
> +eb_select_engine(struct i915_execbuffer *eb,
> +		 struct drm_file *file,
> +		 struct drm_i915_gem_execbuffer2 *args)
> +{
> +	struct intel_context *ce;
> +	unsigned int idx;
> +	int err;
> +
> +	idx = eb_select_legacy_ring(eb, file, args);
> +
> +	ce = i915_gem_context_get_engine(eb->gem_context, idx);
> +	if (IS_ERR(ce))
> +		return PTR_ERR(ce);
> +
> +	err = eb_pin_context(eb, ce);
> +	intel_context_put(ce);
> +
> +	return err;
>   }
>   
>   static void
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index afaeabe5e531..304e597e19e6 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c

I leave the rest for a second pass.

Regards,

Tvrtko

> @@ -1203,35 +1203,37 @@ static int i915_oa_read(struct i915_perf_stream *stream,
>   static struct intel_context *oa_pin_context(struct drm_i915_private *i915,
>   					    struct i915_gem_context *ctx)
>   {
> -	struct intel_engine_cs *engine = i915->engine[RCS0];
> -	struct intel_context *ce;
> +	struct i915_gem_engines *e;
> +	unsigned int i;
>   	int err;
>   
> -	ce = intel_context_instance(ctx, engine);
> -	if (IS_ERR(ce))
> -		return ce;
> -
>   	err = i915_mutex_lock_interruptible(&i915->drm);
> -	if (err) {
> -		intel_context_put(ce);
> +	if (err)
>   		return ERR_PTR(err);
> -	}
>   
> -	/*
> -	 * As the ID is the gtt offset of the context's vma we
> -	 * pin the vma to ensure the ID remains fixed.
> -	 *
> -	 * NB: implied RCS engine...
> -	 */
> -	err = intel_context_pin(ce);
> +	e = i915_gem_context_engine_list_lock(ctx);
> +	for (i = 0; i < e->num_engines; i++) {
> +		struct intel_context *ce = e->engines[i];
> +
> +		if (ce->engine->class != RENDER_CLASS)
> +			continue;
> +
> +		/*
> +		 * As the ID is the gtt offset of the context's vma we
> +		 * pin the vma to ensure the ID remains fixed.
> +		 */
> +		err = intel_context_pin(ce);
> +		if (err == 0) {
> +			i915->perf.oa.pinned_ctx = ce;
> +			break;
> +		}
> +	}
> +	i915_gem_context_engine_list_unlock(ctx);
>   	mutex_unlock(&i915->drm.struct_mutex);
> -	intel_context_put(ce);
>   	if (err)
>   		return ERR_PTR(err);
>   
> -	i915->perf.oa.pinned_ctx = ce;
> -
> -	return ce;
> +	return i915->perf.oa.pinned_ctx;
>   }
>   
>   /**
> @@ -1717,10 +1719,10 @@ gen8_update_reg_state_unlocked(struct intel_context *ce,
>   static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
>   				       const struct i915_oa_config *oa_config)
>   {
> -	struct intel_engine_cs *engine = dev_priv->engine[RCS0];
>   	unsigned int map_type = i915_coherent_map_type(dev_priv);
>   	struct i915_gem_context *ctx;
>   	struct i915_request *rq;
> +	unsigned int i;
>   	int ret;
>   
>   	lockdep_assert_held(&dev_priv->drm.struct_mutex);
> @@ -1746,30 +1748,41 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
>   
>   	/* Update all contexts now that we've stalled the submission. */
>   	list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
> -		struct intel_context *ce = intel_context_lookup(ctx, engine);
> -		u32 *regs;
> +		struct i915_gem_engines *e =
> +			i915_gem_context_engine_list_lock(ctx);
>   
> -		/* OA settings will be set upon first use */
> -		if (!ce || !ce->state)
> -			continue;
> +		for (i = 0; i < e->num_engines; i++) {
> +			struct intel_context *ce = e->engines[i];
> +			u32 *regs;
> +
> +			if (!ce || ce->engine->class != RENDER_CLASS)
> +				continue;
> +
> +			/* OA settings will be set upon first use */
> +			if (!ce->state)
> +				continue;
>   
> -		regs = i915_gem_object_pin_map(ce->state->obj, map_type);
> -		if (IS_ERR(regs))
> -			return PTR_ERR(regs);
> +			regs = i915_gem_object_pin_map(ce->state->obj,
> +						       map_type);
> +			if (IS_ERR(regs))
> +				return PTR_ERR(regs);
>   
> -		ce->state->obj->mm.dirty = true;
> -		regs += LRC_STATE_PN * PAGE_SIZE / sizeof(*regs);
> +			ce->state->obj->mm.dirty = true;
> +			regs += LRC_STATE_PN * PAGE_SIZE / sizeof(*regs);
>   
> -		gen8_update_reg_state_unlocked(ce, regs, oa_config);
> +			gen8_update_reg_state_unlocked(ce, regs, oa_config);
> +
> +			i915_gem_object_unpin_map(ce->state->obj);
> +		}
>   
> -		i915_gem_object_unpin_map(ce->state->obj);
> +		i915_gem_context_engine_list_unlock(ctx);
>   	}
>   
>   	/*
>   	 * Apply the configuration by doing one context restore of the edited
>   	 * context image.
>   	 */
> -	rq = i915_request_create(engine->kernel_context);
> +	rq = i915_request_create(dev_priv->engine[RCS0]->kernel_context);
>   	if (IS_ERR(rq))
>   		return PTR_ERR(rq);
>   
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 8abd891d9287..5e52c1623c17 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -767,7 +767,7 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   	 * GGTT space, so do this first before we reserve a seqno for
>   	 * ourselves.
>   	 */
> -	ce = intel_context_instance(ctx, engine);
> +	ce = i915_gem_context_get_engine(ctx, engine->id);
>   	if (IS_ERR(ce))
>   		return ERR_CAST(ce);
>   
> diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
> index 7fbfcb3d63e0..ae230b38138a 100644
> --- a/drivers/gpu/drm/i915/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/intel_guc_submission.c
> @@ -364,11 +364,10 @@ static void guc_stage_desc_pool_destroy(struct intel_guc *guc)
>   static void guc_stage_desc_init(struct intel_guc_client *client)
>   {
>   	struct intel_guc *guc = client->guc;
> -	struct drm_i915_private *dev_priv = guc_to_i915(guc);
> -	struct intel_engine_cs *engine;
>   	struct i915_gem_context *ctx = client->owner;
> +	struct i915_gem_engines *e;
>   	struct guc_stage_desc *desc;
> -	unsigned int tmp;
> +	unsigned int i;
>   	u32 gfx_addr;
>   
>   	desc = __get_stage_desc(client);
> @@ -382,10 +381,13 @@ static void guc_stage_desc_init(struct intel_guc_client *client)
>   	desc->priority = client->priority;
>   	desc->db_id = client->doorbell_id;
>   
> -	for_each_engine_masked(engine, dev_priv, client->engines, tmp) {
> -		struct intel_context *ce = intel_context_lookup(ctx, engine);
> -		u32 guc_engine_id = engine->guc_id;
> -		struct guc_execlist_context *lrc = &desc->lrc[guc_engine_id];
> +	e = i915_gem_context_engine_list_lock(ctx);
> +	for (i = 0; i < e->num_engines; i++) {
> +		struct intel_context *ce = e->engines[i];
> +		struct guc_execlist_context *lrc;
> +
> +		if (!ce || !(ce->engine->mask & client->engines))
> +			continue;
>   
>   		/* TODO: We have a design issue to be solved here. Only when we
>   		 * receive the first batch, we know which engine is used by the
> @@ -394,7 +396,7 @@ static void guc_stage_desc_init(struct intel_guc_client *client)
>   		 * for now who owns a GuC client. But for future owner of GuC
>   		 * client, need to make sure lrc is pinned prior to enter here.
>   		 */
> -		if (!ce || !ce->state)
> +		if (!ce->state)
>   			break;	/* XXX: continue? */
>   
>   		/*
> @@ -404,6 +406,7 @@ static void guc_stage_desc_init(struct intel_guc_client *client)
>   		 * Instead, the GuC uses the LRCA of the user mode context (see
>   		 * guc_add_request below).
>   		 */
> +		lrc = &desc->lrc[ce->engine->guc_id];
>   		lrc->context_desc = lower_32_bits(ce->lrc_desc);
>   
>   		/* The state page is after PPHWSP */
> @@ -414,15 +417,16 @@ static void guc_stage_desc_init(struct intel_guc_client *client)
>   		 * here. In proxy submission, it wants the stage id
>   		 */
>   		lrc->context_id = (client->stage_id << GUC_ELC_CTXID_OFFSET) |
> -				(guc_engine_id << GUC_ELC_ENGINE_OFFSET);
> +				(ce->engine->guc_id << GUC_ELC_ENGINE_OFFSET);
>   
>   		lrc->ring_begin = intel_guc_ggtt_offset(guc, ce->ring->vma);
>   		lrc->ring_end = lrc->ring_begin + ce->ring->size - 1;
>   		lrc->ring_next_free_location = lrc->ring_begin;
>   		lrc->ring_current_tail_pointer_value = 0;
>   
> -		desc->engines_used |= (1 << guc_engine_id);
> +		desc->engines_used |= BIT(ce->engine->guc_id);
>   	}
> +	i915_gem_context_engine_list_unlock(ctx);
>   
>   	DRM_DEBUG_DRIVER("Host engines 0x%x => GuC engines used 0x%x\n",
>   			 client->engines, desc->engines_used);
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> index 214d1fd2f4dc..7fd224a4ca4c 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> @@ -1094,7 +1094,7 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
>   
>   	wakeref = intel_runtime_pm_get(i915);
>   
> -	ce = intel_context_instance(ctx, i915->engine[RCS0]);
> +	ce = i915_gem_context_get_engine(ctx, RCS0);
>   	if (IS_ERR(ce)) {
>   		ret = PTR_ERR(ce);
>   		goto out_rpm;
> diff --git a/drivers/gpu/drm/i915/selftests/mock_context.c b/drivers/gpu/drm/i915/selftests/mock_context.c
> index 0426093bf1d9..71c750693585 100644
> --- a/drivers/gpu/drm/i915/selftests/mock_context.c
> +++ b/drivers/gpu/drm/i915/selftests/mock_context.c
> @@ -30,6 +30,7 @@ mock_context(struct drm_i915_private *i915,
>   	     const char *name)
>   {
>   	struct i915_gem_context *ctx;
> +	struct i915_gem_engines *e;
>   	int ret;
>   
>   	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
> @@ -40,8 +41,11 @@ mock_context(struct drm_i915_private *i915,
>   	INIT_LIST_HEAD(&ctx->link);
>   	ctx->i915 = i915;
>   
> -	ctx->hw_contexts = RB_ROOT;
> -	spin_lock_init(&ctx->hw_contexts_lock);
> +	mutex_init(&ctx->engines_mutex);
> +	e = default_engines(ctx);
> +	if (IS_ERR(e))
> +		goto err_free;
> +	RCU_INIT_POINTER(ctx->engines, e);
>   
>   	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
>   	INIT_LIST_HEAD(&ctx->handles_list);
> @@ -51,7 +55,7 @@ mock_context(struct drm_i915_private *i915,
>   
>   	ret = i915_gem_context_pin_hw_id(ctx);
>   	if (ret < 0)
> -		goto err_handles;
> +		goto err_engines;
>   
>   	if (name) {
>   		struct i915_hw_ppgtt *ppgtt;
> @@ -69,7 +73,9 @@ mock_context(struct drm_i915_private *i915,
>   
>   	return ctx;
>   
> -err_handles:
> +err_engines:
> +	free_engines(rcu_access_pointer(ctx->engines));
> +err_free:
>   	kfree(ctx);
>   	return NULL;
>   
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 21/29] drm/i915: Switch back to an array of logical per-engine HW contexts
  2019-04-10 15:32   ` Tvrtko Ursulin
@ 2019-04-10 16:18     ` Chris Wilson
  2019-04-11 13:05       ` Tvrtko Ursulin
  0 siblings, 1 reply; 71+ messages in thread
From: Chris Wilson @ 2019-04-10 16:18 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-04-10 16:32:18)
> 
> On 08/04/2019 10:17, Chris Wilson wrote:
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > index 3f794bc71958..0df3c5238c04 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > @@ -716,7 +716,7 @@ static int pin_context(struct i915_gem_context *ctx,
> >       struct intel_context *ce;
> >       int err;
> >   
> > -     ce = intel_context_instance(ctx, engine);
> > +     ce = i915_gem_context_get_engine(ctx, engine->id);
> 
> Staying with intel_context_instance wouldn't help to reduce the churn?

But it takes the GEM context :|

intel_context_lookup() ? But it won't be part of gt/intel_context.h
And I'd like to have 'get' in there for consistency (although other
object lookup functions return a new reference, so that may not be a
solid argument).

It has annoyed me that this does require the GEM context, but that's the
nature of the beast.

> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 50266e87c225..21b4a04c424b 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -4312,8 +4312,9 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
> >   
> >   static int __intel_engines_record_defaults(struct drm_i915_private *i915)
> >   {
> > -     struct i915_gem_context *ctx;
> >       struct intel_engine_cs *engine;
> > +     struct i915_gem_context *ctx;
> > +     struct i915_gem_engines *e;
> >       enum intel_engine_id id;
> >       int err = 0;
> >   
> > @@ -4330,18 +4331,26 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
> >       if (IS_ERR(ctx))
> >               return PTR_ERR(ctx);
> >   
> > +     e = i915_gem_context_engine_list_lock(ctx);
> > +
> >       for_each_engine(engine, i915, id) {
> 
> Do we need i915 version of for_each_context_engine? If all call sites 
> will doing this "lock ctx" -> "walk physical engines" -> "lookup in 
> context" then it seems a bit disconnected.

It's rare, a couple of odd cases imo.

> > +             struct intel_context *ce = e->engines[id];
> 
> How will index by engine->id work for engine map?

It doesn't, that's the point of these being the odd cases. :)

> >               struct i915_request *rq;
> >   
> > -             rq = i915_request_alloc(engine, ctx);
> > +             err = intel_context_pin(ce);
> > +             if (err)
> > +                     goto err_active;
> > +
> > +             rq = i915_request_create(ce);
> > +             intel_context_unpin(ce);
> 
> Kind of verbose, no? Do you want to add some 
> middle-ground-between-request-alloc-and-create helper?

There's 2 callers of i915_request_create() in this style.
(Execbuf has a style all of its own.)

At the moment I don't have a good name for the happy middle ground, so
I'm willing to pay the price for forcing control over the pin to the
user.

...

Does it really make any difference for the perma-pinned kernel contexts?
Actually no... 

Hmm. The fundamental objective was being able to pass ce and avoid
struct_mutex -- but we already avoid struct_mutex for pinning today as
the context is already pinned.

The downside is that it adds redundant steps to execbuf, and
__i915_request_create() is already taken... And I hope you would say no
if I suggest ____i915_request_create :)

> > +static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx)
> > +{
> > +     struct intel_engine_cs *engine;
> > +     struct i915_gem_engines *e;
> > +     enum intel_engine_id id;
> > +
> > +     e = kzalloc(struct_size(e, engines, I915_NUM_ENGINES), GFP_KERNEL);
> > +     if (!e)
> > +             return ERR_PTR(-ENOMEM);
> > +
> > +     e->i915 = ctx->i915;
> > +     for_each_engine(engine, ctx->i915, id) {
> > +             struct intel_context *ce;
> >   
> > +             ce = intel_context_create(ctx, engine);
> > +             if (IS_ERR(ce)) {
> > +                     free_engines_n(e, id);
> 
> I dislike piggy-back of n into e. How about:
> 
> __free_engines(e, n)
> {
>         ...
> }
> 
> free_engines(e)
> {
>         __fre_engines(e, e->num_engines):
> }
> 
> ?

Ok.
 
> Or even you could e->num_engines++ in the above loop and just have one 
> free_engines.

I thought it was cleaner to avoid having multiple counters for the same
loop. free_engines_n() ended up with 5 users.

> > +                     return ERR_CAST(ce);
> > +             }
> > +
> > +             e->engines[id] = ce;
> 
> Each context would have a sparse array of engines, on most platforms. 
> Would it be workable to instead create a compact array per context, and 
> just have a per device translation table of idx to engine->id? Or 
> vice-versa, I can't figure out straight from the bat which one would you 
> need.

As sparse as we do today. I working under the assumption that going
forwards, the default map would be the oddity. But that is better than
the sparse fixed array[] we previously embedded into the GEM context, so
overall I think still an improvement for old platforms.

We can trim the num_engines, i.e. we need only allocate [rcs0] on gen3
as any id>rcs0 will become -EINVAL.

One plan I did have was to remove the sparse engine->id (since that
should be an internal id), but then we end up with not being able to use
static constants for our VCSx etc.

> I guess in this scheme you would need some translation as well to 
> support customised engine maps.. will see if I will spot that later in 
> the patch.

In the scheme in this patch, we just replace ctx->engines[] for the
custom map with separate intel_contexts for each instance of a specific
engine.

> > -     rbtree_postorder_for_each_entry_safe(ce, next, &ctx->hw_contexts, node) {
> > -             struct intel_engine_cs *engine = ce->engine;
> > +     e = i915_gem_context_engine_list_lock(ctx);
> > +     for (i = 0; i < e->num_engines; i++) {
> > +             struct intel_context *ce = e->engines[i];
> >               struct i915_request *rq;
> >   
> > -             if (!(engine->mask & engines))
> > +             if (!ce || !(ce->engine->mask & engines))
> 
> Definitely looks like a case for for_each_context_engine.

for_each_context_engine(ce, e, i)
for_each_context_masked(ce, e, mask, i)

Hmm. I do regret for_each...() that require temporaries.

We can't stuff the lock in there, without a bit of work...

gem_for_each_context_engine(iter, ctx) {
	struct intel_context *ce = iter.context;

with
for (init_iter(&iter, ctx);; next_iter(&iter))

Just to hide the locking. Probably worth it.

> > +                     continue;
> > +
> > +             if (!intel_context_is_pinned(ce))
> >                       continue;
> >   
> >               if (I915_SELFTEST_ONLY(context_barrier_inject_fault &
> > -                                    engine->mask)) {
> > +                                    ce->engine->mask)) {
> >                       err = -ENXIO;
> >                       break;
> >               }
> >   
> > -             rq = i915_request_alloc(engine, ctx);
> > +             err = intel_context_pin(ce);
> > +             if (err)
> > +                     break;
> > +
> > +             rq = i915_request_create(ce);
> > +             intel_context_unpin(ce);
> 
> Yep as mentioned somewhere above. I definitely think another helper 
> would help code base redability. Even if called unimaginatively as 
> __i915_request_add.

It is just these two (based on a quick grep).

Maybe intel_context_create_request() ?

Hmm, but isn't that what i915_request_create() is. Quiet!

> > +struct i915_gem_engines {
> > +     struct rcu_work rcu;
> > +     struct drm_i915_private *i915;
> > +     unsigned long num_engines;
> 
> unsigned int?

Pointer before, array of pointers after, left an unsigned long hole.

I was just filling the space.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 10/29] drm/i915: Introduce context->enter() and context->exit()
  2019-04-10 11:06       ` Tvrtko Ursulin
@ 2019-04-10 19:19         ` Chris Wilson
  0 siblings, 0 replies; 71+ messages in thread
From: Chris Wilson @ 2019-04-10 19:19 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-04-10 12:06:49)
> 
> On 10/04/2019 11:13, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-04-10 11:05:13)
> >> And some lockdep_assert_held in all three?
> > 
> > Read on :(
> > 
> > The plan is for intel_context_enter/_exit to be under the
> > timeline->mutex, but that isn't realised for about another 30 patches.
> > 
> > mark_active has special protection because it gets used from the
> > serialised portion of intel_engine_park.
> 
> Ok, but bug on for zero active_count makes sense right? Maybe even 
> intel_context_pin_active to signify that it can only act on an already 
> active context?

It get's called on the kernel_context with ce->active_count==0 and not
even holding the mutex from inside the wakeref handling code. (On no,
I've invented a new BKL --- well a parking brake.)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 16/29] drm/i915: Export intel_context_instance()
  2019-04-10 12:06   ` Tvrtko Ursulin
@ 2019-04-10 19:32     ` Chris Wilson
  2019-04-11 12:57       ` Tvrtko Ursulin
  0 siblings, 1 reply; 71+ messages in thread
From: Chris Wilson @ 2019-04-10 19:32 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-04-10 13:06:04)
> 
> On 08/04/2019 10:17, Chris Wilson wrote:
> > We want to pass in a intel_context into intel_context_pin() and that
> > requires us to first be able to lookup the intel_context!
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_context.c    | 37 +++++++++++-----------
> >   drivers/gpu/drm/i915/gt/intel_context.h    | 19 +++++++----
> >   drivers/gpu/drm/i915/gt/intel_engine_cs.c  |  8 ++++-
> >   drivers/gpu/drm/i915/gt/mock_engine.c      |  8 ++++-
> >   drivers/gpu/drm/i915/gvt/scheduler.c       |  7 +++-
> >   drivers/gpu/drm/i915/i915_gem_execbuffer.c | 11 +++++--
> >   drivers/gpu/drm/i915/i915_perf.c           | 21 ++++++++----
> >   drivers/gpu/drm/i915/i915_request.c        | 11 ++++++-
> >   8 files changed, 83 insertions(+), 39 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
> > index 6ae6a3f58364..a1267739e369 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> > @@ -104,7 +104,7 @@ void __intel_context_remove(struct intel_context *ce)
> >       spin_unlock(&ctx->hw_contexts_lock);
> >   }
> >   
> > -static struct intel_context *
> > +struct intel_context *
> >   intel_context_instance(struct i915_gem_context *ctx,
> >                      struct intel_engine_cs *engine)
> >   {
> > @@ -112,7 +112,7 @@ intel_context_instance(struct i915_gem_context *ctx,
> >   
> >       ce = intel_context_lookup(ctx, engine);
> >       if (likely(ce))
> > -             return ce;
> > +             return intel_context_get(ce);
> >   
> >       ce = intel_context_alloc();
> >       if (!ce)
> > @@ -125,7 +125,7 @@ intel_context_instance(struct i915_gem_context *ctx,
> >               intel_context_free(ce);
> >   
> >       GEM_BUG_ON(intel_context_lookup(ctx, engine) != pos);
> > -     return pos;
> > +     return intel_context_get(pos);
> >   }
> >   
> >   struct intel_context *
> > @@ -139,30 +139,30 @@ intel_context_pin_lock(struct i915_gem_context *ctx,
> >       if (IS_ERR(ce))
> >               return ce;
> >   
> > -     if (mutex_lock_interruptible(&ce->pin_mutex))
> > +     if (mutex_lock_interruptible(&ce->pin_mutex)) {
> > +             intel_context_put(ce);
> >               return ERR_PTR(-EINTR);
> > +     }
> >   
> >       return ce;
> >   }
> >   
> > -struct intel_context *
> > -intel_context_pin(struct i915_gem_context *ctx,
> > -               struct intel_engine_cs *engine)
> > +void intel_context_pin_unlock(struct intel_context *ce)
> > +     __releases(ce->pin_mutex)
> >   {
> > -     struct intel_context *ce;
> > -     int err;
> > -
> > -     ce = intel_context_instance(ctx, engine);
> > -     if (IS_ERR(ce))
> > -             return ce;
> > +     mutex_unlock(&ce->pin_mutex);
> > +     intel_context_put(ce);
> > +}
> >   
> > -     if (likely(atomic_inc_not_zero(&ce->pin_count)))
> > -             return ce;
> > +int __intel_context_do_pin(struct intel_context *ce)
> > +{
> > +     int err;
> >   
> >       if (mutex_lock_interruptible(&ce->pin_mutex))
> > -             return ERR_PTR(-EINTR);
> > +             return -EINTR;
> >   
> >       if (likely(!atomic_read(&ce->pin_count))) {
> > +             struct i915_gem_context *ctx = ce->gem_context;
> >               intel_wakeref_t wakeref;
> >   
> >               err = 0;
> > @@ -172,7 +172,6 @@ intel_context_pin(struct i915_gem_context *ctx,
> >                       goto err;
> >   
> >               i915_gem_context_get(ctx);
> > -             GEM_BUG_ON(ce->gem_context != ctx);
> >   
> >               mutex_lock(&ctx->mutex);
> >               list_add(&ce->active_link, &ctx->active_engines);
> > @@ -186,11 +185,11 @@ intel_context_pin(struct i915_gem_context *ctx,
> >       GEM_BUG_ON(!intel_context_is_pinned(ce)); /* no overflow! */
> >   
> >       mutex_unlock(&ce->pin_mutex);
> > -     return ce;
> > +     return 0;
> >   
> >   err:
> >       mutex_unlock(&ce->pin_mutex);
> > -     return ERR_PTR(err);
> > +     return err;
> >   }
> >   
> >   void intel_context_unpin(struct intel_context *ce)
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
> > index 9aeef88176b9..da342e9a8c2e 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_context.h
> > @@ -49,11 +49,7 @@ intel_context_is_pinned(struct intel_context *ce)
> >       return atomic_read(&ce->pin_count);
> >   }
> >   
> > -static inline void intel_context_pin_unlock(struct intel_context *ce)
> > -__releases(ce->pin_mutex)
> > -{
> > -     mutex_unlock(&ce->pin_mutex);
> > -}
> 
> Could leave this as static inline since the only addition is kref_put so 
> compiler could decide what to do? Don't mind either way.

In the next (or two) patch.

> > +void intel_context_pin_unlock(struct intel_context *ce);
> >   
> >   struct intel_context *
> >   __intel_context_insert(struct i915_gem_context *ctx,
> > @@ -63,7 +59,18 @@ void
> >   __intel_context_remove(struct intel_context *ce);
> >   
> >   struct intel_context *
> > -intel_context_pin(struct i915_gem_context *ctx, struct intel_engine_cs *engine);
> > +intel_context_instance(struct i915_gem_context *ctx,
> > +                    struct intel_engine_cs *engine);
> > +
> > +int __intel_context_do_pin(struct intel_context *ce);
> > +
> > +static inline int intel_context_pin(struct intel_context *ce)
> > +{
> > +     if (likely(atomic_inc_not_zero(&ce->pin_count)))
> > +             return 0;
> > +
> > +     return __intel_context_do_pin(ce);
> > +}
> >   
> >   static inline void __intel_context_pin(struct intel_context *ce)
> >   {
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > index d378b276e476..f6828c0276eb 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > @@ -697,11 +697,17 @@ static int pin_context(struct i915_gem_context *ctx,
> >                      struct intel_context **out)
> >   {
> >       struct intel_context *ce;
> > +     int err;
> >   
> > -     ce = intel_context_pin(ctx, engine);
> > +     ce = intel_context_instance(ctx, engine);
> >       if (IS_ERR(ce))
> >               return PTR_ERR(ce);
> >   
> > +     err = intel_context_pin(ce);
> > +     intel_context_put(ce);
> > +     if (err)
> > +             return err;
> > +
> >       *out = ce;
> >       return 0;
> >   }
> > diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c
> > index d9f68c89dff4..a79d9909d171 100644
> > --- a/drivers/gpu/drm/i915/gt/mock_engine.c
> > +++ b/drivers/gpu/drm/i915/gt/mock_engine.c
> > @@ -239,6 +239,7 @@ struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
> >                                   int id)
> >   {
> >       struct mock_engine *engine;
> > +     int err;
> >   
> >       GEM_BUG_ON(id >= I915_NUM_ENGINES);
> >   
> > @@ -278,10 +279,15 @@ struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
> >       INIT_LIST_HEAD(&engine->hw_queue);
> >   
> >       engine->base.kernel_context =
> > -             intel_context_pin(i915->kernel_context, &engine->base);
> > +             intel_context_instance(i915->kernel_context, &engine->base);
> >       if (IS_ERR(engine->base.kernel_context))
> >               goto err_breadcrumbs;
> >   
> > +     err = intel_context_pin(engine->base.kernel_context);
> > +     intel_context_put(engine->base.kernel_context);
> > +     if (err)
> > +             goto err_breadcrumbs;
> > +
> >       return &engine->base;
> >   
> >   err_breadcrumbs:
> > diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c
> > index 40d9f549a0cd..606fc2713240 100644
> > --- a/drivers/gpu/drm/i915/gvt/scheduler.c
> > +++ b/drivers/gpu/drm/i915/gvt/scheduler.c
> > @@ -1188,12 +1188,17 @@ int intel_vgpu_setup_submission(struct intel_vgpu *vgpu)
> >               INIT_LIST_HEAD(&s->workload_q_head[i]);
> >               s->shadow[i] = ERR_PTR(-EINVAL);
> >   
> > -             ce = intel_context_pin(ctx, engine);
> > +             ce = intel_context_instance(ctx, engine);
> >               if (IS_ERR(ce)) {
> >                       ret = PTR_ERR(ce);
> >                       goto out_shadow_ctx;
> >               }
> >   
> > +             ret = intel_context_pin(ce);
> > +             intel_context_put(ce);
> > +             if (ret)
> > +                     goto out_shadow_ctx;
> > +
> >               s->shadow[i] = ce;
> >       }
> >   
> > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > index 35732c2ae17f..c700cbc2f594 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > @@ -2101,14 +2101,19 @@ static int eb_pin_context(struct i915_execbuffer *eb,
> >       if (err)
> >               return err;
> >   
> > +     ce = intel_context_instance(eb->gem_context, engine);
> > +     if (IS_ERR(ce))
> > +             return PTR_ERR(ce);
> > +
> >       /*
> >        * Pinning the contexts may generate requests in order to acquire
> >        * GGTT space, so do this first before we reserve a seqno for
> >        * ourselves.
> >        */
> > -     ce = intel_context_pin(eb->gem_context, engine);
> > -     if (IS_ERR(ce))
> > -             return PTR_ERR(ce);
> > +     err = intel_context_pin(ce);
> > +     intel_context_put(ce);
> > +     if (err)
> > +             return err;
> >   
> >       eb->engine = engine;
> >       eb->context = ce;
> > diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> > index 328a740e72cb..afaeabe5e531 100644
> > --- a/drivers/gpu/drm/i915/i915_perf.c
> > +++ b/drivers/gpu/drm/i915/i915_perf.c
> > @@ -1205,11 +1205,17 @@ static struct intel_context *oa_pin_context(struct drm_i915_private *i915,
> >   {
> >       struct intel_engine_cs *engine = i915->engine[RCS0];
> >       struct intel_context *ce;
> > -     int ret;
> > +     int err;
> >   
> > -     ret = i915_mutex_lock_interruptible(&i915->drm);
> > -     if (ret)
> > -             return ERR_PTR(ret);
> > +     ce = intel_context_instance(ctx, engine);
> > +     if (IS_ERR(ce))
> > +             return ce;
> > +
> > +     err = i915_mutex_lock_interruptible(&i915->drm);
> > +     if (err) {
> > +             intel_context_put(ce);
> > +             return ERR_PTR(err);
> > +     }
> >   
> >       /*
> >        * As the ID is the gtt offset of the context's vma we
> > @@ -1217,10 +1223,11 @@ static struct intel_context *oa_pin_context(struct drm_i915_private *i915,
> >        *
> >        * NB: implied RCS engine...
> >        */
> > -     ce = intel_context_pin(ctx, engine);
> > +     err = intel_context_pin(ce);
> >       mutex_unlock(&i915->drm.struct_mutex);
> > -     if (IS_ERR(ce))
> > -             return ce;
> > +     intel_context_put(ce);
> > +     if (err)
> > +             return ERR_PTR(err);
> >   
> >       i915->perf.oa.pinned_ctx = ce;
> >   
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index 7932d1209247..8abd891d9287 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -753,6 +753,7 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
> >       struct drm_i915_private *i915 = engine->i915;
> >       struct intel_context *ce;
> >       struct i915_request *rq;
> > +     int err;
> >   
> >       /*
> >        * Preempt contexts are reserved for exclusive use to inject a
> > @@ -766,13 +767,21 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
> >        * GGTT space, so do this first before we reserve a seqno for
> >        * ourselves.
> >        */
> > -     ce = intel_context_pin(ctx, engine);
> > +     ce = intel_context_instance(ctx, engine);
> >       if (IS_ERR(ce))
> >               return ERR_CAST(ce);
> >   
> > +     err = intel_context_pin(ce);
> > +     if (err) {
> > +             rq = ERR_PTR(err);
> > +             goto err_put;
> > +     }
> > +
> >       rq = i915_request_create(ce);
> >       intel_context_unpin(ce);
> >   
> > +err_put:
> > +     intel_context_put(ce);
> >       return rq;
> >   }
> >   
> > 
> 
> The pattern of instance-pin-put is repeated so many times it begs to be 
> promoted to something like intel_context_pin_instance.

It's just a transition patch. We'll get rid of intel_context_instance()
momentarily.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 16/29] drm/i915: Export intel_context_instance()
  2019-04-10 19:32     ` Chris Wilson
@ 2019-04-11 12:57       ` Tvrtko Ursulin
  0 siblings, 0 replies; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-11 12:57 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 10/04/2019 20:32, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-04-10 13:06:04)
>>
>> On 08/04/2019 10:17, Chris Wilson wrote:
>>> We want to pass in a intel_context into intel_context_pin() and that
>>> requires us to first be able to lookup the intel_context!
>>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> ---
>>>    drivers/gpu/drm/i915/gt/intel_context.c    | 37 +++++++++++-----------
>>>    drivers/gpu/drm/i915/gt/intel_context.h    | 19 +++++++----
>>>    drivers/gpu/drm/i915/gt/intel_engine_cs.c  |  8 ++++-
>>>    drivers/gpu/drm/i915/gt/mock_engine.c      |  8 ++++-
>>>    drivers/gpu/drm/i915/gvt/scheduler.c       |  7 +++-
>>>    drivers/gpu/drm/i915/i915_gem_execbuffer.c | 11 +++++--
>>>    drivers/gpu/drm/i915/i915_perf.c           | 21 ++++++++----
>>>    drivers/gpu/drm/i915/i915_request.c        | 11 ++++++-
>>>    8 files changed, 83 insertions(+), 39 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
>>> index 6ae6a3f58364..a1267739e369 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_context.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_context.c
>>> @@ -104,7 +104,7 @@ void __intel_context_remove(struct intel_context *ce)
>>>        spin_unlock(&ctx->hw_contexts_lock);
>>>    }
>>>    
>>> -static struct intel_context *
>>> +struct intel_context *
>>>    intel_context_instance(struct i915_gem_context *ctx,
>>>                       struct intel_engine_cs *engine)
>>>    {
>>> @@ -112,7 +112,7 @@ intel_context_instance(struct i915_gem_context *ctx,
>>>    
>>>        ce = intel_context_lookup(ctx, engine);
>>>        if (likely(ce))
>>> -             return ce;
>>> +             return intel_context_get(ce);
>>>    
>>>        ce = intel_context_alloc();
>>>        if (!ce)
>>> @@ -125,7 +125,7 @@ intel_context_instance(struct i915_gem_context *ctx,
>>>                intel_context_free(ce);
>>>    
>>>        GEM_BUG_ON(intel_context_lookup(ctx, engine) != pos);
>>> -     return pos;
>>> +     return intel_context_get(pos);
>>>    }
>>>    
>>>    struct intel_context *
>>> @@ -139,30 +139,30 @@ intel_context_pin_lock(struct i915_gem_context *ctx,
>>>        if (IS_ERR(ce))
>>>                return ce;
>>>    
>>> -     if (mutex_lock_interruptible(&ce->pin_mutex))
>>> +     if (mutex_lock_interruptible(&ce->pin_mutex)) {
>>> +             intel_context_put(ce);
>>>                return ERR_PTR(-EINTR);
>>> +     }
>>>    
>>>        return ce;
>>>    }
>>>    
>>> -struct intel_context *
>>> -intel_context_pin(struct i915_gem_context *ctx,
>>> -               struct intel_engine_cs *engine)
>>> +void intel_context_pin_unlock(struct intel_context *ce)
>>> +     __releases(ce->pin_mutex)
>>>    {
>>> -     struct intel_context *ce;
>>> -     int err;
>>> -
>>> -     ce = intel_context_instance(ctx, engine);
>>> -     if (IS_ERR(ce))
>>> -             return ce;
>>> +     mutex_unlock(&ce->pin_mutex);
>>> +     intel_context_put(ce);
>>> +}
>>>    
>>> -     if (likely(atomic_inc_not_zero(&ce->pin_count)))
>>> -             return ce;
>>> +int __intel_context_do_pin(struct intel_context *ce)
>>> +{
>>> +     int err;
>>>    
>>>        if (mutex_lock_interruptible(&ce->pin_mutex))
>>> -             return ERR_PTR(-EINTR);
>>> +             return -EINTR;
>>>    
>>>        if (likely(!atomic_read(&ce->pin_count))) {
>>> +             struct i915_gem_context *ctx = ce->gem_context;
>>>                intel_wakeref_t wakeref;
>>>    
>>>                err = 0;
>>> @@ -172,7 +172,6 @@ intel_context_pin(struct i915_gem_context *ctx,
>>>                        goto err;
>>>    
>>>                i915_gem_context_get(ctx);
>>> -             GEM_BUG_ON(ce->gem_context != ctx);
>>>    
>>>                mutex_lock(&ctx->mutex);
>>>                list_add(&ce->active_link, &ctx->active_engines);
>>> @@ -186,11 +185,11 @@ intel_context_pin(struct i915_gem_context *ctx,
>>>        GEM_BUG_ON(!intel_context_is_pinned(ce)); /* no overflow! */
>>>    
>>>        mutex_unlock(&ce->pin_mutex);
>>> -     return ce;
>>> +     return 0;
>>>    
>>>    err:
>>>        mutex_unlock(&ce->pin_mutex);
>>> -     return ERR_PTR(err);
>>> +     return err;
>>>    }
>>>    
>>>    void intel_context_unpin(struct intel_context *ce)
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
>>> index 9aeef88176b9..da342e9a8c2e 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_context.h
>>> +++ b/drivers/gpu/drm/i915/gt/intel_context.h
>>> @@ -49,11 +49,7 @@ intel_context_is_pinned(struct intel_context *ce)
>>>        return atomic_read(&ce->pin_count);
>>>    }
>>>    
>>> -static inline void intel_context_pin_unlock(struct intel_context *ce)
>>> -__releases(ce->pin_mutex)
>>> -{
>>> -     mutex_unlock(&ce->pin_mutex);
>>> -}
>>
>> Could leave this as static inline since the only addition is kref_put so
>> compiler could decide what to do? Don't mind either way.
> 
> In the next (or two) patch.
> 
>>> +void intel_context_pin_unlock(struct intel_context *ce);
>>>    
>>>    struct intel_context *
>>>    __intel_context_insert(struct i915_gem_context *ctx,
>>> @@ -63,7 +59,18 @@ void
>>>    __intel_context_remove(struct intel_context *ce);
>>>    
>>>    struct intel_context *
>>> -intel_context_pin(struct i915_gem_context *ctx, struct intel_engine_cs *engine);
>>> +intel_context_instance(struct i915_gem_context *ctx,
>>> +                    struct intel_engine_cs *engine);
>>> +
>>> +int __intel_context_do_pin(struct intel_context *ce);
>>> +
>>> +static inline int intel_context_pin(struct intel_context *ce)
>>> +{
>>> +     if (likely(atomic_inc_not_zero(&ce->pin_count)))
>>> +             return 0;
>>> +
>>> +     return __intel_context_do_pin(ce);
>>> +}
>>>    
>>>    static inline void __intel_context_pin(struct intel_context *ce)
>>>    {
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> index d378b276e476..f6828c0276eb 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> @@ -697,11 +697,17 @@ static int pin_context(struct i915_gem_context *ctx,
>>>                       struct intel_context **out)
>>>    {
>>>        struct intel_context *ce;
>>> +     int err;
>>>    
>>> -     ce = intel_context_pin(ctx, engine);
>>> +     ce = intel_context_instance(ctx, engine);
>>>        if (IS_ERR(ce))
>>>                return PTR_ERR(ce);
>>>    
>>> +     err = intel_context_pin(ce);
>>> +     intel_context_put(ce);
>>> +     if (err)
>>> +             return err;
>>> +
>>>        *out = ce;
>>>        return 0;
>>>    }
>>> diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c
>>> index d9f68c89dff4..a79d9909d171 100644
>>> --- a/drivers/gpu/drm/i915/gt/mock_engine.c
>>> +++ b/drivers/gpu/drm/i915/gt/mock_engine.c
>>> @@ -239,6 +239,7 @@ struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
>>>                                    int id)
>>>    {
>>>        struct mock_engine *engine;
>>> +     int err;
>>>    
>>>        GEM_BUG_ON(id >= I915_NUM_ENGINES);
>>>    
>>> @@ -278,10 +279,15 @@ struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
>>>        INIT_LIST_HEAD(&engine->hw_queue);
>>>    
>>>        engine->base.kernel_context =
>>> -             intel_context_pin(i915->kernel_context, &engine->base);
>>> +             intel_context_instance(i915->kernel_context, &engine->base);
>>>        if (IS_ERR(engine->base.kernel_context))
>>>                goto err_breadcrumbs;
>>>    
>>> +     err = intel_context_pin(engine->base.kernel_context);
>>> +     intel_context_put(engine->base.kernel_context);
>>> +     if (err)
>>> +             goto err_breadcrumbs;
>>> +
>>>        return &engine->base;
>>>    
>>>    err_breadcrumbs:
>>> diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c
>>> index 40d9f549a0cd..606fc2713240 100644
>>> --- a/drivers/gpu/drm/i915/gvt/scheduler.c
>>> +++ b/drivers/gpu/drm/i915/gvt/scheduler.c
>>> @@ -1188,12 +1188,17 @@ int intel_vgpu_setup_submission(struct intel_vgpu *vgpu)
>>>                INIT_LIST_HEAD(&s->workload_q_head[i]);
>>>                s->shadow[i] = ERR_PTR(-EINVAL);
>>>    
>>> -             ce = intel_context_pin(ctx, engine);
>>> +             ce = intel_context_instance(ctx, engine);
>>>                if (IS_ERR(ce)) {
>>>                        ret = PTR_ERR(ce);
>>>                        goto out_shadow_ctx;
>>>                }
>>>    
>>> +             ret = intel_context_pin(ce);
>>> +             intel_context_put(ce);
>>> +             if (ret)
>>> +                     goto out_shadow_ctx;
>>> +
>>>                s->shadow[i] = ce;
>>>        }
>>>    
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>> index 35732c2ae17f..c700cbc2f594 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>> @@ -2101,14 +2101,19 @@ static int eb_pin_context(struct i915_execbuffer *eb,
>>>        if (err)
>>>                return err;
>>>    
>>> +     ce = intel_context_instance(eb->gem_context, engine);
>>> +     if (IS_ERR(ce))
>>> +             return PTR_ERR(ce);
>>> +
>>>        /*
>>>         * Pinning the contexts may generate requests in order to acquire
>>>         * GGTT space, so do this first before we reserve a seqno for
>>>         * ourselves.
>>>         */
>>> -     ce = intel_context_pin(eb->gem_context, engine);
>>> -     if (IS_ERR(ce))
>>> -             return PTR_ERR(ce);
>>> +     err = intel_context_pin(ce);
>>> +     intel_context_put(ce);
>>> +     if (err)
>>> +             return err;
>>>    
>>>        eb->engine = engine;
>>>        eb->context = ce;
>>> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
>>> index 328a740e72cb..afaeabe5e531 100644
>>> --- a/drivers/gpu/drm/i915/i915_perf.c
>>> +++ b/drivers/gpu/drm/i915/i915_perf.c
>>> @@ -1205,11 +1205,17 @@ static struct intel_context *oa_pin_context(struct drm_i915_private *i915,
>>>    {
>>>        struct intel_engine_cs *engine = i915->engine[RCS0];
>>>        struct intel_context *ce;
>>> -     int ret;
>>> +     int err;
>>>    
>>> -     ret = i915_mutex_lock_interruptible(&i915->drm);
>>> -     if (ret)
>>> -             return ERR_PTR(ret);
>>> +     ce = intel_context_instance(ctx, engine);
>>> +     if (IS_ERR(ce))
>>> +             return ce;
>>> +
>>> +     err = i915_mutex_lock_interruptible(&i915->drm);
>>> +     if (err) {
>>> +             intel_context_put(ce);
>>> +             return ERR_PTR(err);
>>> +     }
>>>    
>>>        /*
>>>         * As the ID is the gtt offset of the context's vma we
>>> @@ -1217,10 +1223,11 @@ static struct intel_context *oa_pin_context(struct drm_i915_private *i915,
>>>         *
>>>         * NB: implied RCS engine...
>>>         */
>>> -     ce = intel_context_pin(ctx, engine);
>>> +     err = intel_context_pin(ce);
>>>        mutex_unlock(&i915->drm.struct_mutex);
>>> -     if (IS_ERR(ce))
>>> -             return ce;
>>> +     intel_context_put(ce);
>>> +     if (err)
>>> +             return ERR_PTR(err);
>>>    
>>>        i915->perf.oa.pinned_ctx = ce;
>>>    
>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>>> index 7932d1209247..8abd891d9287 100644
>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>> @@ -753,6 +753,7 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>>>        struct drm_i915_private *i915 = engine->i915;
>>>        struct intel_context *ce;
>>>        struct i915_request *rq;
>>> +     int err;
>>>    
>>>        /*
>>>         * Preempt contexts are reserved for exclusive use to inject a
>>> @@ -766,13 +767,21 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>>>         * GGTT space, so do this first before we reserve a seqno for
>>>         * ourselves.
>>>         */
>>> -     ce = intel_context_pin(ctx, engine);
>>> +     ce = intel_context_instance(ctx, engine);
>>>        if (IS_ERR(ce))
>>>                return ERR_CAST(ce);
>>>    
>>> +     err = intel_context_pin(ce);
>>> +     if (err) {
>>> +             rq = ERR_PTR(err);
>>> +             goto err_put;
>>> +     }
>>> +
>>>        rq = i915_request_create(ce);
>>>        intel_context_unpin(ce);
>>>    
>>> +err_put:
>>> +     intel_context_put(ce);
>>>        return rq;
>>>    }
>>>    
>>>
>>
>> The pattern of instance-pin-put is repeated so many times it begs to be
>> promoted to something like intel_context_pin_instance.
> 
> It's just a transition patch. We'll get rid of intel_context_instance()
> momentarily.

And haven't I spotted this later myself!

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 21/29] drm/i915: Switch back to an array of logical per-engine HW contexts
  2019-04-10 16:18     ` Chris Wilson
@ 2019-04-11 13:05       ` Tvrtko Ursulin
  2019-04-11 13:25         ` Chris Wilson
  0 siblings, 1 reply; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-11 13:05 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 10/04/2019 17:18, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-04-10 16:32:18)
>>
>> On 08/04/2019 10:17, Chris Wilson wrote:
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> index 3f794bc71958..0df3c5238c04 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> @@ -716,7 +716,7 @@ static int pin_context(struct i915_gem_context *ctx,
>>>        struct intel_context *ce;
>>>        int err;
>>>    
>>> -     ce = intel_context_instance(ctx, engine);
>>> +     ce = i915_gem_context_get_engine(ctx, engine->id);
>>
>> Staying with intel_context_instance wouldn't help to reduce the churn?
> 
> But it takes the GEM context :|
> 
> intel_context_lookup() ? But it won't be part of gt/intel_context.h
> And I'd like to have 'get' in there for consistency (although other
> object lookup functions return a new reference, so that may not be a
> solid argument).
> 
> It has annoyed me that this does require the GEM context, but that's the
> nature of the beast.

Leave it as is then.

>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>>> index 50266e87c225..21b4a04c424b 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>> @@ -4312,8 +4312,9 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
>>>    
>>>    static int __intel_engines_record_defaults(struct drm_i915_private *i915)
>>>    {
>>> -     struct i915_gem_context *ctx;
>>>        struct intel_engine_cs *engine;
>>> +     struct i915_gem_context *ctx;
>>> +     struct i915_gem_engines *e;
>>>        enum intel_engine_id id;
>>>        int err = 0;
>>>    
>>> @@ -4330,18 +4331,26 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
>>>        if (IS_ERR(ctx))
>>>                return PTR_ERR(ctx);
>>>    
>>> +     e = i915_gem_context_engine_list_lock(ctx);
>>> +
>>>        for_each_engine(engine, i915, id) {
>>
>> Do we need i915 version of for_each_context_engine? If all call sites
>> will doing this "lock ctx" -> "walk physical engines" -> "lookup in
>> context" then it seems a bit disconnected.
> 
> It's rare, a couple of odd cases imo.

Ok, can re-evaluate at the end.

>>> +             struct intel_context *ce = e->engines[id];
>>
>> How will index by engine->id work for engine map?
> 
> It doesn't, that's the point of these being the odd cases. :)
> 
>>>                struct i915_request *rq;
>>>    
>>> -             rq = i915_request_alloc(engine, ctx);
>>> +             err = intel_context_pin(ce);
>>> +             if (err)
>>> +                     goto err_active;
>>> +
>>> +             rq = i915_request_create(ce);
>>> +             intel_context_unpin(ce);
>>
>> Kind of verbose, no? Do you want to add some
>> middle-ground-between-request-alloc-and-create helper?
> 
> There's 2 callers of i915_request_create() in this style.
> (Execbuf has a style all of its own.)
> 
> At the moment I don't have a good name for the happy middle ground, so
> I'm willing to pay the price for forcing control over the pin to the
> user.
> 
> ...
> 
> Does it really make any difference for the perma-pinned kernel contexts?
> Actually no...
> 
> Hmm. The fundamental objective was being able to pass ce and avoid
> struct_mutex -- but we already avoid struct_mutex for pinning today as
> the context is already pinned.
> 
> The downside is that it adds redundant steps to execbuf, and
> __i915_request_create() is already taken... And I hope you would say no
> if I suggest ____i915_request_create :)
> 
>>> +static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx)
>>> +{
>>> +     struct intel_engine_cs *engine;
>>> +     struct i915_gem_engines *e;
>>> +     enum intel_engine_id id;
>>> +
>>> +     e = kzalloc(struct_size(e, engines, I915_NUM_ENGINES), GFP_KERNEL);
>>> +     if (!e)
>>> +             return ERR_PTR(-ENOMEM);
>>> +
>>> +     e->i915 = ctx->i915;
>>> +     for_each_engine(engine, ctx->i915, id) {
>>> +             struct intel_context *ce;
>>>    
>>> +             ce = intel_context_create(ctx, engine);
>>> +             if (IS_ERR(ce)) {
>>> +                     free_engines_n(e, id);
>>
>> I dislike piggy-back of n into e. How about:
>>
>> __free_engines(e, n)
>> {
>>          ...
>> }
>>
>> free_engines(e)
>> {
>>          __fre_engines(e, e->num_engines):
>> }
>>
>> ?
> 
> Ok.
>   
>> Or even you could e->num_engines++ in the above loop and just have one
>> free_engines.
> 
> I thought it was cleaner to avoid having multiple counters for the same
> loop. free_engines_n() ended up with 5 users.
> 
>>> +                     return ERR_CAST(ce);
>>> +             }
>>> +
>>> +             e->engines[id] = ce;
>>
>> Each context would have a sparse array of engines, on most platforms.
>> Would it be workable to instead create a compact array per context, and
>> just have a per device translation table of idx to engine->id? Or
>> vice-versa, I can't figure out straight from the bat which one would you
>> need.
> 
> As sparse as we do today. I working under the assumption that going
> forwards, the default map would be the oddity. But that is better than
> the sparse fixed array[] we previously embedded into the GEM context, so
> overall I think still an improvement for old platforms.

True.

> We can trim the num_engines, i.e. we need only allocate [rcs0] on gen3
> as any id>rcs0 will become -EINVAL.
> 
> One plan I did have was to remove the sparse engine->id (since that
> should be an internal id), but then we end up with not being able to use
> static constants for our VCSx etc.

Best leave that for now.

>> I guess in this scheme you would need some translation as well to
>> support customised engine maps.. will see if I will spot that later in
>> the patch.
> 
> In the scheme in this patch, we just replace ctx->engines[] for the
> custom map with separate intel_contexts for each instance of a specific
> engine.
>
>>> -     rbtree_postorder_for_each_entry_safe(ce, next, &ctx->hw_contexts, node) {
>>> -             struct intel_engine_cs *engine = ce->engine;
>>> +     e = i915_gem_context_engine_list_lock(ctx);
>>> +     for (i = 0; i < e->num_engines; i++) {
>>> +             struct intel_context *ce = e->engines[i];
>>>                struct i915_request *rq;
>>>    
>>> -             if (!(engine->mask & engines))
>>> +             if (!ce || !(ce->engine->mask & engines))
>>
>> Definitely looks like a case for for_each_context_engine.
> 
> for_each_context_engine(ce, e, i)
> for_each_context_masked(ce, e, mask, i)
> 
> Hmm. I do regret for_each...() that require temporaries.
> 
> We can't stuff the lock in there, without a bit of work...
> 
> gem_for_each_context_engine(iter, ctx) {
> 	struct intel_context *ce = iter.context;
> 
> with
> for (init_iter(&iter, ctx);; next_iter(&iter))
> 
> Just to hide the locking. Probably worth it.

Don't go to wild! :) We got used to temporaries so could also clean all 
of them later if too bored.

>>> +                     continue;
>>> +
>>> +             if (!intel_context_is_pinned(ce))
>>>                        continue;
>>>    
>>>                if (I915_SELFTEST_ONLY(context_barrier_inject_fault &
>>> -                                    engine->mask)) {
>>> +                                    ce->engine->mask)) {
>>>                        err = -ENXIO;
>>>                        break;
>>>                }
>>>    
>>> -             rq = i915_request_alloc(engine, ctx);
>>> +             err = intel_context_pin(ce);
>>> +             if (err)
>>> +                     break;
>>> +
>>> +             rq = i915_request_create(ce);
>>> +             intel_context_unpin(ce);
>>
>> Yep as mentioned somewhere above. I definitely think another helper
>> would help code base redability. Even if called unimaginatively as
>> __i915_request_add.
> 
> It is just these two (based on a quick grep).
> 
> Maybe intel_context_create_request() ?
> 
> Hmm, but isn't that what i915_request_create() is. Quiet!

Why not __i915_request_add? i915_request_add would then be just wedged 
check calling __i915_request_add, no? (Going from memory.. might not be 
reliable.)

>>> +struct i915_gem_engines {
>>> +     struct rcu_work rcu;
>>> +     struct drm_i915_private *i915;
>>> +     unsigned long num_engines;
>>
>> unsigned int?
> 
> Pointer before, array of pointers after, left an unsigned long hole.
> 
> I was just filling the space.

Well long makes me think there's a reason int is not enough. Which in 
this case there isn't. So I would still go for an int regardless of the 
hole or not. There is nothing to be gained by filling space. Could even 
be worse if some instructions expand to longer opcodes. :)

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 21/29] drm/i915: Switch back to an array of logical per-engine HW contexts
  2019-04-11 13:05       ` Tvrtko Ursulin
@ 2019-04-11 13:25         ` Chris Wilson
  0 siblings, 0 replies; 71+ messages in thread
From: Chris Wilson @ 2019-04-11 13:25 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-04-11 14:05:19)
> 
> On 10/04/2019 17:18, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-04-10 16:32:18)
> >> Yep as mentioned somewhere above. I definitely think another helper
> >> would help code base redability. Even if called unimaginatively as
> >> __i915_request_add.
> > 
> > It is just these two (based on a quick grep).
> > 
> > Maybe intel_context_create_request() ?
> > 
> > Hmm, but isn't that what i915_request_create() is. Quiet!
> 
> Why not __i915_request_add? i915_request_add would then be just wedged 
> check calling __i915_request_add, no? (Going from memory.. might not be 
> reliable.)

At the moment, we have:

__i915_request_create(): no timeline locking, no runtime-pm
i915_request_create(): no pinning, does timeline + runtime-pm
intel_context_create_request(): pin + call i915_request_create
igt_request_alloc(): lookup intel_context + call intel_context_create_request

I suppose there is some room in bringing i915_request_alloc() back from
the dead, but it's not the simple allocator one would expect from the
name. (igt_request_alloc is just for ease of recognition.)

Outside of igt_request_alloc(), there are two users for
intel_context_create_request() in my tree. One in
__intel_engine_record_defaults() and the other in
context_barrier_task().

For the in-kernel blitter tasks, we'll either be using a pinned kernel
context, or be shoehorning it into a pinned user context. Both should be
using i915_request_create() (as currently planned). And things like the
heartbeat and idle barriers, use the pinned kernel context.

Just to say intel_context_create_request() is the odd one out, and
deserves the longest name :)

> >>> +struct i915_gem_engines {
> >>> +     struct rcu_work rcu;
> >>> +     struct drm_i915_private *i915;
> >>> +     unsigned long num_engines;
> >>
> >> unsigned int?
> > 
> > Pointer before, array of pointers after, left an unsigned long hole.
> > 
> > I was just filling the space.
> 
> Well long makes me think there's a reason int is not enough. Which in 
> this case there isn't.

And I thought you were planning for a busy future, full of little
engines :)

> So I would still go for an int regardless of the 
> hole or not. There is nothing to be gained by filling space. Could even 
> be worse if some instructions expand to longer opcodes. :)

Hmm, in the near future this becomes

struct i915_gem_engines {
        struct rcu_head rcu;
        unsigned long num_engines;
        struct intel_context *engines[];
};

struct rcu_head is a pair of pointers, so that's still a pointer sized
hole. I give in. We'll only support 4 billion engines.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH] drm/i915: Switch back to an array of logical per-engine HW contexts
  2019-04-08  9:17 ` [PATCH 21/29] drm/i915: Switch back to an array of logical per-engine HW contexts Chris Wilson
  2019-04-10 15:32   ` Tvrtko Ursulin
@ 2019-04-11 13:33   ` Chris Wilson
  1 sibling, 0 replies; 71+ messages in thread
From: Chris Wilson @ 2019-04-11 13:33 UTC (permalink / raw)
  To: intel-gfx

We switched to a tree of per-engine HW context to accommodate the
introduction of virtual engines. However, we plan to also support
multiple instances of the same engine within the GEM context, defeating
our use of the engine as a key to looking up the HW context. Just
allocate a logical per-engine instance and always use an index into the
ctx->engines[]. Later on, this ctx->engines[] may be replaced by a user
specified map.

v2: Add for_each_gem_engine() helper to iterator within the engines lock
v3: intel_context_create_request() helper
v4: s/unsigned long/unsigned int/ 4 billion engines is quite enough.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_context.c       | 112 +++-------------
 drivers/gpu/drm/i915/gt/intel_context.h       |  27 +---
 drivers/gpu/drm/i915/gt/intel_context_types.h |   2 -
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   2 +-
 drivers/gpu/drm/i915/gt/mock_engine.c         |   3 +-
 drivers/gpu/drm/i915/gvt/scheduler.c          |   2 +-
 drivers/gpu/drm/i915/i915_gem.c               |  24 ++--
 drivers/gpu/drm/i915/i915_gem_context.c       | 124 ++++++++++++++++--
 drivers/gpu/drm/i915/i915_gem_context.h       |  51 +++++++
 drivers/gpu/drm/i915/i915_gem_context_types.h |  33 ++++-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c    |  70 +++++-----
 drivers/gpu/drm/i915/i915_perf.c              |  76 ++++++-----
 drivers/gpu/drm/i915/i915_request.c           |  15 +--
 drivers/gpu/drm/i915/intel_guc_submission.c   |  23 ++--
 .../gpu/drm/i915/selftests/i915_gem_context.c |   2 +-
 drivers/gpu/drm/i915/selftests/mock_context.c |  14 +-
 16 files changed, 338 insertions(+), 242 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index 15ac99c5dd4a..5e506e648454 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -17,7 +17,7 @@ static struct i915_global_context {
 	struct kmem_cache *slab_ce;
 } global;
 
-struct intel_context *intel_context_alloc(void)
+static struct intel_context *intel_context_alloc(void)
 {
 	return kmem_cache_zalloc(global.slab_ce, GFP_KERNEL);
 }
@@ -28,104 +28,17 @@ void intel_context_free(struct intel_context *ce)
 }
 
 struct intel_context *
-intel_context_lookup(struct i915_gem_context *ctx,
+intel_context_create(struct i915_gem_context *ctx,
 		     struct intel_engine_cs *engine)
 {
-	struct intel_context *ce = NULL;
-	struct rb_node *p;
-
-	spin_lock(&ctx->hw_contexts_lock);
-	p = ctx->hw_contexts.rb_node;
-	while (p) {
-		struct intel_context *this =
-			rb_entry(p, struct intel_context, node);
-
-		if (this->engine == engine) {
-			GEM_BUG_ON(this->gem_context != ctx);
-			ce = this;
-			break;
-		}
-
-		if (this->engine < engine)
-			p = p->rb_right;
-		else
-			p = p->rb_left;
-	}
-	spin_unlock(&ctx->hw_contexts_lock);
-
-	return ce;
-}
-
-struct intel_context *
-__intel_context_insert(struct i915_gem_context *ctx,
-		       struct intel_engine_cs *engine,
-		       struct intel_context *ce)
-{
-	struct rb_node **p, *parent;
-	int err = 0;
-
-	spin_lock(&ctx->hw_contexts_lock);
-
-	parent = NULL;
-	p = &ctx->hw_contexts.rb_node;
-	while (*p) {
-		struct intel_context *this;
-
-		parent = *p;
-		this = rb_entry(parent, struct intel_context, node);
-
-		if (this->engine == engine) {
-			err = -EEXIST;
-			ce = this;
-			break;
-		}
-
-		if (this->engine < engine)
-			p = &parent->rb_right;
-		else
-			p = &parent->rb_left;
-	}
-	if (!err) {
-		rb_link_node(&ce->node, parent, p);
-		rb_insert_color(&ce->node, &ctx->hw_contexts);
-	}
-
-	spin_unlock(&ctx->hw_contexts_lock);
-
-	return ce;
-}
-
-void __intel_context_remove(struct intel_context *ce)
-{
-	struct i915_gem_context *ctx = ce->gem_context;
-
-	spin_lock(&ctx->hw_contexts_lock);
-	rb_erase(&ce->node, &ctx->hw_contexts);
-	spin_unlock(&ctx->hw_contexts_lock);
-}
-
-struct intel_context *
-intel_context_instance(struct i915_gem_context *ctx,
-		       struct intel_engine_cs *engine)
-{
-	struct intel_context *ce, *pos;
-
-	ce = intel_context_lookup(ctx, engine);
-	if (likely(ce))
-		return intel_context_get(ce);
+	struct intel_context *ce;
 
 	ce = intel_context_alloc();
 	if (!ce)
 		return ERR_PTR(-ENOMEM);
 
 	intel_context_init(ce, ctx, engine);
-
-	pos = __intel_context_insert(ctx, engine, ce);
-	if (unlikely(pos != ce)) /* Beaten! Use their HW context instead */
-		intel_context_free(ce);
-
-	GEM_BUG_ON(intel_context_lookup(ctx, engine) != pos);
-	return intel_context_get(pos);
+	return ce;
 }
 
 int __intel_context_do_pin(struct intel_context *ce)
@@ -204,6 +117,8 @@ intel_context_init(struct intel_context *ce,
 		   struct i915_gem_context *ctx,
 		   struct intel_engine_cs *engine)
 {
+	GEM_BUG_ON(!engine->cops);
+
 	kref_init(&ce->ref);
 
 	ce->gem_context = ctx;
@@ -254,3 +169,18 @@ void intel_context_exit_engine(struct intel_context *ce)
 {
 	intel_engine_pm_put(ce->engine);
 }
+
+struct i915_request *intel_context_create_request(struct intel_context *ce)
+{
+	struct i915_request *rq;
+	int err;
+
+	err = intel_context_pin(ce);
+	if (unlikely(err))
+		return ERR_PTR(err);
+
+	rq = i915_request_create(ce);
+	intel_context_unpin(ce);
+
+	return rq;
+}
diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
index b746add6b71d..63392c88cd98 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -12,24 +12,16 @@
 #include "intel_context_types.h"
 #include "intel_engine_types.h"
 
-struct intel_context *intel_context_alloc(void);
-void intel_context_free(struct intel_context *ce);
-
 void intel_context_init(struct intel_context *ce,
 			struct i915_gem_context *ctx,
 			struct intel_engine_cs *engine);
 
-/**
- * intel_context_lookup - Find the matching HW context for this (ctx, engine)
- * @ctx - the parent GEM context
- * @engine - the target HW engine
- *
- * May return NULL if the HW context hasn't been instantiated (i.e. unused).
- */
 struct intel_context *
-intel_context_lookup(struct i915_gem_context *ctx,
+intel_context_create(struct i915_gem_context *ctx,
 		     struct intel_engine_cs *engine);
 
+void intel_context_free(struct intel_context *ce);
+
 /**
  * intel_context_lock_pinned - Stablises the 'pinned' status of the HW context
  * @ce - the context
@@ -71,17 +63,6 @@ static inline void intel_context_unlock_pinned(struct intel_context *ce)
 	mutex_unlock(&ce->pin_mutex);
 }
 
-struct intel_context *
-__intel_context_insert(struct i915_gem_context *ctx,
-		       struct intel_engine_cs *engine,
-		       struct intel_context *ce);
-void
-__intel_context_remove(struct intel_context *ce);
-
-struct intel_context *
-intel_context_instance(struct i915_gem_context *ctx,
-		       struct intel_engine_cs *engine);
-
 int __intel_context_do_pin(struct intel_context *ce);
 
 static inline int intel_context_pin(struct intel_context *ce)
@@ -144,4 +125,6 @@ static inline void intel_context_timeline_unlock(struct intel_context *ce)
 	mutex_unlock(&ce->ring->timeline->mutex);
 }
 
+struct i915_request *intel_context_create_request(struct intel_context *ce);
+
 #endif /* __INTEL_CONTEXT_H__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h
index f02d27734e3b..3579c2708321 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -10,7 +10,6 @@
 #include <linux/kref.h>
 #include <linux/list.h>
 #include <linux/mutex.h>
-#include <linux/rbtree.h>
 #include <linux/types.h>
 
 #include "i915_active_types.h"
@@ -61,7 +60,6 @@ struct intel_context {
 	struct i915_active_request active_tracker;
 
 	const struct intel_context_ops *ops;
-	struct rb_node node;
 
 	/** sseu: Control eu/slice partitioning */
 	struct intel_sseu sseu;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 6cb90137b2fa..2a1c438bf0ad 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -716,7 +716,7 @@ static int pin_context(struct i915_gem_context *ctx,
 	struct intel_context *ce;
 	int err;
 
-	ce = intel_context_instance(ctx, engine);
+	ce = i915_gem_context_get_engine(ctx, engine->id);
 	if (IS_ERR(ce))
 		return PTR_ERR(ce);
 
diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c
index 85cdbfe1d989..2941916b37bf 100644
--- a/drivers/gpu/drm/i915/gt/mock_engine.c
+++ b/drivers/gpu/drm/i915/gt/mock_engine.c
@@ -23,6 +23,7 @@
  */
 
 #include "i915_drv.h"
+#include "i915_gem_context.h"
 #include "intel_context.h"
 #include "intel_engine_pm.h"
 
@@ -286,7 +287,7 @@ int mock_engine_init(struct intel_engine_cs *engine)
 	i915_timeline_set_subclass(&engine->timeline, TIMELINE_ENGINE);
 
 	engine->kernel_context =
-		intel_context_instance(i915->kernel_context, engine);
+		i915_gem_context_get_engine(i915->kernel_context, engine->id);
 	if (IS_ERR(engine->kernel_context))
 		goto err_timeline;
 
diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c
index 606fc2713240..8b6574e1b495 100644
--- a/drivers/gpu/drm/i915/gvt/scheduler.c
+++ b/drivers/gpu/drm/i915/gvt/scheduler.c
@@ -1188,7 +1188,7 @@ int intel_vgpu_setup_submission(struct intel_vgpu *vgpu)
 		INIT_LIST_HEAD(&s->workload_q_head[i]);
 		s->shadow[i] = ERR_PTR(-EINVAL);
 
-		ce = intel_context_instance(ctx, engine);
+		ce = i915_gem_context_get_engine(ctx, i);
 		if (IS_ERR(ce)) {
 			ret = PTR_ERR(ce);
 			goto out_shadow_ctx;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 50266e87c225..8a0278618279 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4312,8 +4312,9 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
 
 static int __intel_engines_record_defaults(struct drm_i915_private *i915)
 {
-	struct i915_gem_context *ctx;
 	struct intel_engine_cs *engine;
+	struct i915_gem_context *ctx;
+	struct i915_gem_engines *e;
 	enum intel_engine_id id;
 	int err = 0;
 
@@ -4330,18 +4331,21 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
 	if (IS_ERR(ctx))
 		return PTR_ERR(ctx);
 
+	e = i915_gem_context_engine_list_lock(ctx);
+
 	for_each_engine(engine, i915, id) {
+		struct intel_context *ce = e->engines[id];
 		struct i915_request *rq;
 
-		rq = i915_request_alloc(engine, ctx);
+		rq = intel_context_create_request(ce);
 		if (IS_ERR(rq)) {
 			err = PTR_ERR(rq);
-			goto out_ctx;
+			goto err_active;
 		}
 
 		err = 0;
-		if (engine->init_context)
-			err = engine->init_context(rq);
+		if (rq->engine->init_context)
+			err = rq->engine->init_context(rq);
 
 		i915_request_add(rq);
 		if (err)
@@ -4355,15 +4359,10 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
 	}
 
 	for_each_engine(engine, i915, id) {
-		struct intel_context *ce;
-		struct i915_vma *state;
+		struct intel_context *ce = e->engines[id];
+		struct i915_vma *state = ce->state;
 		void *vaddr;
 
-		ce = intel_context_lookup(ctx, engine);
-		if (!ce)
-			continue;
-
-		state = ce->state;
 		if (!state)
 			continue;
 
@@ -4419,6 +4418,7 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
 	}
 
 out_ctx:
+	i915_gem_context_engine_list_unlock(ctx);
 	i915_gem_context_set_closed(ctx);
 	i915_gem_context_put(ctx);
 	return err;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 5d6342734821..66869fedd364 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -150,7 +150,7 @@ lookup_user_engine(struct i915_gem_context *ctx, u16 class, u16 instance)
 	if (!engine)
 		return ERR_PTR(-EINVAL);
 
-	return intel_context_instance(ctx, engine);
+	return i915_gem_context_get_engine(ctx, engine->id);
 }
 
 static inline int new_hw_id(struct drm_i915_private *i915, gfp_t gfp)
@@ -242,10 +242,51 @@ static void release_hw_id(struct i915_gem_context *ctx)
 	mutex_unlock(&i915->contexts.mutex);
 }
 
-static void i915_gem_context_free(struct i915_gem_context *ctx)
+static void __free_engines(struct i915_gem_engines *e, unsigned int count)
+{
+	while (count--) {
+		if (!e->engines[count])
+			continue;
+
+		intel_context_put(e->engines[count]);
+	}
+	kfree(e);
+}
+
+static void free_engines(struct i915_gem_engines *e)
+{
+	__free_engines(e, e->num_engines);
+}
+
+static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx)
 {
-	struct intel_context *it, *n;
+	struct intel_engine_cs *engine;
+	struct i915_gem_engines *e;
+	enum intel_engine_id id;
 
+	e = kzalloc(struct_size(e, engines, I915_NUM_ENGINES), GFP_KERNEL);
+	if (!e)
+		return ERR_PTR(-ENOMEM);
+
+	e->i915 = ctx->i915;
+	for_each_engine(engine, ctx->i915, id) {
+		struct intel_context *ce;
+
+		ce = intel_context_create(ctx, engine);
+		if (IS_ERR(ce)) {
+			__free_engines(e, id);
+			return ERR_CAST(ce);
+		}
+
+		e->engines[id] = ce;
+	}
+	e->num_engines = id;
+
+	return e;
+}
+
+static void i915_gem_context_free(struct i915_gem_context *ctx)
+{
 	lockdep_assert_held(&ctx->i915->drm.struct_mutex);
 	GEM_BUG_ON(!i915_gem_context_is_closed(ctx));
 	GEM_BUG_ON(!list_empty(&ctx->active_engines));
@@ -253,8 +294,8 @@ static void i915_gem_context_free(struct i915_gem_context *ctx)
 	release_hw_id(ctx);
 	i915_ppgtt_put(ctx->ppgtt);
 
-	rbtree_postorder_for_each_entry_safe(it, n, &ctx->hw_contexts, node)
-		intel_context_put(it);
+	free_engines(rcu_access_pointer(ctx->engines));
+	mutex_destroy(&ctx->engines_mutex);
 
 	if (ctx->timeline)
 		i915_timeline_put(ctx->timeline);
@@ -363,6 +404,8 @@ static struct i915_gem_context *
 __create_context(struct drm_i915_private *dev_priv)
 {
 	struct i915_gem_context *ctx;
+	struct i915_gem_engines *e;
+	int err;
 	int i;
 
 	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
@@ -376,8 +419,13 @@ __create_context(struct drm_i915_private *dev_priv)
 	INIT_LIST_HEAD(&ctx->active_engines);
 	mutex_init(&ctx->mutex);
 
-	ctx->hw_contexts = RB_ROOT;
-	spin_lock_init(&ctx->hw_contexts_lock);
+	mutex_init(&ctx->engines_mutex);
+	e = default_engines(ctx);
+	if (IS_ERR(e)) {
+		err = PTR_ERR(e);
+		goto err_free;
+	}
+	RCU_INIT_POINTER(ctx->engines, e);
 
 	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
 	INIT_LIST_HEAD(&ctx->handles_list);
@@ -399,6 +447,10 @@ __create_context(struct drm_i915_private *dev_priv)
 		ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
 
 	return ctx;
+
+err_free:
+	kfree(ctx);
+	return ERR_PTR(err);
 }
 
 static struct i915_hw_ppgtt *
@@ -862,7 +914,7 @@ static int context_barrier_task(struct i915_gem_context *ctx,
 {
 	struct drm_i915_private *i915 = ctx->i915;
 	struct context_barrier_task *cb;
-	struct intel_context *ce, *next;
+	struct i915_gem_engines_iter it;
 	int err = 0;
 
 	lockdep_assert_held(&i915->drm.struct_mutex);
@@ -875,22 +927,27 @@ static int context_barrier_task(struct i915_gem_context *ctx,
 	i915_active_init(i915, &cb->base, cb_retire);
 	i915_active_acquire(&cb->base);
 
-	rbtree_postorder_for_each_entry_safe(ce, next, &ctx->hw_contexts, node) {
-		struct intel_engine_cs *engine = ce->engine;
+	for_each_gem_engine(it, ctx) {
+		struct intel_context *ce = it.context;
 		struct i915_request *rq;
 
-		if (!(engine->mask & engines))
+		if (!(ce->engine->mask & engines))
+			continue;
+
+		if (!intel_context_is_pinned(ce))
 			continue;
 
 		if (I915_SELFTEST_ONLY(context_barrier_inject_fault &
-				       engine->mask)) {
+				       ce->engine->mask)) {
 			err = -ENXIO;
+			i915_gem_engines_iter_fini(&it);
 			break;
 		}
 
-		rq = i915_request_alloc(engine, ctx);
+		rq = intel_context_create_request(ce);
 		if (IS_ERR(rq)) {
 			err = PTR_ERR(rq);
+			i915_gem_engines_iter_fini(&it);
 			break;
 		}
 
@@ -901,8 +958,10 @@ static int context_barrier_task(struct i915_gem_context *ctx,
 			err = i915_active_ref(&cb->base, rq->fence.context, rq);
 
 		i915_request_add(rq);
-		if (err)
+		if (err) {
+			i915_gem_engines_iter_fini(&it);
 			break;
+		}
 	}
 
 	cb->task = err ? NULL : task; /* caller needs to unwind instead */
@@ -1739,6 +1798,43 @@ int __i915_gem_context_pin_hw_id(struct i915_gem_context *ctx)
 	return err;
 }
 
+/* GEM context-engines iterator: for_each_gem_engine() */
+bool i915_gem_engines_iter_next(struct i915_gem_engines_iter *it)
+{
+	if (!it->engines)
+		goto end;
+
+	do {
+		if (it->idx >= it->engines->num_engines)
+			goto end;
+
+		it->context = it->engines->engines[it->idx++];
+	} while (!it->context);
+
+	return true;
+
+end:
+	i915_gem_engines_iter_fini(it);
+	return false;
+}
+
+void i915_gem_engines_iter_init(struct i915_gem_engines_iter *it,
+				struct i915_gem_context *ctx)
+{
+	it->gem_context = ctx;
+	it->engines = i915_gem_context_engine_list_lock(ctx);
+	it->idx = 0;
+}
+
+void i915_gem_engines_iter_fini(struct i915_gem_engines_iter *it)
+{
+	struct i915_gem_context *ctx;
+
+	ctx = fetch_and_zero(&it->gem_context);
+	if (ctx)
+		i915_gem_context_engine_list_unlock(ctx);
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftests/mock_context.c"
 #include "selftests/i915_gem_context.c"
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index 5a8e080499fb..25238f750286 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -176,6 +176,57 @@ static inline void i915_gem_context_put(struct i915_gem_context *ctx)
 	kref_put(&ctx->ref, i915_gem_context_release);
 }
 
+static inline struct i915_gem_engines *
+i915_gem_context_engine_list(struct i915_gem_context *ctx)
+{
+	return rcu_dereference_protected(ctx->engines,
+					 lockdep_is_held(&ctx->engines_mutex));
+}
+
+static inline struct i915_gem_engines *
+i915_gem_context_engine_list_lock(struct i915_gem_context *ctx)
+	__acquires(&ctx->engines_mutex)
+{
+	mutex_lock(&ctx->engines_mutex);
+	return i915_gem_context_engine_list(ctx);
+}
+
+static inline void
+i915_gem_context_engine_list_unlock(struct i915_gem_context *ctx)
+	__releases(&ctx->engines_mutex)
+{
+	mutex_unlock(&ctx->engines_mutex);
+}
+
+static inline struct intel_context *
+i915_gem_context_lookup_engine(struct i915_gem_context *ctx, unsigned int idx)
+{
+	return i915_gem_context_engine_list(ctx)->engines[idx];
+}
+
+static inline struct intel_context *
+i915_gem_context_get_engine(struct i915_gem_context *ctx, unsigned int idx)
+{
+	struct intel_context *ce = ERR_PTR(-EINVAL);
+
+	rcu_read_lock(); {
+		struct i915_gem_engines *e = rcu_dereference(ctx->engines);
+		if (likely(idx < e->num_engines && e->engines[idx]))
+			ce = intel_context_get(e->engines[idx]);
+	} rcu_read_unlock();
+
+	return ce;
+}
+
+#define for_each_gem_engine(it, ctx) \
+	for (i915_gem_engines_iter_init(&(it), (ctx)); \
+	     i915_gem_engines_iter_next(&(it));)
+
+void i915_gem_engines_iter_init(struct i915_gem_engines_iter *it,
+				struct i915_gem_context *ctx);
+bool i915_gem_engines_iter_next(struct i915_gem_engines_iter *it);
+void i915_gem_engines_iter_fini(struct i915_gem_engines_iter *it);
+
 struct i915_lut_handle *i915_lut_handle_alloc(void);
 void i915_lut_handle_free(struct i915_lut_handle *lut);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context_types.h b/drivers/gpu/drm/i915/i915_gem_context_types.h
index d282a6ab3b9f..5d146b7b84ec 100644
--- a/drivers/gpu/drm/i915/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/i915_gem_context_types.h
@@ -29,6 +29,20 @@ struct i915_hw_ppgtt;
 struct i915_timeline;
 struct intel_ring;
 
+struct i915_gem_engines {
+	struct rcu_work rcu;
+	struct drm_i915_private *i915;
+	unsigned int num_engines;
+	struct intel_context *engines[];
+};
+
+struct i915_gem_engines_iter {
+	struct intel_context *context;
+	struct i915_gem_context *gem_context;
+	struct i915_gem_engines *engines;
+	unsigned int idx;
+};
+
 /**
  * struct i915_gem_context - client state
  *
@@ -42,6 +56,21 @@ struct i915_gem_context {
 	/** file_priv: owning file descriptor */
 	struct drm_i915_file_private *file_priv;
 
+	/**
+	 * @engines: User defined engines for this context
+	 *
+	 * NULL means to use legacy definitions (including random meaning of
+	 * I915_EXEC_BSD with I915_EXEC_BSD_SELECTOR overrides).
+	 *
+	 * If defined, execbuf uses the I915_EXEC_MASK as an index into
+	 * array, and various uAPI other the ability to lookup up an
+	 * index from this array to select an engine operate on.
+	 *
+	 * User defined by I915_CONTEXT_PARAM_ENGINE.
+	 */
+	struct i915_gem_engines __rcu *engines;
+	struct mutex engines_mutex; /* guards writes to engines */
+
 	struct i915_timeline *timeline;
 
 	/**
@@ -134,10 +163,6 @@ struct i915_gem_context {
 
 	struct i915_sched_attr sched;
 
-	/** hw_contexts: per-engine logical HW state */
-	struct rb_root hw_contexts;
-	spinlock_t hw_contexts_lock;
-
 	/** ring_size: size for allocating the per-engine ring buffer */
 	u32 ring_size;
 	/** desc_template: invariant fields for the HW context descriptor */
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 166a33c0d3ed..679f7c1561ba 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -2076,9 +2076,7 @@ gen8_dispatch_bsd_engine(struct drm_i915_private *dev_priv,
 	return file_priv->bsd_engine;
 }
 
-#define I915_USER_RINGS (4)
-
-static const enum intel_engine_id user_ring_map[I915_USER_RINGS + 1] = {
+static const enum intel_engine_id user_ring_map[] = {
 	[I915_EXEC_DEFAULT]	= RCS0,
 	[I915_EXEC_RENDER]	= RCS0,
 	[I915_EXEC_BLT]		= BCS0,
@@ -2086,10 +2084,8 @@ static const enum intel_engine_id user_ring_map[I915_USER_RINGS + 1] = {
 	[I915_EXEC_VEBOX]	= VECS0
 };
 
-static int eb_pin_context(struct i915_execbuffer *eb,
-			  struct intel_engine_cs *engine)
+static int eb_pin_context(struct i915_execbuffer *eb, struct intel_context *ce)
 {
-	struct intel_context *ce;
 	int err;
 
 	/*
@@ -2100,21 +2096,16 @@ static int eb_pin_context(struct i915_execbuffer *eb,
 	if (err)
 		return err;
 
-	ce = intel_context_instance(eb->gem_context, engine);
-	if (IS_ERR(ce))
-		return PTR_ERR(ce);
-
 	/*
 	 * Pinning the contexts may generate requests in order to acquire
 	 * GGTT space, so do this first before we reserve a seqno for
 	 * ourselves.
 	 */
 	err = intel_context_pin(ce);
-	intel_context_put(ce);
 	if (err)
 		return err;
 
-	eb->engine = engine;
+	eb->engine = ce->engine;
 	eb->context = ce;
 	return 0;
 }
@@ -2124,25 +2115,19 @@ static void eb_unpin_context(struct i915_execbuffer *eb)
 	intel_context_unpin(eb->context);
 }
 
-static int
-eb_select_engine(struct i915_execbuffer *eb,
-		 struct drm_file *file,
-		 struct drm_i915_gem_execbuffer2 *args)
+static unsigned int
+eb_select_legacy_ring(struct i915_execbuffer *eb,
+		      struct drm_file *file,
+		      struct drm_i915_gem_execbuffer2 *args)
 {
 	struct drm_i915_private *i915 = eb->i915;
 	unsigned int user_ring_id = args->flags & I915_EXEC_RING_MASK;
-	struct intel_engine_cs *engine;
-
-	if (user_ring_id > I915_USER_RINGS) {
-		DRM_DEBUG("execbuf with unknown ring: %u\n", user_ring_id);
-		return -EINVAL;
-	}
 
-	if ((user_ring_id != I915_EXEC_BSD) &&
-	    ((args->flags & I915_EXEC_BSD_MASK) != 0)) {
+	if (user_ring_id != I915_EXEC_BSD &&
+	    (args->flags & I915_EXEC_BSD_MASK)) {
 		DRM_DEBUG("execbuf with non bsd ring but with invalid "
 			  "bsd dispatch flags: %d\n", (int)(args->flags));
-		return -EINVAL;
+		return -1;
 	}
 
 	if (user_ring_id == I915_EXEC_BSD && HAS_ENGINE(i915, VCS1)) {
@@ -2157,20 +2142,39 @@ eb_select_engine(struct i915_execbuffer *eb,
 		} else {
 			DRM_DEBUG("execbuf with unknown bsd ring: %u\n",
 				  bsd_idx);
-			return -EINVAL;
+			return -1;
 		}
 
-		engine = i915->engine[_VCS(bsd_idx)];
-	} else {
-		engine = i915->engine[user_ring_map[user_ring_id]];
+		return _VCS(bsd_idx);
 	}
 
-	if (!engine) {
-		DRM_DEBUG("execbuf with invalid ring: %u\n", user_ring_id);
-		return -EINVAL;
+	if (user_ring_id >= ARRAY_SIZE(user_ring_map)) {
+		DRM_DEBUG("execbuf with unknown ring: %u\n", user_ring_id);
+		return -1;
 	}
 
-	return eb_pin_context(eb, engine);
+	return user_ring_map[user_ring_id];
+}
+
+static int
+eb_select_engine(struct i915_execbuffer *eb,
+		 struct drm_file *file,
+		 struct drm_i915_gem_execbuffer2 *args)
+{
+	struct intel_context *ce;
+	unsigned int idx;
+	int err;
+
+	idx = eb_select_legacy_ring(eb, file, args);
+
+	ce = i915_gem_context_get_engine(eb->gem_context, idx);
+	if (IS_ERR(ce))
+		return PTR_ERR(ce);
+
+	err = eb_pin_context(eb, ce);
+	intel_context_put(ce);
+
+	return err;
 }
 
 static void
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index afaeabe5e531..6cec24f1ade6 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1203,35 +1203,36 @@ static int i915_oa_read(struct i915_perf_stream *stream,
 static struct intel_context *oa_pin_context(struct drm_i915_private *i915,
 					    struct i915_gem_context *ctx)
 {
-	struct intel_engine_cs *engine = i915->engine[RCS0];
-	struct intel_context *ce;
+	struct i915_gem_engines_iter it;
 	int err;
 
-	ce = intel_context_instance(ctx, engine);
-	if (IS_ERR(ce))
-		return ce;
-
 	err = i915_mutex_lock_interruptible(&i915->drm);
-	if (err) {
-		intel_context_put(ce);
+	if (err)
 		return ERR_PTR(err);
+
+	for_each_gem_engine(it, ctx) {
+		struct intel_context *ce = it.context;
+
+		if (ce->engine->class != RENDER_CLASS)
+			continue;
+
+		/*
+		 * As the ID is the gtt offset of the context's vma we
+		 * pin the vma to ensure the ID remains fixed.
+		 */
+		err = intel_context_pin(ce);
+		if (err == 0) {
+			i915->perf.oa.pinned_ctx = ce;
+			i915_gem_engines_iter_fini(&it);
+			break;
+		}
 	}
 
-	/*
-	 * As the ID is the gtt offset of the context's vma we
-	 * pin the vma to ensure the ID remains fixed.
-	 *
-	 * NB: implied RCS engine...
-	 */
-	err = intel_context_pin(ce);
 	mutex_unlock(&i915->drm.struct_mutex);
-	intel_context_put(ce);
 	if (err)
 		return ERR_PTR(err);
 
-	i915->perf.oa.pinned_ctx = ce;
-
-	return ce;
+	return i915->perf.oa.pinned_ctx;
 }
 
 /**
@@ -1717,7 +1718,6 @@ gen8_update_reg_state_unlocked(struct intel_context *ce,
 static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
 				       const struct i915_oa_config *oa_config)
 {
-	struct intel_engine_cs *engine = dev_priv->engine[RCS0];
 	unsigned int map_type = i915_coherent_map_type(dev_priv);
 	struct i915_gem_context *ctx;
 	struct i915_request *rq;
@@ -1746,30 +1746,40 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
 
 	/* Update all contexts now that we've stalled the submission. */
 	list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
-		struct intel_context *ce = intel_context_lookup(ctx, engine);
-		u32 *regs;
+		struct i915_gem_engines_iter it;
 
-		/* OA settings will be set upon first use */
-		if (!ce || !ce->state)
-			continue;
+		for_each_gem_engine(it, ctx) {
+			struct intel_context *ce = it.context;
+			u32 *regs;
+
+			if (ce->engine->class != RENDER_CLASS)
+				continue;
 
-		regs = i915_gem_object_pin_map(ce->state->obj, map_type);
-		if (IS_ERR(regs))
-			return PTR_ERR(regs);
+			/* OA settings will be set upon first use */
+			if (!ce->state)
+				continue;
 
-		ce->state->obj->mm.dirty = true;
-		regs += LRC_STATE_PN * PAGE_SIZE / sizeof(*regs);
+			regs = i915_gem_object_pin_map(ce->state->obj,
+						       map_type);
+			if (IS_ERR(regs)) {
+				i915_gem_engines_iter_fini(&it);
+				return PTR_ERR(regs);
+			}
 
-		gen8_update_reg_state_unlocked(ce, regs, oa_config);
+			ce->state->obj->mm.dirty = true;
+			regs += LRC_STATE_PN * PAGE_SIZE / sizeof(*regs);
 
-		i915_gem_object_unpin_map(ce->state->obj);
+			gen8_update_reg_state_unlocked(ce, regs, oa_config);
+
+			i915_gem_object_unpin_map(ce->state->obj);
+		}
 	}
 
 	/*
 	 * Apply the configuration by doing one context restore of the edited
 	 * context image.
 	 */
-	rq = i915_request_create(engine->kernel_context);
+	rq = i915_request_create(dev_priv->engine[RCS0]->kernel_context);
 	if (IS_ERR(rq))
 		return PTR_ERR(rq);
 
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index eaf8851dd3bc..65a949b3659a 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -785,7 +785,6 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	struct drm_i915_private *i915 = engine->i915;
 	struct intel_context *ce;
 	struct i915_request *rq;
-	int err;
 
 	/*
 	 * Preempt contexts are reserved for exclusive use to inject a
@@ -799,21 +798,13 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	 * GGTT space, so do this first before we reserve a seqno for
 	 * ourselves.
 	 */
-	ce = intel_context_instance(ctx, engine);
+	ce = i915_gem_context_get_engine(ctx, engine->id);
 	if (IS_ERR(ce))
 		return ERR_CAST(ce);
 
-	err = intel_context_pin(ce);
-	if (err) {
-		rq = ERR_PTR(err);
-		goto err_put;
-	}
-
-	rq = i915_request_create(ce);
-	intel_context_unpin(ce);
-
-err_put:
+	rq = intel_context_create_request(ce);
 	intel_context_put(ce);
+
 	return rq;
 }
 
diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
index 1b6d6403ee92..9dbb036fd7f1 100644
--- a/drivers/gpu/drm/i915/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/intel_guc_submission.c
@@ -364,11 +364,9 @@ static void guc_stage_desc_pool_destroy(struct intel_guc *guc)
 static void guc_stage_desc_init(struct intel_guc_client *client)
 {
 	struct intel_guc *guc = client->guc;
-	struct drm_i915_private *dev_priv = guc_to_i915(guc);
-	struct intel_engine_cs *engine;
 	struct i915_gem_context *ctx = client->owner;
+	struct i915_gem_engines_iter it;
 	struct guc_stage_desc *desc;
-	unsigned int tmp;
 	u32 gfx_addr;
 
 	desc = __get_stage_desc(client);
@@ -382,10 +380,12 @@ static void guc_stage_desc_init(struct intel_guc_client *client)
 	desc->priority = client->priority;
 	desc->db_id = client->doorbell_id;
 
-	for_each_engine_masked(engine, dev_priv, client->engines, tmp) {
-		struct intel_context *ce = intel_context_lookup(ctx, engine);
-		u32 guc_engine_id = engine->guc_id;
-		struct guc_execlist_context *lrc = &desc->lrc[guc_engine_id];
+	for_each_gem_engine(it, ctx) {
+		struct intel_context *ce = it.context;
+		struct guc_execlist_context *lrc;
+
+		if (!(ce->engine->mask & client->engines))
+			continue;
 
 		/* TODO: We have a design issue to be solved here. Only when we
 		 * receive the first batch, we know which engine is used by the
@@ -394,8 +394,8 @@ static void guc_stage_desc_init(struct intel_guc_client *client)
 		 * for now who owns a GuC client. But for future owner of GuC
 		 * client, need to make sure lrc is pinned prior to enter here.
 		 */
-		if (!ce || !ce->state)
-			break;	/* XXX: continue? */
+		if (!ce->state)
+			continue;
 
 		/*
 		 * XXX: When this is a GUC_STAGE_DESC_ATTR_KERNEL client (proxy
@@ -404,6 +404,7 @@ static void guc_stage_desc_init(struct intel_guc_client *client)
 		 * Instead, the GuC uses the LRCA of the user mode context (see
 		 * guc_add_request below).
 		 */
+		lrc = &desc->lrc[ce->engine->guc_id];
 		lrc->context_desc = lower_32_bits(ce->lrc_desc);
 
 		/* The state page is after PPHWSP */
@@ -414,14 +415,14 @@ static void guc_stage_desc_init(struct intel_guc_client *client)
 		 * here. In proxy submission, it wants the stage id
 		 */
 		lrc->context_id = (client->stage_id << GUC_ELC_CTXID_OFFSET) |
-				(guc_engine_id << GUC_ELC_ENGINE_OFFSET);
+				(ce->engine->guc_id << GUC_ELC_ENGINE_OFFSET);
 
 		lrc->ring_begin = intel_guc_ggtt_offset(guc, ce->ring->vma);
 		lrc->ring_end = lrc->ring_begin + ce->ring->size - 1;
 		lrc->ring_next_free_location = lrc->ring_begin;
 		lrc->ring_current_tail_pointer_value = 0;
 
-		desc->engines_used |= (1 << guc_engine_id);
+		desc->engines_used |= BIT(ce->engine->guc_id);
 	}
 
 	DRM_DEBUG_DRIVER("Host engines 0x%x => GuC engines used 0x%x\n",
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index 214d1fd2f4dc..7fd224a4ca4c 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -1094,7 +1094,7 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
 
 	wakeref = intel_runtime_pm_get(i915);
 
-	ce = intel_context_instance(ctx, i915->engine[RCS0]);
+	ce = i915_gem_context_get_engine(ctx, RCS0);
 	if (IS_ERR(ce)) {
 		ret = PTR_ERR(ce);
 		goto out_rpm;
diff --git a/drivers/gpu/drm/i915/selftests/mock_context.c b/drivers/gpu/drm/i915/selftests/mock_context.c
index 0426093bf1d9..71c750693585 100644
--- a/drivers/gpu/drm/i915/selftests/mock_context.c
+++ b/drivers/gpu/drm/i915/selftests/mock_context.c
@@ -30,6 +30,7 @@ mock_context(struct drm_i915_private *i915,
 	     const char *name)
 {
 	struct i915_gem_context *ctx;
+	struct i915_gem_engines *e;
 	int ret;
 
 	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
@@ -40,8 +41,11 @@ mock_context(struct drm_i915_private *i915,
 	INIT_LIST_HEAD(&ctx->link);
 	ctx->i915 = i915;
 
-	ctx->hw_contexts = RB_ROOT;
-	spin_lock_init(&ctx->hw_contexts_lock);
+	mutex_init(&ctx->engines_mutex);
+	e = default_engines(ctx);
+	if (IS_ERR(e))
+		goto err_free;
+	RCU_INIT_POINTER(ctx->engines, e);
 
 	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
 	INIT_LIST_HEAD(&ctx->handles_list);
@@ -51,7 +55,7 @@ mock_context(struct drm_i915_private *i915,
 
 	ret = i915_gem_context_pin_hw_id(ctx);
 	if (ret < 0)
-		goto err_handles;
+		goto err_engines;
 
 	if (name) {
 		struct i915_hw_ppgtt *ppgtt;
@@ -69,7 +73,9 @@ mock_context(struct drm_i915_private *i915,
 
 	return ctx;
 
-err_handles:
+err_engines:
+	free_engines(rcu_access_pointer(ctx->engines));
+err_free:
 	kfree(ctx);
 	return NULL;
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* ✗ Fi.CI.BAT: failure for series starting with [01/29] drm/i915: Mark up ips for RCU protection (rev2)
  2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
                   ` (30 preceding siblings ...)
  2019-04-08 10:28 ` ✗ Fi.CI.BAT: failure " Patchwork
@ 2019-04-11 22:20 ` Patchwork
  31 siblings, 0 replies; 71+ messages in thread
From: Patchwork @ 2019-04-11 22:20 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/29] drm/i915: Mark up ips for RCU protection (rev2)
URL   : https://patchwork.freedesktop.org/series/59155/
State : failure

== Summary ==

Applying: drm/i915: Mark up ips for RCU protection
Applying: drm/i915/guc: Replace WARN with a DRM_ERROR
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/i915/intel_guc_submission.c
Falling back to patching base and 3-way merge...
Auto-merging drivers/gpu/drm/i915/intel_guc_submission.c
No changes -- Patch already applied.
Applying: drm/i915: Use static allocation for i915_globals_park()
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/i915/i915_globals.c
Falling back to patching base and 3-way merge...
No changes -- Patch already applied.
Applying: drm/i915: Consolidate the timeline->barrier
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/i915/i915_gem_context.c
M	drivers/gpu/drm/i915/i915_request.c
M	drivers/gpu/drm/i915/i915_timeline.c
M	drivers/gpu/drm/i915/i915_timeline.h
M	drivers/gpu/drm/i915/i915_timeline_types.h
M	drivers/gpu/drm/i915/selftests/mock_timeline.c
Falling back to patching base and 3-way merge...
Auto-merging drivers/gpu/drm/i915/i915_request.c
No changes -- Patch already applied.
Applying: drm/i915: Store the default sseu setup on the engine
Applying: drm/i915: Move GraphicsTechnology files under gt/
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/i915/i915_debugfs.c
M	drivers/gpu/drm/i915/i915_drv.h
M	drivers/gpu/drm/i915/i915_gem.c
M	drivers/gpu/drm/i915/i915_gem_gtt.c
M	drivers/gpu/drm/i915/i915_request.c
M	drivers/gpu/drm/i915/intel_display.c
M	drivers/gpu/drm/i915/intel_engine_cs.c
M	drivers/gpu/drm/i915/intel_engine_types.h
M	drivers/gpu/drm/i915/intel_guc_submission.c
M	drivers/gpu/drm/i915/intel_lrc.c
M	drivers/gpu/drm/i915/intel_lrc.h
M	drivers/gpu/drm/i915/intel_ringbuffer.c
Falling back to patching base and 3-way merge...
Auto-merging drivers/gpu/drm/i915/intel_guc_submission.c
Auto-merging drivers/gpu/drm/i915/intel_display.c
Auto-merging drivers/gpu/drm/i915/i915_request.c
Auto-merging drivers/gpu/drm/i915/i915_gem_gtt.c
Auto-merging drivers/gpu/drm/i915/i915_gem.c
Auto-merging drivers/gpu/drm/i915/i915_drv.h
Auto-merging drivers/gpu/drm/i915/i915_debugfs.c
Auto-merging drivers/gpu/drm/i915/gt/intel_ringbuffer.c
Auto-merging drivers/gpu/drm/i915/gt/intel_lrc.h
Auto-merging drivers/gpu/drm/i915/gt/intel_lrc.c
Auto-merging drivers/gpu/drm/i915/gt/intel_engine_types.h
Auto-merging drivers/gpu/drm/i915/gt/intel_engine_cs.c
Applying: drm/i915: Only reset the pinned kernel contexts on resume
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/i915/gt/intel_context_types.h
M	drivers/gpu/drm/i915/gt/intel_engine.h
M	drivers/gpu/drm/i915/gt/intel_engine_cs.c
M	drivers/gpu/drm/i915/gt/intel_lrc.c
M	drivers/gpu/drm/i915/gt/intel_lrc.h
M	drivers/gpu/drm/i915/gt/intel_ringbuffer.c
M	drivers/gpu/drm/i915/i915_drv.h
M	drivers/gpu/drm/i915/i915_gem.c
Falling back to patching base and 3-way merge...
Auto-merging drivers/gpu/drm/i915/gt/intel_lrc.h
Auto-merging drivers/gpu/drm/i915/gt/intel_lrc.c
Auto-merging drivers/gpu/drm/i915/gt/intel_engine_cs.c
CONFLICT (content): Merge conflict in drivers/gpu/drm/i915/gt/intel_engine_cs.c
Auto-merging drivers/gpu/drm/i915/gt/intel_engine.h
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch' to see the failed patch
Patch failed at 0007 drm/i915: Only reset the pinned kernel contexts on resume
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 23/29] drm/i915: Move i915_request_alloc into selftests/
  2019-04-08  9:17 ` [PATCH 23/29] drm/i915: Move i915_request_alloc into selftests/ Chris Wilson
@ 2019-04-12  7:05   ` Tvrtko Ursulin
  0 siblings, 0 replies; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-12  7:05 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 08/04/2019 10:17, Chris Wilson wrote:
> Having transitioned GEM over to using intel_context as its primary means
> of tracking the GEM context and engine combined and using
>   i915_request_create(), we can move the older i915_request_alloc()
> helper function into selftests/
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/Makefile                 |  1 +
>   drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |  9 ++--
>   drivers/gpu/drm/i915/gt/selftest_lrc.c        | 13 ++---
>   .../gpu/drm/i915/gt/selftest_workarounds.c    |  5 +-
>   drivers/gpu/drm/i915/i915_request.c           | 47 -------------------
>   drivers/gpu/drm/i915/i915_request.h           |  3 --
>   drivers/gpu/drm/i915/selftests/huge_pages.c   |  3 +-
>   drivers/gpu/drm/i915/selftests/i915_gem.c     |  5 +-
>   .../gpu/drm/i915/selftests/i915_gem_context.c | 13 ++---
>   .../gpu/drm/i915/selftests/i915_gem_evict.c   |  3 +-
>   drivers/gpu/drm/i915/selftests/i915_request.c |  4 +-
>   .../gpu/drm/i915/selftests/igt_gem_utils.c    | 43 +++++++++++++++++
>   .../gpu/drm/i915/selftests/igt_gem_utils.h    | 17 +++++++
>   drivers/gpu/drm/i915/selftests/igt_spinner.c  |  3 +-
>   drivers/gpu/drm/i915/selftests/mock_request.c |  3 +-
>   15 files changed, 96 insertions(+), 76 deletions(-)
>   create mode 100644 drivers/gpu/drm/i915/selftests/igt_gem_utils.c
>   create mode 100644 drivers/gpu/drm/i915/selftests/igt_gem_utils.h
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index dd8d923aa1c6..58643373495c 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -193,6 +193,7 @@ i915-$(CONFIG_DRM_I915_SELFTEST) += \
>   	selftests/i915_random.o \
>   	selftests/i915_selftest.o \
>   	selftests/igt_flush_test.o \
> +	selftests/igt_gem_utils.o \
>   	selftests/igt_live_test.o \
>   	selftests/igt_reset.o \
>   	selftests/igt_spinner.o
> diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> index 6004d6907e9c..c206748629dc 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
> @@ -29,6 +29,7 @@
>   #include "i915_selftest.h"
>   #include "selftests/i915_random.h"
>   #include "selftests/igt_flush_test.h"
> +#include "selftests/igt_gem_utils.h"
>   #include "selftests/igt_reset.h"
>   #include "selftests/igt_wedge_me.h"
>   
> @@ -175,7 +176,7 @@ hang_create_request(struct hang *h, struct intel_engine_cs *engine)
>   	if (err)
>   		goto unpin_vma;
>   
> -	rq = i915_request_alloc(engine, h->ctx);
> +	rq = igt_request_alloc(h->ctx, engine);
>   	if (IS_ERR(rq)) {
>   		err = PTR_ERR(rq);
>   		goto unpin_hws;
> @@ -455,7 +456,7 @@ static int igt_reset_nop(void *arg)
>   			for (i = 0; i < 16; i++) {
>   				struct i915_request *rq;
>   
> -				rq = i915_request_alloc(engine, ctx);
> +				rq = igt_request_alloc(ctx, engine);
>   				if (IS_ERR(rq)) {
>   					err = PTR_ERR(rq);
>   					break;
> @@ -554,7 +555,7 @@ static int igt_reset_nop_engine(void *arg)
>   			for (i = 0; i < 16; i++) {
>   				struct i915_request *rq;
>   
> -				rq = i915_request_alloc(engine, ctx);
> +				rq = igt_request_alloc(ctx, engine);
>   				if (IS_ERR(rq)) {
>   					err = PTR_ERR(rq);
>   					break;
> @@ -800,7 +801,7 @@ static int active_engine(void *data)
>   		struct i915_request *new;
>   
>   		mutex_lock(&engine->i915->drm.struct_mutex);
> -		new = i915_request_alloc(engine, ctx[idx]);
> +		new = igt_request_alloc(ctx[idx], engine);
>   		if (IS_ERR(new)) {
>   			mutex_unlock(&engine->i915->drm.struct_mutex);
>   			err = PTR_ERR(new);
> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> index cd0551f97c2f..84538f69185b 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> @@ -10,6 +10,7 @@
>   #include "i915_selftest.h"
>   #include "selftests/i915_random.h"
>   #include "selftests/igt_flush_test.h"
> +#include "selftests/igt_gem_utils.h"
>   #include "selftests/igt_live_test.h"
>   #include "selftests/igt_spinner.h"
>   #include "selftests/mock_context.h"
> @@ -148,7 +149,7 @@ static int live_busywait_preempt(void *arg)
>   		 * fails, we hang instead.
>   		 */
>   
> -		lo = i915_request_alloc(engine, ctx_lo);
> +		lo = igt_request_alloc(ctx_lo, engine);
>   		if (IS_ERR(lo)) {
>   			err = PTR_ERR(lo);
>   			goto err_vma;
> @@ -192,7 +193,7 @@ static int live_busywait_preempt(void *arg)
>   			goto err_vma;
>   		}
>   
> -		hi = i915_request_alloc(engine, ctx_hi);
> +		hi = igt_request_alloc(ctx_hi, engine);
>   		if (IS_ERR(hi)) {
>   			err = PTR_ERR(hi);
>   			goto err_vma;
> @@ -857,13 +858,13 @@ static int live_chain_preempt(void *arg)
>   			i915_request_add(rq);
>   
>   			for (i = 0; i < count; i++) {
> -				rq = i915_request_alloc(engine, lo.ctx);
> +				rq = igt_request_alloc(lo.ctx, engine);
>   				if (IS_ERR(rq))
>   					goto err_wedged;
>   				i915_request_add(rq);
>   			}
>   
> -			rq = i915_request_alloc(engine, hi.ctx);
> +			rq = igt_request_alloc(hi.ctx, engine);
>   			if (IS_ERR(rq))
>   				goto err_wedged;
>   			i915_request_add(rq);
> @@ -882,7 +883,7 @@ static int live_chain_preempt(void *arg)
>   			}
>   			igt_spinner_end(&lo.spin);
>   
> -			rq = i915_request_alloc(engine, lo.ctx);
> +			rq = igt_request_alloc(lo.ctx, engine);
>   			if (IS_ERR(rq))
>   				goto err_wedged;
>   			i915_request_add(rq);
> @@ -1087,7 +1088,7 @@ static int smoke_submit(struct preempt_smoke *smoke,
>   
>   	ctx->sched.priority = prio;
>   
> -	rq = i915_request_alloc(smoke->engine, ctx);
> +	rq = igt_request_alloc(ctx, smoke->engine);
>   	if (IS_ERR(rq)) {
>   		err = PTR_ERR(rq);
>   		goto unpin;
> diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
> index 028942cac4b4..02511b532b41 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
> @@ -8,6 +8,7 @@
>   #include "intel_reset.h"
>   
>   #include "selftests/igt_flush_test.h"
> +#include "selftests/igt_gem_utils.h"
>   #include "selftests/igt_reset.h"
>   #include "selftests/igt_spinner.h"
>   #include "selftests/igt_wedge_me.h"
> @@ -102,7 +103,7 @@ read_nonprivs(struct i915_gem_context *ctx, struct intel_engine_cs *engine)
>   	if (err)
>   		goto err_obj;
>   
> -	rq = i915_request_alloc(engine, ctx);
> +	rq = igt_request_alloc(ctx, engine);
>   	if (IS_ERR(rq)) {
>   		err = PTR_ERR(rq);
>   		goto err_pin;
> @@ -554,7 +555,7 @@ static int check_dirty_whitelist(struct i915_gem_context *ctx,
>   		i915_gem_object_unpin_map(batch->obj);
>   		i915_gem_chipset_flush(ctx->i915);
>   
> -		rq = i915_request_alloc(engine, ctx);
> +		rq = igt_request_alloc(ctx, engine);
>   		if (IS_ERR(rq)) {
>   			err = PTR_ERR(rq);
>   			goto out_batch;
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 5e52c1623c17..67360107d61d 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -738,53 +738,6 @@ i915_request_create(struct intel_context *ce)
>   	return rq;
>   }
>   
> -/**
> - * i915_request_alloc - allocate a request structure
> - *
> - * @engine: engine that we wish to issue the request on.
> - * @ctx: context that the request will be associated with.
> - *
> - * Returns a pointer to the allocated request if successful,
> - * or an error code if not.
> - */
> -struct i915_request *
> -i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
> -{
> -	struct drm_i915_private *i915 = engine->i915;
> -	struct intel_context *ce;
> -	struct i915_request *rq;
> -	int err;
> -
> -	/*
> -	 * Preempt contexts are reserved for exclusive use to inject a
> -	 * preemption context switch. They are never to be used for any trivial
> -	 * request!
> -	 */
> -	GEM_BUG_ON(ctx == i915->preempt_context);
> -
> -	/*
> -	 * Pinning the contexts may generate requests in order to acquire
> -	 * GGTT space, so do this first before we reserve a seqno for
> -	 * ourselves.
> -	 */
> -	ce = i915_gem_context_get_engine(ctx, engine->id);
> -	if (IS_ERR(ce))
> -		return ERR_CAST(ce);
> -
> -	err = intel_context_pin(ce);
> -	if (err) {
> -		rq = ERR_PTR(err);
> -		goto err_put;
> -	}
> -
> -	rq = i915_request_create(ce);
> -	intel_context_unpin(ce);
> -
> -err_put:
> -	intel_context_put(ce);
> -	return rq;
> -}
> -
>   static int
>   emit_semaphore_wait(struct i915_request *to,
>   		    struct i915_request *from,
> diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> index c0b7539daaf5..ea7a37ad63e2 100644
> --- a/drivers/gpu/drm/i915/i915_request.h
> +++ b/drivers/gpu/drm/i915/i915_request.h
> @@ -245,9 +245,6 @@ i915_request_create(struct intel_context *ce);
>   
>   struct i915_request *__i915_request_commit(struct i915_request *request);
>   
> -struct i915_request * __must_check
> -i915_request_alloc(struct intel_engine_cs *engine,
> -		   struct i915_gem_context *ctx);
>   void i915_request_retire_upto(struct i915_request *rq);
>   
>   static inline struct i915_request *
> diff --git a/drivers/gpu/drm/i915/selftests/huge_pages.c b/drivers/gpu/drm/i915/selftests/huge_pages.c
> index 90721b54e7ae..1e1f83326a96 100644
> --- a/drivers/gpu/drm/i915/selftests/huge_pages.c
> +++ b/drivers/gpu/drm/i915/selftests/huge_pages.c
> @@ -26,6 +26,7 @@
>   
>   #include <linux/prime_numbers.h>
>   
> +#include "igt_gem_utils.h"
>   #include "mock_drm.h"
>   #include "i915_random.h"
>   
> @@ -980,7 +981,7 @@ static int gpu_write(struct i915_vma *vma,
>   	if (IS_ERR(batch))
>   		return PTR_ERR(batch);
>   
> -	rq = i915_request_alloc(engine, ctx);
> +	rq = igt_request_alloc(ctx, engine);
>   	if (IS_ERR(rq)) {
>   		err = PTR_ERR(rq);
>   		goto err_batch;
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem.c b/drivers/gpu/drm/i915/selftests/i915_gem.c
> index 7460b77b34e0..3f4033c0e8e2 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem.c
> @@ -8,8 +8,9 @@
>   
>   #include "../i915_selftest.h"
>   
> -#include "mock_context.h"
> +#include "igt_gem_utils.h"
>   #include "igt_flush_test.h"
> +#include "mock_context.h"
>   
>   static int switch_to_context(struct drm_i915_private *i915,
>   			     struct i915_gem_context *ctx)
> @@ -20,7 +21,7 @@ static int switch_to_context(struct drm_i915_private *i915,
>   	for_each_engine(engine, i915, id) {
>   		struct i915_request *rq;
>   
> -		rq = i915_request_alloc(engine, ctx);
> +		rq = igt_request_alloc(ctx, engine);
>   		if (IS_ERR(rq))
>   			return PTR_ERR(rq);
>   
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> index deedd1898fe5..b62f005e4d50 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> @@ -29,6 +29,7 @@
>   
>   #include "i915_random.h"
>   #include "igt_flush_test.h"
> +#include "igt_gem_utils.h"
>   #include "igt_live_test.h"
>   #include "igt_reset.h"
>   #include "igt_spinner.h"
> @@ -91,7 +92,7 @@ static int live_nop_switch(void *arg)
>   
>   		times[0] = ktime_get_raw();
>   		for (n = 0; n < nctx; n++) {
> -			rq = i915_request_alloc(engine, ctx[n]);
> +			rq = igt_request_alloc(ctx[n], engine);
>   			if (IS_ERR(rq)) {
>   				err = PTR_ERR(rq);
>   				goto out_unlock;
> @@ -121,7 +122,7 @@ static int live_nop_switch(void *arg)
>   			times[1] = ktime_get_raw();
>   
>   			for (n = 0; n < prime; n++) {
> -				rq = i915_request_alloc(engine, ctx[n % nctx]);
> +				rq = igt_request_alloc(ctx[n % nctx], engine);
>   				if (IS_ERR(rq)) {
>   					err = PTR_ERR(rq);
>   					goto out_unlock;
> @@ -301,7 +302,7 @@ static int gpu_fill(struct drm_i915_gem_object *obj,
>   		goto err_vma;
>   	}
>   
> -	rq = i915_request_alloc(engine, ctx);
> +	rq = igt_request_alloc(ctx, engine);
>   	if (IS_ERR(rq)) {
>   		err = PTR_ERR(rq);
>   		goto err_batch;
> @@ -1350,7 +1351,7 @@ static int write_to_scratch(struct i915_gem_context *ctx,
>   	if (err)
>   		goto err_unpin;
>   
> -	rq = i915_request_alloc(engine, ctx);
> +	rq = igt_request_alloc(ctx, engine);
>   	if (IS_ERR(rq)) {
>   		err = PTR_ERR(rq);
>   		goto err_unpin;
> @@ -1445,7 +1446,7 @@ static int read_from_scratch(struct i915_gem_context *ctx,
>   	if (err)
>   		goto err_unpin;
>   
> -	rq = i915_request_alloc(engine, ctx);
> +	rq = igt_request_alloc(ctx, engine);
>   	if (IS_ERR(rq)) {
>   		err = PTR_ERR(rq);
>   		goto err_unpin;
> @@ -1669,7 +1670,7 @@ static int mock_context_barrier(void *arg)
>   		goto out;
>   	}
>   
> -	rq = i915_request_alloc(i915->engine[RCS0], ctx);
> +	rq = igt_request_alloc(ctx, i915->engine[RCS0]);
>   	if (IS_ERR(rq)) {
>   		pr_err("Request allocation failed!\n");
>   		goto out;
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
> index 89766688e420..4fc6e5445dd1 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
> @@ -24,6 +24,7 @@
>   
>   #include "../i915_selftest.h"
>   
> +#include "igt_gem_utils.h"
>   #include "lib_sw_fence.h"
>   #include "mock_context.h"
>   #include "mock_drm.h"
> @@ -460,7 +461,7 @@ static int igt_evict_contexts(void *arg)
>   
>   			/* We will need some GGTT space for the rq's context */
>   			igt_evict_ctl.fail_if_busy = true;
> -			rq = i915_request_alloc(engine, ctx);
> +			rq = igt_request_alloc(ctx, engine);
>   			igt_evict_ctl.fail_if_busy = false;
>   
>   			if (IS_ERR(rq)) {
> diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c
> index 098d7b3aa131..b60591531e4a 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_request.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_request.c
> @@ -267,7 +267,7 @@ static struct i915_request *
>   __live_request_alloc(struct i915_gem_context *ctx,
>   		     struct intel_engine_cs *engine)
>   {
> -	return i915_request_alloc(engine, ctx);
> +	return igt_request_alloc(ctx, engine);
>   }
>   
>   static int __igt_breadcrumbs_smoketest(void *arg)
> @@ -1074,7 +1074,7 @@ max_batches(struct i915_gem_context *ctx, struct intel_engine_cs *engine)
>   	if (HAS_EXECLISTS(ctx->i915))
>   		return INT_MAX;
>   
> -	rq = i915_request_alloc(engine, ctx);
> +	rq = igt_request_alloc(ctx, engine);
>   	if (IS_ERR(rq)) {
>   		ret = PTR_ERR(rq);
>   	} else {
> diff --git a/drivers/gpu/drm/i915/selftests/igt_gem_utils.c b/drivers/gpu/drm/i915/selftests/igt_gem_utils.c
> new file mode 100644
> index 000000000000..23299aa80ac4
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/selftests/igt_gem_utils.c
> @@ -0,0 +1,43 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2018 Intel Corporation
> + */
> +
> +#include "igt_gem_utils.h"
> +
> +#include "gt/intel_context.h"
> +
> +#include "../i915_gem_context.h"
> +#include "../i915_gem_pm.h"
> +#include "../i915_request.h"
> +
> +struct i915_request *
> +igt_request_alloc(struct i915_gem_context *ctx, struct intel_engine_cs *engine)
> +{
> +	struct intel_context *ce;
> +	struct i915_request *rq;
> +	int err;
> +
> +	/*
> +	 * Pinning the contexts may generate requests in order to acquire
> +	 * GGTT space, so do this first before we reserve a seqno for
> +	 * ourselves.
> +	 */
> +	ce = i915_gem_context_get_engine(ctx, engine->id);
> +	if (IS_ERR(ce))
> +		return ERR_CAST(ce);
> +
> +	err = intel_context_pin(ce);
> +	if (err) {
> +		rq = ERR_PTR(err);
> +		goto err_put;
> +	}
> +
> +	rq = i915_request_create(ce);
> +	intel_context_unpin(ce);
> +
> +err_put:
> +	intel_context_put(ce);
> +	return rq;
> +}
> diff --git a/drivers/gpu/drm/i915/selftests/igt_gem_utils.h b/drivers/gpu/drm/i915/selftests/igt_gem_utils.h
> new file mode 100644
> index 000000000000..0f17251cf75d
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/selftests/igt_gem_utils.h
> @@ -0,0 +1,17 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2018 Intel Corporation
> + */
> +
> +#ifndef __IGT_GEM_UTILS_H__
> +#define __IGT_GEM_UTILS_H__
> +
> +struct i915_request;
> +struct i915_gem_context;
> +struct intel_engine_cs;
> +
> +struct i915_request *
> +igt_request_alloc(struct i915_gem_context *ctx, struct intel_engine_cs *engine);
> +
> +#endif /* __IGT_GEM_UTILS_H__ */
> diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c
> index 16890dfe74c0..ece8a8a0d3b0 100644
> --- a/drivers/gpu/drm/i915/selftests/igt_spinner.c
> +++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c
> @@ -4,6 +4,7 @@
>    * Copyright © 2018 Intel Corporation
>    */
>   
> +#include "igt_gem_utils.h"
>   #include "igt_spinner.h"
>   
>   int igt_spinner_init(struct igt_spinner *spin, struct drm_i915_private *i915)
> @@ -114,7 +115,7 @@ igt_spinner_create_request(struct igt_spinner *spin,
>   	if (err)
>   		goto unpin_vma;
>   
> -	rq = i915_request_alloc(engine, ctx);
> +	rq = igt_request_alloc(ctx, engine);
>   	if (IS_ERR(rq)) {
>   		err = PTR_ERR(rq);
>   		goto unpin_hws;
> diff --git a/drivers/gpu/drm/i915/selftests/mock_request.c b/drivers/gpu/drm/i915/selftests/mock_request.c
> index f739ba63057f..b99f7576153c 100644
> --- a/drivers/gpu/drm/i915/selftests/mock_request.c
> +++ b/drivers/gpu/drm/i915/selftests/mock_request.c
> @@ -24,6 +24,7 @@
>   
>   #include "gt/mock_engine.h"
>   
> +#include "igt_gem_utils.h"
>   #include "mock_request.h"
>   
>   struct i915_request *
> @@ -34,7 +35,7 @@ mock_request(struct intel_engine_cs *engine,
>   	struct i915_request *request;
>   
>   	/* NB the i915->requests slab cache is enlarged to fit mock_request */
> -	request = i915_request_alloc(engine, context);
> +	request = igt_request_alloc(context, engine);
>   	if (IS_ERR(request))
>   		return NULL;
>   
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH 24/29] drm/i915: Allow multiple user handles to the same VM
  2019-04-08  9:17 ` [PATCH 24/29] drm/i915: Allow multiple user handles to the same VM Chris Wilson
@ 2019-04-12  7:21   ` Tvrtko Ursulin
  0 siblings, 0 replies; 71+ messages in thread
From: Tvrtko Ursulin @ 2019-04-12  7:21 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 08/04/2019 10:17, Chris Wilson wrote:
> It was noted that we made the same mistake for VM_ID as for object
> handles, whereby we ensured that we only allocated a single handle for
> one ppgtt. This has the unfortunate consequence for userspace that they
> need to reference count the handles to avoid destroying an active ID. If
> we allow multiple handles to the same ppgtt, userspace can freely
> unreference any handle they own without fear of destroying the same
> handle in use elsewhere.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_gem_context.c | 26 ++++++++-----------------
>   drivers/gpu/drm/i915/i915_gem_gtt.h     |  2 --
>   2 files changed, 8 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index c05f949ed893..2a30cab7a65d 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -837,8 +837,7 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void *data,
>   	if (err < 0)
>   		goto err_unlock;
>   
> -	GEM_BUG_ON(err == 0); /* reserved for default/unassigned ppgtt */
> -	ppgtt->user_handle = err;
> +	GEM_BUG_ON(err == 0); /* reserved for invalid/unassigned ppgtt */
>   
>   	mutex_unlock(&file_priv->vm_idr_lock);
>   
> @@ -876,10 +875,6 @@ int i915_gem_vm_destroy_ioctl(struct drm_device *dev, void *data,
>   		return err;
>   
>   	ppgtt = idr_remove(&file_priv->vm_idr, id);
> -	if (ppgtt) {
> -		GEM_BUG_ON(ppgtt->user_handle != id);
> -		ppgtt->user_handle = 0;
> -	}
>   
>   	mutex_unlock(&file_priv->vm_idr_lock);
>   	if (!ppgtt)
> @@ -1001,18 +996,15 @@ static int get_ppgtt(struct drm_i915_file_private *file_priv,
>   	if (ret)
>   		goto err_put;
>   
> -	if (!ppgtt->user_handle) {
> -		ret = idr_alloc(&file_priv->vm_idr, ppgtt, 0, 0, GFP_KERNEL);
> -		GEM_BUG_ON(!ret);
> -		if (ret < 0)
> -			goto err_unlock;
> +	ret = idr_alloc(&file_priv->vm_idr, ppgtt, 0, 0, GFP_KERNEL);
> +	GEM_BUG_ON(!ret);
> +	if (ret < 0)
> +		goto err_unlock;
>   
> -		ppgtt->user_handle = ret;
> -		i915_ppgtt_get(ppgtt);
> -	}
> +	i915_ppgtt_get(ppgtt);
>   
>   	args->size = 0;
> -	args->value = ppgtt->user_handle;
> +	args->value = ret;
>   
>   	ret = 0;
>   err_unlock:
> @@ -1103,10 +1095,8 @@ static int set_ppgtt(struct drm_i915_file_private *file_priv,
>   		return err;
>   
>   	ppgtt = idr_find(&file_priv->vm_idr, args->value);
> -	if (ppgtt) {
> -		GEM_BUG_ON(ppgtt->user_handle != args->value);
> +	if (ppgtt)
>   		i915_ppgtt_get(ppgtt);
> -	}
>   	mutex_unlock(&file_priv->vm_idr_lock);
>   	if (!ppgtt)
>   		return -ENOENT;
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
> index c8d96e91f3dc..4832bb5c5fc0 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> @@ -396,8 +396,6 @@ struct i915_hw_ppgtt {
>   		struct i915_page_directory_pointer pdp;	/* GEN8+ */
>   		struct i915_page_directory pd;		/* GEN6-7 */
>   	};
> -
> -	u32 user_handle;
>   };
>   
>   struct gen6_hw_ppgtt {
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 71+ messages in thread

end of thread, other threads:[~2019-04-12  7:21 UTC | newest]

Thread overview: 71+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-08  9:17 [PATCH 01/29] drm/i915: Mark up ips for RCU protection Chris Wilson
2019-04-08  9:17 ` [PATCH 02/29] drm/i915/guc: Replace WARN with a DRM_ERROR Chris Wilson
2019-04-08 14:26   ` Tvrtko Ursulin
2019-04-08  9:17 ` [PATCH 03/29] drm/i915: Use static allocation for i915_globals_park() Chris Wilson
2019-04-08 14:31   ` Tvrtko Ursulin
2019-04-08  9:17 ` [PATCH 04/29] drm/i915: Consolidate the timeline->barrier Chris Wilson
2019-04-08 14:42   ` Tvrtko Ursulin
2019-04-08  9:17 ` [PATCH 05/29] drm/i915: Store the default sseu setup on the engine Chris Wilson
2019-04-08 14:54   ` Tvrtko Ursulin
2019-04-08 15:57     ` Chris Wilson
2019-04-08 16:04       ` Tvrtko Ursulin
2019-04-08  9:17 ` [PATCH 06/29] drm/i915: Move GraphicsTechnology files under gt/ Chris Wilson
2019-04-08  9:17 ` [PATCH 07/29] drm/i915: Only reset the pinned kernel contexts on resume Chris Wilson
2019-04-10  9:39   ` Tvrtko Ursulin
2019-04-08  9:17 ` [PATCH 08/29] drm/i915: Introduce struct intel_wakeref Chris Wilson
2019-04-10  9:49   ` Tvrtko Ursulin
2019-04-10 10:01     ` Chris Wilson
2019-04-10 10:07       ` Tvrtko Ursulin
2019-04-08  9:17 ` [PATCH 09/29] drm/i915: Pull the GEM powermangement coupling into its own file Chris Wilson
2019-04-08 14:56   ` Tvrtko Ursulin
2019-04-08 16:00     ` Chris Wilson
2019-04-10  9:57   ` Tvrtko Ursulin
2019-04-08  9:17 ` [PATCH 10/29] drm/i915: Introduce context->enter() and context->exit() Chris Wilson
2019-04-10 10:05   ` Tvrtko Ursulin
2019-04-10 10:13     ` Chris Wilson
2019-04-10 11:06       ` Tvrtko Ursulin
2019-04-10 19:19         ` Chris Wilson
2019-04-08  9:17 ` [PATCH 11/29] drm/i915: Pass intel_context to i915_request_create() Chris Wilson
2019-04-10 10:38   ` Tvrtko Ursulin
2019-04-08  9:17 ` [PATCH 12/29] drm/i915: Invert the GEM wakeref hierarchy Chris Wilson
2019-04-08  9:17 ` [PATCH 13/29] drm/i915/gvt: Pin the per-engine GVT shadow contexts Chris Wilson
2019-04-08  9:17 ` [PATCH 14/29] drm/i915: Explicitly pin the logical context for execbuf Chris Wilson
2019-04-08 15:17   ` Tvrtko Ursulin
2019-04-08  9:17 ` [PATCH 15/29] drm/i915/guc: Replace preempt_client lookup with engine->preempt_context Chris Wilson
2019-04-08 14:57   ` Tvrtko Ursulin
2019-04-08  9:17 ` [PATCH 16/29] drm/i915: Export intel_context_instance() Chris Wilson
2019-04-10 12:06   ` Tvrtko Ursulin
2019-04-10 19:32     ` Chris Wilson
2019-04-11 12:57       ` Tvrtko Ursulin
2019-04-08  9:17 ` [PATCH 17/29] drm/i915/selftests: Use the real kernel context for sseu isolation tests Chris Wilson
2019-04-08 15:00   ` Tvrtko Ursulin
2019-04-08  9:17 ` [PATCH 18/29] drm/i915/selftests: Pass around intel_context for sseu Chris Wilson
2019-04-10 12:25   ` Tvrtko Ursulin
2019-04-08  9:17 ` [PATCH 19/29] drm/i915: Pass intel_context to intel_context_pin_lock() Chris Wilson
2019-04-10 12:45   ` Tvrtko Ursulin
2019-04-10 12:49     ` Chris Wilson
2019-04-10 13:04       ` Chris Wilson
2019-04-10 14:53         ` Tvrtko Ursulin
2019-04-08  9:17 ` [PATCH 20/29] drm/i915: Split engine setup/init into two phases Chris Wilson
2019-04-10 13:30   ` Tvrtko Ursulin
2019-04-08  9:17 ` [PATCH 21/29] drm/i915: Switch back to an array of logical per-engine HW contexts Chris Wilson
2019-04-10 15:32   ` Tvrtko Ursulin
2019-04-10 16:18     ` Chris Wilson
2019-04-11 13:05       ` Tvrtko Ursulin
2019-04-11 13:25         ` Chris Wilson
2019-04-11 13:33   ` [PATCH] " Chris Wilson
2019-04-08  9:17 ` [PATCH 22/29] drm/i915: Remove intel_context.active_link Chris Wilson
2019-04-08  9:17 ` [PATCH 23/29] drm/i915: Move i915_request_alloc into selftests/ Chris Wilson
2019-04-12  7:05   ` Tvrtko Ursulin
2019-04-08  9:17 ` [PATCH 24/29] drm/i915: Allow multiple user handles to the same VM Chris Wilson
2019-04-12  7:21   ` Tvrtko Ursulin
2019-04-08  9:17 ` [PATCH 25/29] drm/i915: Restore control over ppgtt for context creation ABI Chris Wilson
2019-04-08  9:17 ` [PATCH 26/29] drm/i915: Allow a context to define its set of engines Chris Wilson
2019-04-08  9:17 ` [PATCH 27/29] drm/i915: Allow userspace to clone contexts on creation Chris Wilson
2019-04-08  9:17 ` [PATCH 28/29] drm/i915: Re-expose SINGLE_TIMELINE flags for context creation Chris Wilson
2019-04-08  9:17 ` [PATCH 29/29] drm/i915: Load balancing across a virtual engine Chris Wilson
2019-04-08  9:59 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/29] drm/i915: Mark up ips for RCU protection Patchwork
2019-04-08 10:13 ` ✗ Fi.CI.SPARSE: " Patchwork
2019-04-08 10:28 ` ✗ Fi.CI.BAT: failure " Patchwork
2019-04-08 10:37   ` Chris Wilson
2019-04-11 22:20 ` ✗ Fi.CI.BAT: failure for series starting with [01/29] drm/i915: Mark up ips for RCU protection (rev2) Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.