intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH 1/6] drm/i915/gt: Protect signaler walk with RCU
@ 2020-02-20  7:50 Chris Wilson
  2020-02-20  7:50 ` [Intel-gfx] [PATCH 2/6] drm/i915/gem: Consolidate ctx->engines[] release Chris Wilson
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Chris Wilson @ 2020-02-20  7:50 UTC (permalink / raw)
  To: intel-gfx; +Cc: Matthew Auld

While we know that the waiters cannot disappear as we walk our list
(only that they might be added), the same cannot be said for our
signalers as they may be completed by the HW and retired as we process
this request. Ergo we need to use rcu to protect the list iteration and
remember to mark up the list_del_rcu.

v2: Mark the deps as safe-for-rcu

Fixes: 793c22617367 ("drm/i915/gt: Protect execlists_hold/unhold from new waiters")
Fixes: 32ff621fd744 ("drm/i915/gt: Allow temporary suspension of inflight requests")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c   | 16 ++++++++++------
 drivers/gpu/drm/i915/i915_scheduler.c |  7 ++++---
 2 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index ba31cbe8c68e..47561dc29304 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1668,9 +1668,9 @@ last_active(const struct intel_engine_execlists *execlists)
 				     wait_link)
 
 #define for_each_signaler(p__, rq__) \
-	list_for_each_entry_lockless(p__, \
-				     &(rq__)->sched.signalers_list, \
-				     signal_link)
+	list_for_each_entry_rcu(p__, \
+				&(rq__)->sched.signalers_list, \
+				signal_link)
 
 static void defer_request(struct i915_request *rq, struct list_head * const pl)
 {
@@ -2533,11 +2533,13 @@ static bool execlists_hold(struct intel_engine_cs *engine,
 static bool hold_request(const struct i915_request *rq)
 {
 	struct i915_dependency *p;
+	bool result = false;
 
 	/*
 	 * If one of our ancestors is on hold, we must also be on hold,
 	 * otherwise we will bypass it and execute before it.
 	 */
+	rcu_read_lock();
 	for_each_signaler(p, rq) {
 		const struct i915_request *s =
 			container_of(p->signaler, typeof(*s), sched);
@@ -2545,11 +2547,13 @@ static bool hold_request(const struct i915_request *rq)
 		if (s->engine != rq->engine)
 			continue;
 
-		if (i915_request_on_hold(s))
-			return true;
+		result = i915_request_on_hold(s);
+		if (result)
+			break;
 	}
+	rcu_read_unlock();
 
-	return false;
+	return result;
 }
 
 static void __execlists_unhold(struct i915_request *rq)
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index e19a37a83397..59f70b674665 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -486,7 +486,7 @@ void i915_sched_node_fini(struct i915_sched_node *node)
 	list_for_each_entry_safe(dep, tmp, &node->signalers_list, signal_link) {
 		GEM_BUG_ON(!list_empty(&dep->dfs_link));
 
-		list_del(&dep->wait_link);
+		list_del_rcu(&dep->wait_link);
 		if (dep->flags & I915_DEPENDENCY_ALLOC)
 			i915_dependency_free(dep);
 	}
@@ -497,7 +497,7 @@ void i915_sched_node_fini(struct i915_sched_node *node)
 		GEM_BUG_ON(dep->signaler != node);
 		GEM_BUG_ON(!list_empty(&dep->dfs_link));
 
-		list_del(&dep->signal_link);
+		list_del_rcu(&dep->signal_link);
 		if (dep->flags & I915_DEPENDENCY_ALLOC)
 			i915_dependency_free(dep);
 	}
@@ -526,7 +526,8 @@ static struct i915_global_scheduler global = { {
 int __init i915_global_scheduler_init(void)
 {
 	global.slab_dependencies = KMEM_CACHE(i915_dependency,
-					      SLAB_HWCACHE_ALIGN);
+					      SLAB_HWCACHE_ALIGN |
+					      SLAB_TYPESAFE_BY_RCU);
 	if (!global.slab_dependencies)
 		return -ENOMEM;
 
-- 
2.25.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Intel-gfx] [PATCH 2/6] drm/i915/gem: Consolidate ctx->engines[] release
  2020-02-20  7:50 [Intel-gfx] [PATCH 1/6] drm/i915/gt: Protect signaler walk with RCU Chris Wilson
@ 2020-02-20  7:50 ` Chris Wilson
  2020-02-20  7:50 ` [Intel-gfx] [PATCH 3/6] drm/i915/gt: Prevent allocation on a banned context Chris Wilson
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Chris Wilson @ 2020-02-20  7:50 UTC (permalink / raw)
  To: intel-gfx

Use the same engine_idle_release() routine for cleaning all old
ctx->engine[] state, closing any potential races with concurrent execbuf
submission.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1241
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
Reorder set-closed/engine_idle_release to avoid premature killing
Take a reference to prevent racing context free with engine cleanup
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 190 ++++++++++----------
 1 file changed, 100 insertions(+), 90 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 3e82739bdbc0..99206ec45876 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -243,7 +243,6 @@ static void __free_engines(struct i915_gem_engines *e, unsigned int count)
 		if (!e->engines[count])
 			continue;
 
-		RCU_INIT_POINTER(e->engines[count]->gem_context, NULL);
 		intel_context_put(e->engines[count]);
 	}
 	kfree(e);
@@ -270,8 +269,6 @@ static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx)
 	if (!e)
 		return ERR_PTR(-ENOMEM);
 
-	e->ctx = ctx;
-
 	for_each_engine(engine, gt, id) {
 		struct intel_context *ce;
 
@@ -305,7 +302,6 @@ static void i915_gem_context_free(struct i915_gem_context *ctx)
 	list_del(&ctx->link);
 	spin_unlock(&ctx->i915->gem.contexts.lock);
 
-	free_engines(rcu_access_pointer(ctx->engines));
 	mutex_destroy(&ctx->engines_mutex);
 
 	if (ctx->timeline)
@@ -492,30 +488,110 @@ static void kill_engines(struct i915_gem_engines *engines)
 static void kill_stale_engines(struct i915_gem_context *ctx)
 {
 	struct i915_gem_engines *pos, *next;
-	unsigned long flags;
 
-	spin_lock_irqsave(&ctx->stale.lock, flags);
+	spin_lock_irq(&ctx->stale.lock);
+	GEM_BUG_ON(!i915_gem_context_is_closed(ctx));
 	list_for_each_entry_safe(pos, next, &ctx->stale.engines, link) {
-		if (!i915_sw_fence_await(&pos->fence))
+		if (!i915_sw_fence_await(&pos->fence)) {
+			list_del_init(&pos->link);
 			continue;
+		}
 
-		spin_unlock_irqrestore(&ctx->stale.lock, flags);
+		spin_unlock_irq(&ctx->stale.lock);
 
 		kill_engines(pos);
 
-		spin_lock_irqsave(&ctx->stale.lock, flags);
+		spin_lock_irq(&ctx->stale.lock);
+		GEM_BUG_ON(i915_sw_fence_signaled(&pos->fence));
 		list_safe_reset_next(pos, next, link);
 		list_del_init(&pos->link); /* decouple from FENCE_COMPLETE */
 
 		i915_sw_fence_complete(&pos->fence);
 	}
-	spin_unlock_irqrestore(&ctx->stale.lock, flags);
+	spin_unlock_irq(&ctx->stale.lock);
 }
 
 static void kill_context(struct i915_gem_context *ctx)
 {
 	kill_stale_engines(ctx);
-	kill_engines(__context_engines_static(ctx));
+}
+
+static int engines_notify(struct i915_sw_fence *fence,
+			  enum i915_sw_fence_notify state)
+{
+	struct i915_gem_engines *engines =
+		container_of(fence, typeof(*engines), fence);
+
+	switch (state) {
+	case FENCE_COMPLETE:
+		if (!list_empty(&engines->link)) {
+			struct i915_gem_context *ctx = engines->ctx;
+			unsigned long flags;
+
+			spin_lock_irqsave(&ctx->stale.lock, flags);
+			list_del(&engines->link);
+			spin_unlock_irqrestore(&ctx->stale.lock, flags);
+		}
+		break;
+
+	case FENCE_FREE:
+		i915_gem_context_put(engines->ctx);
+		init_rcu_head(&engines->rcu);
+		call_rcu(&engines->rcu, free_engines_rcu);
+		break;
+	}
+
+	return NOTIFY_DONE;
+}
+
+static void engines_idle_release(struct i915_gem_context *ctx,
+				 struct i915_gem_engines *engines)
+{
+	struct i915_gem_engines_iter it;
+	struct intel_context *ce;
+
+	i915_sw_fence_init(&engines->fence, engines_notify);
+	INIT_LIST_HEAD(&engines->link);
+
+	engines->ctx = i915_gem_context_get(ctx);
+
+	for_each_gem_engine(ce, engines, it) {
+		int err = 0;
+
+		RCU_INIT_POINTER(ce->gem_context, NULL);
+
+		if (!ce->timeline) { /* XXX serialisation with execbuf? */
+			intel_context_set_banned(ce);
+			continue;
+		}
+
+		mutex_lock(&ce->timeline->mutex);
+		if (!list_empty(&ce->timeline->requests)) {
+			struct i915_request *rq;
+
+			rq = list_last_entry(&ce->timeline->requests,
+					     typeof(*rq),
+					     link);
+
+			err = i915_sw_fence_await_dma_fence(&engines->fence,
+							    &rq->fence, 0,
+							    GFP_KERNEL);
+		}
+		mutex_unlock(&ce->timeline->mutex);
+		if (err < 0)
+			goto kill;
+	}
+
+	spin_lock_irq(&engines->ctx->stale.lock);
+	if (!i915_gem_context_is_closed(engines->ctx))
+		list_add_tail(&engines->link, &engines->ctx->stale.engines);
+	spin_unlock_irq(&engines->ctx->stale.lock);
+
+kill:
+	if (list_empty(&engines->link)) /* raced, already closed */
+		kill_engines(engines);
+
+	i915_sw_fence_commit(&engines->fence);
 }
 
 static void set_closed_name(struct i915_gem_context *ctx)
@@ -539,11 +615,16 @@ static void context_close(struct i915_gem_context *ctx)
 {
 	struct i915_address_space *vm;
 
+	/* Flush any concurrent set_engines() */
+	mutex_lock(&ctx->engines_mutex);
+	engines_idle_release(ctx, rcu_replace_pointer(ctx->engines, NULL, 1));
 	i915_gem_context_set_closed(ctx);
-	set_closed_name(ctx);
+	mutex_unlock(&ctx->engines_mutex);
 
 	mutex_lock(&ctx->mutex);
 
+	set_closed_name(ctx);
+
 	vm = i915_gem_context_vm(ctx);
 	if (vm)
 		i915_vm_close(vm);
@@ -1562,77 +1643,6 @@ static const i915_user_extension_fn set_engines__extensions[] = {
 	[I915_CONTEXT_ENGINES_EXT_BOND] = set_engines__bond,
 };
 
-static int engines_notify(struct i915_sw_fence *fence,
-			  enum i915_sw_fence_notify state)
-{
-	struct i915_gem_engines *engines =
-		container_of(fence, typeof(*engines), fence);
-
-	switch (state) {
-	case FENCE_COMPLETE:
-		if (!list_empty(&engines->link)) {
-			struct i915_gem_context *ctx = engines->ctx;
-			unsigned long flags;
-
-			spin_lock_irqsave(&ctx->stale.lock, flags);
-			list_del(&engines->link);
-			spin_unlock_irqrestore(&ctx->stale.lock, flags);
-		}
-		break;
-
-	case FENCE_FREE:
-		init_rcu_head(&engines->rcu);
-		call_rcu(&engines->rcu, free_engines_rcu);
-		break;
-	}
-
-	return NOTIFY_DONE;
-}
-
-static void engines_idle_release(struct i915_gem_engines *engines)
-{
-	struct i915_gem_engines_iter it;
-	struct intel_context *ce;
-	unsigned long flags;
-
-	GEM_BUG_ON(!engines);
-	i915_sw_fence_init(&engines->fence, engines_notify);
-
-	INIT_LIST_HEAD(&engines->link);
-	spin_lock_irqsave(&engines->ctx->stale.lock, flags);
-	if (!i915_gem_context_is_closed(engines->ctx))
-		list_add(&engines->link, &engines->ctx->stale.engines);
-	spin_unlock_irqrestore(&engines->ctx->stale.lock, flags);
-	if (list_empty(&engines->link)) /* raced, already closed */
-		goto kill;
-
-	for_each_gem_engine(ce, engines, it) {
-		struct dma_fence *fence;
-		int err;
-
-		if (!ce->timeline)
-			continue;
-
-		fence = i915_active_fence_get(&ce->timeline->last_request);
-		if (!fence)
-			continue;
-
-		err = i915_sw_fence_await_dma_fence(&engines->fence,
-						    fence, 0,
-						    GFP_KERNEL);
-
-		dma_fence_put(fence);
-		if (err < 0)
-			goto kill;
-	}
-	goto out;
-
-kill:
-	kill_engines(engines);
-out:
-	i915_sw_fence_commit(&engines->fence);
-}
-
 static int
 set_engines(struct i915_gem_context *ctx,
 	    const struct drm_i915_gem_context_param *args)
@@ -1675,8 +1685,6 @@ set_engines(struct i915_gem_context *ctx,
 	if (!set.engines)
 		return -ENOMEM;
 
-	set.engines->ctx = ctx;
-
 	for (n = 0; n < num_engines; n++) {
 		struct i915_engine_class_instance ci;
 		struct intel_engine_cs *engine;
@@ -1729,6 +1737,11 @@ set_engines(struct i915_gem_context *ctx,
 
 replace:
 	mutex_lock(&ctx->engines_mutex);
+	if (i915_gem_context_is_closed(ctx)) {
+		mutex_unlock(&ctx->engines_mutex);
+		free_engines(set.engines);
+		return -ENOENT;
+	}
 	if (args->size)
 		i915_gem_context_set_user_engines(ctx);
 	else
@@ -1737,7 +1750,7 @@ set_engines(struct i915_gem_context *ctx,
 	mutex_unlock(&ctx->engines_mutex);
 
 	/* Keep track of old engine sets for kill_context() */
-	engines_idle_release(set.engines);
+	engines_idle_release(ctx, set.engines);
 
 	return 0;
 }
@@ -1995,8 +2008,6 @@ static int clone_engines(struct i915_gem_context *dst,
 	if (!clone)
 		goto err_unlock;
 
-	clone->ctx = dst;
-
 	for (n = 0; n < e->num_engines; n++) {
 		struct intel_engine_cs *engine;
 
@@ -2033,8 +2044,7 @@ static int clone_engines(struct i915_gem_context *dst,
 	i915_gem_context_unlock_engines(src);
 
 	/* Serialised by constructor */
-	free_engines(__context_engines_static(dst));
-	RCU_INIT_POINTER(dst->engines, clone);
+	engines_idle_release(dst, rcu_replace_pointer(dst->engines, clone, 1));
 	if (user_engines)
 		i915_gem_context_set_user_engines(dst);
 	else
-- 
2.25.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Intel-gfx] [PATCH 3/6] drm/i915/gt: Prevent allocation on a banned context
  2020-02-20  7:50 [Intel-gfx] [PATCH 1/6] drm/i915/gt: Protect signaler walk with RCU Chris Wilson
  2020-02-20  7:50 ` [Intel-gfx] [PATCH 2/6] drm/i915/gem: Consolidate ctx->engines[] release Chris Wilson
@ 2020-02-20  7:50 ` Chris Wilson
  2020-02-20  7:50 ` [Intel-gfx] [PATCH 4/6] drm/i915/gem: Check that the context wasn't closed during setup Chris Wilson
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Chris Wilson @ 2020-02-20  7:50 UTC (permalink / raw)
  To: intel-gfx; +Cc: Matthew Auld

If a context is banned even before we submit our first request to it,
report the failure before we attempt to allocate any resources for the
context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_context.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index 8bb444cda14f..01474d3a558b 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -51,6 +51,11 @@ int intel_context_alloc_state(struct intel_context *ce)
 		return -EINTR;
 
 	if (!test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
+		if (intel_context_is_banned(ce)) {
+			err = -EIO;
+			goto unlock;
+		}
+
 		err = ce->ops->alloc(ce);
 		if (unlikely(err))
 			goto unlock;
-- 
2.25.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Intel-gfx] [PATCH 4/6] drm/i915/gem: Check that the context wasn't closed during setup
  2020-02-20  7:50 [Intel-gfx] [PATCH 1/6] drm/i915/gt: Protect signaler walk with RCU Chris Wilson
  2020-02-20  7:50 ` [Intel-gfx] [PATCH 2/6] drm/i915/gem: Consolidate ctx->engines[] release Chris Wilson
  2020-02-20  7:50 ` [Intel-gfx] [PATCH 3/6] drm/i915/gt: Prevent allocation on a banned context Chris Wilson
@ 2020-02-20  7:50 ` Chris Wilson
  2020-02-20  7:50 ` [Intel-gfx] [PATCH 5/6] drm/i915/gt: Declare when we enabled timeslicing Chris Wilson
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Chris Wilson @ 2020-02-20  7:50 UTC (permalink / raw)
  To: intel-gfx; +Cc: Matthew Auld

As setup takes a long time, the user may close the context during the
construction of the execbuf. In order to make sure we correctly track
all outstanding work with non-persistent contexts, we need to serialise
the submission with the context closure and mop up any leaks.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 87fa5f42c39a..b2311fe93ad8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2729,6 +2729,12 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 		goto err_batch_unpin;
 	}
 
+	/* Check that the context wasn't destroyed before setup */
+	if (!rcu_access_pointer(eb.context->gem_context)) {
+		err = -ENOENT;
+		goto err_request;
+	}
+
 	if (in_fence) {
 		err = i915_request_await_dma_fence(eb.request, in_fence);
 		if (err < 0)
-- 
2.25.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Intel-gfx] [PATCH 5/6] drm/i915/gt: Declare when we enabled timeslicing
  2020-02-20  7:50 [Intel-gfx] [PATCH 1/6] drm/i915/gt: Protect signaler walk with RCU Chris Wilson
                   ` (2 preceding siblings ...)
  2020-02-20  7:50 ` [Intel-gfx] [PATCH 4/6] drm/i915/gem: Check that the context wasn't closed during setup Chris Wilson
@ 2020-02-20  7:50 ` Chris Wilson
  2020-02-20  7:50 ` [Intel-gfx] [PATCH 6/6] drm/i915/gt: Yield the timeslice if caught waiting on a user semaphore Chris Wilson
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Chris Wilson @ 2020-02-20  7:50 UTC (permalink / raw)
  To: intel-gfx; +Cc: Kenneth Graunke

Let userspace know if they can trust timeslicing by including it as part
of the I915_PARAM_HAS_SCHEDULER::I915_SCHEDULER_CAP_TIMESLICING

v2: Only declare timeslicing if we can safely preempt userspace.

Fixes: 8ee36e048c98 ("drm/i915/execlists: Minimalistic timeslicing")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenneth@whitecape.org>
---
 drivers/gpu/drm/i915/gt/intel_engine.h      | 3 ++-
 drivers/gpu/drm/i915/gt/intel_engine_user.c | 5 +++++
 include/uapi/drm/i915_drm.h                 | 1 +
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index 29c8c03c5caa..a32dc82a90d4 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -326,7 +326,8 @@ intel_engine_has_timeslices(const struct intel_engine_cs *engine)
 	if (!IS_ACTIVE(CONFIG_DRM_I915_TIMESLICE_DURATION))
 		return false;
 
-	return intel_engine_has_semaphores(engine);
+	return (intel_engine_has_semaphores(engine) &&
+		intel_engine_has_preemption(engine));
 }
 
 #endif /* _INTEL_RINGBUFFER_H_ */
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_user.c b/drivers/gpu/drm/i915/gt/intel_engine_user.c
index 848decee9066..b84fdd722781 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_user.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_user.c
@@ -121,6 +121,11 @@ static void set_scheduler_caps(struct drm_i915_private *i915)
 			else
 				disabled |= BIT(map[i].sched);
 		}
+
+		if (intel_engine_has_timeslices(engine))
+			enabled |= I915_SCHEDULER_CAP_TIMESLICING;
+		else
+			disabled |= I915_SCHEDULER_CAP_TIMESLICING;
 	}
 
 	i915->caps.scheduler = enabled & ~disabled;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 829c0a48577f..f4d521f51258 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -523,6 +523,7 @@ typedef struct drm_i915_irq_wait {
 #define   I915_SCHEDULER_CAP_PREEMPTION	(1ul << 2)
 #define   I915_SCHEDULER_CAP_SEMAPHORES	(1ul << 3)
 #define   I915_SCHEDULER_CAP_ENGINE_BUSY_STATS	(1ul << 4)
+#define   I915_SCHEDULER_CAP_TIMESLICING	(1ul << 5)
 
 #define I915_PARAM_HUC_STATUS		 42
 
-- 
2.25.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Intel-gfx] [PATCH 6/6] drm/i915/gt: Yield the timeslice if caught waiting on a user semaphore
  2020-02-20  7:50 [Intel-gfx] [PATCH 1/6] drm/i915/gt: Protect signaler walk with RCU Chris Wilson
                   ` (3 preceding siblings ...)
  2020-02-20  7:50 ` [Intel-gfx] [PATCH 5/6] drm/i915/gt: Declare when we enabled timeslicing Chris Wilson
@ 2020-02-20  7:50 ` Chris Wilson
  2020-02-20  8:27 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [1/6] drm/i915/gt: Protect signaler walk with RCU Patchwork
  2020-02-20 12:47 ` [Intel-gfx] [PATCH 1/6] " Matthew Auld
  6 siblings, 0 replies; 10+ messages in thread
From: Chris Wilson @ 2020-02-20  7:50 UTC (permalink / raw)
  To: intel-gfx; +Cc: Kenneth Graunke

If we find ourselves waiting on a MI_SEMAPHORE_WAIT, either within the
user batch or in our own preamble, the engine raises a
GT_WAIT_ON_SEMAPHORE interrupt. We can unmask that interrupt and so
respond to a semaphore wait by yielding the timeslice, if we have
another context to yield to!

The only real complication is that the interrupt is only generated for
the start of the semaphore wait, and is asynchronous to our
process_csb() -- that is, we may not have registered the timeslice before
we see the interrupt. To ensure we don't miss a potential semaphore
blocking forward progress (e.g. selftests/live_timeslice_preempt) we mark
the interrupt and apply it to the next timeslice regardless of whether it
was active at the time.

v2: We use semaphores in preempt-to-busy, within the timeslicing
implementation itself! Ergo, when we do insert a preemption due to an
expired timeslice, the new context may start with the missed semaphore
flagged by the retired context and be yielded, ad infinitum. To avoid
this, read the context id at the time of the semaphore interrupt and
only yield if that context is still active.

Fixes: 8ee36e048c98 ("drm/i915/execlists: Minimalistic timeslicing")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c    |  6 +++
 drivers/gpu/drm/i915/gt/intel_engine_types.h |  9 +++++
 drivers/gpu/drm/i915/gt/intel_gt_irq.c       | 13 ++++++-
 drivers/gpu/drm/i915/gt/intel_lrc.c          | 40 +++++++++++++++++---
 drivers/gpu/drm/i915/i915_reg.h              |  1 +
 5 files changed, 61 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index e46e55354e95..1a2e9610f1a8 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1288,6 +1288,12 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
 
 	if (engine->id == RENDER_CLASS && IS_GEN_RANGE(dev_priv, 4, 7))
 		drm_printf(m, "\tCCID: 0x%08x\n", ENGINE_READ(engine, CCID));
+	if (HAS_EXECLISTS(dev_priv)) {
+		drm_printf(m, "\tEL_STAT_HI: 0x%08x\n",
+			   ENGINE_READ(engine, RING_EXECLIST_STATUS_HI));
+		drm_printf(m, "\tEL_STAT_LO: 0x%08x\n",
+			   ENGINE_READ(engine, RING_EXECLIST_STATUS_LO));
+	}
 	drm_printf(m, "\tRING_START: 0x%08x\n",
 		   ENGINE_READ(engine, RING_START));
 	drm_printf(m, "\tRING_HEAD:  0x%08x\n",
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index b23366a81048..24cff658e6e5 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -156,6 +156,15 @@ struct intel_engine_execlists {
 	 */
 	struct i915_priolist default_priolist;
 
+	/**
+	 * @yield: CCID at the time of the last semaphore-wait interrupt.
+	 *
+	 * Instead of leaving a semaphore busy-spinning on an engine, we would
+	 * like to switch to another ready context, i.e. yielding the semaphore
+	 * timeslice.
+	 */
+	u32 yield;
+
 	/**
 	 * @error_interrupt: CS Master EIR
 	 *
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_irq.c b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
index f0e7fd95165a..875bd0392ffc 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_irq.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
@@ -39,6 +39,13 @@ cs_irq_handler(struct intel_engine_cs *engine, u32 iir)
 		}
 	}
 
+	if (iir & GT_WAIT_SEMAPHORE_INTERRUPT) {
+		WRITE_ONCE(engine->execlists.yield,
+			   ENGINE_READ_FW(engine, RING_EXECLIST_STATUS_HI));
+		if (del_timer(&engine->execlists.timer))
+			tasklet = true;
+	}
+
 	if (iir & GT_CONTEXT_SWITCH_INTERRUPT)
 		tasklet = true;
 
@@ -228,7 +235,8 @@ void gen11_gt_irq_postinstall(struct intel_gt *gt)
 	const u32 irqs =
 		GT_CS_MASTER_ERROR_INTERRUPT |
 		GT_RENDER_USER_INTERRUPT |
-		GT_CONTEXT_SWITCH_INTERRUPT;
+		GT_CONTEXT_SWITCH_INTERRUPT |
+		GT_WAIT_SEMAPHORE_INTERRUPT;
 	struct intel_uncore *uncore = gt->uncore;
 	const u32 dmask = irqs << 16 | irqs;
 	const u32 smask = irqs << 16;
@@ -366,7 +374,8 @@ void gen8_gt_irq_postinstall(struct intel_gt *gt)
 	const u32 irqs =
 		GT_CS_MASTER_ERROR_INTERRUPT |
 		GT_RENDER_USER_INTERRUPT |
-		GT_CONTEXT_SWITCH_INTERRUPT;
+		GT_CONTEXT_SWITCH_INTERRUPT |
+		GT_WAIT_SEMAPHORE_INTERRUPT;
 	const u32 gt_interrupts[] = {
 		irqs << GEN8_RCS_IRQ_SHIFT | irqs << GEN8_BCS_IRQ_SHIFT,
 		irqs << GEN8_VCS0_IRQ_SHIFT | irqs << GEN8_VCS1_IRQ_SHIFT,
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 47561dc29304..fc6c3a922a22 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1728,7 +1728,8 @@ static void defer_active(struct intel_engine_cs *engine)
 }
 
 static bool
-need_timeslice(struct intel_engine_cs *engine, const struct i915_request *rq)
+need_timeslice(const struct intel_engine_cs *engine,
+	       const struct i915_request *rq)
 {
 	int hint;
 
@@ -1744,6 +1745,31 @@ need_timeslice(struct intel_engine_cs *engine, const struct i915_request *rq)
 	return hint >= effective_prio(rq);
 }
 
+static bool
+timeslice_yield(const struct intel_engine_execlists *el,
+		const struct i915_request *rq)
+{
+	/*
+	 * Once bitten, forever smitten!
+	 *
+	 * If the active context ever busy-waited on a semaphore,
+	 * it will be treated as a hog until the end of its timeslice.
+	 * The HW only sends an interrupt on the first miss, and we
+	 * do know if that semaphore has been signaled, or even if it
+	 * is now stuck on another semaphore. Play safe, yield if it
+	 * might be stuck -- it will be given a fresh timeslice in
+	 * the near future.
+	 */
+	return upper_32_bits(rq->context->lrc_desc) == READ_ONCE(el->yield);
+}
+
+static bool
+timeslice_expired(const struct intel_engine_execlists *el,
+		  const struct i915_request *rq)
+{
+	return timer_expired(&el->timer) || timeslice_yield(el, rq);
+}
+
 static int
 switch_prio(struct intel_engine_cs *engine, const struct i915_request *rq)
 {
@@ -1759,8 +1785,7 @@ timeslice(const struct intel_engine_cs *engine)
 	return READ_ONCE(engine->props.timeslice_duration_ms);
 }
 
-static unsigned long
-active_timeslice(const struct intel_engine_cs *engine)
+static unsigned long active_timeslice(const struct intel_engine_cs *engine)
 {
 	const struct i915_request *rq = *engine->execlists.active;
 
@@ -1903,13 +1928,14 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 
 			last = NULL;
 		} else if (need_timeslice(engine, last) &&
-			   timer_expired(&engine->execlists.timer)) {
+			   timeslice_expired(execlists, last)) {
 			ENGINE_TRACE(engine,
-				     "expired last=%llx:%lld, prio=%d, hint=%d\n",
+				     "expired last=%llx:%lld, prio=%d, hint=%d, yield?=%s\n",
 				     last->fence.context,
 				     last->fence.seqno,
 				     last->sched.attr.priority,
-				     execlists->queue_priority_hint);
+				     execlists->queue_priority_hint,
+				     yesno(timeslice_yield(execlists, last)));
 
 			ring_set_paused(engine, 1);
 			defer_active(engine);
@@ -2169,6 +2195,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		}
 		clear_ports(port + 1, last_port - port);
 
+		WRITE_ONCE(execlists->yield, -1);
 		execlists_submit_ports(engine);
 		set_preempt_timeout(engine);
 	} else {
@@ -4413,6 +4440,7 @@ logical_ring_default_irqs(struct intel_engine_cs *engine)
 	engine->irq_enable_mask = GT_RENDER_USER_INTERRUPT << shift;
 	engine->irq_keep_mask = GT_CONTEXT_SWITCH_INTERRUPT << shift;
 	engine->irq_keep_mask |= GT_CS_MASTER_ERROR_INTERRUPT << shift;
+	engine->irq_keep_mask |= GT_WAIT_SEMAPHORE_INTERRUPT << shift;
 }
 
 static void rcs_submission_override(struct intel_engine_cs *engine)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index b09c1d6dc0aa..0f1fcc863f3d 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -3090,6 +3090,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define GT_BSD_CS_ERROR_INTERRUPT		(1 << 15)
 #define GT_BSD_USER_INTERRUPT			(1 << 12)
 #define GT_RENDER_L3_PARITY_ERROR_INTERRUPT_S1	(1 << 11) /* hsw+; rsvd on snb, ivb, vlv */
+#define GT_WAIT_SEMAPHORE_INTERRUPT		REG_BIT(11) /* bdw+ */
 #define GT_CONTEXT_SWITCH_INTERRUPT		(1 <<  8)
 #define GT_RENDER_L3_PARITY_ERROR_INTERRUPT	(1 <<  5) /* !snb */
 #define GT_RENDER_PIPECTL_NOTIFY_INTERRUPT	(1 <<  4)
-- 
2.25.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [1/6] drm/i915/gt: Protect signaler walk with RCU
  2020-02-20  7:50 [Intel-gfx] [PATCH 1/6] drm/i915/gt: Protect signaler walk with RCU Chris Wilson
                   ` (4 preceding siblings ...)
  2020-02-20  7:50 ` [Intel-gfx] [PATCH 6/6] drm/i915/gt: Yield the timeslice if caught waiting on a user semaphore Chris Wilson
@ 2020-02-20  8:27 ` Patchwork
  2020-02-20 12:47 ` [Intel-gfx] [PATCH 1/6] " Matthew Auld
  6 siblings, 0 replies; 10+ messages in thread
From: Patchwork @ 2020-02-20  8:27 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/6] drm/i915/gt: Protect signaler walk with RCU
URL   : https://patchwork.freedesktop.org/series/73691/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_7970 -> Patchwork_16642
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_16642 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_16642, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16642/index.html

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_16642:

### IGT changes ###

#### Possible regressions ####

  * igt@runner@aborted:
    - fi-byt-j1900:       NOTRUN -> [FAIL][1]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16642/fi-byt-j1900/igt@runner@aborted.html

  
Known issues
------------

  Here are the changes found in Patchwork_16642 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_ctx_create@basic-files:
    - fi-byt-j1900:       [PASS][2] -> [INCOMPLETE][3] ([i915#45])
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7970/fi-byt-j1900/igt@gem_ctx_create@basic-files.html
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16642/fi-byt-j1900/igt@gem_ctx_create@basic-files.html

  * igt@i915_selftest@live_sanitycheck:
    - fi-icl-u3:          [PASS][4] -> [DMESG-WARN][5] ([i915#585])
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7970/fi-icl-u3/igt@i915_selftest@live_sanitycheck.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16642/fi-icl-u3/igt@i915_selftest@live_sanitycheck.html

  * igt@kms_flip@basic-flip-vs-wf_vblank:
    - fi-bwr-2160:        [PASS][6] -> [FAIL][7] ([i915#34])
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7970/fi-bwr-2160/igt@kms_flip@basic-flip-vs-wf_vblank.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16642/fi-bwr-2160/igt@kms_flip@basic-flip-vs-wf_vblank.html

  
#### Possible fixes ####

  * igt@gem_close_race@basic-threads:
    - fi-hsw-peppy:       [TIMEOUT][8] ([fdo#112271] / [i915#1084]) -> [PASS][9]
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7970/fi-hsw-peppy/igt@gem_close_race@basic-threads.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16642/fi-hsw-peppy/igt@gem_close_race@basic-threads.html

  * igt@i915_selftest@live_execlists:
    - {fi-tgl-dsi}:       [INCOMPLETE][10] ([i915#529] / [i915#647]) -> [PASS][11]
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7970/fi-tgl-dsi/igt@i915_selftest@live_execlists.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16642/fi-tgl-dsi/igt@i915_selftest@live_execlists.html

  * igt@i915_selftest@live_gtt:
    - fi-kbl-7500u:       [TIMEOUT][12] ([fdo#112271]) -> [PASS][13]
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7970/fi-kbl-7500u/igt@i915_selftest@live_gtt.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16642/fi-kbl-7500u/igt@i915_selftest@live_gtt.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#112271]: https://bugs.freedesktop.org/show_bug.cgi?id=112271
  [i915#1084]: https://gitlab.freedesktop.org/drm/intel/issues/1084
  [i915#34]: https://gitlab.freedesktop.org/drm/intel/issues/34
  [i915#45]: https://gitlab.freedesktop.org/drm/intel/issues/45
  [i915#529]: https://gitlab.freedesktop.org/drm/intel/issues/529
  [i915#585]: https://gitlab.freedesktop.org/drm/intel/issues/585
  [i915#647]: https://gitlab.freedesktop.org/drm/intel/issues/647


Participating hosts (48 -> 40)
------------------------------

  Additional (3): fi-gdg-551 fi-bsw-nick fi-snb-2600 
  Missing    (11): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-cfl-8109u fi-bsw-kefka fi-skl-lmem fi-kbl-7560u fi-byt-n2820 fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_7970 -> Patchwork_16642

  CI-20190529: 20190529
  CI_DRM_7970: 6b8b833350142345f4b1a6af9486db7d316a7ff1 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5452: c05dc6cd816feb1cc518ce777ab3fd6c81893113 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_16642: 81c48211d4f0171849b2fe323a5c6bf3bdfcfae6 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

81c48211d4f0 drm/i915/gt: Yield the timeslice if caught waiting on a user semaphore
3a98210715c5 drm/i915/gt: Declare when we enabled timeslicing
c0b4866e2c12 drm/i915/gem: Check that the context wasn't closed during setup
4af18a639c91 drm/i915/gt: Prevent allocation on a banned context
1677db7d6e97 drm/i915/gem: Consolidate ctx->engines[] release
857122388306 drm/i915/gt: Protect signaler walk with RCU

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16642/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-gfx] [PATCH 1/6] drm/i915/gt: Protect signaler walk with RCU
  2020-02-20  7:50 [Intel-gfx] [PATCH 1/6] drm/i915/gt: Protect signaler walk with RCU Chris Wilson
                   ` (5 preceding siblings ...)
  2020-02-20  8:27 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [1/6] drm/i915/gt: Protect signaler walk with RCU Patchwork
@ 2020-02-20 12:47 ` Matthew Auld
  2020-02-20 12:52   ` Chris Wilson
  6 siblings, 1 reply; 10+ messages in thread
From: Matthew Auld @ 2020-02-20 12:47 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 20/02/2020 07:50, Chris Wilson wrote:
> While we know that the waiters cannot disappear as we walk our list
> (only that they might be added), the same cannot be said for our
> signalers as they may be completed by the HW and retired as we process
> this request. Ergo we need to use rcu to protect the list iteration and
> remember to mark up the list_del_rcu.
> 
> v2: Mark the deps as safe-for-rcu
> 
> Fixes: 793c22617367 ("drm/i915/gt: Protect execlists_hold/unhold from new waiters")
> Fixes: 32ff621fd744 ("drm/i915/gt: Allow temporary suspension of inflight requests")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Matthew Auld <matthew.auld@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_lrc.c   | 16 ++++++++++------
>   drivers/gpu/drm/i915/i915_scheduler.c |  7 ++++---
>   2 files changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index ba31cbe8c68e..47561dc29304 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -1668,9 +1668,9 @@ last_active(const struct intel_engine_execlists *execlists)
>   				     wait_link)
>   
>   #define for_each_signaler(p__, rq__) \
> -	list_for_each_entry_lockless(p__, \
> -				     &(rq__)->sched.signalers_list, \
> -				     signal_link)
> +	list_for_each_entry_rcu(p__, \
> +				&(rq__)->sched.signalers_list, \
> +				signal_link)
>   
>   static void defer_request(struct i915_request *rq, struct list_head * const pl)
>   {
> @@ -2533,11 +2533,13 @@ static bool execlists_hold(struct intel_engine_cs *engine,
>   static bool hold_request(const struct i915_request *rq)
>   {
>   	struct i915_dependency *p;
> +	bool result = false;
>   
>   	/*
>   	 * If one of our ancestors is on hold, we must also be on hold,
>   	 * otherwise we will bypass it and execute before it.
>   	 */
> +	rcu_read_lock();
>   	for_each_signaler(p, rq) {
>   		const struct i915_request *s =
>   			container_of(p->signaler, typeof(*s), sched);
> @@ -2545,11 +2547,13 @@ static bool hold_request(const struct i915_request *rq)
>   		if (s->engine != rq->engine)
>   			continue;
>   
> -		if (i915_request_on_hold(s))
> -			return true;
> +		result = i915_request_on_hold(s);
> +		if (result)
> +			break;
>   	}
> +	rcu_read_unlock();
>   
> -	return false;
> +	return result;
>   }
>   
>   static void __execlists_unhold(struct i915_request *rq)
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> index e19a37a83397..59f70b674665 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.c
> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> @@ -486,7 +486,7 @@ void i915_sched_node_fini(struct i915_sched_node *node)
>   	list_for_each_entry_safe(dep, tmp, &node->signalers_list, signal_link) {
>   		GEM_BUG_ON(!list_empty(&dep->dfs_link));
>   
> -		list_del(&dep->wait_link);
> +		list_del_rcu(&dep->wait_link);
>   		if (dep->flags & I915_DEPENDENCY_ALLOC)
>   			i915_dependency_free(dep);
>   	}
> @@ -497,7 +497,7 @@ void i915_sched_node_fini(struct i915_sched_node *node)
>   		GEM_BUG_ON(dep->signaler != node);
>   		GEM_BUG_ON(!list_empty(&dep->dfs_link));
>   
> -		list_del(&dep->signal_link);
> +		list_del_rcu(&dep->signal_link);
>   		if (dep->flags & I915_DEPENDENCY_ALLOC)
>   			i915_dependency_free(dep);
>   	}
> @@ -526,7 +526,8 @@ static struct i915_global_scheduler global = { {
>   int __init i915_global_scheduler_init(void)
>   {
>   	global.slab_dependencies = KMEM_CACHE(i915_dependency,
> -					      SLAB_HWCACHE_ALIGN);
> +					      SLAB_HWCACHE_ALIGN |
> +					      SLAB_TYPESAFE_BY_RCU);

So, the claim is that we should be fine if the node is re-used and then 
initialised, even though there might exist a minuscule window where 
hold_request might still be able to see it, somehow?
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-gfx] [PATCH 1/6] drm/i915/gt: Protect signaler walk with RCU
  2020-02-20 12:47 ` [Intel-gfx] [PATCH 1/6] " Matthew Auld
@ 2020-02-20 12:52   ` Chris Wilson
  2020-02-20 13:16     ` Matthew Auld
  0 siblings, 1 reply; 10+ messages in thread
From: Chris Wilson @ 2020-02-20 12:52 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2020-02-20 12:47:28)
> On 20/02/2020 07:50, Chris Wilson wrote:
> > While we know that the waiters cannot disappear as we walk our list
> > (only that they might be added), the same cannot be said for our
> > signalers as they may be completed by the HW and retired as we process
> > this request. Ergo we need to use rcu to protect the list iteration and
> > remember to mark up the list_del_rcu.
> > 
> > v2: Mark the deps as safe-for-rcu
> > 
> > Fixes: 793c22617367 ("drm/i915/gt: Protect execlists_hold/unhold from new waiters")
> > Fixes: 32ff621fd744 ("drm/i915/gt: Allow temporary suspension of inflight requests")
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Matthew Auld <matthew.auld@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_lrc.c   | 16 ++++++++++------
> >   drivers/gpu/drm/i915/i915_scheduler.c |  7 ++++---
> >   2 files changed, 14 insertions(+), 9 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > index ba31cbe8c68e..47561dc29304 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > @@ -1668,9 +1668,9 @@ last_active(const struct intel_engine_execlists *execlists)
> >                                    wait_link)
> >   
> >   #define for_each_signaler(p__, rq__) \
> > -     list_for_each_entry_lockless(p__, \
> > -                                  &(rq__)->sched.signalers_list, \
> > -                                  signal_link)
> > +     list_for_each_entry_rcu(p__, \
> > +                             &(rq__)->sched.signalers_list, \
> > +                             signal_link)
> >   
> >   static void defer_request(struct i915_request *rq, struct list_head * const pl)
> >   {
> > @@ -2533,11 +2533,13 @@ static bool execlists_hold(struct intel_engine_cs *engine,
> >   static bool hold_request(const struct i915_request *rq)
> >   {
> >       struct i915_dependency *p;
> > +     bool result = false;
> >   
> >       /*
> >        * If one of our ancestors is on hold, we must also be on hold,
> >        * otherwise we will bypass it and execute before it.
> >        */
> > +     rcu_read_lock();
> >       for_each_signaler(p, rq) {
> >               const struct i915_request *s =
> >                       container_of(p->signaler, typeof(*s), sched);
> > @@ -2545,11 +2547,13 @@ static bool hold_request(const struct i915_request *rq)
> >               if (s->engine != rq->engine)
> >                       continue;
> >   
> > -             if (i915_request_on_hold(s))
> > -                     return true;
> > +             result = i915_request_on_hold(s);
> > +             if (result)
> > +                     break;
> >       }
> > +     rcu_read_unlock();
> >   
> > -     return false;
> > +     return result;
> >   }
> >   
> >   static void __execlists_unhold(struct i915_request *rq)
> > diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
> > index e19a37a83397..59f70b674665 100644
> > --- a/drivers/gpu/drm/i915/i915_scheduler.c
> > +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> > @@ -486,7 +486,7 @@ void i915_sched_node_fini(struct i915_sched_node *node)
> >       list_for_each_entry_safe(dep, tmp, &node->signalers_list, signal_link) {
> >               GEM_BUG_ON(!list_empty(&dep->dfs_link));
> >   
> > -             list_del(&dep->wait_link);
> > +             list_del_rcu(&dep->wait_link);
> >               if (dep->flags & I915_DEPENDENCY_ALLOC)
> >                       i915_dependency_free(dep);
> >       }
> > @@ -497,7 +497,7 @@ void i915_sched_node_fini(struct i915_sched_node *node)
> >               GEM_BUG_ON(dep->signaler != node);
> >               GEM_BUG_ON(!list_empty(&dep->dfs_link));
> >   
> > -             list_del(&dep->signal_link);
> > +             list_del_rcu(&dep->signal_link);
> >               if (dep->flags & I915_DEPENDENCY_ALLOC)
> >                       i915_dependency_free(dep);
> >       }
> > @@ -526,7 +526,8 @@ static struct i915_global_scheduler global = { {
> >   int __init i915_global_scheduler_init(void)
> >   {
> >       global.slab_dependencies = KMEM_CACHE(i915_dependency,
> > -                                           SLAB_HWCACHE_ALIGN);
> > +                                           SLAB_HWCACHE_ALIGN |
> > +                                           SLAB_TYPESAFE_BY_RCU);
> 
> So, the claim is that we should be fine if the node is re-used and then 
> initialised, even though there might exist a minuscule window where 
> hold_request might still be able to see it, somehow?

Yes. That is my claim. The saving grace here is that for the on-hold
transitions we must go through the engine->active.lock which we are
holding. Ergo hold_request() is safe.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-gfx] [PATCH 1/6] drm/i915/gt: Protect signaler walk with RCU
  2020-02-20 12:52   ` Chris Wilson
@ 2020-02-20 13:16     ` Matthew Auld
  0 siblings, 0 replies; 10+ messages in thread
From: Matthew Auld @ 2020-02-20 13:16 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 20/02/2020 12:52, Chris Wilson wrote:
> Quoting Matthew Auld (2020-02-20 12:47:28)
>> On 20/02/2020 07:50, Chris Wilson wrote:
>>> While we know that the waiters cannot disappear as we walk our list
>>> (only that they might be added), the same cannot be said for our
>>> signalers as they may be completed by the HW and retired as we process
>>> this request. Ergo we need to use rcu to protect the list iteration and
>>> remember to mark up the list_del_rcu.
>>>
>>> v2: Mark the deps as safe-for-rcu
>>>
>>> Fixes: 793c22617367 ("drm/i915/gt: Protect execlists_hold/unhold from new waiters")
>>> Fixes: 32ff621fd744 ("drm/i915/gt: Allow temporary suspension of inflight requests")
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>>> Cc: Matthew Auld <matthew.auld@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/gt/intel_lrc.c   | 16 ++++++++++------
>>>    drivers/gpu/drm/i915/i915_scheduler.c |  7 ++++---
>>>    2 files changed, 14 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>> index ba31cbe8c68e..47561dc29304 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>> @@ -1668,9 +1668,9 @@ last_active(const struct intel_engine_execlists *execlists)
>>>                                     wait_link)
>>>    
>>>    #define for_each_signaler(p__, rq__) \
>>> -     list_for_each_entry_lockless(p__, \
>>> -                                  &(rq__)->sched.signalers_list, \
>>> -                                  signal_link)
>>> +     list_for_each_entry_rcu(p__, \
>>> +                             &(rq__)->sched.signalers_list, \
>>> +                             signal_link)
>>>    
>>>    static void defer_request(struct i915_request *rq, struct list_head * const pl)
>>>    {
>>> @@ -2533,11 +2533,13 @@ static bool execlists_hold(struct intel_engine_cs *engine,
>>>    static bool hold_request(const struct i915_request *rq)
>>>    {
>>>        struct i915_dependency *p;
>>> +     bool result = false;
>>>    
>>>        /*
>>>         * If one of our ancestors is on hold, we must also be on hold,
>>>         * otherwise we will bypass it and execute before it.
>>>         */
>>> +     rcu_read_lock();
>>>        for_each_signaler(p, rq) {
>>>                const struct i915_request *s =
>>>                        container_of(p->signaler, typeof(*s), sched);
>>> @@ -2545,11 +2547,13 @@ static bool hold_request(const struct i915_request *rq)
>>>                if (s->engine != rq->engine)
>>>                        continue;
>>>    
>>> -             if (i915_request_on_hold(s))
>>> -                     return true;
>>> +             result = i915_request_on_hold(s);
>>> +             if (result)
>>> +                     break;
>>>        }
>>> +     rcu_read_unlock();
>>>    
>>> -     return false;
>>> +     return result;
>>>    }
>>>    
>>>    static void __execlists_unhold(struct i915_request *rq)
>>> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
>>> index e19a37a83397..59f70b674665 100644
>>> --- a/drivers/gpu/drm/i915/i915_scheduler.c
>>> +++ b/drivers/gpu/drm/i915/i915_scheduler.c
>>> @@ -486,7 +486,7 @@ void i915_sched_node_fini(struct i915_sched_node *node)
>>>        list_for_each_entry_safe(dep, tmp, &node->signalers_list, signal_link) {
>>>                GEM_BUG_ON(!list_empty(&dep->dfs_link));
>>>    
>>> -             list_del(&dep->wait_link);
>>> +             list_del_rcu(&dep->wait_link);
>>>                if (dep->flags & I915_DEPENDENCY_ALLOC)
>>>                        i915_dependency_free(dep);
>>>        }
>>> @@ -497,7 +497,7 @@ void i915_sched_node_fini(struct i915_sched_node *node)
>>>                GEM_BUG_ON(dep->signaler != node);
>>>                GEM_BUG_ON(!list_empty(&dep->dfs_link));
>>>    
>>> -             list_del(&dep->signal_link);
>>> +             list_del_rcu(&dep->signal_link);
>>>                if (dep->flags & I915_DEPENDENCY_ALLOC)
>>>                        i915_dependency_free(dep);
>>>        }
>>> @@ -526,7 +526,8 @@ static struct i915_global_scheduler global = { {
>>>    int __init i915_global_scheduler_init(void)
>>>    {
>>>        global.slab_dependencies = KMEM_CACHE(i915_dependency,
>>> -                                           SLAB_HWCACHE_ALIGN);
>>> +                                           SLAB_HWCACHE_ALIGN |
>>> +                                           SLAB_TYPESAFE_BY_RCU);
>>
>> So, the claim is that we should be fine if the node is re-used and then
>> initialised, even though there might exist a minuscule window where
>> hold_request might still be able to see it, somehow?
> 
> Yes. That is my claim. The saving grace here is that for the on-hold
> transitions we must go through the engine->active.lock which we are
> holding. Ergo hold_request() is safe.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>

> -Chris
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-02-20 13:16 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-20  7:50 [Intel-gfx] [PATCH 1/6] drm/i915/gt: Protect signaler walk with RCU Chris Wilson
2020-02-20  7:50 ` [Intel-gfx] [PATCH 2/6] drm/i915/gem: Consolidate ctx->engines[] release Chris Wilson
2020-02-20  7:50 ` [Intel-gfx] [PATCH 3/6] drm/i915/gt: Prevent allocation on a banned context Chris Wilson
2020-02-20  7:50 ` [Intel-gfx] [PATCH 4/6] drm/i915/gem: Check that the context wasn't closed during setup Chris Wilson
2020-02-20  7:50 ` [Intel-gfx] [PATCH 5/6] drm/i915/gt: Declare when we enabled timeslicing Chris Wilson
2020-02-20  7:50 ` [Intel-gfx] [PATCH 6/6] drm/i915/gt: Yield the timeslice if caught waiting on a user semaphore Chris Wilson
2020-02-20  8:27 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [1/6] drm/i915/gt: Protect signaler walk with RCU Patchwork
2020-02-20 12:47 ` [Intel-gfx] [PATCH 1/6] " Matthew Auld
2020-02-20 12:52   ` Chris Wilson
2020-02-20 13:16     ` Matthew Auld

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).