All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 01/32] drm/i915: Seal races between async GPU cancellation, retirement and signaling
Date: Tue, 23 Apr 2019 13:59:47 +0100	[thread overview]
Message-ID: <7c6ad611-8373-7ce8-cf3f-9a4ec2f4fffa@linux.intel.com> (raw)
In-Reply-To: <20190417075657.19456-1-chris@chris-wilson.co.uk>


On 17/04/2019 08:56, Chris Wilson wrote:
> Currently there is an underlying assumption that i915_request_unsubmit()
> is synchronous wrt the GPU -- that is the request is no longer in flight
> as we remove it. In the near future that may change, and this may upset
> our signaling as we can process an interrupt for that request while it
> is no longer in flight.
> 
> CPU0					CPU1
> intel_engine_breadcrumbs_irq
> (queue request completion)
> 					i915_request_cancel_signaling
> ...					...
> 					i915_request_enable_signaling
> dma_fence_signal
> 
> Hence in the time it took us to drop the lock to signal the request, a
> preemption event may have occurred and re-queued the request. In the
> process, that request would have seen I915_FENCE_FLAG_SIGNAL clear and
> so reused the rq->signal_link that was in use on CPU0, leading to bad
> pointer chasing in intel_engine_breadcrumbs_irq.
> 
> A related issue was that if someone started listening for a signal on a
> completed but no longer in-flight request, we missed the opportunity to
> immediately signal that request.
> 
> Furthermore, as intel_contexts may be immediately released during
> request retirement, in order to be entirely sure that
> intel_engine_breadcrumbs_irq may no longer dereference the intel_context
> (ce->signals and ce->signal_link), we must wait for irq spinlock.
> 
> In order to prevent the race, we use a bit in the fence.flags to signal
> the transfer onto the signal list inside intel_engine_breadcrumbs_irq.
> For simplicity, we use the DMA_FENCE_FLAG_SIGNALED_BIT as it then
> quickly signals to any outside observer that the fence is indeed signaled.
> 
> Fixes: 52c0fdb25c7c ("drm/i915: Replace global breadcrumbs with per-context interrupt tracking")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/dma-buf/dma-fence.c              |  1 +
>   drivers/gpu/drm/i915/i915_request.c      |  1 +
>   drivers/gpu/drm/i915/intel_breadcrumbs.c | 52 ++++++++++++++----------
>   3 files changed, 33 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> index 3aa8733f832a..9bf06042619a 100644
> --- a/drivers/dma-buf/dma-fence.c
> +++ b/drivers/dma-buf/dma-fence.c
> @@ -29,6 +29,7 @@
>   
>   EXPORT_TRACEPOINT_SYMBOL(dma_fence_emit);
>   EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal);
> +EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled);
>   
>   static DEFINE_SPINLOCK(dma_fence_stub_lock);
>   static struct dma_fence dma_fence_stub;
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index b836721d3b13..e0efc334463b 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -432,6 +432,7 @@ void __i915_request_submit(struct i915_request *request)
>   	set_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags);
>   
>   	if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags) &&
> +	    !test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &request->fence.flags) &&
>   	    !i915_request_enable_breadcrumb(request))
>   		intel_engine_queue_breadcrumbs(engine);
>   
> diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> index 3cbffd400b1b..e19f84b006cc 100644
> --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
> +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> @@ -23,6 +23,7 @@
>    */
>   
>   #include <linux/kthread.h>
> +#include <trace/events/dma_fence.h>
>   #include <uapi/linux/sched/types.h>
>   
>   #include "i915_drv.h"
> @@ -83,6 +84,7 @@ static inline bool __request_completed(const struct i915_request *rq)
>   void intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine)
>   {
>   	struct intel_breadcrumbs *b = &engine->breadcrumbs;
> +	const ktime_t timestamp = ktime_get();
>   	struct intel_context *ce, *cn;
>   	struct list_head *pos, *next;
>   	LIST_HEAD(signal);
> @@ -104,6 +106,11 @@ void intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine)
>   
>   			GEM_BUG_ON(!test_bit(I915_FENCE_FLAG_SIGNAL,
>   					     &rq->fence.flags));
> +			clear_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags);
> +
> +			if (test_and_set_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
> +					     &rq->fence.flags))
> +				continue;

 From here to below is intimate coupling with the dma_fence_signal 
implementation, via open-coding it (with some optimizations as well).

I am thinking about this solution.. here we put:

	if (!__dma_fence_maybe_signal(&rq->fence))
		continue;

Add the low-level helper to dma-fence.c and export it.

And below..

>   
>   			/*
>   			 * Queue for execution after dropping the signaling
> @@ -111,14 +118,6 @@ void intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine)
>   			 * more signalers to the same context or engine.
>   			 */
>   			i915_request_get(rq);
> -
> -			/*
> -			 * We may race with direct invocation of
> -			 * dma_fence_signal(), e.g. i915_request_retire(),
> -			 * so we need to acquire our reference to the request
> -			 * before we cancel the breadcrumb.
> -			 */
> -			clear_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags);
>   			list_add_tail(&rq->signal_link, &signal);
>   		}
>   
> @@ -140,8 +139,21 @@ void intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine)
>   	list_for_each_safe(pos, next, &signal) {
>   		struct i915_request *rq =
>   			list_entry(pos, typeof(*rq), signal_link);
> +		struct dma_fence_cb *cur, *tmp;
> +
> +		trace_dma_fence_signaled(&rq->fence);
> +
> +		rq->fence.timestamp = timestamp;
> +		set_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT, &rq->fence.flags);
> +
> +		spin_lock(&rq->lock);
> +		list_for_each_entry_safe(cur, tmp, &rq->fence.cb_list, node) {
> +			INIT_LIST_HEAD(&cur->node);
> +			cur->func(&rq->fence, cur);
> +		}
> +		INIT_LIST_HEAD(&rq->fence.cb_list);
> +		spin_unlock(&rq->lock);

..we do:

   __dma_fence_complete/force/finish_signal(&rq->fence, timestamp));

Also export etc, instead of the whole above block.

This way it is both self-documenting in our code and we remove the 
intimate coupling with dma-fence implementation details.

No need to export the tracepoint then either.

(You can have a prequel patch to optimise the list_del_init in 
dma_fence_signal.)

Thoughts?

Regards,

Tvrtko

>   
> -		dma_fence_signal(&rq->fence);
>   		i915_request_put(rq);
>   	}
>   }
> @@ -243,19 +255,17 @@ void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine)
>   
>   bool i915_request_enable_breadcrumb(struct i915_request *rq)
>   {
> -	struct intel_breadcrumbs *b = &rq->engine->breadcrumbs;
> -
> -	GEM_BUG_ON(test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags));
> -
> -	if (!test_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags))
> -		return true;
> +	lockdep_assert_held(&rq->lock);
> +	lockdep_assert_irqs_disabled();
>   
> -	spin_lock(&b->irq_lock);
> -	if (test_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags) &&
> -	    !__request_completed(rq)) {
> +	if (test_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags)) {
> +		struct intel_breadcrumbs *b = &rq->engine->breadcrumbs;
>   		struct intel_context *ce = rq->hw_context;
>   		struct list_head *pos;
>   
> +		spin_lock(&b->irq_lock);
> +		GEM_BUG_ON(test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags));
> +
>   		__intel_breadcrumbs_arm_irq(b);
>   
>   		/*
> @@ -284,8 +294,8 @@ bool i915_request_enable_breadcrumb(struct i915_request *rq)
>   			list_move_tail(&ce->signal_link, &b->signalers);
>   
>   		set_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags);
> +		spin_unlock(&b->irq_lock);
>   	}
> -	spin_unlock(&b->irq_lock);
>   
>   	return !__request_completed(rq);
>   }
> @@ -294,8 +304,8 @@ void i915_request_cancel_breadcrumb(struct i915_request *rq)
>   {
>   	struct intel_breadcrumbs *b = &rq->engine->breadcrumbs;
>   
> -	if (!test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags))
> -		return;
> +	lockdep_assert_held(&rq->lock);
> +	lockdep_assert_irqs_disabled();
>   
>   	spin_lock(&b->irq_lock);
>   	if (test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags)) {
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

      parent reply	other threads:[~2019-04-23 12:59 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-17  7:56 [PATCH 01/32] drm/i915: Seal races between async GPU cancellation, retirement and signaling Chris Wilson
2019-04-17  7:56 ` [PATCH 02/32] drm/i915: Verify workarounds immediately after application Chris Wilson
2019-04-17  7:56 ` [PATCH 03/32] drm/i915: Verify the engine workarounds stick on application Chris Wilson
2019-04-17  7:56 ` [PATCH 04/32] drm/i915: Make workaround verification *optional* Chris Wilson
2019-04-17  9:37   ` Tvrtko Ursulin
2019-04-17  7:56 ` [PATCH 05/32] drm/i915/selftests: Verify whitelist of context registers Chris Wilson
2019-04-17  7:56 ` [PATCH 06/32] drm/i915: Store the default sseu setup on the engine Chris Wilson
2019-04-17  9:40   ` Tvrtko Ursulin
2019-04-24  9:45     ` Chris Wilson
2019-04-17  7:56 ` [PATCH 07/32] drm/i915: Move GraphicsTechnology files under gt/ Chris Wilson
2019-04-17  9:42   ` Tvrtko Ursulin
2019-04-18 12:04   ` Joonas Lahtinen
2019-04-23  8:57     ` Joonas Lahtinen
2019-04-23  9:40       ` Jani Nikula
2019-04-23 16:46         ` Rodrigo Vivi
2019-04-17  7:56 ` [PATCH 08/32] drm/i915: Introduce struct intel_wakeref Chris Wilson
2019-04-17  9:45   ` Tvrtko Ursulin
2019-04-17  7:56 ` [PATCH 09/32] drm/i915: Pull the GEM powermangement coupling into its own file Chris Wilson
2019-04-17  7:56 ` [PATCH 10/32] drm/i915: Introduce context->enter() and context->exit() Chris Wilson
2019-04-17  7:56 ` [PATCH 11/32] drm/i915: Pass intel_context to i915_request_create() Chris Wilson
2019-04-17  7:56 ` [PATCH 12/32] drm/i915: Invert the GEM wakeref hierarchy Chris Wilson
2019-04-18 12:42   ` Tvrtko Ursulin
2019-04-18 13:07     ` Chris Wilson
2019-04-18 13:22       ` Chris Wilson
2019-04-23 13:02   ` Tvrtko Ursulin
2019-04-17  7:56 ` [PATCH 13/32] drm/i915/gvt: Pin the per-engine GVT shadow contexts Chris Wilson
2019-04-17  7:56 ` [PATCH 14/32] drm/i915: Explicitly pin the logical context for execbuf Chris Wilson
2019-04-17  7:56 ` [PATCH 15/32] drm/i915: Export intel_context_instance() Chris Wilson
2019-04-17  7:56 ` [PATCH 16/32] drm/i915/selftests: Use the real kernel context for sseu isolation tests Chris Wilson
2019-04-17  7:56 ` [PATCH 17/32] drm/i915/selftests: Pass around intel_context for sseu Chris Wilson
2019-04-17  7:56 ` [PATCH 18/32] drm/i915: Pass intel_context to intel_context_pin_lock() Chris Wilson
2019-04-17  7:56 ` [PATCH 19/32] drm/i915: Split engine setup/init into two phases Chris Wilson
2019-04-17  7:56 ` [PATCH 20/32] drm/i915: Switch back to an array of logical per-engine HW contexts Chris Wilson
2019-04-17  7:56 ` [PATCH 21/32] drm/i915: Remove intel_context.active_link Chris Wilson
2019-04-17  9:47   ` Tvrtko Ursulin
2019-04-17  7:56 ` [PATCH 22/32] drm/i915: Move i915_request_alloc into selftests/ Chris Wilson
2019-04-17  7:56 ` [PATCH 23/32] drm/i915: Allow multiple user handles to the same VM Chris Wilson
2019-04-17  7:56 ` [PATCH 24/32] drm/i915: Restore control over ppgtt for context creation ABI Chris Wilson
2019-04-17  7:56 ` [PATCH 25/32] drm/i915: Allow a context to define its set of engines Chris Wilson
2019-04-17  9:50   ` Tvrtko Ursulin
2019-04-17  7:56 ` [PATCH 26/32] drm/i915: Re-expose SINGLE_TIMELINE flags for context creation Chris Wilson
2019-04-17  7:56 ` [PATCH 27/32] drm/i915: Allow userspace to clone contexts on creation Chris Wilson
2019-04-17  9:50   ` Tvrtko Ursulin
2019-04-17  7:56 ` [PATCH 28/32] drm/i915: Load balancing across a virtual engine Chris Wilson
2019-04-17 11:26   ` Tvrtko Ursulin
2019-04-17 13:51     ` Chris Wilson
2019-04-17  7:56 ` [PATCH 29/32] drm/i915: Apply an execution_mask to the virtual_engine Chris Wilson
2019-04-17 11:43   ` Tvrtko Ursulin
2019-04-17 11:57     ` Chris Wilson
2019-04-17 12:35       ` Tvrtko Ursulin
2019-04-17 12:46         ` Chris Wilson
2019-04-17 13:32           ` Tvrtko Ursulin
2019-04-18  7:24             ` Chris Wilson
2019-04-17  7:56 ` [PATCH 30/32] drm/i915: Extend execution fence to support a callback Chris Wilson
2019-04-17  7:56 ` [PATCH 31/32] drm/i915/execlists: Virtual engine bonding Chris Wilson
2019-04-18  6:47   ` Tvrtko Ursulin
2019-04-18  6:57     ` Chris Wilson
2019-04-18  8:57       ` Tvrtko Ursulin
2019-04-18  9:13         ` Chris Wilson
2019-04-18  9:50           ` Tvrtko Ursulin
2019-04-18  9:59             ` Chris Wilson
2019-04-18 10:24               ` Tvrtko Ursulin
2019-04-17  7:56 ` [PATCH 32/32] drm/i915: Allow specification of parallel execbuf Chris Wilson
2019-04-17  8:46 ` [PATCH 01/32] drm/i915: Seal races between async GPU cancellation, retirement and signaling Chris Wilson
2019-04-17 11:33 ` ✗ Fi.CI.BAT: failure for series starting with [01/32] " Patchwork
2019-04-18 10:32 ` [PATCH 01/32] " Tvrtko Ursulin
2019-04-18 10:40   ` Chris Wilson
2019-04-23 12:59 ` Tvrtko Ursulin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7c6ad611-8373-7ce8-cf3f-9a4ec2f4fffa@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.