All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 04/33] drm/i915: Use RCU to annotate and enforce protection for breadcrumb's bh
Date: Mon, 8 Aug 2016 11:33:56 +0200	[thread overview]
Message-ID: <20160808093356.GZ6232@phenom.ffwll.local> (raw)
In-Reply-To: <1470581141-14432-5-git-send-email-chris@chris-wilson.co.uk>

On Sun, Aug 07, 2016 at 03:45:12PM +0100, Chris Wilson wrote:
> The bottom-half we use for processing the breadcrumb interrupt is a
> task, which is an RCU protected struct. When accessing this struct, we
> need to be holding the RCU read lock to prevent it disappearing beneath
> us. We can use the RCU annotation to mark our irq_seqno_bh pointer as
> being under RCU guard and then use the RCU accessors to both provide
> correct ordering of access through the pointer.
> 
> Most notably, this fixes the access from hard irq context to use the RCU
> read lock, which both Daniel and Tvrtko complained about.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>

I'll leave sparse-checking this to 0day and runtime lockdep checking to CI
;-)

> ---
>  drivers/gpu/drm/i915/i915_drv.h          |  2 +-
>  drivers/gpu/drm/i915/intel_breadcrumbs.c | 22 +++++++++-------------
>  drivers/gpu/drm/i915/intel_ringbuffer.c  |  2 --
>  drivers/gpu/drm/i915/intel_ringbuffer.h  | 21 ++++++++++++++-------
>  4 files changed, 24 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index feec00f769e1..3d546b5c2e4c 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -3848,7 +3848,7 @@ static inline bool __i915_request_irq_complete(struct drm_i915_gem_request *req)
>  	 * is woken.
>  	 */
>  	if (engine->irq_seqno_barrier &&
> -	    READ_ONCE(engine->breadcrumbs.irq_seqno_bh) == current &&
> +	    rcu_access_pointer(engine->breadcrumbs.irq_seqno_bh) == current &&
>  	    cmpxchg_relaxed(&engine->breadcrumbs.irq_posted, 1, 0)) {
>  		struct task_struct *tsk;
>  
> diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> index 8ecb3b6538fc..7552bd039565 100644
> --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
> +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> @@ -60,10 +60,8 @@ static void intel_breadcrumbs_fake_irq(unsigned long data)
>  	 * every jiffie in order to kick the oldest waiter to do the
>  	 * coherent seqno check.
>  	 */
> -	rcu_read_lock();
>  	if (intel_engine_wakeup(engine))
>  		mod_timer(&engine->breadcrumbs.fake_irq, jiffies + 1);
> -	rcu_read_unlock();
>  }
>  
>  static void irq_enable(struct intel_engine_cs *engine)
> @@ -232,7 +230,7 @@ static bool __intel_engine_add_wait(struct intel_engine_cs *engine,
>  	}
>  	rb_link_node(&wait->node, parent, p);
>  	rb_insert_color(&wait->node, &b->waiters);
> -	GEM_BUG_ON(!first && !b->irq_seqno_bh);
> +	GEM_BUG_ON(!first && !rcu_access_pointer(b->irq_seqno_bh));

Nit: reading through rcu docs I think the suggested accessor here is
rcu_dereference_protected for write-side access. That one allows the
compiler full freedom for reordering. otoh it's a bit more noise-y and meh
about optional debug checks anyway. So with or without that change:

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

>  
>  	if (completed) {
>  		struct rb_node *next = rb_next(completed);
> @@ -242,7 +240,7 @@ static bool __intel_engine_add_wait(struct intel_engine_cs *engine,
>  			GEM_BUG_ON(first);
>  			b->timeout = wait_timeout();
>  			b->first_wait = to_wait(next);
> -			smp_store_mb(b->irq_seqno_bh, b->first_wait->tsk);
> +			rcu_assign_pointer(b->irq_seqno_bh, b->first_wait->tsk);
>  			/* As there is a delay between reading the current
>  			 * seqno, processing the completed tasks and selecting
>  			 * the next waiter, we may have missed the interrupt
> @@ -269,7 +267,7 @@ static bool __intel_engine_add_wait(struct intel_engine_cs *engine,
>  		GEM_BUG_ON(rb_first(&b->waiters) != &wait->node);
>  		b->timeout = wait_timeout();
>  		b->first_wait = wait;
> -		smp_store_mb(b->irq_seqno_bh, wait->tsk);
> +		rcu_assign_pointer(b->irq_seqno_bh, wait->tsk);
>  		/* After assigning ourselves as the new bottom-half, we must
>  		 * perform a cursory check to prevent a missed interrupt.
>  		 * Either we miss the interrupt whilst programming the hardware,
> @@ -280,7 +278,7 @@ static bool __intel_engine_add_wait(struct intel_engine_cs *engine,
>  		 */
>  		__intel_breadcrumbs_enable_irq(b);
>  	}
> -	GEM_BUG_ON(!b->irq_seqno_bh);
> +	GEM_BUG_ON(!rcu_access_pointer(b->irq_seqno_bh));
>  	GEM_BUG_ON(!b->first_wait);
>  	GEM_BUG_ON(rb_first(&b->waiters) != &b->first_wait->node);
>  
> @@ -335,7 +333,7 @@ void intel_engine_remove_wait(struct intel_engine_cs *engine,
>  		const int priority = wakeup_priority(b, wait->tsk);
>  		struct rb_node *next;
>  
> -		GEM_BUG_ON(b->irq_seqno_bh != wait->tsk);
> +		GEM_BUG_ON(rcu_access_pointer(b->irq_seqno_bh) != wait->tsk);
>  
>  		/* We are the current bottom-half. Find the next candidate,
>  		 * the first waiter in the queue on the remaining oldest
> @@ -379,13 +377,13 @@ void intel_engine_remove_wait(struct intel_engine_cs *engine,
>  			 */
>  			b->timeout = wait_timeout();
>  			b->first_wait = to_wait(next);
> -			smp_store_mb(b->irq_seqno_bh, b->first_wait->tsk);
> +			rcu_assign_pointer(b->irq_seqno_bh, b->first_wait->tsk);
>  			if (b->first_wait->seqno != wait->seqno)
>  				__intel_breadcrumbs_enable_irq(b);
> -			wake_up_process(b->irq_seqno_bh);
> +			wake_up_process(b->first_wait->tsk);
>  		} else {
>  			b->first_wait = NULL;
> -			WRITE_ONCE(b->irq_seqno_bh, NULL);
> +			rcu_assign_pointer(b->irq_seqno_bh, NULL);
>  			__intel_breadcrumbs_disable_irq(b);
>  		}
>  	} else {
> @@ -399,7 +397,7 @@ out_unlock:
>  	GEM_BUG_ON(b->first_wait == wait);
>  	GEM_BUG_ON(rb_first(&b->waiters) !=
>  		   (b->first_wait ? &b->first_wait->node : NULL));
> -	GEM_BUG_ON(!b->irq_seqno_bh ^ RB_EMPTY_ROOT(&b->waiters));
> +	GEM_BUG_ON(!rcu_access_pointer(b->irq_seqno_bh) ^ RB_EMPTY_ROOT(&b->waiters));
>  	spin_unlock(&b->lock);
>  }
>  
> @@ -596,11 +594,9 @@ unsigned int intel_kick_waiters(struct drm_i915_private *i915)
>  	 * RCU lock, i.e. as we call wake_up_process() we must be holding the
>  	 * rcu_read_lock().
>  	 */
> -	rcu_read_lock();
>  	for_each_engine(engine, i915)
>  		if (unlikely(intel_engine_wakeup(engine)))
>  			mask |= intel_engine_flag(engine);
> -	rcu_read_unlock();
>  
>  	return mask;
>  }
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index e08a1e1b04e4..16b726fe33eb 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2410,9 +2410,7 @@ void intel_engine_init_seqno(struct intel_engine_cs *engine, u32 seqno)
>  	/* After manually advancing the seqno, fake the interrupt in case
>  	 * there are any waiters for that seqno.
>  	 */
> -	rcu_read_lock();
>  	intel_engine_wakeup(engine);
> -	rcu_read_unlock();
>  }
>  
>  static void gen6_bsd_submit_request(struct drm_i915_gem_request *request)
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 4aed4586b0b6..66dc93469076 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -171,7 +171,7 @@ struct intel_engine_cs {
>  	 * the overhead of waking that client is much preferred.
>  	 */
>  	struct intel_breadcrumbs {
> -		struct task_struct *irq_seqno_bh; /* bh for user interrupts */
> +		struct task_struct __rcu *irq_seqno_bh; /* bh for interrupts */
>  		bool irq_posted;
>  
>  		spinlock_t lock; /* protects the lists of requests */
> @@ -541,23 +541,30 @@ void intel_engine_enable_signaling(struct drm_i915_gem_request *request);
>  
>  static inline bool intel_engine_has_waiter(struct intel_engine_cs *engine)
>  {
> -	return READ_ONCE(engine->breadcrumbs.irq_seqno_bh);
> +	return rcu_access_pointer(engine->breadcrumbs.irq_seqno_bh);
>  }
>  
>  static inline bool intel_engine_wakeup(struct intel_engine_cs *engine)
>  {
>  	bool wakeup = false;
> -	struct task_struct *tsk = READ_ONCE(engine->breadcrumbs.irq_seqno_bh);
> +
>  	/* Note that for this not to dangerously chase a dangling pointer,
> -	 * the caller is responsible for ensure that the task remain valid for
> -	 * wake_up_process() i.e. that the RCU grace period cannot expire.
> +	 * we must hold the rcu_read_lock here.
>  	 *
>  	 * Also note that tsk is likely to be in !TASK_RUNNING state so an
>  	 * early test for tsk->state != TASK_RUNNING before wake_up_process()
>  	 * is unlikely to be beneficial.
>  	 */
> -	if (tsk)
> -		wakeup = wake_up_process(tsk);
> +	if (rcu_access_pointer(engine->breadcrumbs.irq_seqno_bh)) {
> +		struct task_struct *tsk;
> +
> +		rcu_read_lock();
> +		tsk = rcu_dereference(engine->breadcrumbs.irq_seqno_bh);
> +		if (tsk)
> +			wakeup = wake_up_process(tsk);
> +		rcu_read_unlock();
> +	}
> +
>  	return wakeup;
>  }
>  
> -- 
> 2.8.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2016-08-08  9:34 UTC|newest]

Thread overview: 125+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-07 14:45 First class VMA, take 2 Chris Wilson
2016-08-07 14:45 ` [PATCH 01/33] drm/i915: Add smp_rmb() to busy ioctl's RCU dance Chris Wilson
2016-08-08  9:12   ` Daniel Vetter
2016-08-08  9:30     ` Chris Wilson
2016-08-08  9:45       ` Chris Wilson
2016-08-09  6:36         ` Joonas Lahtinen
2016-08-09  7:14           ` Chris Wilson
2016-08-09  8:48             ` Joonas Lahtinen
2016-08-09  9:05               ` Chris Wilson
2016-08-10 10:12                 ` Daniel Vetter
2016-08-10 10:13                   ` Daniel Vetter
2016-08-10 11:00                     ` Joonas Lahtinen
2016-08-12  9:50                       ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 02/33] drm/i915: Do not overwrite the request with zero on reallocation Chris Wilson
2016-08-08  9:25   ` Daniel Vetter
2016-08-08  9:56     ` Chris Wilson
2016-08-09  6:32       ` Daniel Vetter
2016-08-07 14:45 ` [PATCH 03/33] drm/i915: Move missed interrupt detection from hangcheck to breadcrumbs Chris Wilson
2016-08-09 14:08   ` [PATCH v2] " Chris Wilson
2016-08-09 14:10   ` [PATCH v3] " Chris Wilson
2016-08-09 15:24     ` Mika Kuoppala
2016-08-07 14:45 ` [PATCH 04/33] drm/i915: Use RCU to annotate and enforce protection for breadcrumb's bh Chris Wilson
2016-08-08  9:33   ` Daniel Vetter [this message]
2016-08-12  9:56   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 05/33] drm/i915: Reduce amount of duplicate buffer information captured on error Chris Wilson
2016-08-10  7:04   ` Joonas Lahtinen
2016-08-10  7:15     ` Chris Wilson
2016-08-10  8:07       ` Joonas Lahtinen
2016-08-10  8:36         ` Chris Wilson
2016-08-10 10:51           ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 06/33] drm/i915: Stop the machine whilst capturing the GPU crash dump Chris Wilson
2016-08-07 14:45 ` [PATCH 07/33] drm/i915: Store the active context object on all engines upon error Chris Wilson
2016-08-09  9:02   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 08/33] drm/i915: Move setting of request->batch into its single callsite Chris Wilson
2016-08-09 15:53   ` Mika Kuoppala
2016-08-09 16:04     ` Chris Wilson
2016-08-10  7:19   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 09/33] drm/i915: Mark unmappable GGTT entries as PIN_HIGH Chris Wilson
2016-08-08  9:09   ` Joonas Lahtinen
2016-08-09 11:05   ` Tvrtko Ursulin
2016-08-09 11:13     ` Chris Wilson
2016-08-09 11:20       ` Chris Wilson
2016-08-07 14:45 ` [PATCH 10/33] drm/i915: Remove inactive/active list from debugfs Chris Wilson
2016-08-09 10:29   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 11/33] drm/i915: Focus debugfs/i915_gem_pinned to show only display pins Chris Wilson
2016-08-09 10:39   ` Joonas Lahtinen
2016-08-09 10:46     ` Chris Wilson
2016-08-09 11:32       ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 12/33] drm/i915: Reduce i915_gem_objects to only show object information Chris Wilson
2016-08-10  7:29   ` Joonas Lahtinen
2016-08-10  7:38     ` Chris Wilson
2016-08-10  8:10       ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 13/33] drm/i915: Remove redundant WARN_ON from __i915_add_request() Chris Wilson
2016-08-08  9:03   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 14/33] drm/i915: Create a VMA for an object Chris Wilson
2016-08-08  9:01   ` Joonas Lahtinen
2016-08-08  9:09     ` Chris Wilson
2016-08-10 10:58       ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 15/33] drm/i915: Track pinned vma inside guc Chris Wilson
2016-08-11 16:19   ` Dave Gordon
2016-08-11 16:41     ` Chris Wilson
2016-08-07 14:45 ` [PATCH 16/33] drm/i915: Convert fence computations to use vma directly Chris Wilson
2016-08-09 10:27   ` Joonas Lahtinen
2016-08-09 10:33     ` Chris Wilson
2016-08-07 14:45 ` [PATCH 17/33] drm/i915: Use VMA directly for checking tiling parameters Chris Wilson
2016-08-09  6:18   ` Joonas Lahtinen
2016-08-09  8:03     ` Chris Wilson
2016-08-07 14:45 ` [PATCH 18/33] drm/i915: Use VMA as the primary object for context state Chris Wilson
2016-08-10  8:03   ` Joonas Lahtinen
2016-08-10  8:25     ` Chris Wilson
2016-08-10 10:54       ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 19/33] drm/i915: Only clflush the context object when binding Chris Wilson
2016-08-10  8:41   ` Joonas Lahtinen
2016-08-10  9:02     ` Chris Wilson
2016-08-10 10:50       ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 20/33] drm/i915: Use VMA for ringbuffer tracking Chris Wilson
2016-08-11  9:32   ` Joonas Lahtinen
2016-08-11  9:58     ` Chris Wilson
2016-08-07 14:45 ` [PATCH 21/33] drm/i915: Use VMA for scratch page tracking Chris Wilson
2016-08-08  8:00   ` [PATCH 1/3] " Chris Wilson
2016-08-08  8:00     ` [PATCH 2/3] drm/i915: Move common scratch allocation/destroy to intel_engine_cs.c Chris Wilson
2016-08-08  9:24       ` Matthew Auld
2016-08-08  8:00     ` [PATCH 3/3] drm/i915: Move common seqno reset " Chris Wilson
2016-08-08  9:40       ` Matthew Auld
2016-08-08 10:15         ` Chris Wilson
2016-08-08 15:34           ` Matthew Auld
2016-08-11 10:06   ` [PATCH 21/33] drm/i915: Use VMA for scratch page tracking Joonas Lahtinen
2016-08-11 10:22     ` Chris Wilson
2016-08-07 14:45 ` [PATCH 22/33] drm/i915/overlay: Use VMA as the primary tracker for images Chris Wilson
2016-08-11 10:17   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 23/33] drm/i915: Use VMA as the primary tracker for semaphore page Chris Wilson
2016-08-11 10:42   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 24/33] drm/i915: Use VMA for render state page tracking Chris Wilson
2016-08-11 10:46   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 25/33] drm/i915: Use VMA for wa_ctx tracking Chris Wilson
2016-08-11 10:53   ` Joonas Lahtinen
2016-08-11 11:02     ` Chris Wilson
2016-08-11 12:41       ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 26/33] drm/i915: Track pinned VMA Chris Wilson
2016-08-11 12:18   ` Joonas Lahtinen
2016-08-11 12:37     ` Chris Wilson
2016-08-07 14:45 ` [PATCH 27/33] drm/i915: Print the batchbuffer offset next to BBADDR in error state Chris Wilson
2016-08-11 12:24   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 28/33] drm/i915: Move per-request pid from request to ctx Chris Wilson
2016-08-11 12:32   ` Joonas Lahtinen
2016-08-11 12:41     ` Chris Wilson
2016-08-07 14:45 ` [PATCH 29/33] drm/i915: Only record active and pending requests upon a GPU hang Chris Wilson
2016-08-11 12:36   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 30/33] drm/i915: Record the RING_MODE register for post-mortem debugging Chris Wilson
2016-08-08 11:35   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 31/33] drm/i915: Always use the GTT for error capture Chris Wilson
2016-08-07 14:45 ` [PATCH 32/33] drm/i915: Consolidate error object printing Chris Wilson
2016-08-09 11:44   ` Joonas Lahtinen
2016-08-09 11:53     ` Chris Wilson
2016-08-10 10:55       ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 33/33] drm/i915: Compress GPU objects in error state Chris Wilson
2016-08-10 10:32   ` Joonas Lahtinen
2016-08-10 10:52     ` Chris Wilson
2016-08-10 11:26       ` Joonas Lahtinen
2016-08-07 15:16 ` ✗ Ro.CI.BAT: failure for series starting with [01/33] drm/i915: Add smp_rmb() to busy ioctl's RCU dance Patchwork
2016-08-08  9:46 ` ✗ Ro.CI.BAT: failure for series starting with [01/33] drm/i915: Add smp_rmb() to busy ioctl's RCU dance (rev4) Patchwork
2016-08-08 10:34 ` ✗ Fi.CI.BAT: " Patchwork
2016-08-09 14:10 ` ✗ Ro.CI.BAT: failure for series starting with [01/33] drm/i915: Add smp_rmb() to busy ioctl's RCU dance (rev5) Patchwork
2016-08-09 14:20 ` ✗ Ro.CI.BAT: failure for series starting with [01/33] drm/i915: Add smp_rmb() to busy ioctl's RCU dance (rev6) Patchwork
2016-08-10  6:43 ` Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160808093356.GZ6232@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=chris@chris-wilson.co.uk \
    --cc=daniel.vetter@ffwll.ch \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.