All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: intel-gfx@lists.freedesktop.org, "Rogozhkin,
	Dmitry V" <dmitry.v.rogozhkin@intel.com>,
	Daniel Vetter <daniel.vetter@ffwll.ch>,
	Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	Eero Tamminen <eero.t.tamminen@intel.com>,
	"Rantala, Valtteri" <valtteri.rantala@intel.com>,
	stable@vger.kernel.org
Subject: Re: [PATCH 03/32] drm/i915: Only spin whilst waiting on the current request
Date: Fri, 18 Dec 2015 17:12:21 +0100	[thread overview]
Message-ID: <20151218161221.GC30437@phenom.ffwll.local> (raw)
In-Reply-To: <1449833608-22125-4-git-send-email-chris@chris-wilson.co.uk>

On Fri, Dec 11, 2015 at 11:32:59AM +0000, Chris Wilson wrote:
> Limit busywaiting only to the request currently being processed by the
> GPU. If the request is not currently being processed by the GPU, there
> is a very low likelihood of it being completed within the 2 microsecond
> spin timeout and so we will just be wasting CPU cycles.
> 
> v2: Check for logical inversion when rebasing - we were incorrectly
> checking for this request being active, and instead busywaiting for
> when the GPU was not yet processing the request of interest.
> 
> v3: Try another colour for the seqno names.
> v4: Another colour for the function names.
> 
> v5: Remove the forced coherency when checking for the active request. On
> reflection and plenty of recent experimentation, the issue is not a
> cache coherency problem - but an irq/seqno ordering problem (timing issue).
> Here, we do not need the w/a to force ordering of the read with an
> interrupt.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Cc: Eero Tamminen <eero.t.tamminen@intel.com>
> Cc: "Rantala, Valtteri" <valtteri.rantala@intel.com>
> Cc: stable@vger.kernel.org

Merged these 3 patches, thanks.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.h | 27 +++++++++++++++++++--------
>  drivers/gpu/drm/i915/i915_gem.c |  8 +++++++-
>  2 files changed, 26 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 5edd39352e97..8c4303b664d9 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2182,8 +2182,17 @@ struct drm_i915_gem_request {
>  	struct drm_i915_private *i915;
>  	struct intel_engine_cs *ring;
>  
> -	/** GEM sequence number associated with this request. */
> -	uint32_t seqno;
> +	 /** GEM sequence number associated with the previous request,
> +	  * when the HWS breadcrumb is equal to this the GPU is processing
> +	  * this request.
> +	  */
> +	u32 previous_seqno;
> +
> +	 /** GEM sequence number associated with this request,
> +	  * when the HWS breadcrumb is equal or greater than this the GPU
> +	  * has finished processing this request.
> +	  */
> +	u32 seqno;
>  
>  	/** Position in the ringbuffer of the start of the request */
>  	u32 head;
> @@ -2958,15 +2967,17 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2)
>  	return (int32_t)(seq1 - seq2) >= 0;
>  }
>  
> +static inline bool i915_gem_request_started(struct drm_i915_gem_request *req,
> +					   bool lazy_coherency)
> +{
> +	u32 seqno = req->ring->get_seqno(req->ring, lazy_coherency);
> +	return i915_seqno_passed(seqno, req->previous_seqno);
> +}
> +
>  static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req,
>  					      bool lazy_coherency)
>  {
> -	u32 seqno;
> -
> -	BUG_ON(req == NULL);
> -
> -	seqno = req->ring->get_seqno(req->ring, lazy_coherency);
> -
> +	u32 seqno = req->ring->get_seqno(req->ring, lazy_coherency);
>  	return i915_seqno_passed(seqno, req->seqno);
>  }
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 46a84c447d8f..29d98ddbbc80 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1193,9 +1193,13 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
>  	 * takes to sleep on a request, on the order of a microsecond.
>  	 */
>  
> -	if (i915_gem_request_get_ring(req)->irq_refcount)
> +	if (req->ring->irq_refcount)
>  		return -EBUSY;
>  
> +	/* Only spin if we know the GPU is processing this request */
> +	if (!i915_gem_request_started(req, true))
> +		return -EAGAIN;
> +
>  	timeout = local_clock_us(&cpu) + 5;
>  	while (!need_resched()) {
>  		if (i915_gem_request_completed(req, true))
> @@ -1209,6 +1213,7 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
>  
>  		cpu_relax_lowlatency();
>  	}
> +
>  	if (i915_gem_request_completed(req, false))
>  		return 0;
>  
> @@ -2600,6 +2605,7 @@ void __i915_add_request(struct drm_i915_gem_request *request,
>  	request->batch_obj = obj;
>  
>  	request->emitted_jiffies = jiffies;
> +	request->previous_seqno = ring->last_submitted_seqno;
>  	ring->last_submitted_seqno = request->seqno;
>  	list_add_tail(&request->list, &ring->request_list);
>  
> -- 
> 2.6.3
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

WARNING: multiple messages have this Message-ID (diff)
From: Daniel Vetter <daniel@ffwll.ch>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>,
	intel-gfx@lists.freedesktop.org, stable@vger.kernel.org,
	"Rantala, Valtteri" <valtteri.rantala@intel.com>,
	Eero Tamminen <eero.t.tamminen@intel.com>
Subject: Re: [PATCH 03/32] drm/i915: Only spin whilst waiting on the current request
Date: Fri, 18 Dec 2015 17:12:21 +0100	[thread overview]
Message-ID: <20151218161221.GC30437@phenom.ffwll.local> (raw)
In-Reply-To: <1449833608-22125-4-git-send-email-chris@chris-wilson.co.uk>

On Fri, Dec 11, 2015 at 11:32:59AM +0000, Chris Wilson wrote:
> Limit busywaiting only to the request currently being processed by the
> GPU. If the request is not currently being processed by the GPU, there
> is a very low likelihood of it being completed within the 2 microsecond
> spin timeout and so we will just be wasting CPU cycles.
> 
> v2: Check for logical inversion when rebasing - we were incorrectly
> checking for this request being active, and instead busywaiting for
> when the GPU was not yet processing the request of interest.
> 
> v3: Try another colour for the seqno names.
> v4: Another colour for the function names.
> 
> v5: Remove the forced coherency when checking for the active request. On
> reflection and plenty of recent experimentation, the issue is not a
> cache coherency problem - but an irq/seqno ordering problem (timing issue).
> Here, we do not need the w/a to force ordering of the read with an
> interrupt.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Cc: Eero Tamminen <eero.t.tamminen@intel.com>
> Cc: "Rantala, Valtteri" <valtteri.rantala@intel.com>
> Cc: stable@vger.kernel.org

Merged these 3 patches, thanks.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.h | 27 +++++++++++++++++++--------
>  drivers/gpu/drm/i915/i915_gem.c |  8 +++++++-
>  2 files changed, 26 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 5edd39352e97..8c4303b664d9 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2182,8 +2182,17 @@ struct drm_i915_gem_request {
>  	struct drm_i915_private *i915;
>  	struct intel_engine_cs *ring;
>  
> -	/** GEM sequence number associated with this request. */
> -	uint32_t seqno;
> +	 /** GEM sequence number associated with the previous request,
> +	  * when the HWS breadcrumb is equal to this the GPU is processing
> +	  * this request.
> +	  */
> +	u32 previous_seqno;
> +
> +	 /** GEM sequence number associated with this request,
> +	  * when the HWS breadcrumb is equal or greater than this the GPU
> +	  * has finished processing this request.
> +	  */
> +	u32 seqno;
>  
>  	/** Position in the ringbuffer of the start of the request */
>  	u32 head;
> @@ -2958,15 +2967,17 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2)
>  	return (int32_t)(seq1 - seq2) >= 0;
>  }
>  
> +static inline bool i915_gem_request_started(struct drm_i915_gem_request *req,
> +					   bool lazy_coherency)
> +{
> +	u32 seqno = req->ring->get_seqno(req->ring, lazy_coherency);
> +	return i915_seqno_passed(seqno, req->previous_seqno);
> +}
> +
>  static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req,
>  					      bool lazy_coherency)
>  {
> -	u32 seqno;
> -
> -	BUG_ON(req == NULL);
> -
> -	seqno = req->ring->get_seqno(req->ring, lazy_coherency);
> -
> +	u32 seqno = req->ring->get_seqno(req->ring, lazy_coherency);
>  	return i915_seqno_passed(seqno, req->seqno);
>  }
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 46a84c447d8f..29d98ddbbc80 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1193,9 +1193,13 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
>  	 * takes to sleep on a request, on the order of a microsecond.
>  	 */
>  
> -	if (i915_gem_request_get_ring(req)->irq_refcount)
> +	if (req->ring->irq_refcount)
>  		return -EBUSY;
>  
> +	/* Only spin if we know the GPU is processing this request */
> +	if (!i915_gem_request_started(req, true))
> +		return -EAGAIN;
> +
>  	timeout = local_clock_us(&cpu) + 5;
>  	while (!need_resched()) {
>  		if (i915_gem_request_completed(req, true))
> @@ -1209,6 +1213,7 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
>  
>  		cpu_relax_lowlatency();
>  	}
> +
>  	if (i915_gem_request_completed(req, false))
>  		return 0;
>  
> @@ -2600,6 +2605,7 @@ void __i915_add_request(struct drm_i915_gem_request *request,
>  	request->batch_obj = obj;
>  
>  	request->emitted_jiffies = jiffies;
> +	request->previous_seqno = ring->last_submitted_seqno;
>  	ring->last_submitted_seqno = request->seqno;
>  	list_add_tail(&request->list, &ring->request_list);
>  
> -- 
> 2.6.3
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2015-12-18 16:12 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-11 11:32 Slaughter the thundering i915_wait_request, v3? Chris Wilson
2015-12-11 11:32 ` [PATCH 01/32] drm/i915: Break busywaiting for requests on pending signals Chris Wilson
2015-12-11 11:32 ` [PATCH 02/32] drm/i915: Limit the busy wait on requests to 5us not 10ms! Chris Wilson
2015-12-11 11:32 ` [PATCH 03/32] drm/i915: Only spin whilst waiting on the current request Chris Wilson
2015-12-18 16:12   ` Daniel Vetter [this message]
2015-12-18 16:12     ` Daniel Vetter
2015-12-11 11:33 ` [PATCH 04/32] drm/i915: Hide the atomic_read(reset_counter) behind a helper Chris Wilson
2015-12-16  9:31   ` Daniel Vetter
2015-12-16  9:33   ` Daniel Vetter
2015-12-16  9:36     ` Daniel Vetter
2015-12-16 10:26     ` Chris Wilson
2015-12-11 11:33 ` [PATCH 05/32] drm/i915: Simplify checking of GPU reset_counter in display pageflips Chris Wilson
2015-12-16  9:31   ` Daniel Vetter
2015-12-11 11:33 ` [PATCH 06/32] drm/i915: Tighten reset_counter for reset status Chris Wilson
2015-12-16  9:35   ` Daniel Vetter
2015-12-11 11:33 ` [PATCH 07/32] drm/i915: Store the reset counter when constructing a request Chris Wilson
2015-12-16  9:44   ` Daniel Vetter
2015-12-16 10:19     ` Chris Wilson
2016-01-04 15:58       ` Dave Gordon
2016-01-04 16:10         ` Chris Wilson
2016-01-04 17:57           ` Dave Gordon
2015-12-11 11:33 ` [PATCH 08/32] drm/i915: Simplify reset_counter handling during atomic modesetting Chris Wilson
2015-12-16  9:46   ` Daniel Vetter
2015-12-11 11:33 ` [PATCH 09/32] drm/i915: Prevent leaking of -EIO from i915_wait_request() Chris Wilson
2015-12-16  9:52   ` Daniel Vetter
2015-12-16 11:06     ` Chris Wilson
2015-12-16 12:53       ` Daniel Vetter
2015-12-11 11:33 ` [PATCH 10/32] drm/i915: Suppress error message when GPU resets are disabled Chris Wilson
2015-12-16  9:53   ` Daniel Vetter
2015-12-16 10:06     ` Chris Wilson
2015-12-11 11:33 ` [PATCH 11/32] drm/i915: Delay queuing hangcheck to wait-request Chris Wilson
2015-12-11 11:33 ` [PATCH 12/32] drm/i915: Remove the dedicated hangcheck workqueue Chris Wilson
2015-12-11 11:33 ` [PATCH 13/32] drm/i915: Make queueing the hangcheck work inline Chris Wilson
2015-12-11 11:33 ` [PATCH 14/32] drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+ Chris Wilson
2016-01-05 12:45   ` Dave Gordon
2015-12-11 11:33 ` [PATCH 15/32] drm/i915: Slaughter the thundering i915_wait_request herd Chris Wilson
2015-12-14 12:21   ` Tvrtko Ursulin
2015-12-14 13:18     ` Chris Wilson
2015-12-18 10:01     ` [PATCH] " Chris Wilson
2015-12-21 11:23       ` [PATCH v16] " Chris Wilson
2015-12-11 11:33 ` [PATCH 16/32] drm/i915: Separate out the seqno-barrier from engine->get_seqno Chris Wilson
2015-12-11 11:33 ` [PATCH 17/32] drm/i915: Remove the lazy_coherency parameter from request-completed? Chris Wilson
2015-12-14 14:59   ` Tvrtko Ursulin
2015-12-14 15:11     ` Chris Wilson
2016-01-04 11:16       ` Dave Gordon
2016-01-04 11:26         ` Chris Wilson
2016-01-04 13:02           ` Dave Gordon
2016-01-04 13:11             ` Chris Wilson
2016-01-04 14:09             ` Dave Gordon
2016-01-04 14:20               ` Chris Wilson
2016-01-04 17:28                 ` Dave Gordon
2015-12-11 11:33 ` [PATCH 18/32] drm/i915: Use HWS for seqno tracking everywhere Chris Wilson
2016-01-04 18:11   ` Dave Gordon
2016-01-04 19:37     ` Chris Wilson
2015-12-11 11:33 ` [PATCH 19/32] drm/i915: Check the CPU cached value of seqno after waking the waiter Chris Wilson
2015-12-11 11:33 ` [PATCH 20/32] drm/i915: Replace manual barrier() with READ_ONCE() in HWS accessor Chris Wilson
2015-12-11 11:33 ` [PATCH 21/32] drm/i915: Broadwell execlists needs exactly the same seqno w/a as legacy Chris Wilson
2016-01-04 21:34   ` Jesse Barnes
2016-01-05 10:20     ` Chris Wilson
2015-12-11 11:33 ` [PATCH 22/32] drm/i915: Stop setting wraparound seqno on initialisation Chris Wilson
2015-12-11 11:33 ` [PATCH 23/32] drm/i915: Only query timestamp when measuring elapsed time Chris Wilson
2015-12-11 11:33 ` [PATCH 24/32] drm/i915: On GPU reset, set the HWS breadcrumb to the last seqno Chris Wilson
2015-12-11 11:33 ` [PATCH 25/32] drm/i915: Convert trace-irq to the breadcrumb waiter Chris Wilson
2015-12-12 15:20   ` [PATCH v2] " Chris Wilson
2015-12-12 15:34     ` [PATCH 1/3] drm/i915: Move GEM request routines to i915_gem_request.c Chris Wilson
2015-12-12 15:34       ` [PATCH 2/3] drm/i915: Move releasing of the GEM request from free to retire/cancel Chris Wilson
2015-12-12 15:34       ` [PATCH 3/3] drm/i915: Derive GEM requests from dma-fence Chris Wilson
2016-01-04 12:17         ` Dave Gordon
2016-01-04 12:22           ` Chris Wilson
2015-12-11 11:33 ` [PATCH 26/32] drm/i915: Move the get/put irq locking into the caller Chris Wilson
2015-12-11 11:33 ` [PATCH 27/32] drm/i915: Harden detection of missed interrupts Chris Wilson
2015-12-11 11:33 ` [PATCH 28/32] drm/i915: Remove debug noise on detecting fault-injection " Chris Wilson
2015-12-11 11:33 ` [PATCH 29/32] drm/i915: Only start retire worker when idle Chris Wilson
2015-12-15  9:26   ` [PATCH] " Chris Wilson
2015-12-11 11:33 ` [PATCH 30/32] drm/i915: Restore waitboost credit to the synchronous waiter Chris Wilson
2015-12-11 11:33 ` [PATCH 31/32] drm/i915: Add background commentary to "waitboosting" Chris Wilson
2015-12-11 11:33 ` [PATCH 32/32] drm/i915: Flush the RPS bottom-half when the GPU idles Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151218161221.GC30437@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=chris@chris-wilson.co.uk \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dmitry.v.rogozhkin@intel.com \
    --cc=eero.t.tamminen@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=stable@vger.kernel.org \
    --cc=tvrtko.ursulin@linux.intel.com \
    --cc=valtteri.rantala@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.