From: Daniel Vetter <daniel@ffwll.ch> To: Chris Wilson <chris@chris-wilson.co.uk> Cc: intel-gfx@lists.freedesktop.org, "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>, Daniel Vetter <daniel.vetter@ffwll.ch>, Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>, Eero Tamminen <eero.t.tamminen@intel.com>, "Rantala, Valtteri" <valtteri.rantala@intel.com>, stable@vger.kernel.org Subject: Re: [PATCH 03/32] drm/i915: Only spin whilst waiting on the current request Date: Fri, 18 Dec 2015 17:12:21 +0100 [thread overview] Message-ID: <20151218161221.GC30437@phenom.ffwll.local> (raw) In-Reply-To: <1449833608-22125-4-git-send-email-chris@chris-wilson.co.uk> On Fri, Dec 11, 2015 at 11:32:59AM +0000, Chris Wilson wrote: > Limit busywaiting only to the request currently being processed by the > GPU. If the request is not currently being processed by the GPU, there > is a very low likelihood of it being completed within the 2 microsecond > spin timeout and so we will just be wasting CPU cycles. > > v2: Check for logical inversion when rebasing - we were incorrectly > checking for this request being active, and instead busywaiting for > when the GPU was not yet processing the request of interest. > > v3: Try another colour for the seqno names. > v4: Another colour for the function names. > > v5: Remove the forced coherency when checking for the active request. On > reflection and plenty of recent experimentation, the issue is not a > cache coherency problem - but an irq/seqno ordering problem (timing issue). > Here, we do not need the w/a to force ordering of the read with an > interrupt. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> > Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch> > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> > Cc: Eero Tamminen <eero.t.tamminen@intel.com> > Cc: "Rantala, Valtteri" <valtteri.rantala@intel.com> > Cc: stable@vger.kernel.org Merged these 3 patches, thanks. -Daniel > --- > drivers/gpu/drm/i915/i915_drv.h | 27 +++++++++++++++++++-------- > drivers/gpu/drm/i915/i915_gem.c | 8 +++++++- > 2 files changed, 26 insertions(+), 9 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 5edd39352e97..8c4303b664d9 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -2182,8 +2182,17 @@ struct drm_i915_gem_request { > struct drm_i915_private *i915; > struct intel_engine_cs *ring; > > - /** GEM sequence number associated with this request. */ > - uint32_t seqno; > + /** GEM sequence number associated with the previous request, > + * when the HWS breadcrumb is equal to this the GPU is processing > + * this request. > + */ > + u32 previous_seqno; > + > + /** GEM sequence number associated with this request, > + * when the HWS breadcrumb is equal or greater than this the GPU > + * has finished processing this request. > + */ > + u32 seqno; > > /** Position in the ringbuffer of the start of the request */ > u32 head; > @@ -2958,15 +2967,17 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2) > return (int32_t)(seq1 - seq2) >= 0; > } > > +static inline bool i915_gem_request_started(struct drm_i915_gem_request *req, > + bool lazy_coherency) > +{ > + u32 seqno = req->ring->get_seqno(req->ring, lazy_coherency); > + return i915_seqno_passed(seqno, req->previous_seqno); > +} > + > static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req, > bool lazy_coherency) > { > - u32 seqno; > - > - BUG_ON(req == NULL); > - > - seqno = req->ring->get_seqno(req->ring, lazy_coherency); > - > + u32 seqno = req->ring->get_seqno(req->ring, lazy_coherency); > return i915_seqno_passed(seqno, req->seqno); > } > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 46a84c447d8f..29d98ddbbc80 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -1193,9 +1193,13 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state) > * takes to sleep on a request, on the order of a microsecond. > */ > > - if (i915_gem_request_get_ring(req)->irq_refcount) > + if (req->ring->irq_refcount) > return -EBUSY; > > + /* Only spin if we know the GPU is processing this request */ > + if (!i915_gem_request_started(req, true)) > + return -EAGAIN; > + > timeout = local_clock_us(&cpu) + 5; > while (!need_resched()) { > if (i915_gem_request_completed(req, true)) > @@ -1209,6 +1213,7 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state) > > cpu_relax_lowlatency(); > } > + > if (i915_gem_request_completed(req, false)) > return 0; > > @@ -2600,6 +2605,7 @@ void __i915_add_request(struct drm_i915_gem_request *request, > request->batch_obj = obj; > > request->emitted_jiffies = jiffies; > + request->previous_seqno = ring->last_submitted_seqno; > ring->last_submitted_seqno = request->seqno; > list_add_tail(&request->list, &ring->request_list); > > -- > 2.6.3 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
WARNING: multiple messages have this Message-ID (diff)
From: Daniel Vetter <daniel@ffwll.ch> To: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>, intel-gfx@lists.freedesktop.org, stable@vger.kernel.org, "Rantala, Valtteri" <valtteri.rantala@intel.com>, Eero Tamminen <eero.t.tamminen@intel.com> Subject: Re: [PATCH 03/32] drm/i915: Only spin whilst waiting on the current request Date: Fri, 18 Dec 2015 17:12:21 +0100 [thread overview] Message-ID: <20151218161221.GC30437@phenom.ffwll.local> (raw) In-Reply-To: <1449833608-22125-4-git-send-email-chris@chris-wilson.co.uk> On Fri, Dec 11, 2015 at 11:32:59AM +0000, Chris Wilson wrote: > Limit busywaiting only to the request currently being processed by the > GPU. If the request is not currently being processed by the GPU, there > is a very low likelihood of it being completed within the 2 microsecond > spin timeout and so we will just be wasting CPU cycles. > > v2: Check for logical inversion when rebasing - we were incorrectly > checking for this request being active, and instead busywaiting for > when the GPU was not yet processing the request of interest. > > v3: Try another colour for the seqno names. > v4: Another colour for the function names. > > v5: Remove the forced coherency when checking for the active request. On > reflection and plenty of recent experimentation, the issue is not a > cache coherency problem - but an irq/seqno ordering problem (timing issue). > Here, we do not need the w/a to force ordering of the read with an > interrupt. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> > Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch> > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> > Cc: Eero Tamminen <eero.t.tamminen@intel.com> > Cc: "Rantala, Valtteri" <valtteri.rantala@intel.com> > Cc: stable@vger.kernel.org Merged these 3 patches, thanks. -Daniel > --- > drivers/gpu/drm/i915/i915_drv.h | 27 +++++++++++++++++++-------- > drivers/gpu/drm/i915/i915_gem.c | 8 +++++++- > 2 files changed, 26 insertions(+), 9 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 5edd39352e97..8c4303b664d9 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -2182,8 +2182,17 @@ struct drm_i915_gem_request { > struct drm_i915_private *i915; > struct intel_engine_cs *ring; > > - /** GEM sequence number associated with this request. */ > - uint32_t seqno; > + /** GEM sequence number associated with the previous request, > + * when the HWS breadcrumb is equal to this the GPU is processing > + * this request. > + */ > + u32 previous_seqno; > + > + /** GEM sequence number associated with this request, > + * when the HWS breadcrumb is equal or greater than this the GPU > + * has finished processing this request. > + */ > + u32 seqno; > > /** Position in the ringbuffer of the start of the request */ > u32 head; > @@ -2958,15 +2967,17 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2) > return (int32_t)(seq1 - seq2) >= 0; > } > > +static inline bool i915_gem_request_started(struct drm_i915_gem_request *req, > + bool lazy_coherency) > +{ > + u32 seqno = req->ring->get_seqno(req->ring, lazy_coherency); > + return i915_seqno_passed(seqno, req->previous_seqno); > +} > + > static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req, > bool lazy_coherency) > { > - u32 seqno; > - > - BUG_ON(req == NULL); > - > - seqno = req->ring->get_seqno(req->ring, lazy_coherency); > - > + u32 seqno = req->ring->get_seqno(req->ring, lazy_coherency); > return i915_seqno_passed(seqno, req->seqno); > } > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 46a84c447d8f..29d98ddbbc80 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -1193,9 +1193,13 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state) > * takes to sleep on a request, on the order of a microsecond. > */ > > - if (i915_gem_request_get_ring(req)->irq_refcount) > + if (req->ring->irq_refcount) > return -EBUSY; > > + /* Only spin if we know the GPU is processing this request */ > + if (!i915_gem_request_started(req, true)) > + return -EAGAIN; > + > timeout = local_clock_us(&cpu) + 5; > while (!need_resched()) { > if (i915_gem_request_completed(req, true)) > @@ -1209,6 +1213,7 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state) > > cpu_relax_lowlatency(); > } > + > if (i915_gem_request_completed(req, false)) > return 0; > > @@ -2600,6 +2605,7 @@ void __i915_add_request(struct drm_i915_gem_request *request, > request->batch_obj = obj; > > request->emitted_jiffies = jiffies; > + request->previous_seqno = ring->last_submitted_seqno; > ring->last_submitted_seqno = request->seqno; > list_add_tail(&request->list, &ring->request_list); > > -- > 2.6.3 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2015-12-18 16:12 UTC|newest] Thread overview: 77+ messages / expand[flat|nested] mbox.gz Atom feed top 2015-12-11 11:32 Slaughter the thundering i915_wait_request, v3? Chris Wilson 2015-12-11 11:32 ` [PATCH 01/32] drm/i915: Break busywaiting for requests on pending signals Chris Wilson 2015-12-11 11:32 ` [PATCH 02/32] drm/i915: Limit the busy wait on requests to 5us not 10ms! Chris Wilson 2015-12-11 11:32 ` [PATCH 03/32] drm/i915: Only spin whilst waiting on the current request Chris Wilson 2015-12-18 16:12 ` Daniel Vetter [this message] 2015-12-18 16:12 ` Daniel Vetter 2015-12-11 11:33 ` [PATCH 04/32] drm/i915: Hide the atomic_read(reset_counter) behind a helper Chris Wilson 2015-12-16 9:31 ` Daniel Vetter 2015-12-16 9:33 ` Daniel Vetter 2015-12-16 9:36 ` Daniel Vetter 2015-12-16 10:26 ` Chris Wilson 2015-12-11 11:33 ` [PATCH 05/32] drm/i915: Simplify checking of GPU reset_counter in display pageflips Chris Wilson 2015-12-16 9:31 ` Daniel Vetter 2015-12-11 11:33 ` [PATCH 06/32] drm/i915: Tighten reset_counter for reset status Chris Wilson 2015-12-16 9:35 ` Daniel Vetter 2015-12-11 11:33 ` [PATCH 07/32] drm/i915: Store the reset counter when constructing a request Chris Wilson 2015-12-16 9:44 ` Daniel Vetter 2015-12-16 10:19 ` Chris Wilson 2016-01-04 15:58 ` Dave Gordon 2016-01-04 16:10 ` Chris Wilson 2016-01-04 17:57 ` Dave Gordon 2015-12-11 11:33 ` [PATCH 08/32] drm/i915: Simplify reset_counter handling during atomic modesetting Chris Wilson 2015-12-16 9:46 ` Daniel Vetter 2015-12-11 11:33 ` [PATCH 09/32] drm/i915: Prevent leaking of -EIO from i915_wait_request() Chris Wilson 2015-12-16 9:52 ` Daniel Vetter 2015-12-16 11:06 ` Chris Wilson 2015-12-16 12:53 ` Daniel Vetter 2015-12-11 11:33 ` [PATCH 10/32] drm/i915: Suppress error message when GPU resets are disabled Chris Wilson 2015-12-16 9:53 ` Daniel Vetter 2015-12-16 10:06 ` Chris Wilson 2015-12-11 11:33 ` [PATCH 11/32] drm/i915: Delay queuing hangcheck to wait-request Chris Wilson 2015-12-11 11:33 ` [PATCH 12/32] drm/i915: Remove the dedicated hangcheck workqueue Chris Wilson 2015-12-11 11:33 ` [PATCH 13/32] drm/i915: Make queueing the hangcheck work inline Chris Wilson 2015-12-11 11:33 ` [PATCH 14/32] drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+ Chris Wilson 2016-01-05 12:45 ` Dave Gordon 2015-12-11 11:33 ` [PATCH 15/32] drm/i915: Slaughter the thundering i915_wait_request herd Chris Wilson 2015-12-14 12:21 ` Tvrtko Ursulin 2015-12-14 13:18 ` Chris Wilson 2015-12-18 10:01 ` [PATCH] " Chris Wilson 2015-12-21 11:23 ` [PATCH v16] " Chris Wilson 2015-12-11 11:33 ` [PATCH 16/32] drm/i915: Separate out the seqno-barrier from engine->get_seqno Chris Wilson 2015-12-11 11:33 ` [PATCH 17/32] drm/i915: Remove the lazy_coherency parameter from request-completed? Chris Wilson 2015-12-14 14:59 ` Tvrtko Ursulin 2015-12-14 15:11 ` Chris Wilson 2016-01-04 11:16 ` Dave Gordon 2016-01-04 11:26 ` Chris Wilson 2016-01-04 13:02 ` Dave Gordon 2016-01-04 13:11 ` Chris Wilson 2016-01-04 14:09 ` Dave Gordon 2016-01-04 14:20 ` Chris Wilson 2016-01-04 17:28 ` Dave Gordon 2015-12-11 11:33 ` [PATCH 18/32] drm/i915: Use HWS for seqno tracking everywhere Chris Wilson 2016-01-04 18:11 ` Dave Gordon 2016-01-04 19:37 ` Chris Wilson 2015-12-11 11:33 ` [PATCH 19/32] drm/i915: Check the CPU cached value of seqno after waking the waiter Chris Wilson 2015-12-11 11:33 ` [PATCH 20/32] drm/i915: Replace manual barrier() with READ_ONCE() in HWS accessor Chris Wilson 2015-12-11 11:33 ` [PATCH 21/32] drm/i915: Broadwell execlists needs exactly the same seqno w/a as legacy Chris Wilson 2016-01-04 21:34 ` Jesse Barnes 2016-01-05 10:20 ` Chris Wilson 2015-12-11 11:33 ` [PATCH 22/32] drm/i915: Stop setting wraparound seqno on initialisation Chris Wilson 2015-12-11 11:33 ` [PATCH 23/32] drm/i915: Only query timestamp when measuring elapsed time Chris Wilson 2015-12-11 11:33 ` [PATCH 24/32] drm/i915: On GPU reset, set the HWS breadcrumb to the last seqno Chris Wilson 2015-12-11 11:33 ` [PATCH 25/32] drm/i915: Convert trace-irq to the breadcrumb waiter Chris Wilson 2015-12-12 15:20 ` [PATCH v2] " Chris Wilson 2015-12-12 15:34 ` [PATCH 1/3] drm/i915: Move GEM request routines to i915_gem_request.c Chris Wilson 2015-12-12 15:34 ` [PATCH 2/3] drm/i915: Move releasing of the GEM request from free to retire/cancel Chris Wilson 2015-12-12 15:34 ` [PATCH 3/3] drm/i915: Derive GEM requests from dma-fence Chris Wilson 2016-01-04 12:17 ` Dave Gordon 2016-01-04 12:22 ` Chris Wilson 2015-12-11 11:33 ` [PATCH 26/32] drm/i915: Move the get/put irq locking into the caller Chris Wilson 2015-12-11 11:33 ` [PATCH 27/32] drm/i915: Harden detection of missed interrupts Chris Wilson 2015-12-11 11:33 ` [PATCH 28/32] drm/i915: Remove debug noise on detecting fault-injection " Chris Wilson 2015-12-11 11:33 ` [PATCH 29/32] drm/i915: Only start retire worker when idle Chris Wilson 2015-12-15 9:26 ` [PATCH] " Chris Wilson 2015-12-11 11:33 ` [PATCH 30/32] drm/i915: Restore waitboost credit to the synchronous waiter Chris Wilson 2015-12-11 11:33 ` [PATCH 31/32] drm/i915: Add background commentary to "waitboosting" Chris Wilson 2015-12-11 11:33 ` [PATCH 32/32] drm/i915: Flush the RPS bottom-half when the GPU idles Chris Wilson
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20151218161221.GC30437@phenom.ffwll.local \ --to=daniel@ffwll.ch \ --cc=chris@chris-wilson.co.uk \ --cc=daniel.vetter@ffwll.ch \ --cc=dmitry.v.rogozhkin@intel.com \ --cc=eero.t.tamminen@intel.com \ --cc=intel-gfx@lists.freedesktop.org \ --cc=stable@vger.kernel.org \ --cc=tvrtko.ursulin@linux.intel.com \ --cc=valtteri.rantala@intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.