* [PATCH 1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking @ 2017-03-27 3:28 ` Chris Wilson 0 siblings, 0 replies; 12+ messages in thread From: Chris Wilson @ 2017-03-27 3:28 UTC (permalink / raw) To: intel-gfx; +Cc: Chris Wilson, Mika Kuoppala, # v4 . 10+ If the request->wa_tail is 0 (because it landed exactly on the end of the ringbuffer), when we reconstruct request->tail following a reset we fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is never able to catch up with RING_TAIL and the GPU spins endlessly. If the ring contains a couple of breadcrumbs, even our hangcheck is unable to catch the busy-looping as the ACTHD and seqno continually advance. Fixes: a3aabe86a340 ("drm/i915/execlists: Reinitialise context image after GPU hang") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: <stable@vger.kernel.org> # v4.10+ --- drivers/gpu/drm/i915/intel_lrc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 3fdabba0a32d..c8dd848d2ebe 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1302,6 +1302,7 @@ static void reset_common_ring(struct intel_engine_cs *engine, /* Reset WaIdleLiteRestore:bdw,skl as well */ request->tail = request->wa_tail - WA_TAIL_DWORDS * sizeof(u32); + request->tail &= request->ring->size - 1; GEM_BUG_ON(!IS_ALIGNED(request->tail, 8)); } -- 2.11.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking @ 2017-03-27 3:28 ` Chris Wilson 0 siblings, 0 replies; 12+ messages in thread From: Chris Wilson @ 2017-03-27 3:28 UTC (permalink / raw) To: intel-gfx; +Cc: Mika Kuoppala, # v4 . 10+ If the request->wa_tail is 0 (because it landed exactly on the end of the ringbuffer), when we reconstruct request->tail following a reset we fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is never able to catch up with RING_TAIL and the GPU spins endlessly. If the ring contains a couple of breadcrumbs, even our hangcheck is unable to catch the busy-looping as the ACTHD and seqno continually advance. Fixes: a3aabe86a340 ("drm/i915/execlists: Reinitialise context image after GPU hang") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: <stable@vger.kernel.org> # v4.10+ --- drivers/gpu/drm/i915/intel_lrc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 3fdabba0a32d..c8dd848d2ebe 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1302,6 +1302,7 @@ static void reset_common_ring(struct intel_engine_cs *engine, /* Reset WaIdleLiteRestore:bdw,skl as well */ request->tail = request->wa_tail - WA_TAIL_DWORDS * sizeof(u32); + request->tail &= request->ring->size - 1; GEM_BUG_ON(!IS_ALIGNED(request->tail, 8)); } -- 2.11.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/2] drm/i915: Assert that the request->tail fits within the ring 2017-03-27 3:28 ` Chris Wilson (?) @ 2017-03-27 3:28 ` Chris Wilson -1 siblings, 0 replies; 12+ messages in thread From: Chris Wilson @ 2017-03-27 3:28 UTC (permalink / raw) To: intel-gfx In addition to being qword-aligned, the RING_TAIL offset must be within the ring! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/intel_lrc.c | 4 ++++ drivers/gpu/drm/i915/intel_ringbuffer.c | 3 +++ 2 files changed, 7 insertions(+) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index c8dd848d2ebe..3b84fbb7da5d 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -327,6 +327,7 @@ static u64 execlists_update_context(struct drm_i915_gem_request *rq) u32 *reg_state = ce->lrc_reg_state; GEM_BUG_ON(!IS_ALIGNED(rq->tail, 8)); + GEM_BUG_ON(rq->tail >= rq->ring->size); reg_state[CTX_RING_TAIL+1] = rq->tail; /* True 32b PPGTT with dynamic page allocation: update PDP @@ -1304,6 +1305,7 @@ static void reset_common_ring(struct intel_engine_cs *engine, request->tail = request->wa_tail - WA_TAIL_DWORDS * sizeof(u32); request->tail &= request->ring->size - 1; GEM_BUG_ON(!IS_ALIGNED(request->tail, 8)); + GEM_BUG_ON(request->tail >= request->ring->size); } static int intel_logical_ring_emit_pdps(struct drm_i915_gem_request *req) @@ -1515,6 +1517,7 @@ static void gen8_emit_breadcrumb(struct drm_i915_gem_request *request, u32 *cs) *cs++ = MI_NOOP; request->tail = intel_ring_offset(request, cs); GEM_BUG_ON(!IS_ALIGNED(request->tail, 8)); + GEM_BUG_ON(request->tail >= request->ring->size); gen8_emit_wa_tail(request, cs); } @@ -1543,6 +1546,7 @@ static void gen8_emit_breadcrumb_render(struct drm_i915_gem_request *request, *cs++ = MI_NOOP; request->tail = intel_ring_offset(request, cs); GEM_BUG_ON(!IS_ALIGNED(request->tail, 8)); + GEM_BUG_ON(request->tail >= request->ring->size); gen8_emit_wa_tail(request, cs); } diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 4729ac7ac122..47921dcbedb3 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -775,6 +775,7 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request) i915_gem_request_submit(request); GEM_BUG_ON(!IS_ALIGNED(request->tail, 8)); + GEM_BUG_ON(request->tail >= request->ring->size); I915_WRITE_TAIL(request->engine, request->tail); } @@ -787,6 +788,7 @@ static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req, u32 *cs) req->tail = intel_ring_offset(req, cs); GEM_BUG_ON(!IS_ALIGNED(req->tail, 8)); + GEM_BUG_ON(req->tail >= req->ring->size); } static const int i9xx_emit_breadcrumb_sz = 4; @@ -826,6 +828,7 @@ static void gen8_render_emit_breadcrumb(struct drm_i915_gem_request *req, req->tail = intel_ring_offset(req, cs); GEM_BUG_ON(!IS_ALIGNED(req->tail, 8)); + GEM_BUG_ON(req->tail >= req->ring->size); } static const int gen8_render_emit_breadcrumb_sz = 8; -- 2.11.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 12+ messages in thread
* ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking 2017-03-27 3:28 ` Chris Wilson (?) (?) @ 2017-03-27 3:45 ` Patchwork -1 siblings, 0 replies; 12+ messages in thread From: Patchwork @ 2017-03-27 3:45 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx == Series Details == Series: series starting with [1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking URL : https://patchwork.freedesktop.org/series/21907/ State : success == Summary == Series 21907v1 Series without cover letter https://patchwork.freedesktop.org/api/1.0/series/21907/revisions/1/mbox/ Test gem_exec_fence: Subgroup await-hang-default: pass -> INCOMPLETE (fi-ilk-650) fdo#100144 Test kms_pipe_crc_basic: Subgroup suspend-read-crc-pipe-a: dmesg-warn -> PASS (fi-byt-n2820) fdo#100126 fdo#100144 https://bugs.freedesktop.org/show_bug.cgi?id=100144 fdo#100126 https://bugs.freedesktop.org/show_bug.cgi?id=100126 fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time: 455s fi-bdw-gvtdvm total:278 pass:256 dwarn:8 dfail:0 fail:0 skip:14 time: 456s fi-bsw-n3050 total:278 pass:239 dwarn:0 dfail:0 fail:0 skip:39 time: 580s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time: 540s fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time: 567s fi-byt-j1900 total:278 pass:251 dwarn:0 dfail:0 fail:0 skip:27 time: 507s fi-byt-n2820 total:278 pass:247 dwarn:0 dfail:0 fail:0 skip:31 time: 511s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time: 438s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time: 440s fi-ilk-650 total:48 pass:27 dwarn:0 dfail:0 fail:0 skip:20 time: 0s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time: 515s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time: 496s fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time: 485s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time: 487s fi-skl-6700hq total:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time: 603s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time: 495s fi-skl-6770hq total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time: 532s fi-skl-gvtdvm total:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time: 464s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time: 554s fi-snb-2600 total:278 pass:249 dwarn:0 dfail:0 fail:0 skip:29 time: 414s 295dfed68b4cc63cdc1747a915e0099a32bf0955 drm-tip: 2017y-03m-26d-01h-26m-45s UTC integration manifest 06fca80 drm/i915: Assert that the request->tail fits within the ring 5022d1b drm/i915/execlists: Wrap tail pointer after reset tweaking == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4309/ _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Intel-gfx] [PATCH 1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking 2017-03-27 3:28 ` Chris Wilson @ 2017-03-27 10:44 ` Mika Kuoppala -1 siblings, 0 replies; 12+ messages in thread From: Mika Kuoppala @ 2017-03-27 10:44 UTC (permalink / raw) To: Chris Wilson, intel-gfx; +Cc: # v4 . 10+ Chris Wilson <chris@chris-wilson.co.uk> writes: > If the request->wa_tail is 0 (because it landed exactly on the end of > the ringbuffer), when we reconstruct request->tail following a reset we > fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is > never able to catch up with RING_TAIL and the GPU spins endlessly. If > the ring contains a couple of breadcrumbs, even our hangcheck is unable > to catch the busy-looping as the ACTHD and seqno continually advance. Tail is past ring size (on hw) and the ring contents has seqno writes. So we will replay the ring contents over and over and seqno advances and wraps back to the first breadcrumbs in ring? > Fixes: a3aabe86a340 ("drm/i915/execlists: Reinitialise context image after GPU hang") > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > Cc: <stable@vger.kernel.org> # v4.10+ > --- > drivers/gpu/drm/i915/intel_lrc.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > index 3fdabba0a32d..c8dd848d2ebe 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -1302,6 +1302,7 @@ static void reset_common_ring(struct intel_engine_cs *engine, > > /* Reset WaIdleLiteRestore:bdw,skl as well */ > request->tail = request->wa_tail - WA_TAIL_DWORDS * sizeof(u32); > + request->tail &= request->ring->size - 1; > GEM_BUG_ON(!IS_ALIGNED(request->tail, 8)); > } > > -- > 2.11.0 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking @ 2017-03-27 10:44 ` Mika Kuoppala 0 siblings, 0 replies; 12+ messages in thread From: Mika Kuoppala @ 2017-03-27 10:44 UTC (permalink / raw) To: Chris Wilson, intel-gfx; +Cc: # v4 . 10+ Chris Wilson <chris@chris-wilson.co.uk> writes: > If the request->wa_tail is 0 (because it landed exactly on the end of > the ringbuffer), when we reconstruct request->tail following a reset we > fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is > never able to catch up with RING_TAIL and the GPU spins endlessly. If > the ring contains a couple of breadcrumbs, even our hangcheck is unable > to catch the busy-looping as the ACTHD and seqno continually advance. Tail is past ring size (on hw) and the ring contents has seqno writes. So we will replay the ring contents over and over and seqno advances and wraps back to the first breadcrumbs in ring? > Fixes: a3aabe86a340 ("drm/i915/execlists: Reinitialise context image after GPU hang") > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > Cc: <stable@vger.kernel.org> # v4.10+ > --- > drivers/gpu/drm/i915/intel_lrc.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > index 3fdabba0a32d..c8dd848d2ebe 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -1302,6 +1302,7 @@ static void reset_common_ring(struct intel_engine_cs *engine, > > /* Reset WaIdleLiteRestore:bdw,skl as well */ > request->tail = request->wa_tail - WA_TAIL_DWORDS * sizeof(u32); > + request->tail &= request->ring->size - 1; > GEM_BUG_ON(!IS_ALIGNED(request->tail, 8)); > } > > -- > 2.11.0 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Intel-gfx] [PATCH 1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking 2017-03-27 10:44 ` Mika Kuoppala @ 2017-03-27 11:00 ` Chris Wilson -1 siblings, 0 replies; 12+ messages in thread From: Chris Wilson @ 2017-03-27 11:00 UTC (permalink / raw) To: Mika Kuoppala; +Cc: intel-gfx, # v4 . 10+ On Mon, Mar 27, 2017 at 01:44:00PM +0300, Mika Kuoppala wrote: > Chris Wilson <chris@chris-wilson.co.uk> writes: > > > If the request->wa_tail is 0 (because it landed exactly on the end of > > the ringbuffer), when we reconstruct request->tail following a reset we > > fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is > > never able to catch up with RING_TAIL and the GPU spins endlessly. If > > the ring contains a couple of breadcrumbs, even our hangcheck is unable > > to catch the busy-looping as the ACTHD and seqno continually advance. > > Tail is past ring size (on hw) and the ring contents has seqno writes. > So we will replay the ring contents over and over and seqno advances > and wraps back to the first breadcrumbs in ring? Yup. It was most confusing to watch. The execlist_port[] was static, RING_START was static, yet the seqno kept changing. I felt like I was hallucinating. That or insomnia. -Chris -- Chris Wilson, Intel Open Source Technology Centre ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking @ 2017-03-27 11:00 ` Chris Wilson 0 siblings, 0 replies; 12+ messages in thread From: Chris Wilson @ 2017-03-27 11:00 UTC (permalink / raw) To: Mika Kuoppala; +Cc: intel-gfx, # v4 . 10+ On Mon, Mar 27, 2017 at 01:44:00PM +0300, Mika Kuoppala wrote: > Chris Wilson <chris@chris-wilson.co.uk> writes: > > > If the request->wa_tail is 0 (because it landed exactly on the end of > > the ringbuffer), when we reconstruct request->tail following a reset we > > fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is > > never able to catch up with RING_TAIL and the GPU spins endlessly. If > > the ring contains a couple of breadcrumbs, even our hangcheck is unable > > to catch the busy-looping as the ACTHD and seqno continually advance. > > Tail is past ring size (on hw) and the ring contents has seqno writes. > So we will replay the ring contents over and over and seqno advances > and wraps back to the first breadcrumbs in ring? Yup. It was most confusing to watch. The execlist_port[] was static, RING_START was static, yet the seqno kept changing. I felt like I was hallucinating. That or insomnia. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Intel-gfx] [PATCH 1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking 2017-03-27 11:00 ` Chris Wilson @ 2017-03-27 11:07 ` Mika Kuoppala -1 siblings, 0 replies; 12+ messages in thread From: Mika Kuoppala @ 2017-03-27 11:07 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx, # v4 . 10+ Chris Wilson <chris@chris-wilson.co.uk> writes: > On Mon, Mar 27, 2017 at 01:44:00PM +0300, Mika Kuoppala wrote: >> Chris Wilson <chris@chris-wilson.co.uk> writes: >> >> > If the request->wa_tail is 0 (because it landed exactly on the end of >> > the ringbuffer), when we reconstruct request->tail following a reset we >> > fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is >> > never able to catch up with RING_TAIL and the GPU spins endlessly. If >> > the ring contains a couple of breadcrumbs, even our hangcheck is unable >> > to catch the busy-looping as the ACTHD and seqno continually advance. >> >> Tail is past ring size (on hw) and the ring contents has seqno writes. >> So we will replay the ring contents over and over and seqno advances >> and wraps back to the first breadcrumbs in ring? > > Yup. It was most confusing to watch. The execlist_port[] was static, > RING_START was static, yet the seqno kept changing. I felt like I was > hallucinating. That or insomnia. /o\ When we reset_common_ring() it is always after a hw reset. So the 'last' in sense of hardware's lrc contexts doesn't mean much. So can we actually get rid of the tail trickery as for first request after reset, as the lite restore can't happen and should not matter? -Mika > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking @ 2017-03-27 11:07 ` Mika Kuoppala 0 siblings, 0 replies; 12+ messages in thread From: Mika Kuoppala @ 2017-03-27 11:07 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx, # v4 . 10+ Chris Wilson <chris@chris-wilson.co.uk> writes: > On Mon, Mar 27, 2017 at 01:44:00PM +0300, Mika Kuoppala wrote: >> Chris Wilson <chris@chris-wilson.co.uk> writes: >> >> > If the request->wa_tail is 0 (because it landed exactly on the end of >> > the ringbuffer), when we reconstruct request->tail following a reset we >> > fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is >> > never able to catch up with RING_TAIL and the GPU spins endlessly. If >> > the ring contains a couple of breadcrumbs, even our hangcheck is unable >> > to catch the busy-looping as the ACTHD and seqno continually advance. >> >> Tail is past ring size (on hw) and the ring contents has seqno writes. >> So we will replay the ring contents over and over and seqno advances >> and wraps back to the first breadcrumbs in ring? > > Yup. It was most confusing to watch. The execlist_port[] was static, > RING_START was static, yet the seqno kept changing. I felt like I was > hallucinating. That or insomnia. /o\ When we reset_common_ring() it is always after a hw reset. So the 'last' in sense of hardware's lrc contexts doesn't mean much. So can we actually get rid of the tail trickery as for first request after reset, as the lite restore can't happen and should not matter? -Mika > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Intel-gfx] [PATCH 1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking 2017-03-27 11:07 ` Mika Kuoppala @ 2017-03-27 11:19 ` Chris Wilson -1 siblings, 0 replies; 12+ messages in thread From: Chris Wilson @ 2017-03-27 11:19 UTC (permalink / raw) To: Mika Kuoppala; +Cc: intel-gfx, # v4 . 10+ On Mon, Mar 27, 2017 at 02:07:09PM +0300, Mika Kuoppala wrote: > Chris Wilson <chris@chris-wilson.co.uk> writes: > > > On Mon, Mar 27, 2017 at 01:44:00PM +0300, Mika Kuoppala wrote: > >> Chris Wilson <chris@chris-wilson.co.uk> writes: > >> > >> > If the request->wa_tail is 0 (because it landed exactly on the end of > >> > the ringbuffer), when we reconstruct request->tail following a reset we > >> > fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is > >> > never able to catch up with RING_TAIL and the GPU spins endlessly. If > >> > the ring contains a couple of breadcrumbs, even our hangcheck is unable > >> > to catch the busy-looping as the ACTHD and seqno continually advance. > >> > >> Tail is past ring size (on hw) and the ring contents has seqno writes. > >> So we will replay the ring contents over and over and seqno advances > >> and wraps back to the first breadcrumbs in ring? > > > > Yup. It was most confusing to watch. The execlist_port[] was static, > > RING_START was static, yet the seqno kept changing. I felt like I was > > hallucinating. That or insomnia. > > /o\ > > When we reset_common_ring() it is always after a hw reset. So the > 'last' in sense of hardware's lrc contexts doesn't mean much. > > So can we actually get rid of the tail trickery as for first > request after reset, as the lite restore can't happen and > should not matter? So move handling the rare case into the latency sensitive hotpath? ;) The complaint I feel is that we don't have a great interface, otoh this manipulation is currently a one-off. -Chris -- Chris Wilson, Intel Open Source Technology Centre ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking @ 2017-03-27 11:19 ` Chris Wilson 0 siblings, 0 replies; 12+ messages in thread From: Chris Wilson @ 2017-03-27 11:19 UTC (permalink / raw) To: Mika Kuoppala; +Cc: intel-gfx, # v4 . 10+ On Mon, Mar 27, 2017 at 02:07:09PM +0300, Mika Kuoppala wrote: > Chris Wilson <chris@chris-wilson.co.uk> writes: > > > On Mon, Mar 27, 2017 at 01:44:00PM +0300, Mika Kuoppala wrote: > >> Chris Wilson <chris@chris-wilson.co.uk> writes: > >> > >> > If the request->wa_tail is 0 (because it landed exactly on the end of > >> > the ringbuffer), when we reconstruct request->tail following a reset we > >> > fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is > >> > never able to catch up with RING_TAIL and the GPU spins endlessly. If > >> > the ring contains a couple of breadcrumbs, even our hangcheck is unable > >> > to catch the busy-looping as the ACTHD and seqno continually advance. > >> > >> Tail is past ring size (on hw) and the ring contents has seqno writes. > >> So we will replay the ring contents over and over and seqno advances > >> and wraps back to the first breadcrumbs in ring? > > > > Yup. It was most confusing to watch. The execlist_port[] was static, > > RING_START was static, yet the seqno kept changing. I felt like I was > > hallucinating. That or insomnia. > > /o\ > > When we reset_common_ring() it is always after a hw reset. So the > 'last' in sense of hardware's lrc contexts doesn't mean much. > > So can we actually get rid of the tail trickery as for first > request after reset, as the lite restore can't happen and > should not matter? So move handling the rare case into the latency sensitive hotpath? ;) The complaint I feel is that we don't have a great interface, otoh this manipulation is currently a one-off. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2017-03-27 11:20 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-03-27 3:28 [PATCH 1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking Chris Wilson 2017-03-27 3:28 ` Chris Wilson 2017-03-27 3:28 ` [PATCH 2/2] drm/i915: Assert that the request->tail fits within the ring Chris Wilson 2017-03-27 3:45 ` ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking Patchwork 2017-03-27 10:44 ` [Intel-gfx] [PATCH 1/2] " Mika Kuoppala 2017-03-27 10:44 ` Mika Kuoppala 2017-03-27 11:00 ` [Intel-gfx] " Chris Wilson 2017-03-27 11:00 ` Chris Wilson 2017-03-27 11:07 ` [Intel-gfx] " Mika Kuoppala 2017-03-27 11:07 ` Mika Kuoppala 2017-03-27 11:19 ` [Intel-gfx] " Chris Wilson 2017-03-27 11:19 ` Chris Wilson
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.