* [PATCH 1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking
@ 2017-03-27 3:28 ` Chris Wilson
0 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2017-03-27 3:28 UTC (permalink / raw)
To: intel-gfx; +Cc: Chris Wilson, Mika Kuoppala, # v4 . 10+
If the request->wa_tail is 0 (because it landed exactly on the end of
the ringbuffer), when we reconstruct request->tail following a reset we
fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is
never able to catch up with RING_TAIL and the GPU spins endlessly. If
the ring contains a couple of breadcrumbs, even our hangcheck is unable
to catch the busy-looping as the ACTHD and seqno continually advance.
Fixes: a3aabe86a340 ("drm/i915/execlists: Reinitialise context image after GPU hang")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: <stable@vger.kernel.org> # v4.10+
---
drivers/gpu/drm/i915/intel_lrc.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 3fdabba0a32d..c8dd848d2ebe 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1302,6 +1302,7 @@ static void reset_common_ring(struct intel_engine_cs *engine,
/* Reset WaIdleLiteRestore:bdw,skl as well */
request->tail = request->wa_tail - WA_TAIL_DWORDS * sizeof(u32);
+ request->tail &= request->ring->size - 1;
GEM_BUG_ON(!IS_ALIGNED(request->tail, 8));
}
--
2.11.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking
@ 2017-03-27 3:28 ` Chris Wilson
0 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2017-03-27 3:28 UTC (permalink / raw)
To: intel-gfx; +Cc: Mika Kuoppala, # v4 . 10+
If the request->wa_tail is 0 (because it landed exactly on the end of
the ringbuffer), when we reconstruct request->tail following a reset we
fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is
never able to catch up with RING_TAIL and the GPU spins endlessly. If
the ring contains a couple of breadcrumbs, even our hangcheck is unable
to catch the busy-looping as the ACTHD and seqno continually advance.
Fixes: a3aabe86a340 ("drm/i915/execlists: Reinitialise context image after GPU hang")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: <stable@vger.kernel.org> # v4.10+
---
drivers/gpu/drm/i915/intel_lrc.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 3fdabba0a32d..c8dd848d2ebe 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1302,6 +1302,7 @@ static void reset_common_ring(struct intel_engine_cs *engine,
/* Reset WaIdleLiteRestore:bdw,skl as well */
request->tail = request->wa_tail - WA_TAIL_DWORDS * sizeof(u32);
+ request->tail &= request->ring->size - 1;
GEM_BUG_ON(!IS_ALIGNED(request->tail, 8));
}
--
2.11.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/2] drm/i915: Assert that the request->tail fits within the ring
2017-03-27 3:28 ` Chris Wilson
(?)
@ 2017-03-27 3:28 ` Chris Wilson
-1 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2017-03-27 3:28 UTC (permalink / raw)
To: intel-gfx
In addition to being qword-aligned, the RING_TAIL offset must be within
the ring!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/intel_lrc.c | 4 ++++
drivers/gpu/drm/i915/intel_ringbuffer.c | 3 +++
2 files changed, 7 insertions(+)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index c8dd848d2ebe..3b84fbb7da5d 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -327,6 +327,7 @@ static u64 execlists_update_context(struct drm_i915_gem_request *rq)
u32 *reg_state = ce->lrc_reg_state;
GEM_BUG_ON(!IS_ALIGNED(rq->tail, 8));
+ GEM_BUG_ON(rq->tail >= rq->ring->size);
reg_state[CTX_RING_TAIL+1] = rq->tail;
/* True 32b PPGTT with dynamic page allocation: update PDP
@@ -1304,6 +1305,7 @@ static void reset_common_ring(struct intel_engine_cs *engine,
request->tail = request->wa_tail - WA_TAIL_DWORDS * sizeof(u32);
request->tail &= request->ring->size - 1;
GEM_BUG_ON(!IS_ALIGNED(request->tail, 8));
+ GEM_BUG_ON(request->tail >= request->ring->size);
}
static int intel_logical_ring_emit_pdps(struct drm_i915_gem_request *req)
@@ -1515,6 +1517,7 @@ static void gen8_emit_breadcrumb(struct drm_i915_gem_request *request, u32 *cs)
*cs++ = MI_NOOP;
request->tail = intel_ring_offset(request, cs);
GEM_BUG_ON(!IS_ALIGNED(request->tail, 8));
+ GEM_BUG_ON(request->tail >= request->ring->size);
gen8_emit_wa_tail(request, cs);
}
@@ -1543,6 +1546,7 @@ static void gen8_emit_breadcrumb_render(struct drm_i915_gem_request *request,
*cs++ = MI_NOOP;
request->tail = intel_ring_offset(request, cs);
GEM_BUG_ON(!IS_ALIGNED(request->tail, 8));
+ GEM_BUG_ON(request->tail >= request->ring->size);
gen8_emit_wa_tail(request, cs);
}
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 4729ac7ac122..47921dcbedb3 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -775,6 +775,7 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request)
i915_gem_request_submit(request);
GEM_BUG_ON(!IS_ALIGNED(request->tail, 8));
+ GEM_BUG_ON(request->tail >= request->ring->size);
I915_WRITE_TAIL(request->engine, request->tail);
}
@@ -787,6 +788,7 @@ static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req, u32 *cs)
req->tail = intel_ring_offset(req, cs);
GEM_BUG_ON(!IS_ALIGNED(req->tail, 8));
+ GEM_BUG_ON(req->tail >= req->ring->size);
}
static const int i9xx_emit_breadcrumb_sz = 4;
@@ -826,6 +828,7 @@ static void gen8_render_emit_breadcrumb(struct drm_i915_gem_request *req,
req->tail = intel_ring_offset(req, cs);
GEM_BUG_ON(!IS_ALIGNED(req->tail, 8));
+ GEM_BUG_ON(req->tail >= req->ring->size);
}
static const int gen8_render_emit_breadcrumb_sz = 8;
--
2.11.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 12+ messages in thread
* ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking
2017-03-27 3:28 ` Chris Wilson
(?)
(?)
@ 2017-03-27 3:45 ` Patchwork
-1 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2017-03-27 3:45 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: series starting with [1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking
URL : https://patchwork.freedesktop.org/series/21907/
State : success
== Summary ==
Series 21907v1 Series without cover letter
https://patchwork.freedesktop.org/api/1.0/series/21907/revisions/1/mbox/
Test gem_exec_fence:
Subgroup await-hang-default:
pass -> INCOMPLETE (fi-ilk-650) fdo#100144
Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-a:
dmesg-warn -> PASS (fi-byt-n2820) fdo#100126
fdo#100144 https://bugs.freedesktop.org/show_bug.cgi?id=100144
fdo#100126 https://bugs.freedesktop.org/show_bug.cgi?id=100126
fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time: 455s
fi-bdw-gvtdvm total:278 pass:256 dwarn:8 dfail:0 fail:0 skip:14 time: 456s
fi-bsw-n3050 total:278 pass:239 dwarn:0 dfail:0 fail:0 skip:39 time: 580s
fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time: 540s
fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time: 567s
fi-byt-j1900 total:278 pass:251 dwarn:0 dfail:0 fail:0 skip:27 time: 507s
fi-byt-n2820 total:278 pass:247 dwarn:0 dfail:0 fail:0 skip:31 time: 511s
fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time: 438s
fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time: 440s
fi-ilk-650 total:48 pass:27 dwarn:0 dfail:0 fail:0 skip:20 time: 0s
fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time: 515s
fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time: 496s
fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time: 485s
fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time: 487s
fi-skl-6700hq total:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time: 603s
fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time: 495s
fi-skl-6770hq total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time: 532s
fi-skl-gvtdvm total:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time: 464s
fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time: 554s
fi-snb-2600 total:278 pass:249 dwarn:0 dfail:0 fail:0 skip:29 time: 414s
295dfed68b4cc63cdc1747a915e0099a32bf0955 drm-tip: 2017y-03m-26d-01h-26m-45s UTC integration manifest
06fca80 drm/i915: Assert that the request->tail fits within the ring
5022d1b drm/i915/execlists: Wrap tail pointer after reset tweaking
== Logs ==
For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4309/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Intel-gfx] [PATCH 1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking
2017-03-27 3:28 ` Chris Wilson
@ 2017-03-27 10:44 ` Mika Kuoppala
-1 siblings, 0 replies; 12+ messages in thread
From: Mika Kuoppala @ 2017-03-27 10:44 UTC (permalink / raw)
To: Chris Wilson, intel-gfx; +Cc: # v4 . 10+
Chris Wilson <chris@chris-wilson.co.uk> writes:
> If the request->wa_tail is 0 (because it landed exactly on the end of
> the ringbuffer), when we reconstruct request->tail following a reset we
> fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is
> never able to catch up with RING_TAIL and the GPU spins endlessly. If
> the ring contains a couple of breadcrumbs, even our hangcheck is unable
> to catch the busy-looping as the ACTHD and seqno continually advance.
Tail is past ring size (on hw) and the ring contents has seqno writes.
So we will replay the ring contents over and over and seqno advances
and wraps back to the first breadcrumbs in ring?
> Fixes: a3aabe86a340 ("drm/i915/execlists: Reinitialise context image after GPU hang")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: <stable@vger.kernel.org> # v4.10+
> ---
> drivers/gpu/drm/i915/intel_lrc.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 3fdabba0a32d..c8dd848d2ebe 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1302,6 +1302,7 @@ static void reset_common_ring(struct intel_engine_cs *engine,
>
> /* Reset WaIdleLiteRestore:bdw,skl as well */
> request->tail = request->wa_tail - WA_TAIL_DWORDS * sizeof(u32);
> + request->tail &= request->ring->size - 1;
> GEM_BUG_ON(!IS_ALIGNED(request->tail, 8));
> }
>
> --
> 2.11.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking
@ 2017-03-27 10:44 ` Mika Kuoppala
0 siblings, 0 replies; 12+ messages in thread
From: Mika Kuoppala @ 2017-03-27 10:44 UTC (permalink / raw)
To: Chris Wilson, intel-gfx; +Cc: # v4 . 10+
Chris Wilson <chris@chris-wilson.co.uk> writes:
> If the request->wa_tail is 0 (because it landed exactly on the end of
> the ringbuffer), when we reconstruct request->tail following a reset we
> fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is
> never able to catch up with RING_TAIL and the GPU spins endlessly. If
> the ring contains a couple of breadcrumbs, even our hangcheck is unable
> to catch the busy-looping as the ACTHD and seqno continually advance.
Tail is past ring size (on hw) and the ring contents has seqno writes.
So we will replay the ring contents over and over and seqno advances
and wraps back to the first breadcrumbs in ring?
> Fixes: a3aabe86a340 ("drm/i915/execlists: Reinitialise context image after GPU hang")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: <stable@vger.kernel.org> # v4.10+
> ---
> drivers/gpu/drm/i915/intel_lrc.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 3fdabba0a32d..c8dd848d2ebe 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1302,6 +1302,7 @@ static void reset_common_ring(struct intel_engine_cs *engine,
>
> /* Reset WaIdleLiteRestore:bdw,skl as well */
> request->tail = request->wa_tail - WA_TAIL_DWORDS * sizeof(u32);
> + request->tail &= request->ring->size - 1;
> GEM_BUG_ON(!IS_ALIGNED(request->tail, 8));
> }
>
> --
> 2.11.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Intel-gfx] [PATCH 1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking
2017-03-27 10:44 ` Mika Kuoppala
@ 2017-03-27 11:00 ` Chris Wilson
-1 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2017-03-27 11:00 UTC (permalink / raw)
To: Mika Kuoppala; +Cc: intel-gfx, # v4 . 10+
On Mon, Mar 27, 2017 at 01:44:00PM +0300, Mika Kuoppala wrote:
> Chris Wilson <chris@chris-wilson.co.uk> writes:
>
> > If the request->wa_tail is 0 (because it landed exactly on the end of
> > the ringbuffer), when we reconstruct request->tail following a reset we
> > fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is
> > never able to catch up with RING_TAIL and the GPU spins endlessly. If
> > the ring contains a couple of breadcrumbs, even our hangcheck is unable
> > to catch the busy-looping as the ACTHD and seqno continually advance.
>
> Tail is past ring size (on hw) and the ring contents has seqno writes.
> So we will replay the ring contents over and over and seqno advances
> and wraps back to the first breadcrumbs in ring?
Yup. It was most confusing to watch. The execlist_port[] was static,
RING_START was static, yet the seqno kept changing. I felt like I was
hallucinating. That or insomnia.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking
@ 2017-03-27 11:00 ` Chris Wilson
0 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2017-03-27 11:00 UTC (permalink / raw)
To: Mika Kuoppala; +Cc: intel-gfx, # v4 . 10+
On Mon, Mar 27, 2017 at 01:44:00PM +0300, Mika Kuoppala wrote:
> Chris Wilson <chris@chris-wilson.co.uk> writes:
>
> > If the request->wa_tail is 0 (because it landed exactly on the end of
> > the ringbuffer), when we reconstruct request->tail following a reset we
> > fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is
> > never able to catch up with RING_TAIL and the GPU spins endlessly. If
> > the ring contains a couple of breadcrumbs, even our hangcheck is unable
> > to catch the busy-looping as the ACTHD and seqno continually advance.
>
> Tail is past ring size (on hw) and the ring contents has seqno writes.
> So we will replay the ring contents over and over and seqno advances
> and wraps back to the first breadcrumbs in ring?
Yup. It was most confusing to watch. The execlist_port[] was static,
RING_START was static, yet the seqno kept changing. I felt like I was
hallucinating. That or insomnia.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Intel-gfx] [PATCH 1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking
2017-03-27 11:00 ` Chris Wilson
@ 2017-03-27 11:07 ` Mika Kuoppala
-1 siblings, 0 replies; 12+ messages in thread
From: Mika Kuoppala @ 2017-03-27 11:07 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx, # v4 . 10+
Chris Wilson <chris@chris-wilson.co.uk> writes:
> On Mon, Mar 27, 2017 at 01:44:00PM +0300, Mika Kuoppala wrote:
>> Chris Wilson <chris@chris-wilson.co.uk> writes:
>>
>> > If the request->wa_tail is 0 (because it landed exactly on the end of
>> > the ringbuffer), when we reconstruct request->tail following a reset we
>> > fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is
>> > never able to catch up with RING_TAIL and the GPU spins endlessly. If
>> > the ring contains a couple of breadcrumbs, even our hangcheck is unable
>> > to catch the busy-looping as the ACTHD and seqno continually advance.
>>
>> Tail is past ring size (on hw) and the ring contents has seqno writes.
>> So we will replay the ring contents over and over and seqno advances
>> and wraps back to the first breadcrumbs in ring?
>
> Yup. It was most confusing to watch. The execlist_port[] was static,
> RING_START was static, yet the seqno kept changing. I felt like I was
> hallucinating. That or insomnia.
/o\
When we reset_common_ring() it is always after a hw reset. So the
'last' in sense of hardware's lrc contexts doesn't mean much.
So can we actually get rid of the tail trickery as for first
request after reset, as the lite restore can't happen and
should not matter?
-Mika
> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking
@ 2017-03-27 11:07 ` Mika Kuoppala
0 siblings, 0 replies; 12+ messages in thread
From: Mika Kuoppala @ 2017-03-27 11:07 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx, # v4 . 10+
Chris Wilson <chris@chris-wilson.co.uk> writes:
> On Mon, Mar 27, 2017 at 01:44:00PM +0300, Mika Kuoppala wrote:
>> Chris Wilson <chris@chris-wilson.co.uk> writes:
>>
>> > If the request->wa_tail is 0 (because it landed exactly on the end of
>> > the ringbuffer), when we reconstruct request->tail following a reset we
>> > fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is
>> > never able to catch up with RING_TAIL and the GPU spins endlessly. If
>> > the ring contains a couple of breadcrumbs, even our hangcheck is unable
>> > to catch the busy-looping as the ACTHD and seqno continually advance.
>>
>> Tail is past ring size (on hw) and the ring contents has seqno writes.
>> So we will replay the ring contents over and over and seqno advances
>> and wraps back to the first breadcrumbs in ring?
>
> Yup. It was most confusing to watch. The execlist_port[] was static,
> RING_START was static, yet the seqno kept changing. I felt like I was
> hallucinating. That or insomnia.
/o\
When we reset_common_ring() it is always after a hw reset. So the
'last' in sense of hardware's lrc contexts doesn't mean much.
So can we actually get rid of the tail trickery as for first
request after reset, as the lite restore can't happen and
should not matter?
-Mika
> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Intel-gfx] [PATCH 1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking
2017-03-27 11:07 ` Mika Kuoppala
@ 2017-03-27 11:19 ` Chris Wilson
-1 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2017-03-27 11:19 UTC (permalink / raw)
To: Mika Kuoppala; +Cc: intel-gfx, # v4 . 10+
On Mon, Mar 27, 2017 at 02:07:09PM +0300, Mika Kuoppala wrote:
> Chris Wilson <chris@chris-wilson.co.uk> writes:
>
> > On Mon, Mar 27, 2017 at 01:44:00PM +0300, Mika Kuoppala wrote:
> >> Chris Wilson <chris@chris-wilson.co.uk> writes:
> >>
> >> > If the request->wa_tail is 0 (because it landed exactly on the end of
> >> > the ringbuffer), when we reconstruct request->tail following a reset we
> >> > fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is
> >> > never able to catch up with RING_TAIL and the GPU spins endlessly. If
> >> > the ring contains a couple of breadcrumbs, even our hangcheck is unable
> >> > to catch the busy-looping as the ACTHD and seqno continually advance.
> >>
> >> Tail is past ring size (on hw) and the ring contents has seqno writes.
> >> So we will replay the ring contents over and over and seqno advances
> >> and wraps back to the first breadcrumbs in ring?
> >
> > Yup. It was most confusing to watch. The execlist_port[] was static,
> > RING_START was static, yet the seqno kept changing. I felt like I was
> > hallucinating. That or insomnia.
>
> /o\
>
> When we reset_common_ring() it is always after a hw reset. So the
> 'last' in sense of hardware's lrc contexts doesn't mean much.
>
> So can we actually get rid of the tail trickery as for first
> request after reset, as the lite restore can't happen and
> should not matter?
So move handling the rare case into the latency sensitive hotpath? ;)
The complaint I feel is that we don't have a great interface, otoh this
manipulation is currently a one-off.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking
@ 2017-03-27 11:19 ` Chris Wilson
0 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2017-03-27 11:19 UTC (permalink / raw)
To: Mika Kuoppala; +Cc: intel-gfx, # v4 . 10+
On Mon, Mar 27, 2017 at 02:07:09PM +0300, Mika Kuoppala wrote:
> Chris Wilson <chris@chris-wilson.co.uk> writes:
>
> > On Mon, Mar 27, 2017 at 01:44:00PM +0300, Mika Kuoppala wrote:
> >> Chris Wilson <chris@chris-wilson.co.uk> writes:
> >>
> >> > If the request->wa_tail is 0 (because it landed exactly on the end of
> >> > the ringbuffer), when we reconstruct request->tail following a reset we
> >> > fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is
> >> > never able to catch up with RING_TAIL and the GPU spins endlessly. If
> >> > the ring contains a couple of breadcrumbs, even our hangcheck is unable
> >> > to catch the busy-looping as the ACTHD and seqno continually advance.
> >>
> >> Tail is past ring size (on hw) and the ring contents has seqno writes.
> >> So we will replay the ring contents over and over and seqno advances
> >> and wraps back to the first breadcrumbs in ring?
> >
> > Yup. It was most confusing to watch. The execlist_port[] was static,
> > RING_START was static, yet the seqno kept changing. I felt like I was
> > hallucinating. That or insomnia.
>
> /o\
>
> When we reset_common_ring() it is always after a hw reset. So the
> 'last' in sense of hardware's lrc contexts doesn't mean much.
>
> So can we actually get rid of the tail trickery as for first
> request after reset, as the lite restore can't happen and
> should not matter?
So move handling the rare case into the latency sensitive hotpath? ;)
The complaint I feel is that we don't have a great interface, otoh this
manipulation is currently a one-off.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2017-03-27 11:20 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-27 3:28 [PATCH 1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking Chris Wilson
2017-03-27 3:28 ` Chris Wilson
2017-03-27 3:28 ` [PATCH 2/2] drm/i915: Assert that the request->tail fits within the ring Chris Wilson
2017-03-27 3:45 ` ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915/execlists: Wrap tail pointer after reset tweaking Patchwork
2017-03-27 10:44 ` [Intel-gfx] [PATCH 1/2] " Mika Kuoppala
2017-03-27 10:44 ` Mika Kuoppala
2017-03-27 11:00 ` [Intel-gfx] " Chris Wilson
2017-03-27 11:00 ` Chris Wilson
2017-03-27 11:07 ` [Intel-gfx] " Mika Kuoppala
2017-03-27 11:07 ` Mika Kuoppala
2017-03-27 11:19 ` [Intel-gfx] " Chris Wilson
2017-03-27 11:19 ` Chris Wilson
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.