* [PATCH] drm/i915: Cancel pending execlists irq handler upon idling
@ 2017-06-27 15:25 Chris Wilson
2017-06-27 15:45 ` ✓ Fi.CI.BAT: success for " Patchwork
2017-06-28 8:59 ` [PATCH] " Tvrtko Ursulin
0 siblings, 2 replies; 6+ messages in thread
From: Chris Wilson @ 2017-06-27 15:25 UTC (permalink / raw)
To: intel-gfx
Due to the slight asynchronicity in handling the execlists interrupts
(i.e. we defer the work to a handler that may consume more than one
interrupt event), when the engine is idle we may still have an irq
tasklet queued (especially when it has been deferred to a ksoftirqd).
At the beginning of the tasklet, we assert that we do hold a device
wakeref for the access we are about to perform. This assumes that when
we idle and release the GT wakeref, all execlists work has been
completed (since the elsp tracking says the hw is idle). However, there
may still be a tasklet queued, so as we mark the engine idle, also
cancel any pending tasklet.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
drivers/gpu/drm/i915/intel_engine_cs.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 3b46c1f7b88b..49e875c46c96 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1328,6 +1328,7 @@ void intel_engines_mark_idle(struct drm_i915_private *i915)
for_each_engine(engine, i915, id) {
intel_engine_disarm_breadcrumbs(engine);
i915_gem_batch_pool_fini(&engine->batch_pool);
+ tasklet_kill(&engine->irq_tasklet);
engine->no_priolist = false;
}
}
--
2.13.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 6+ messages in thread
* ✓ Fi.CI.BAT: success for drm/i915: Cancel pending execlists irq handler upon idling
2017-06-27 15:25 [PATCH] drm/i915: Cancel pending execlists irq handler upon idling Chris Wilson
@ 2017-06-27 15:45 ` Patchwork
2017-06-28 8:59 ` [PATCH] " Tvrtko Ursulin
1 sibling, 0 replies; 6+ messages in thread
From: Patchwork @ 2017-06-27 15:45 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Cancel pending execlists irq handler upon idling
URL : https://patchwork.freedesktop.org/series/26441/
State : success
== Summary ==
Series 26441v1 drm/i915: Cancel pending execlists irq handler upon idling
https://patchwork.freedesktop.org/api/1.0/series/26441/revisions/1/mbox/
Test gem_exec_flush:
Subgroup basic-batch-kernel-default-uc:
fail -> PASS (fi-snb-2600) fdo#100007
Test gem_exec_suspend:
Subgroup basic-s4-devices:
pass -> DMESG-WARN (fi-kbl-r) fdo#100125
Test kms_cursor_legacy:
Subgroup basic-busy-flip-before-cursor-legacy:
fail -> PASS (fi-snb-2600) fdo#100215
Test kms_pipe_crc_basic:
Subgroup hang-read-crc-pipe-a:
pass -> DMESG-WARN (fi-pnv-d510) fdo#101597 +1
fdo#100007 https://bugs.freedesktop.org/show_bug.cgi?id=100007
fdo#100125 https://bugs.freedesktop.org/show_bug.cgi?id=100125
fdo#100215 https://bugs.freedesktop.org/show_bug.cgi?id=100215
fdo#101597 https://bugs.freedesktop.org/show_bug.cgi?id=101597
fi-bdw-5557u total:279 pass:268 dwarn:0 dfail:0 fail:0 skip:11 time:443s
fi-bdw-gvtdvm total:279 pass:257 dwarn:8 dfail:0 fail:0 skip:14 time:433s
fi-blb-e6850 total:279 pass:224 dwarn:1 dfail:0 fail:0 skip:54 time:353s
fi-bsw-n3050 total:279 pass:242 dwarn:1 dfail:0 fail:0 skip:36 time:524s
fi-bxt-j4205 total:279 pass:260 dwarn:0 dfail:0 fail:0 skip:19 time:508s
fi-byt-j1900 total:279 pass:253 dwarn:2 dfail:0 fail:0 skip:24 time:499s
fi-byt-n2820 total:279 pass:249 dwarn:2 dfail:0 fail:0 skip:28 time:486s
fi-glk-2a total:279 pass:260 dwarn:0 dfail:0 fail:0 skip:19 time:600s
fi-hsw-4770 total:279 pass:263 dwarn:0 dfail:0 fail:0 skip:16 time:435s
fi-hsw-4770r total:279 pass:263 dwarn:0 dfail:0 fail:0 skip:16 time:419s
fi-ilk-650 total:279 pass:229 dwarn:0 dfail:0 fail:0 skip:50 time:419s
fi-ivb-3520m total:279 pass:261 dwarn:0 dfail:0 fail:0 skip:18 time:494s
fi-ivb-3770 total:279 pass:261 dwarn:0 dfail:0 fail:0 skip:18 time:470s
fi-kbl-7500u total:279 pass:261 dwarn:0 dfail:0 fail:0 skip:18 time:471s
fi-kbl-7560u total:279 pass:269 dwarn:0 dfail:0 fail:0 skip:10 time:571s
fi-kbl-r total:279 pass:260 dwarn:1 dfail:0 fail:0 skip:18 time:585s
fi-pnv-d510 total:279 pass:222 dwarn:2 dfail:0 fail:0 skip:55 time:556s
fi-skl-6260u total:279 pass:269 dwarn:0 dfail:0 fail:0 skip:10 time:451s
fi-skl-6700hq total:279 pass:223 dwarn:1 dfail:0 fail:30 skip:24 time:340s
fi-skl-6700k total:279 pass:257 dwarn:4 dfail:0 fail:0 skip:18 time:464s
fi-skl-6770hq total:279 pass:269 dwarn:0 dfail:0 fail:0 skip:10 time:489s
fi-skl-gvtdvm total:279 pass:266 dwarn:0 dfail:0 fail:0 skip:13 time:433s
fi-snb-2520m total:279 pass:251 dwarn:0 dfail:0 fail:0 skip:28 time:536s
fi-snb-2600 total:279 pass:250 dwarn:0 dfail:0 fail:0 skip:29 time:402s
bf26e1dbbba24a7697559f1131d4be99747b7646 drm-tip: 2017y-06m-27d-13h-59m-07s UTC integration manifest
1e7aaf7 drm/i915: Cancel pending execlists irq handler upon idling
== Logs ==
For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_5051/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] drm/i915: Cancel pending execlists irq handler upon idling
2017-06-27 15:25 [PATCH] drm/i915: Cancel pending execlists irq handler upon idling Chris Wilson
2017-06-27 15:45 ` ✓ Fi.CI.BAT: success for " Patchwork
@ 2017-06-28 8:59 ` Tvrtko Ursulin
2017-06-28 10:01 ` Chris Wilson
1 sibling, 1 reply; 6+ messages in thread
From: Tvrtko Ursulin @ 2017-06-28 8:59 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 27/06/2017 16:25, Chris Wilson wrote:
> Due to the slight asynchronicity in handling the execlists interrupts
> (i.e. we defer the work to a handler that may consume more than one
> interrupt event), when the engine is idle we may still have an irq
> tasklet queued (especially when it has been deferred to a ksoftirqd).
> At the beginning of the tasklet, we assert that we do hold a device
> wakeref for the access we are about to perform. This assumes that when
> we idle and release the GT wakeref, all execlists work has been
> completed (since the elsp tracking says the hw is idle). However, there
> may still be a tasklet queued, so as we mark the engine idle, also
> cancel any pending tasklet.
We check the irq_posted bit which should correspond with a pending
tasklet (intel_engines_are_idle/intel_engine_is_idle), before
transitioning to idle so I don't understand this.
Regards,
Tvrtko
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
> drivers/gpu/drm/i915/intel_engine_cs.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index 3b46c1f7b88b..49e875c46c96 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -1328,6 +1328,7 @@ void intel_engines_mark_idle(struct drm_i915_private *i915)
> for_each_engine(engine, i915, id) {
> intel_engine_disarm_breadcrumbs(engine);
> i915_gem_batch_pool_fini(&engine->batch_pool);
> + tasklet_kill(&engine->irq_tasklet);
> engine->no_priolist = false;
> }
> }
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] drm/i915: Cancel pending execlists irq handler upon idling
2017-06-28 8:59 ` [PATCH] " Tvrtko Ursulin
@ 2017-06-28 10:01 ` Chris Wilson
2017-06-28 10:15 ` Tvrtko Ursulin
0 siblings, 1 reply; 6+ messages in thread
From: Chris Wilson @ 2017-06-28 10:01 UTC (permalink / raw)
To: Tvrtko Ursulin, intel-gfx
Quoting Tvrtko Ursulin (2017-06-28 09:59:04)
>
> On 27/06/2017 16:25, Chris Wilson wrote:
> > Due to the slight asynchronicity in handling the execlists interrupts
> > (i.e. we defer the work to a handler that may consume more than one
> > interrupt event), when the engine is idle we may still have an irq
> > tasklet queued (especially when it has been deferred to a ksoftirqd).
> > At the beginning of the tasklet, we assert that we do hold a device
> > wakeref for the access we are about to perform. This assumes that when
> > we idle and release the GT wakeref, all execlists work has been
> > completed (since the elsp tracking says the hw is idle). However, there
> > may still be a tasklet queued, so as we mark the engine idle, also
> > cancel any pending tasklet.
>
> We check the irq_posted bit which should correspond with a pending
> tasklet (intel_engines_are_idle/intel_engine_is_idle), before
> transitioning to idle so I don't understand this.
Exactly, we've processed the interrupt in the current irq handler, but
due to the ordering (which is essential to ensure that we don't miss an
interrupt, i.e. the strong ordering is via the tasklet atomic ops so
that each interrupt is always followed by a tasklet, at least if we do
have elsp[]!) we can queue a second tasklet execution despite it already
being handled concurrently.
Run long enough and this rare event will then coincide with idling.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] drm/i915: Cancel pending execlists irq handler upon idling
2017-06-28 10:01 ` Chris Wilson
@ 2017-06-28 10:15 ` Tvrtko Ursulin
2017-06-28 10:29 ` Chris Wilson
0 siblings, 1 reply; 6+ messages in thread
From: Tvrtko Ursulin @ 2017-06-28 10:15 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 28/06/2017 11:01, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2017-06-28 09:59:04)
>>
>> On 27/06/2017 16:25, Chris Wilson wrote:
>>> Due to the slight asynchronicity in handling the execlists interrupts
>>> (i.e. we defer the work to a handler that may consume more than one
>>> interrupt event), when the engine is idle we may still have an irq
>>> tasklet queued (especially when it has been deferred to a ksoftirqd).
>>> At the beginning of the tasklet, we assert that we do hold a device
>>> wakeref for the access we are about to perform. This assumes that when
>>> we idle and release the GT wakeref, all execlists work has been
>>> completed (since the elsp tracking says the hw is idle). However, there
>>> may still be a tasklet queued, so as we mark the engine idle, also
>>> cancel any pending tasklet.
>>
>> We check the irq_posted bit which should correspond with a pending
>> tasklet (intel_engines_are_idle/intel_engine_is_idle), before
>> transitioning to idle so I don't understand this.
>
> Exactly, we've processed the interrupt in the current irq handler, but
> due to the ordering (which is essential to ensure that we don't miss an
> interrupt, i.e. the strong ordering is via the tasklet atomic ops so
> that each interrupt is always followed by a tasklet, at least if we do
> have elsp[]!) we can queue a second tasklet execution despite it already
> being handled concurrently.
>
> Run long enough and this rare event will then coincide with idling.
Ah rings a bell now. That would mean the irq_posted bit being consumed
by the current tasklet, and then the next one coming along as designed.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] drm/i915: Cancel pending execlists irq handler upon idling
2017-06-28 10:15 ` Tvrtko Ursulin
@ 2017-06-28 10:29 ` Chris Wilson
0 siblings, 0 replies; 6+ messages in thread
From: Chris Wilson @ 2017-06-28 10:29 UTC (permalink / raw)
To: Tvrtko Ursulin, intel-gfx
Quoting Tvrtko Ursulin (2017-06-28 11:15:36)
>
> On 28/06/2017 11:01, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2017-06-28 09:59:04)
> >>
> >> On 27/06/2017 16:25, Chris Wilson wrote:
> >>> Due to the slight asynchronicity in handling the execlists interrupts
> >>> (i.e. we defer the work to a handler that may consume more than one
> >>> interrupt event), when the engine is idle we may still have an irq
> >>> tasklet queued (especially when it has been deferred to a ksoftirqd).
> >>> At the beginning of the tasklet, we assert that we do hold a device
> >>> wakeref for the access we are about to perform. This assumes that when
> >>> we idle and release the GT wakeref, all execlists work has been
> >>> completed (since the elsp tracking says the hw is idle). However, there
> >>> may still be a tasklet queued, so as we mark the engine idle, also
> >>> cancel any pending tasklet.
> >>
> >> We check the irq_posted bit which should correspond with a pending
> >> tasklet (intel_engines_are_idle/intel_engine_is_idle), before
> >> transitioning to idle so I don't understand this.
> >
> > Exactly, we've processed the interrupt in the current irq handler, but
> > due to the ordering (which is essential to ensure that we don't miss an
> > interrupt, i.e. the strong ordering is via the tasklet atomic ops so
> > that each interrupt is always followed by a tasklet, at least if we do
> > have elsp[]!) we can queue a second tasklet execution despite it already
> > being handled concurrently.
> >
> > Run long enough and this rare event will then coincide with idling.
>
> Ah rings a bell now. That would mean the irq_posted bit being consumed
> by the current tasklet, and then the next one coming along as designed.
Phew, I didn't have to draw the ascii flow chart. Now, for the final
execution that triggers the bug, we shouldn't actually touch hw (the irq
posted bit should be clear, and the execlists queue should also be
empty), so we could argue that the bug on is in err -- but I think it is
a useful bug on to document the intent of the wakeref being held on our
behalf and cancelling the extra work seems sensible.
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Ta, consider it pushed. Not sure if CI has seen it yet, so no bugzilla
to close.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-06-28 10:30 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-27 15:25 [PATCH] drm/i915: Cancel pending execlists irq handler upon idling Chris Wilson
2017-06-27 15:45 ` ✓ Fi.CI.BAT: success for " Patchwork
2017-06-28 8:59 ` [PATCH] " Tvrtko Ursulin
2017-06-28 10:01 ` Chris Wilson
2017-06-28 10:15 ` Tvrtko Ursulin
2017-06-28 10:29 ` Chris Wilson
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.